Devlink backport: Fix mlx5 driver hangs due to mlx5_sf_hw_table_init
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
linux-bluefield (Ubuntu) |
Invalid
|
Undecided
|
Unassigned | ||
Jammy |
Fix Committed
|
Undecided
|
Unassigned |
Bug Description
Summary:
Machine hangs when loading OFED 2310 mlx5 driver at BlueField
How to reproduce:
# load the OFED driver
Reason:
BF got stuck and observed call trace "mlx5_sf_
dmesg from minicom:
[ 726.569928] INFO: task systemd-udevd:297 blocked for more than 604 seconds.
[ 726.576895] Tainted: G OE 5.15.0-
[ 726.584101] "echo 0 > /proc/sys/
[ 726.591913] task:systemd-udevd state:D stack: 0 pid: 297 ppid: 280 flags:0x0000000d
[ 726.600248] Call trace:
[ 726.602680] __switch_
[ 726.606159] __schedule+
[ 726.609634] schedule+0x64/0x140
[ 726.612850] schedule_
[ 726.617453] __mutex_
[ 726.622141] __mutex_
[ 726.626396] mutex_lock+
[ 726.629695] devlink_
[ 726.634386] mlx5_sf_
[ 726.639882] mlx5_init_
[ 726.645791] probe_one+
[ 726.650307] local_pci_
[ 726.654043] pci_device_
[ 726.658039] really_
[ 726.661600] __driver_
[ 726.666029] driver_
[ 726.670198] __driver_
[ 726.674106] bus_for_
[ 726.677927] driver_
[ 726.681486] bus_add_
[ 726.685307] driver_
[ 726.689129] __pci_register_
[ 726.693386] __init_
[ 726.698425] do_one_
[ 726.702248] do_init_
[ 726.705983] load_module+
[ 726.709543] __do_sys_
[ 726.713885] __arm64_
[ 726.718401] invoke_
[ 726.722137] el0_svc_
[ 726.726913] do_el0_
[ 726.730215] el0_svc+0x48/0x160
[ 726.733341] el0t_64_
[ 726.737597] el0t_64_
[ 847.401924] INFO: task systemd-udevd:297 blocked for more than 724 seconds.
[ 847.408891] Tainted: G OE 5.15.0-
How to fix:
This is related to
https:/
and we need to backport/
Patches are below
Backport: f655dacb59ac net: devlink: remove unused locked functions
Backport: 012ec02ae441 netdevsim: convert driver to use unlocked devlink API during init/fini
Cherry-pick: eb0e9fa2c635 net: devlink: add unlocked variants of devlink_
SKIP: 72a4c8c94efa mlxsw: convert driver to use unlocked devlink API during init/fini
Backport: 70a2ff89369d net: devlink: add unlocked variants of devlink_dpipe*() functions
Cherry-pick: 755cfa69c4ec net: devlink: add unlocked variants of devlink_sb*() functions
Cherry-pick: c223d6a4bf6d net: devlink: add unlocked variants of devlink_resource*() functions
Cherry-pick: 852e85a704c2 net: devlink: add unlocked variants of devling_trap*() functions
Cherry-pick: e26fde2f5bef net: devlink: avoid false DEADLOCK warning reported by lock
Thanks!
summary: |
- Devlink backport: Fix mlx5 driver hangs + Devlink backport: Fix mlx5 driver hangs due to mlx5_sf_hw_table_init |
Changed in linux-bluefield (Ubuntu): | |
status: | New → Invalid |
Changed in linux-bluefield (Ubuntu Jammy): | |
status: | New → Fix Committed |
tags: |
added: verification-done-jammy-linux-bluefield removed: verification-needed-jammy-linux-bluefield |
This bug is awaiting verification that the linux-bluefield /5.15.0- 1031.33 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification- needed- jammy-linux- bluefield' to 'verification- done-jammy- linux-bluefield '. If the problem still exists, change the tag 'verification- needed- jammy-linux- bluefield' to 'verification- failed- jammy-linux- bluefield' .
If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.
See https:/ /wiki.ubuntu. com/Testing/ EnableProposed for documentation how to enable and use -proposed. Thank you!