Comment 38 for bug 2004262

Revision history for this message
Jeff Hillman (jhillman) wrote : Re: [Bug 2004262] Re: Intel E810 NICs driver in causing hangs when booting and bonds configured

I am no longer in a poaition to recreate the scenario.

On Mon, May 22, 2023, 2:38 AM Ubuntu Kernel Bot <email address hidden>
wrote:

> This bug is awaiting verification that the linux-
> allwinner/5.19.0-1012.12 kernel in -proposed solves the problem. Please
> test the kernel and update this bug with the results. If the problem is
> solved, change the tag 'verification-needed-kinetic' to 'verification-
> done-kinetic'. If the problem still exists, change the tag
> 'verification-needed-kinetic' to 'verification-failed-kinetic'.
>
> If verification is not done by 5 working days from today, this fix will
> be dropped from the source code, and this bug will be closed.
>
> See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how
> to enable and use -proposed. Thank you!
>
>
> ** Tags added: kernel-spammed-kinetic-linux-allwinner
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/2004262
>
> Title:
> Intel E810 NICs driver in causing hangs when booting and bonds
> configured
>
> Status in linux package in Ubuntu:
> Confirmed
> Status in linux source package in Jammy:
> Fix Released
> Status in linux source package in Kinetic:
> Fix Released
> Status in linux source package in Lunar:
> Confirmed
>
> Bug description:
> [Impact]
> * Intel E810-family NICs cause system hangs when booting with bonding
> enabled
> * This happens due to the driver unplugging auxiliary devices
> * The unplug event happens under RTNL lock context, which causes a
> deadlock where the RDMA driver waits for the RNL lock to complete removal
>
> [Test Plan]
> * Users have reported that after setting up bonding on switch and
> server side, the system will hang when starting network services
>
> [Fix]
> * The upstream patch defers unplugging/re-plugging of the auxiliary
> device, so that it's not performed under the RTNL lock context.
> * Fix was introduced by commit:
> 248401cb2c46 ice: avoid bonding causing auxiliary plug/unplug
> under RTNL lock
>
> [Regression Potential]
> * Regressions would manifest in devices that support RDMA
> functionality and
> have been added to a bond
> * We should look out for auxiliary devices that haven't been properly
> unplugged, or that cause further issues with
> ice_plug_aux_dev()/ice_unplug_aux_dev()
>
>
> [Original Description]
> jammy 22.04.1
> linux-image-generic 5.15.0-58-generic
> Intel E810-XXV Dual Port NICs in Dell PowerEdge 650
>
> - 5.15 in jammy -> reproducible
> - 5.19 in hwe-edge -> reproducible
> - 6.2.rc6 in the mainline build -> works
> - Intel's ice driver 1.10.1.2.2 -> works
>
> After beonding is enabled on switch and server side, the system will
> hang at initialing ubuntu. The kernel loads but around starting the
> Network Services the system can hang for sometimes 5 minutes, and in
> other cases, indefinitely.
>
> The message of:
>
> echo 0 > /proc/sys/kernel/hung_task_timeout_sec” systemd-resolve
> blocked for more than 120 seconds
>
> appears, and eventually the Network services just attempts to start
> and never does. This is with or without DHCP enabled.
>
> Tried this same setup with the hwe-22.04, hwe-20.04, hwe-22.04-ege and
> linux-oem kernels and all exhibit the same failure.
>
> To work around this. installing the Intel 'ice' driver of version
> 1.10.1.2.2 works. The system doesn't even remotely hang at startup
> and all networking functions remain working (ping, DNS, general
> accessibility).
>
> The driver can be found at
> https://downloadmirror.intel.com/763930/ice-1.10.1.2.2.tar.gz
> ---
> ProblemType: Bug
> AlsaDevices:
> total 0
> crw-rw---- 1 root audio 116, 1 Jan 31 13:08 seq
> crw-rw---- 1 root audio 116, 33 Jan 31 13:08 timer
> AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
> ApportVersion: 2.20.11-0ubuntu82.3
> Architecture: amd64
> ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
> AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq',
> '/dev/snd/timer'] failed with exit code 1:
> CRDA: N/A
> CasperMD5json:
> {
> "result": "skip"
> }DistroRelease: Ubuntu 22.04
> InstallationDate: Installed on 2023-01-27 (3 days ago)InstallationMedia:
> Ubuntu-Server 22.04.1 LTS "Jammy Jellyfish" - Release amd64 (20220809)
> IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig'
> MachineType: Dell Inc. PowerEdge R650
> Package: linux (not installed)
> PciMultimedia:
>
> ProcFB: 0 mgag200drmfb
> ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.15.0-58-generic
> root=UUID=668aab7c-abe9-434b-a810-acc6eab76cbc ro fsck.mode=skip
> ProcVersionSignature: Ubuntu 5.15.0-58.64-generic 5.15.74
> RelatedPackageVersions:
> linux-restricted-modules-5.15.0-58-generic N/A
> linux-backports-modules-5.15.0-58-generic N/A
> linux-firmware
> 20220329.git681281e4-0ubuntu3.9
> RfKill: Error: [Errno 2] No such file or directory: 'rfkill'Tags: jammy
> uec-images
> Uname: Linux 5.15.0-58-generic x86_64
> UpgradeStatus: No upgrade log present (probably fresh install)
> UserGroups: N/A
> _MarkForUpload: True
> dmi.bios.date: 09/14/2022
> dmi.bios.release: 1.8
> dmi.bios.vendor: Dell Inc.
> dmi.bios.version: 1.8.2
> dmi.board.name: 0PJ7YJ
> dmi.board.vendor: Dell Inc.
> dmi.board.version: A01
> dmi.chassis.type: 23
> dmi.chassis.vendor: Dell Inc.
> dmi.modalias:
> dmi:bvnDellInc.:bvr1.8.2:bd09/14/2022:br1.8:svnDellInc.:pnPowerEdgeR650:pvr:rvnDellInc.:rn0PJ7YJ:rvrA01:cvnDellInc.:ct23:cvr:skuSKU=0912;ModelName=PowerEdgeR650:
> dmi.product.family: PowerEdge
> dmi.product.name: PowerEdge R650
> dmi.product.sku: SKU=0912;ModelName=PowerEdge R650
> dmi.sys.vendor: Dell Inc.
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2004262/+subscriptions
>
>