xhci_hcd 0000:00:14.0: WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Linux |
Confirmed
|
High
|
Bug Description
Recently my external USB drive enclosure stops working after a bit of IO activity (copy jobs etc.). This wasn't the case not too long ago. I use this enclosure as an archive backup and plug it every month or so.
Import to note that this issue is/was being tracked here: https:/
195 patch in comment # 176 fixes the issue.
$ uname -a
Linux kambuntu 5.11.0-25-generic #27~20.04.1-Ubuntu SMP Tue Jul 13 17:41:23 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
$ dmesg -T
[Sun Aug 15 10:47:19 2021] usb 2-2: new SuperSpeed Gen 1 USB device number 10 using xhci_hcd
[Sun Aug 15 10:47:19 2021] usb 2-2: New USB device found, idVendor=152d, idProduct=0539, bcdDevice= 1.00
[Sun Aug 15 10:47:19 2021] usb 2-2: New USB device strings: Mfr=1, Product=2, SerialNumber=5
[Sun Aug 15 10:47:19 2021] usb 2-2: Product: USB to ATA/ATAPI Bridge
[Sun Aug 15 10:47:19 2021] usb 2-2: Manufacturer: JMicron
[Sun Aug 15 10:47:19 2021] usb 2-2: SerialNumber: DCC10435415F
[Sun Aug 15 10:47:19 2021] usb-storage 2-2:1.0: USB Mass Storage device detected
[Sun Aug 15 10:47:19 2021] usb-storage 2-2:1.0: Quirks match for vid 152d pid 0539: 4000000
[Sun Aug 15 10:47:19 2021] scsi host4: usb-storage 2-2:1.0
[Sun Aug 15 10:47:20 2021] scsi 4:0:0:0: Direct-Access WDC WD30 EFRX-68AX9N0 PQ: 0 ANSI: 5
[Sun Aug 15 10:47:20 2021] scsi 4:0:0:1: Direct-Access WDC WD30 EFRX-68AX9N0 PQ: 0 ANSI: 5
[Sun Aug 15 10:47:20 2021] scsi 4:0:0:2: Direct-Access WDC WD30 EFRX-68AX9N0 PQ: 0 ANSI: 5
[Sun Aug 15 10:47:20 2021] scsi 4:0:0:3: Direct-Access WDC WD30 EFRX-68AX9N0 PQ: 0 ANSI: 5
[Sun Aug 15 10:47:20 2021] sd 4:0:0:0: Attached scsi generic sg1 type 0
[Sun Aug 15 10:47:20 2021] scsi 4:0:0:1: Attached scsi generic sg2 type 0
[Sun Aug 15 10:47:20 2021] sd 4:0:0:0: [sdb] Very big device. Trying to use READ CAPACITY(16).
[Sun Aug 15 10:47:20 2021] sd 4:0:0:0: [sdb] 5860533168 512-byte logical blocks: (3.00 TB/2.73 TiB)
[Sun Aug 15 10:47:20 2021] sd 4:0:0:2: Attached scsi generic sg3 type 0
[Sun Aug 15 10:47:20 2021] sd 4:0:0:1: [sdc] Very big device. Trying to use READ CAPACITY(16).
[Sun Aug 15 10:47:20 2021] sd 4:0:0:3: Attached scsi generic sg4 type 0
[Sun Aug 15 10:47:20 2021] sd 4:0:0:0: [sdb] Write Protect is off
[Sun Aug 15 10:47:20 2021] sd 4:0:0:0: [sdb] Mode Sense: 28 00 00 00
[Sun Aug 15 10:47:20 2021] sd 4:0:0:1: [sdc] 5860533168 512-byte logical blocks: (3.00 TB/2.73 TiB)
[Sun Aug 15 10:47:20 2021] sd 4:0:0:0: [sdb] No Caching mode page found
[Sun Aug 15 10:47:20 2021] sd 4:0:0:0: [sdb] Assuming drive cache: write through
[Sun Aug 15 10:47:20 2021] sd 4:0:0:1: [sdc] Write Protect is off
[Sun Aug 15 10:47:20 2021] sd 4:0:0:1: [sdc] Mode Sense: 28 00 00 00
[Sun Aug 15 10:47:20 2021] sd 4:0:0:2: [sdd] Very big device. Trying to use READ CAPACITY(16).
[Sun Aug 15 10:47:20 2021] sd 4:0:0:1: [sdc] No Caching mode page found
[Sun Aug 15 10:47:20 2021] sd 4:0:0:1: [sdc] Assuming drive cache: write through
[Sun Aug 15 10:47:20 2021] sd 4:0:0:3: [sde] Very big device. Trying to use READ CAPACITY(16).
[Sun Aug 15 10:47:20 2021] sd 4:0:0:3: [sde] 5860533168 512-byte logical blocks: (3.00 TB/2.73 TiB)
[Sun Aug 15 10:47:20 2021] sd 4:0:0:3: [sde] Write Protect is off
[Sun Aug 15 10:47:20 2021] sd 4:0:0:3: [sde] Mode Sense: 28 00 00 00
[Sun Aug 15 10:47:20 2021] sd 4:0:0:3: [sde] No Caching mode page found
[Sun Aug 15 10:47:20 2021] sd 4:0:0:3: [sde] Assuming drive cache: write through
[Sun Aug 15 10:47:20 2021] sd 4:0:0:2: [sdd] 5860533168 512-byte logical blocks: (3.00 TB/2.73 TiB)
[Sun Aug 15 10:47:20 2021] sd 4:0:0:2: [sdd] Write Protect is off
[Sun Aug 15 10:47:20 2021] sd 4:0:0:2: [sdd] Mode Sense: 28 00 00 00
[Sun Aug 15 10:47:20 2021] sd 4:0:0:2: [sdd] No Caching mode page found
[Sun Aug 15 10:47:20 2021] sd 4:0:0:2: [sdd] Assuming drive cache: write through
[Sun Aug 15 10:47:22 2021] sdc: sdc1
[Sun Aug 15 10:47:22 2021] sde: sde1
[Sun Aug 15 10:47:22 2021] sdb: sdb1
[Sun Aug 15 10:47:22 2021] sdd: sdd1
[Sun Aug 15 10:47:22 2021] sd 4:0:0:1: [sdc] Attached SCSI disk
[Sun Aug 15 10:47:22 2021] sd 4:0:0:3: [sde] Attached SCSI disk
[Sun Aug 15 10:47:22 2021] sd 4:0:0:0: [sdb] Attached SCSI disk
[Sun Aug 15 10:47:22 2021] sd 4:0:0:2: [sdd] Attached SCSI disk
[Sun Aug 15 11:00:35 2021] usb 2-2: USB disconnect, device number 10
[Sun Aug 15 11:00:35 2021] xhci_hcd 0000:00:14.0: WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
ProblemType: Bug
DistroRelease: Ubuntu 20.04
Package: ubuntu-
ProcVersionSign
Uname: Linux 5.11.0-25-generic x86_64
NonfreeKernelMo
ApportVersion: 2.20.11-
Architecture: amd64
CasperMD5CheckR
CrashDB: ubuntu
CurrentDesktop: KDE
Date: Sun Aug 15 11:30:57 2021
InstallationDate: Installed on 2021-03-26 (142 days ago)
InstallationMedia: Kubuntu 20.04.2.0 LTS "Focal Fossa" - Release amd64 (20210209.1)
PackageArchitec
ProcEnviron:
LANGUAGE=en_CA:en
PATH=(custom, no user)
XDG_RUNTIME_
LANG=en_CA.UTF-8
SHELL=/bin/bash
SourcePackage: ubuntu-
Symptom: ubuntu-
UpgradeStatus: No upgrade log present (probably fresh install)

|
#2 |

|
#3 |
Hmm, that's strange perhaps this is some USB host problem. Please provide dmesg of your system.

|
#4 |
Created attachment 281677
dmesg output before reboot

|
#5 |
Created attachment 281679
dmesg output after reboot

|
#6 |
We have this xhci_hcd warning on bad case:
xhci_hcd 0000:15:00.0: WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state
Not sure where it come from. But I notice you are using AMD IOMMU which we have troubles with with different drivers.
You could try to disable iommu via kerenl boot parameter and check if that improve things. You could also try test this patch if possible:
https:/
If none of that helps I will prepare some rt2800 patches to see if this not caused by some of v4.19 .. v4.20 rt2800 commits:
0240564430c0 rt2800: flush and txstatus rework for rt2800mmio
adf26a356f13 rt2x00: use different txstatus timeouts when flushing
5022efb50f62 rt2x00: do not check for txstatus timeout every time on tasklet
0b0d556e0ebb rt2800mmio: use txdone/txstatus routines from lib
5c656c71b1bf rt2800: move usb specific txdone/txstatus routines to rt2800lib
f483039cf51a rt2x00: use simple_
But I would rather suspect problem introduced in AMD IOMMU or usb/xhci drivers.

|
#7 |
I tried disabling iommu, and I also compiled the 4.20.15 kernel from source with that patch applied, but the wifi didn´t work in both cases either.

|
#8 |
Created attachment 281711
rt2x00_
Please test this patch and report if it makes problem gone or not.

|
#9 |
The problem is still there after applying that patch.

|
#10 |
You need to report this bug usb maintainers. I'm changing the topic and component, but USB bugs should be reported directly to mailing list.

|
#11 |
Please send bug report to <email address hidden>

|
#12 |
I can confirm this issue. Also I can confirm that other USB devices are effected, too (mostly if plugged into an USB3 port).
For example:
ID 7392:7710 Edimax Technology Co., Ltd (mt7601u)
WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
dmesg doesn't show IOMMU warnings, so I assume it is a problem introduced in usb/xhci driver.

|
#13 |
(In reply to Michael from comment #10)
> I can confirm this issue. Also I can confirm that other USB devices are
> effected, too (mostly if plugged into an USB3 port).
> For example:
> ID 7392:7710 Edimax Technology Co., Ltd (mt7601u)
> WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
>
> dmesg doesn't show IOMMU warnings, so I assume it is a problem introduced in
> usb/xhci driver.
I think this affects only a specific hardware configuration(I've tried using my wifi stick on a different machine and it worked without problems).
Which hardware are you using? Maybe there are some parts we have in common.
My hardware configuration:
CPU: AMD Ryzen 3 2200G, Motherboard: MSI B350 PC MATE
GPU: AMD Radeon RX 580 8GB

|
#14 |
@ Bernhard
The parts we have in common : AMD RYZEN
AMD RYZEN 1700 MSI X370 KRAIT, MSI AERO GTX1080Ti, 5.0.6-arch1-1-ARCH (system was also affected by IOMMU issue - but that is fixed)
Affected USB WiFi devices (tested):
ID 148f:3070 Ralink Technology, Corp. RT2870/RT3070 Wireless Adapter (ALFA AWUS036NH - rt2800usb)
ID 148f:3070 Ralink Technology, Corp. RT2870/RT3070 Wireless Adapter (ipTime/ zioncom - rt2800usb)
ID 7392:7710 Edimax Technology Co., Ltd (mt7601u)
ID 7392:a812 Edimax Technology Co., Ltd (Edimax EW-7811USC - rtl88xxau)
ID 148f:761a Ralink Technology, Corp. MT7610U ("Archer T2U" 2.4G+5G WLAN Adapter - mt76x0)
ID 0b05:17d1 ASUSTek Computer, Inc. AC51 802.11a/b/g/n/ac Wireless Adapter [Mediatek MT7610U]
ID 0a12:0001 Cambridge Silicon Radio, Ltd Bluetooth Dongle (HCI mode)
I'm sure there are more.
After he has fixed some driver / IOMMU issues, Stanislaw has found out, that it possibly could be a xhci/driver issue. I share his opinion.
You can read more about the issues here:
https:/
and the fixed IOMMU issue here:
https:/

|
#15 |
FTR: I think those two commits could help:
commit 6cbcf596934c8e1
Author: Mathias Nyman <email address hidden>
Date: Fri Mar 22 17:50:15 2019 +0200
xhci: Fix port resume done detection for SS ports with LPM enabled
commit d92f2c59cc2cbca
Author: Mathias Nyman <email address hidden>
Date: Fri Mar 22 17:50:17 2019 +0200
xhci: Don't let USB3 ports stuck in polling state prevent suspend
Also I'm not sure if if issue was reported to proper maintainer. If not and problem is not already fixed on latest upstream, either bisection will be needed to precede with this bug or fill properly informative bug report to proper maintainer.

|
#16 |
@ Stanislaw, thanks for additional information.
@ Bernhard, have you already sent this bug report to linux-usb mailing list?
can we change affected kernel version from 4.20 to >= 4.20, because 5.0.6 is affected, too?

|
#17 |
Yes, I already sent this to the mailing list, but I got no response unfortunately.
I've changed the affected kernel version btw.

|
#18 |
@ Bernhard, thanks for your answer. So there is no need for me to report this issue, too.

|
#19 |
I just tried the two patches Stanislaw mentioned, but the problem is still there.

|
#20 |
Tried them, too, some days ago, but the didn't solve the issue.
Just downloaded 5.1rc3, but I don't expect a working driver (usb/host), inside.

|
#21 |
Tested an ASUS X555U (Intel i5-6200 - 5.0.6-arch1-1-ARCH) and that system is affected, if the device is plugged into one of the USB3 ports. The device is working, if plugged into the USB2 port.

|
#22 |
I just tried replacing the xhci_ring.c file with the version from the 4.19 kernel, that solved the problem. Then I started patching the code until the problem occurs again.
The change in the function "static int process_

|
#23 |
Berna(In reply to Bernhard from comment #20)
> I just tried replacing the xhci_ring.c file with the version from the 4.19
> kernel, that solved the problem. Then I started patching the code until the
> problem occurs again.
> The change in the function "static int process_
> problem, it's part of this patch:
> https:/
> drivers/
Good findings, great. This seems to be part of
commit f8f80be501aa2f1
Author: Mathias Nyman <email address hidden>
Date: Thu Sep 20 19:13:37 2018 +0300
xhci: Use soft retry to recover faster from transaction errors
Just add information you found in the posted linux-usb email and CC "Mathias Nyman <email address hidden>" to make sure he is aware of the problem.

|
#24 |
The issue isn't fixed in 5.1rc3, so it look's like Mathias Nyman isn't aware of the problem, yet.

|
#25 |
Still present in 5.1.2

|
#26 |
This issue is really funny:
running
D 0b05:17d1 ASUSTek Computer, Inc. AC51 802.11a/b/g/n/ac Wireless Adapter [Mediatek MT7610U]
on kernel
$ uname -r
5.1.7-arch1-1-ARCH
will spam the log after the know WARN
43163.034783] mt76x0u 1-10.2:1.0 wlp3s0f0u10u2: renamed from wlan0
[43163.351656] usb 1-10.2: USB disconnect, device number 6
[43163.352176] xhci_hcd 0000:03:00.0: WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
with tons of failed vendor requests:
[43160.683383] mt76x0u 1-10.2:1.0: vendor request req:06 off:c3dc failed:-71
[43160.813398] mt76x0u 1-10.2:1.0: vendor request req:06 off:c3e0 failed:-71
[43160.943415] mt76x0u 1-10.2:1.0: vendor request req:06 off:c3e4 failed:-71
[43161.073440] mt76x0u 1-10.2:1.0: vendor request req:06 off:c3e8 failed:-71
[43161.203439] mt76x0u 1-10.2:1.0: vendor request req:06 off:c3ec failed:-71
[43161.333458] mt76x0u 1-10.2:1.0: vendor request req:06 off:c3f0 failed:-71
[43161.463468] mt76x0u 1-10.2:1.0: vendor request req:06 off:c3f4 failed:-71
[43161.593561] mt76x0u 1-10.2:1.0: vendor request req:06 off:c3f8 failed:-71
[43161.723502] mt76x0u 1-10.2:1.0: vendor request req:06 off:c3fc failed:-71
[43161.853512] mt76x0u 1-10.2:1.0: vendor request req:06 off:108c failed:-71
....

|
#27 |
If the same device is connected to an Intel Core I5-6200 system (USB3 port), the log looks different to the AMD RYZEN system.
[ 204.231872] mt76x0u 1-1:1.0: rx urb failed: -71
[ 204.231901] mt76x0u 1-1:1.0: rx urb failed: -71
[ 204.231940] mt76x0u 1-1:1.0: rx urb failed: -71
[ 204.231980] mt76x0u 1-1:1.0: rx urb failed: -71
[ 204.232020] mt76x0u 1-1:1.0: rx urb failed: -71
[ 204.232188] mt76x0u 1-1:1.0: rx urb failed: -71
[ 204.232226] mt76x0u 1-1:1.0: rx urb failed: -71
[ 204.232275] mt76x0u 1-1:1.0: rx urb failed: -71
[ 204.232304] mt76x0u 1-1:1.0: rx urb failed: -71
[ 204.232345] mt76x0u 1-1:1.0: rx urb failed: -71
[ 204.233284] xhci_hcd 0000:00:14.0: WARN Cannot submit Set TR Deq Ptr
[ 204.233291] xhci_hcd 0000:00:14.0: A Set TR Deq Ptr command is pending.
[ 204.263427] mt76x0u 1-1:1.0: TX DMA did not stop
[ 207.596726] mt76x0u 1-1:1.0: Warning: MAC TX did not stop!
[ 209.650050] mt76x0u 1-1:1.0: Warning: MAC RX did not stop!
[ 209.651133] mt76x0u 1-1:1.0: RX DMA did not stop
Also I noticed some changes in xhci-ring.c between 5.1.7 and 5.2_rc4. Maybe they'll fix the problem. I didn't tested it, yet.

|
#28 |
I already tried the 5.2-rc3 kernel and the problem isn't fixed yet. There were no changes in the xhci driver between rc3 and rc4, so it's very unlikely that the problem doesn't occur in the 5.2-rc4 kernel.

|
#29 |
Thanks for the information. I skipped 5.2rc1 ... rc3.
But with your information, there is no real need for me to run some more tests.
Unfortunately it looks like the issue is back ported to older kernel versions (4.19), because I got some issue reports here, too:
https:/
and 90% of my devices doesn't work any longer.

|
#30 |
When did it get back ported? I'm on 4.19.48 and haven't had a problem with this version...

|
#31 |
It's just a guess, because of this post:
https:/
But it looks like the device was working before that post.
I cant test it, because I have not such a device.
I tested a TP-LINK Archer T2UH and this device is not working on 4.19.46 arm (Raspberry Pi).

|
#32 |
Yes, rt2800usb is working fine on 4.19.46.

|
#33 |
hcxdumptool running on kernel 4.19.46 arm doesn't receive packets on several different devices. In this case:
ID 0b05:17d1 ASUSTek Computer, Inc. AC51 802.11a/b/g/n/ac Wireless Adapter [Mediatek MT7610U]
INFO: cha=1, rx=0, rx(dropped)=0, tx=18, err=0, aps=0 (0 in range)
while a few other devices still working
INFO: cha=1, rx=805, rx(dropped)=0, tx=93, err=0, aps=29 (21 in range)
BTW:
I'm running/testing only devices on which driver support monitor mode and packet injection.
Very interesting on that arm kernel is that dmesg doesn't show any WARNs.

|
#34 |
Still no fix?
$ uname -r
5.1.11-arch1-1-ARCH
and most of the USB devices WiFI, BLUETOOTH,....) are still not working:
32942.700591] usb 1-10.4: new full-speed USB device number 7 using xhci_hcd
[32944.721410] usb 1-10.4: New USB device found, idVendor=0a12, idProduct=0001, bcdDevice=52.76
[32944.721412] usb 1-10.4: New USB device strings: Mfr=0, Product=2, SerialNumber=0
[32945.069015] Bluetooth: hci0: hardware error 0x37
How about kernel 5.2?

|
#35 |
Some USB card readers are also affected (connected to USB 3 port):
$ uname -r
5.1.12-arch1-1-ARCH
[ 3510.100114] usb 2-2: new SuperSpeed Gen 1 USB device number 2 using xhci_hcd
[ 3510.134121] usb 2-2: New USB device found, idVendor=058f, idProduct=6387, bcdDevice= 0.02
[ 3510.134126] usb 2-2: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[ 3510.134128] usb 2-2: Product: Intenso Ultra Line
[ 3510.134130] usb 2-2: Manufacturer: ALCOR
...
[ 5129.997608] usb 1-1: reset high-speed USB device number 7 using xhci_hcd
[ 5130.218618] sd 9:0:0:0: [sdb] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=
[ 5130.218631] sd 9:0:0:0: [sdb] tag#0 CDB: Read(10) 28 00 00 20 c3 c0 00 00 20 00
[ 5130.218637] print_req_error: I/O error, dev sdb, sector 2147264 flags 80700
I really wonder why that issue hasn't been fixed, yet, because many, many devices are affected.

|
#36 |
The list of changes for 5.2-rc6 contains this two commits:
Mathias Nyman (2):
usb: xhci: Don't try to recover an endpoint if port is in error state.
xhci: detect USB 3.2 capable host controllers correctly
I think this could be the fix for this issue.

|
#37 |
Great, thanks for the information. The issue is really ugly, because many USB devices are affected (hdd, card reader, bleutooth, wlan, ... - this list is long)
I'll check 5.2-rc6.

|
#38 |
Just tried 5.2-rc6, but unfortunately I still have the same issue.

|
#39 |
Thanks for the information. I tested 5.2-rc6, too. Even an USB 3.0 HDD isn't working.

|
#40 |
Now running mainline kernel 5.2 and the issue still exists.
Tested on this device:
ID 148f:3070 Ralink Technology, Corp. RT2870/RT3070 Wireless Adapter
but the same applies to many other devices, too
dmesg after plug in the device:
[75.482165] usb 1-2: new high-speed USB device number 6 using xhci_hcd
[75.639236] usb 1-2: New USB device found, idVendor=148f, idProduct=3070, bcdDevice= 1.01
[75.639238] usb 1-2: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[75.639239] usb 1-2: Product: 802.11 n WLAN
[75.639240] usb 1-2: Manufacturer: Ralink
[75.639241] usb 1-2: SerialNumber: 1.0
[75.952611] usb 1-2: reset high-speed USB device number 6 using xhci_hcd
[76.107232] ieee80211 phy1: rt2x00_set_rt: Info - RT chipset 3070, rev 0201 detected
[76.120228] ieee80211 phy1: rt2x00_set_rf: Info - RF chipset 0005 detected
[76.121079] ieee80211 phy1: Selected rate control algorithm 'minstrel_ht'
[76.130873] usbcore: registered new interface driver rt2800usb
[76.194447] audit: type=1130 audit(156283349
[76.195313] rt2800usb 1-2:1.0 wlp0s20f0u2: renamed from wlan0
[76.216178] ieee80211 phy1: rt2x00lib_
[76.241382] ieee80211 phy1: rt2x00lib_
[76.544022] ieee80211 phy1: rt2x00usb_
[77.562305] ieee80211 phy1: rt2800_
[77.562316] ieee80211 phy1: rt2800usb_
...
followed by this message on access to the interface:
[341.598563] xhci_hcd 0000:00:14.0: WARN Cannot submit Set TR Deq Ptr
[341.598573] xhci_hcd 0000:00:14.0: A Set TR Deq Ptr command is pending.
144 comments hidden
Loading more comments
|
view all 224 comments |

|
#185 |
My controller has the PCI ID 43bb, so I've added "PCI_DEVICE_

|
#186 |
@Stanislaw, I'm running an older mobo and a RYZEN 1700.
I don't need CPU power - GPU power is more important for me (crypto analysis).

|
#187 |
[Continuing my first report in comment:https:/
$ lspci -k -nn | grep -B2 xhci
02:00.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset USB 3.1 XHCI Controller [1022:43d5] (rev 01)
Subsystem: ASMedia Technology Inc. Device [1b21:1142]
Kernel driver in use: xhci_hcd
I have adapted the patch by Mr. Gruszka [https:/
$ uname -a
Linux voidx 5.4.95_1 #1 SMP PREEMPT 1612063540 x86_64 GNU/Linux
If someone has some spare time to glance at it or comment on my error ;)
(diff availible for 30 days) @
https:/

|
#188 |
(In reply to alpir from comment #182)
> I tried patch from comment 147. The error "WARN Set TR Deq Ptr cmd failed
> due to incorrect slot or ep state" has gone. But behavior USDB3.1 still the
> same.
[snip]
> But if you connect them to USB2, then there are no errors at all.
alpir, I think you experiencing different issue that can not be solved by simply disabling Soft Retry. Some more fixes are possibly needed for handing your xHCI/usb hardware. Maybe you can try patch from comment 139? If this is regression, maybe you can bisect to find offending commit? Anyway your problems, most likely will require expertise of Mathias Nyman - xhci driver maintainer.

|
#189 |
(In reply to biopsin from comment #185)
> [Continuing my first report in
> comment:https:/
Similarly like for as for alpir case this most likely will require some different fixes, but you can try if disabling Soft Retry works. You can just disable like showed in comment 147
> $ lspci -k -nn | grep -B2 xhci
> 02:00.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] 400 Series
> Chipset USB 3.1 XHCI Controller [1022:43d5] (rev 01)
> Subsystem: ASMedia Technology Inc. Device [1b21:1142]
> Kernel driver in use: xhci_hcd
>
[snip]
> If someone has some spare time to glance at it or comment on my error ;)
> (diff availible for 30 days) @
> https:/
ASMedia is subsystem_
diff --git a/drivers/
index 906a0e08821e.
--- a/drivers/
+++ b/drivers/
@@ -102,6 +102,9 @@ static void xhci_pci_
id = pci_match_
+ printk("vendor: 0x%04x device 0x%04x subvendor 0x%04x subdevice 0x%04x\n",
+ pdev->vendor, pdev->device, pdev->subsystem
+
if (id && id->driver_data) {
If indeed those are subsystem ID's I think there is bug in existing xhci-pci.c quirks code:
if (pdev->vendor == PCI_VENDOR_
if (pdev->vendor == PCI_VENDOR_
if (pdev->vendor == PCI_VENDOR_
and those check should be replaced by pdev->subsystem

|
#190 |
Created attachment 295065
asmedia_
This patch apply existing xhci ASMedia quirks also for ASMedia subdevices .
Looking into changelog history those quirks helped with some usb disk issues, so perhaps patch could help with disk issues reported here i.e. alpir and biopsin cases. Please test.

|
#191 |
None of the patches (comments 139, 147, 188) did not solve my problem.

|
#192 |
@Gruszka
Your patch [https:/
I'm currently testing it with my setup and kernel 5.4.95_x86_64.
Tested against one PATA and one SATA drives, so far I see no ill effects, but I also can't confirm or deny it does anything with this short timespan, and much have change since my initial post last year. I will at least continuing applying it now and then out this year and report any newsworthy. Thank you for your time and help!

|
#193 |
Created attachment 295151
Dmesg of a Toshiba USB 3.0 HDD connected to USB 3.0 front port and back port.
I am having this error on Linux 5.10.10-051010 while trying to connect a USB 3.0 hard disk, Toshiba Touro 4TB (HitachiGST). If I connect the disk to a USB 2.0 port it works flawlessly.
The kernel shows a different kind of error depending on whether I connect the HDD to the front or back USB 3.0 ports of the motherboard MSI X470 Gaming Plus MAX.
lspci -vnnt:
> -[0000:00]-+-00.0 Advanced Micro Devices, Inc. [AMD] Family 17h (Models
> 00h-0fh) Root Complex [1022:1450]
> +-00.2 Advanced Micro Devices, Inc. [AMD] Family 17h (Models
> 00h-0fh) I/O Memory Management Unit [1022:1451]
> +-01.0 Advanced Micro Devices, Inc. [AMD] Family 17h (Models
> 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
> +-01.1-[01]----00.0 Samsung Electronics Co Ltd NVMe SSD
> Controller SM981/PM981/PM983 [144d:a808]
> +-01.3-
> [1022:43d0]
> | +-00.1 Advanced Micro Devices, Inc. [AMD] 400
> Series Chipset SATA Controller [1022:43c8]
> | \-00.2-
> | +-01.0-[22]----00.0 Realtek
> Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit
> Ethernet Controller [10ec:8168]
> | +-02.0-[23]--
> | +-03.0-[24]--
> | +-04.0-[25]--
> | \-08.0-[26]----00.0 ASMedia
> Technology Inc. ASM1142 USB 3.1 Host Controller [1b21:1242]
> +-02.0 Advanced Micro Devices, Inc. [AMD] Family 17h (Models
> 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
> +-03.0 Advanced Micro Devices, Inc. [AMD] Family 17h (Models
> 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
> +-03.1-[27]--+-00.0 Advanced Micro Devices, Inc. [AMD/ATI]
> Ellesmere [Radeon RX 470/480/
> | \-00.1 Advanced Micro Devices, Inc. [AMD/ATI]
> Ellesmere HDMI Audio [Radeon RX 470/480 / 570/580/590] [1002:aaf0]
> +-04.0 Advanced Micro Devices, Inc. [AMD] Family 17h (Models
> 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
> +-07.0 Advanced Micro Devices, Inc. [AMD] Family 17h (Models
> 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
> +-07.1-[28]--+-00.0 Advanced Micro Devices, Inc. [AMD]
> Zeppelin/
> | +-00.2 Advanced Micro Devices, Inc. [AMD] Family 17h
> (Models 00h-0fh) Platform Security Processor [1022:1456]
> | \-00.3 Advanced Micro Devices, Inc. [AMD] Zeppelin
> USB 3.0 Host controller [1022:145f]
> +-08.0 Advanced Micro Devices, Inc. [AMD] Family 17h (Models
> 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
> +-08.1-[29]--+-00.0 Advance...

|
#194 |
Created attachment 295183
Dmesg of a OnePlus 7 Pro connecting in USB 3.1 gen1 mode. No errors.
(In reply to raul from comment #191)
Connecting a Oneplus 7 Pro smartphone does show any error. This phone has a USB 3.1 gen1 port and connects in that mode without errors. I can navigate the filesystem as one would expect.

|
#195 |
Same issue with a Seagate Portable 4 TB USB 3.0 drive that I connect with usb-storage quirks as its UAS implementation is problematic. Random hangs that flood dmesg with errors.
lsusb -tv
/: Bus 02.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/4p, 5000M
ID 1d6b:0003 Linux Foundation 3.0 root hub
|__ Port 3: Dev 2, If 0, Class=Mass Storage, Driver=usb-storage, 5000M
ID 0bc2:231a Seagate RSS LLC Expansion Portable
Errors in dmesg start like this...
xhci_hcd 0000:00:10.0: WARN Cannot submit Set TR Deq Ptr
xhci_hcd 0000:00:10.0: A Set TR Deq Ptr command is pending.
usb 3-3: reset SuperSpeed Gen 1 USB device number 3 using xhci_hcd
sd 5:0:0:0: [sdd] tag#0 FAILED Result: hostbyte=
sd 5:0:0:0: [sdd] tag#0 CDB: Read(16) 88 00 00 00 00 00 a4 01 ed 78 00 00 00 10 00 00
After that:
task:usb-storage state:D stack: 0 pid: 286 ppid: 2 flags:0x00004000
Call Trace:
__schedule+
? usleep_
schedule+
schedule_
? __prepare_
__wait_
usb_sg_
usb_stor_
usb_stor_
usb_stor_
? __prepare_
? __wait_
usb_stor_
? storage_
kthread+
? __kthread_
ret_from_

|
#196 |
(In reply to Zak from comment #193)
>
>
> Errors in dmesg start like this...
>
> xhci_hcd 0000:00:10.0: WARN Cannot submit Set TR Deq Ptr
> xhci_hcd 0000:00:10.0: A Set TR Deq Ptr command is pending.
There are recent major changes in this area in the xhci driver.
The above message no longer exists, new message in this case is
"Set TR Deq already pending, don't submit for x"
Can you try this on a 5.12-rc kernel?
Thanks
Mathias

|
#197 |
Created attachment 296259
xhci no soft retry for Intel xhci 8086:06ed and 8086:31a8
Hi
I am having this issue on 2 systems when I plug in
a Hoco Hub HB16. The Hoco Hub HB16 is a 6 in 1 adapter that
includes
Type-C to USB3.0 x3
Type-C to HDMI
Type-C to RJ45 Ethernet (RealTek RTL8153, linux loads driver rtl8153b-2)
Type-C to Type-C(PD2.0)
USB Billboard device
Also when the device is plugged into a Windows10 machine
for the first time it presents a disk that contains the RTL8153
drivers, the user is provided with an option to install these. This
"disk" is not visible later.
The 2 systems where this device failed both reported
"WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state."
Both systems have Ubuntu Mate 20.10
$ uname -a
5.8.0-48-generic #54-Ubuntu SMP Fri Mar 19 14:25:20 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
1. Dell XPS 9500 (Intel(R) Core(TM) i5-10300H CPU @ 2.50GHz)
$ sudo lspci -k -nn | grep -B2 xhci
00:14.0 USB controller [0c03]: Intel Corporation Comet Lake USB 3.1 xHCI Host Controller [8086:06ed]
Subsystem: Dell Comet Lake USB 3.1 xHCI Host Controller [1028:097d]
Kernel driver in use: xhci_hcd
Kernel modules: xhci_pci
--
7:00.0 USB controller [0c03]: Intel Corporation JHL7540 Thunderbolt 3 USB Controller [Titan Ridge 4C 2018] [8086:15ec] (rev 06)
Subsystem: Dell JHL7540 Thunderbolt 3 USB Controller [Titan Ridge 4C 2018] [1028:097d]
Kernel driver in use: xhci_hcd
Kernel modules: xhci_pci
2. Seed Studio Odyssey J4105 (Intel(R) Celeron(R) J4105 CPU @ 1.50GHz)
$ sudo lspci -k -nn | grep -B3 xhci
00:15.0 USB controller [0c03]: Intel Corporation Device [8086:31a8] (rev 03)
DeviceName: Onboard - Other
Subsystem: Intel Corporation Device [8086:7270]
Kernel driver in use: xhci_hcd
Kernel modules: xhci_pci
I applied the changes in Stanislaw's patch at comment 176, I added the
PCI IDs to match both my systems.
I can confirm that with the patch applied both systems no longer reported the
issue ""WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state."
Just to note that on the Dell XPS I use the Dell DA20 Adapter which is a Type-C
to USB and HDMI adapter. This appears to have an ASIX Elec. Corp. AX88179
USB 3.0 to Gigabit Ethernet which I don't have any issues with.

|
#198 |
Encountered this with a PCI-e card using ASMedia Technology Inc. ASM1142 USB 3.1 Host Controller
Moved to my native "Intel Corporation Device a3af" USB bus, this error disappeared (though other problems remain in my case)
Linux 5.10.33
Of potential noteworthiness: When I got my Talos II, I tried to move this ASMedia USB PCI-e card to it, and found it was immediately shutdown by the IOMMU whenever I would try to use it at all. It seems the firmware is garbage.
IIRC, someone was getting close to an open source firmware replacement without those issues... would be interesting to see if it helps with this bug as well.

|
#199 |
same problem
5.12.12-arch1-1 #1 SMP PREEMPT Fri, 18 Jun 2021 21:59:22 +0000 x86_64 GNU/Linux
GPD Pocket
00:00.0 Host bridge [0600]: Intel Corporation Atom/Celeron/
Subsystem: Intel Corporation Device [8086:7270]
Kernel driver in use: iosf_mbi_pci
00:02.0 VGA compatible controller [0300]: Intel Corporation Atom/Celeron/
DeviceName: Onboard IGD
Subsystem: Intel Corporation Device [8086:7270]
Kernel driver in use: i915
Kernel modules: i915
00:0b.0 Signal processing controller [1180]: Intel Corporation Atom/Celeron/
Subsystem: Intel Corporation Device [8086:7270]
Kernel driver in use: proc_thermal
Kernel modules: processor_
00:14.0 USB controller [0c03]: Intel Corporation Atom/Celeron/
Subsystem: Intel Corporation Device [8086:7270]
Kernel driver in use: xhci_hcd
Kernel modules: xhci_pci
00:1a.0 Encryption controller [1080]: Intel Corporation Atom/Celeron/
Subsystem: Intel Corporation Device [8086:7270]
Kernel modules: mei_txe
00:1c.0 PCI bridge [0604]: Intel Corporation Atom/Celeron/
Kernel driver in use: pcieport
00:1f.0 ISA bridge [0601]: Intel Corporation Atom/Celeron/
Subsystem: Intel Corporation Device [8086:7270]
Kernel modules: lpc_ich
01:00.0 Network controller [0280]: Broadcom Inc. and subsidiaries BCM4356 802.11ac Wireless Network Adapter [14e4:43ec] (rev 02)
Subsystem: Gemtek Technology Co., Ltd Device [17f9:0036]
Kernel driver in use: brcmfmac
Kernel modules: brcmfmac
# dmidecode 3.3
Getting SMBIOS data from sysfs.
SMBIOS 3.0.0 present.
Table at 0x5B8DE000.
Handle 0x0000, DMI type 0, 24 bytes
BIOS Information
Vendor: American Megatrends Inc.
Version: 5.11
Release Date: 06/28/2017
Address: 0xF0000
Runtime Size: 64 kB
ROM Size: 4 MB
Characteristics:
PCI is supported
BIOS is upgradeable
BIOS shadowing is allowed
Boot from CD is supported
Selectable boot is supported
BIOS ROM is socketed
EDD is supported
5.25"/1.2 MB floppy services are supported (int 13h)
3.5"/720 kB floppy services are supported (int 13h)
3.5"/2.88 MB floppy services are supported (int 13h)
Print screen service is supported (int 5h)
Serial services are supported (int 14h)
Printer services are supported (int 17h)
ACPI is supported
USB legacy is supported
BIOS boot specification is supported
Targeted content distribution is supported
UEFI is supported
BIOS Revision: 5.11
Handle 0x0001, DMI type 1, 27 bytes
System Information
Manufacturer: Default string
Product Name: Default string
Version: Default string
Serial Number: Default string
UUID: 03000200-
Wake-up ...
197 comments hidden
Loading more comments
|
view all 224 comments |

M K S (muhkamsad) wrote : | #1 |
- CurrentDmesg.txt.txt Edit (150.8 KiB, text/plain; charset="utf-8")
- Dependencies.txt Edit (2.3 KiB, text/plain; charset="utf-8")
- ProcCpuinfoMinimal.txt Edit (1.4 KiB, text/plain; charset="utf-8")
description: | updated |
description: | updated |
Changed in ubuntu-release-upgrader: | |
importance: | Unknown → High |
status: | Unknown → Confirmed |
198 comments hidden
Loading more comments
|
view all 224 comments |

|
#200 |
I have same problem with kernels 5.13.12 and 5.14.0-rc7:
dmesg:
xhci_hcd 0000:00:14.0: WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
journalctl:
ago 24 18:38:40 SERVER kernel: sd 4:0:0:0: [sda] tag#3 FAILED Result: hostbyte=

Launchpad Janitor (janitor) wrote : | #201 |
Status changed to 'Confirmed' because the bug affects multiple users.
Changed in ubuntu-release-upgrader (Ubuntu): | |
status: | New → Confirmed |

Sertac TULLUK (stulluk) wrote : | #202 |
I also experience exactly same issue on multiple USB devices ( USB-WIFI or a USB-Webcam ) only on my brand new AMD Mainboard ( ASRock model: B550M-HDV)
I tried both focal and hirsute with latest kernels on my OldPC (ASUSTeK model: M5A78L-M LX3) and on my IntelNUC (NUC8BEB) and this issue does not happen (Tried with same USB-WIFI and USB-Webcam devices).
Issue is easily reproducible by inserting USB-WIFI and then executing "ip a" on a shell.

|
#203 |
I also experience exactly same issue on multiple USB devices ( USB-WIFI or a USB-Webcam ) only on my brand new AMD Mainboard ( ASRock model: B550M-HDV)
I tried both ubuntu focal and hirsute with latest kernels on my OldPC (ASUSTeK model: M5A78L-M LX3) and on my IntelNUC (NUC8BEB) and this issue does not happen (Tried with same USB-WIFI and USB-Webcam devices).
Issue is easily reproducible by inserting USB-WIFI and then executing "ip a" on a shell.

|
#204 |
I also have exactly same problem, but with a bit different HW.
Now it's USB DAC branded as "Qudelix-5K". As far as I understand it's USB1 device.
[ 174.358189] usb 5-2.3.2.2.1.1: new full-speed USB device number 17 using xhci_hcd
[ 174.475229] usb 5-2.3.2.2.1.1: New USB device found, idVendor=0a12, idProduct=4025, bcdDevice=19.70
[ 174.475232] usb 5-2.3.2.2.1.1: New USB device strings: Mfr=1, Product=8, SerialNumber=3
[ 174.475233] usb 5-2.3.2.2.1.1: Product: Qudelix-5K USB DAC/MIC 48KHz
[ 174.475234] usb 5-2.3.2.2.1.1: Manufacturer: QTIL
[ 174.475235] usb 5-2.3.2.2.1.1: SerialNumber: ABCDEF0123456789
It produces corrupted sound (actually some noise) just after a few seconds of playback if connected to Dell WD19TB thunderbolt dock station. Issue happens with USB-A ports on dock plus one Type-C port (front). Second Type-C port (named as "Type-C with Thunderbolt 3 port" works.
When such noise happens I'm getting followed in dmesg:
xhci_hcd 0000:3a:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 5 comp_code 1
xhci_hcd 0000:3a:00.0: Looking for event-dma 00000000ffe940f0 trb-start 00000000ffe94100 trb-end 00000000ffe94100 seg-start 00000000ffe94000 seg-end 00000000ffe94ff0
xhci_hcd 0000:3a:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 5 comp_code 1
xhci_hcd 0000:3a:00.0: Looking for event-dma 00000000ffe949b0 trb-start 00000000ffe949c0 trb-end 00000000ffe949c0 seg-start 00000000ffe94000 seg-end 00000000ffe94ff0
I've tried to add/remove extra USB hubs (originally Qudelix was plugged to internal USB3 hub of monitor). But even if plugged directly to dock, it produces corrupted sound.
Another important thing: this dock has built-in Ethernet with r8153 chipset like mentioned above.
After reading comments here I've tried to disable soft retry using followed patch:
diff --git a/drivers/
index 1c9a7957c45c.
--- a/drivers/
+++ b/drivers/
@@ -189,10 +189,11 @@ static void xhci_pci_
if (pdev->vendor == PCI_VENDOR_
+ xhci->quirks |= XHCI_NO_SOFT_RETRY;
}
if (pdev->vendor == PCI_VENDOR_ID_INTEL &&
And it completely fixed issue for me. DAC produces clear sound even if connected through chain of two hubs!
PS.
lspci -k -nn | grep -B2 xhci
00:14.0 USB controller [0c03]: Intel Corporation Comet Lake PCH-LP USB 3.1 xHCI Host Controller [8086:02ed]
Subsystem: Hewlett-Packard Company Comet Lake PCH-LP USB 3.1 xHCI Host Controller [103c:8724]
Kernel driver in use: xhci_hcd
Kernel modules: xhci_pci
--
37:00.0 USB controller [0c03]: Intel Corporation JHL7540 Thunderbolt 3 USB Controller [Titan Ridge 4C 2018] [8086:15ec] (rev 06)
Subsystem: Hewlett-P...

|
#205 |
Turns out the problem was the cable, it was too long. A shorter USB 3.0 cable (1.8m) allowed a stable connection. On the same Linux 5.13 (the previous dmesg was on Linux 5.10) the longer 3 meters cable kept failing while with the 1.8 meters cable the HDD works without issue.
(In reply to raul from comment #191)

|
#206 |
Hi,
I have also issues with USB3 on my Debian 10 with kernel 5.10.0-
Aug 6 13:20:14 media-server kernel: [ 964.069355] scsi host17: uas_eh_
Aug 6 13:20:14 media-server kernel: [ 964.197532] usb 2-1: reset SuperSpeed Gen 1 USB device number 2 using xhci_hcd
Aug 6 13:20:14 media-server kernel: [ 964.219053] scsi host17: uas_eh_
Aug 6 13:20:18 media-server kernel: [ 968.137601] task:sync state:D stack: 0 pid:12237 ppid: 11291 flags:0x00004324
Aug 6 13:20:18 media-server kernel: [ 968.137607] Call Trace:
Aug 6 13:20:18 media-server kernel: [ 968.137621] __schedule+
Aug 6 13:20:18 media-server kernel: [ 968.137630] schedule+0x3c/0xa0
Aug 6 13:20:18 media-server kernel: [ 968.137635] io_schedule+
Aug 6 13:20:18 media-server kernel: [ 968.137644] wait_on_
Aug 6 13:20:18 media-server kernel: [ 968.137651] ? __page_
Aug 6 13:20:18 media-server kernel: [ 968.137657] wait_on_
Aug 6 13:20:18 media-server kernel: [ 968.137663] __filemap_
Aug 6 13:20:18 media-server kernel: [ 968.137673] ? sync_inodes_
Aug 6 13:20:18 media-server kernel: [ 968.137679] filemap_
Aug 6 13:20:18 media-server kernel: [ 968.137684] iterate_
Aug 6 13:20:18 media-server kernel: [ 968.137691] ksys_sync+0x7c/0xb0
Aug 6 13:20:18 media-server kernel: [ 968.137697] __do_sys_
Aug 6 13:20:18 media-server kernel: [ 968.137704] do_syscall_
Aug 6 13:20:18 media-server kernel: [ 968.137709] entry_SYSCALL_
Aug 6 13:20:18 media-server kernel: [ 968.137714] RIP: 0033:0x7fc4ec0529aa
Aug 6 13:20:18 media-server kernel: [ 968.137717] RSP: 002b:00007ffcdd
Aug 6 13:20:18 media-server kernel: [ 968.137723] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fc4ec0529aa
Aug 6 13:20:18 media-server kernel: [ 968.137725] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 00000000a8002000
Aug 6 13:20:18 media-server kernel: [ 968.137728] RBP: 0000000000000000 R08: 0000555ba9703dcf R09: 00007ffcddf4afe2
Aug 6 13:20:18 media-server kernel: [ 968.137730] R10: 00007fc4ec01a201 R11: 0000000000000246 R12: 0000000000000001
Aug 6 13:20:18 media-server kernel: [ 968.137733] R13: 0000000000000001 R14: 00007ffcddf49158 R15: 0000000000000000
affects: | ubuntu-release-upgrader → linux |
Changed in ubuntu-release-upgrader (Ubuntu): | |
status: | Confirmed → Invalid |

|
#207 |
Hello everyone,
I encountered the problem with kernel 6.0.0-rc3 on a lenovo t470 laptop and a usb3 axis card. The system was started with the parameter intel_idle.
I have another similar setup (same laptop and same usb3 network card, but with linux 6.0.0-rc2) that has been active for 8 days started without the parameter intel_idle.
The distribution is Slackware 15 (64 bit).
This is the full output of dmesg.
Any feedback is welcome.
Marco
[ 0.000000] Linux version 6.0.0-rc3 (root@Cherepakha) (gcc (GCC) 11.2.0, GNU ld version 2.37-slack15) #1 SMP PREEMPT_DYNAMIC Tue Aug 30 16:07:18 CEST 2022
[ 0.000000] Command line: auto BOOT_IMAGE=Linux ro root=10303 intel_idle.
[ 0.000000] x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers'
[ 0.000000] x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers'
[ 0.000000] x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers'
[ 0.000000] x86/fpu: Supporting XSAVE feature 0x008: 'MPX bounds registers'
[ 0.000000] x86/fpu: Supporting XSAVE feature 0x010: 'MPX CSR'
[ 0.000000] x86/fpu: xstate_offset[2]: 576, xstate_sizes[2]: 256
[ 0.000000] x86/fpu: xstate_offset[3]: 832, xstate_sizes[3]: 64
[ 0.000000] x86/fpu: xstate_offset[4]: 896, xstate_sizes[4]: 64
[ 0.000000] x86/fpu: Enabled xstate features 0x1f, context size is 960 bytes, using 'compacted' format.
[ 0.000000] signal: max sigframe size: 1616
[ 0.000000] BIOS-provided physical RAM map:
[ 0.000000] BIOS-e820: [mem 0x0000000000000
[ 0.000000] BIOS-e820: [mem 0x000000000009d
[ 0.000000] BIOS-e820: [mem 0x00000000000e0
[ 0.000000] BIOS-e820: [mem 0x0000000000100
[ 0.000000] BIOS-e820: [mem 0x0000000040000
[ 0.000000] BIOS-e820: [mem 0x0000000040400
[ 0.000000] BIOS-e820: [mem 0x000000008b79c
[ 0.000000] BIOS-e820: [mem 0x0000000090653
[ 0.000000] BIOS-e820: [mem 0x0000000090654
[ 0.000000] BIOS-e820: [mem 0x000000009b52d
[ 0.000000] BIOS-e820: [mem 0x000000009b59a
[ 0.000000] BIOS-e820: [mem 0x000000009b5ff
[ 0.000000] BIOS-e820: [mem 0x00000000f0000
[ 0.000000] BIOS-e820: [mem 0x00000000fd000
[ 0.000000] BIOS-e820: [mem 0x00000000fec00
[ 0.000000] BIOS-e820: [mem 0x00000000fed00
[ 0.000000] BIOS-e820: [mem 0x00000000fed10
[ 0.000000] BIOS-e820: [mem 0x00000000fed84
[ 0.000000] BIOS-e820: [mem 0x00000000fee00
[ 0.000000] BIOS-e820: [mem 0x00...

|
#208 |
Hello everyone,
unfortunately it happened again (system started without parameters):
[ 9.561808] br0: port 2(eth1) entered forwarding state
[95735.974041] usb 2-1: USB disconnect, device number 2
[95735.974215] xhci_hcd 0000:00:14.0: WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
[95735.974439] xhci_hcd 0000:00:14.0: WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
[95735.974471] ax88179_178a 2-1:1.0 eth1: unregister 'ax88179_178a' usb-0000:00:14.0-1, ASIX AX88179 USB 3.0 Gigabit Ethernet
[95735.974523] ax88179_178a 2-1:1.0 eth1: Failed to read reg index 0x0002: -19
[95735.974532] ax88179_178a 2-1:1.0 eth1: Failed to write reg index 0x0002: -19
[95735.974595] br0: port 2(eth1) entered disabled state
[95735.974783] device eth1 left promiscuous mode
[95735.974790] br0: port 2(eth1) entered disabled state
[95735.992489] ax88179_178a 2-1:1.0 eth1 (unregistered): Failed to write reg index 0x0002: -19
[95735.992503] ax88179_178a 2-1:1.0 eth1 (unregistered): Failed to write reg index 0x0001: -19
[95735.992510] ax88179_178a 2-1:1.0 eth1 (unregistered): Failed to write reg index 0x0002: -19
[95736.215301] usb 2-1: new SuperSpeed USB device number 4 using xhci_hcd
[95736.566562] ax88179_178a 2-1:1.0 eth1: register 'ax88179_178a' at usb-0000:00:14.0-1, ASIX AX88179 USB 3.0 Gigabit Ethernet, 00:0e:c6:81:79:01
Marco

|
#209 |
I also have the issue. Using Proxmox 7.2 (Debian Bullseye) with a Lenovo M910q core-i7-7700T, using two TPLink UE300 (RTL8153) USB to 1Gbe Ethernet adapters. Each one is stable in a lower USB slot. Swapping the adapters does not change the behavior and only impacts the USB device in the higher slot. Changes to different ports without change.
Easily reproducible with the following commands. Basically I'm trying to plumb bond0 again, which works initially, I get the xhci_hcd warning, and the link is down again. System details are also below.
root@higgins:~# dmesg -C ; ifup -a ; ip link | grep enx ; \
> dmesg -H ; dmesg -C ; sleep 70 ; \
> ip link | grep enx ; dmesg -H
3: enxd03745be5afc: <BROADCAST,
16: enx54af9786ab11: <BROADCAST,
[Sep 3 11:05] device enx54af9786ab11 entered promiscuous mode
[ +0.001236] bond0: (slave enx54af9786ab11): Enslaving as a backup interface with a down link
[ +0.006363] vmbr0: the hash_elasticity option has been deprecated and is always 16
[ +0.013972] r8152 2-4:1.0 enx54af9786ab11: Promiscuous mode enabled
[ +0.001344] r8152 2-4:1.0 enx54af9786ab11: carrier on
3: enxd03745be5afc: <BROADCAST,
17: enx54af9786ab11: <BROADCAST,
[Sep 3 11:05] bond0: (slave enx54af9786ab11): link status definitely up, 1000 Mbps full duplex
[Sep 3 11:06] usb 2-4: USB disconnect, device number 12
[ +0.001544] xhci_hcd 0000:00:14.0: WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
[ +0.001435] bond0: (slave enx54af9786ab11): Releasing backup interface
[ +0.029081] device enx54af9786ab11 left promiscuous mode
[ +0.316190] usb 2-4: new SuperSpeed USB device number 13 using xhci_hcd
[ +0.022053] usb 2-4: New USB device found, idVendor=2357, idProduct=0601, bcdDevice=30.00
[ +0.001297] usb 2-4: New USB device strings: Mfr=1, Product=2, SerialNumber=6
[ +0.001337] usb 2-4: Product: USB 10/100/1000 LAN
[ +0.001261] usb 2-4: Manufacturer: TP-Link
[ +0.001208] usb 2-4: SerialNumber: 000001
[ +0.137200] usb 2-4: reset SuperSpeed USB device number 13 using xhci_hcd
[ +0.049197] r8152 2-4:1.0: load rtl8153a-4 v2 02/07/20 successfully
[ +0.030905] r8152 2-4:1.0 eth0: v1.12.12
[ +0.007834] r8152 2-4:1.0 enx54af9786ab11: renamed from eth0
root@higgins:~#
-------
System Details
-------
root@higgins:~# uname -a
Linux higgins 5.15.39-4-pve #1 SMP PVE 5.15.39-4 (Mon, 08 Aug 2022 15:11:15 +0200) x86_64 GNU/Linux
root@higgins:~# lspci -k -nn | grep -B2 xhci
00:14.0 USB controller [0c03]: Intel Corporation 200 Series/Z370 Chipset Family USB 3.0 xHCI Controller [8086:a2af]
Subsystem: Lenovo 200 Series/Z370 Chipset Family USB 3.0 xHCI Controller [17aa:310b]
Kernel driver in use: xhci_hcd
Kernel modules: xhci_pci
root@higgins:~# lsusb -tv
/: Bus 02.Port 1: D...

|
#210 |
(In reply to Sean Kennedy from comment #205)
> I also have the issue. Using Proxmox 7.2 (Debian Bullseye) with a Lenovo
> M910q core-i7-7700T, using two TPLink UE300 (RTL8153) USB to 1Gbe Ethernet
> adapters. Each one is stable in a lower USB slot. Swapping the adapters does
> not change the behavior and only impacts the USB device in the higher slot.
> Changes to different ports without change.
Update - Tried a different dongle - a 2.5Gbe and have two hard drives attached to the system. Doesn't matter where the 2.5Gbe dongle is attached, it eventually errors with "WARN Set TR Deq Ptr cmd failed" And the error rate is only around six times a day right now:
8156 Realtek Semiconductor Corp. USB 10/100/1G/2.5G LAN
# dmesg -T | grep xhci
[Tue Sep 6 13:37:13 2022] xhci_hcd 0000:00:14.0: xHCI Host Controller
[Tue Sep 6 13:37:13 2022] xhci_hcd 0000:00:14.0: new USB bus registered, assigned bus number 1
[Tue Sep 6 13:37:13 2022] xhci_hcd 0000:00:14.0: hcc params 0x200077c1 hci version 0x100 quirks 0x0000000000009810
[Tue Sep 6 13:37:13 2022] usb usb1: Manufacturer: Linux 5.15.39-4-pve xhci-hcd
[Tue Sep 6 13:37:13 2022] xhci_hcd 0000:00:14.0: xHCI Host Controller
[Tue Sep 6 13:37:13 2022] xhci_hcd 0000:00:14.0: new USB bus registered, assigned bus number 2
[Tue Sep 6 13:37:13 2022] xhci_hcd 0000:00:14.0: Host supports USB 3.0 SuperSpeed
[Tue Sep 6 13:37:13 2022] usb usb2: Manufacturer: Linux 5.15.39-4-pve xhci-hcd
[Tue Sep 6 13:37:13 2022] usb 2-1: new SuperSpeed USB device number 2 using xhci_hcd
[Tue Sep 6 13:37:14 2022] usb 2-3: new SuperSpeed USB device number 3 using xhci_hcd
[Tue Sep 6 13:37:14 2022] usb 2-4: new SuperSpeed USB device number 4 using xhci_hcd
[Tue Sep 6 14:39:22 2022] xhci_hcd 0000:00:14.0: WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
[Tue Sep 6 14:39:22 2022] usb 2-4: new SuperSpeed USB device number 5 using xhci_hcd
[Tue Sep 6 18:44:01 2022] xhci_hcd 0000:00:14.0: WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
[Tue Sep 6 18:44:01 2022] xhci_hcd 0000:00:14.0: WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
[Tue Sep 6 18:44:02 2022] usb 2-4: new SuperSpeed USB device number 6 using xhci_hcd
[Tue Sep 6 22:19:06 2022] xhci_hcd 0000:00:14.0: WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
[Tue Sep 6 22:19:07 2022] usb 2-4: new SuperSpeed USB device number 7 using xhci_hcd
Since this drops the device from the system and offlines the link, I created a simple script to detect zero UP ethernet devices via cron once a minute and runs a ifnet -a. It's clunky but works.
crontab:
# m h dom mon dow command
* * * * * /root/fixnet.sh >/dev/null 2>&1
fixnet.sh:
#!/bin/sh
STATE=`ip link | grep " enx" | grep UP | wc -l`
if [ $STATE -gt 0 ]; then
# All good. Exit
exit 0
fi
/usr/sbin/ifup -a
sleep 20
ping -c 1 10.0.0.1 | grep "1 received"
if [ $? -eq 0 ]; then
# Network looks good. Exit.
exit 0
fi
sleep 310
ping -c 1 10.0.0.1 | grep "1 received"
if [ $? -ne 0 ]; then
# The network is still down.
systemctl reboot
fi
no longer affects: | ubuntu-release-upgrader (Ubuntu) |

|
#211 |
I'm using a 2.5gb ethernet usb device and getting this error intermittently (a dozen times per day).
$ uname -a
Linux hephaestus 5.4.0-135-generic #152-Ubuntu SMP Wed Nov 23 20:19:22 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
$ lsusb
<snip>
Bus 003 Device 016: ID 0bda:8156 Realtek Semiconductor Corp. USB 10/100/1G/2.5G
This is what plays out via /var/log/syslog each time:
Dec 21 10:26:47 hephaestus kernel: [346923.166782] usb 3-4: USB disconnect, device number 15
Dec 21 10:26:47 hephaestus kernel: [346923.166913] xhci_hcd 0000:00:14.0: WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
Dec 21 10:26:47 hephaestus kernel: [346923.166927] cdc_ncm 3-4:2.0 eth1: unregister 'cdc_ncm' usb-0000:00:14.0-4, CDC NCM
Dec 21 10:26:47 hephaestus kernel: [346923.167071] xhci_hcd 0000:00:14.0: WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
Dec 21 10:26:47 hephaestus kernel: [346923.170644] xhci_hcd 0000:00:14.0: WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
Dec 21 10:26:47 hephaestus dhclient[320734]: receive_packet failed on eth1: Network is down
Dec 21 10:26:47 hephaestus systemd[1]: Stopping ifup for eth1...
Dec 21 10:26:47 hephaestus dhclient[325522]: Killed old client process
Dec 21 10:26:47 hephaestus ifdown[325522]: Killed old client process
Dec 21 10:26:47 hephaestus kernel: [346923.478913] usb 3-4: new SuperSpeed Gen 1 USB device number 16 using xhci_hcd
Dec 21 10:26:47 hephaestus kernel: [346923.499567] usb 3-4: New USB device found, idVendor=0bda, idProduct=8156, bcdDevice=31.00
Dec 21 10:26:47 hephaestus kernel: [346923.499573] usb 3-4: New USB device strings: Mfr=1, Product=2, SerialNumber=6
Dec 21 10:26:47 hephaestus kernel: [346923.499577] usb 3-4: Product: USB 10/100/1G/2.5G LAN
Dec 21 10:26:47 hephaestus kernel: [346923.499580] usb 3-4: Manufacturer: Realtek
Dec 21 10:26:47 hephaestus kernel: [346923.499583] usb 3-4: SerialNumber: 001000001
Dec 21 10:26:47 hephaestus kernel: [346923.523736] cdc_ncm 3-4:2.0: MAC-Address: xx:xx:xx:xx:xx:xx
Dec 21 10:26:47 hephaestus kernel: [346923.523742] cdc_ncm 3-4:2.0: setting rx_max = 16384
Dec 21 10:26:47 hephaestus kernel: [346923.523836] cdc_ncm 3-4:2.0: setting tx_max = 16384
Dec 21 10:26:47 hephaestus kernel: [346923.524578] cdc_ncm 3-4:2.0 eth1: register 'cdc_ncm' at usb-0000:00:14.0-4, CDC NCM, xx:xx:xx:xx:xx:xx
Dec 21 10:26:47 hephaestus systemd-
Dec 21 10:26:47 hephaestus systemd-
Dec 21 10:26:47 hephaestus systemd[1]: Found device USB_10_
(then things start back up and the ethernet link goes live again after about 10 seconds)

|
#212 |
FYI: I have built a kernel with the previously (on this thread) discussed patch (on a 5.4 kernel) and I still have the error multiple times per day.
(In reply to James H from comment #207)
> I'm using a 2.5gb ethernet usb device and getting this error intermittently
> (a dozen times per day).
>
> $ uname -a
> Linux hephaestus 5.4.0-135-generic #152-Ubuntu SMP Wed Nov 23 20:19:22 UTC
> 2022 x86_64 x86_64 x86_64 GNU/Linux
>
>
> $ lsusb
> <snip>
> Bus 003 Device 016: ID 0bda:8156 Realtek Semiconductor Corp. USB
> 10/100/1G/2.5G
>
>
>
> This is what plays out via /var/log/syslog each time:
>
> Dec 21 10:26:47 hephaestus kernel: [346923.166782] usb 3-4: USB disconnect,
> device number 15
> Dec 21 10:26:47 hephaestus kernel: [346923.166913] xhci_hcd 0000:00:14.0:
> WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
> Dec 21 10:26:47 hephaestus kernel: [346923.166927] cdc_ncm 3-4:2.0 eth1:
> unregister 'cdc_ncm' usb-0000:00:14.0-4, CDC NCM
> Dec 21 10:26:47 hephaestus kernel: [346923.167071] xhci_hcd 0000:00:14.0:
> WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
> Dec 21 10:26:47 hephaestus kernel: [346923.170644] xhci_hcd 0000:00:14.0:
> WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
> Dec 21 10:26:47 hephaestus dhclient[320734]: receive_packet failed on eth1:
> Network is down
> Dec 21 10:26:47 hephaestus systemd[1]: Stopping ifup for eth1...
> Dec 21 10:26:47 hephaestus dhclient[325522]: Killed old client process
> Dec 21 10:26:47 hephaestus ifdown[325522]: Killed old client process
> Dec 21 10:26:47 hephaestus kernel: [346923.478913] usb 3-4: new SuperSpeed
> Gen 1 USB device number 16 using xhci_hcd
> Dec 21 10:26:47 hephaestus kernel: [346923.499567] usb 3-4: New USB device
> found, idVendor=0bda, idProduct=8156, bcdDevice=31.00
> Dec 21 10:26:47 hephaestus kernel: [346923.499573] usb 3-4: New USB device
> strings: Mfr=1, Product=2, SerialNumber=6
> Dec 21 10:26:47 hephaestus kernel: [346923.499577] usb 3-4: Product: USB
> 10/100/1G/2.5G LAN
> Dec 21 10:26:47 hephaestus kernel: [346923.499580] usb 3-4: Manufacturer:
> Realtek
> Dec 21 10:26:47 hephaestus kernel: [346923.499583] usb 3-4: SerialNumber:
> 001000001
> Dec 21 10:26:47 hephaestus kernel: [346923.523736] cdc_ncm 3-4:2.0:
> MAC-Address: xx:xx:xx:xx:xx:xx
> Dec 21 10:26:47 hephaestus kernel: [346923.523742] cdc_ncm 3-4:2.0: setting
> rx_max = 16384
> Dec 21 10:26:47 hephaestus kernel: [346923.523836] cdc_ncm 3-4:2.0: setting
> tx_max = 16384
> Dec 21 10:26:47 hephaestus kernel: [346923.524578] cdc_ncm 3-4:2.0 eth1:
> register 'cdc_ncm' at usb-0000:00:14.0-4, CDC NCM, xx:xx:xx:xx:xx:xx
> Dec 21 10:26:47 hephaestus systemd-
> naming scheme 'v245'.
> Dec 21 10:26:47 hephaestus systemd-
> is unset or enabled, the speed and duplex are not writable.
> Dec 21 10:26:47 hephaestus systemd[1]: Found device USB_10_
> (then things start back up and the ethernet link goes live again after about
> 10 seconds)

Sven Mohr (svmohr) wrote : | #213 |
I also get random disconnects on kernel 6.3.0-7-generic with a Samsung T7 Shield external SSD drive. Unfortunately it is hard to reproduce this error, it usually takes hours before it occurs the first time.
System:
Kernel: 6.3.0-7-generic arch: x86_64 bits: 64 compiler: N/A Console: pty pts/10 Distro: Ubuntu
23.10 (Mantic Minotaur)
Machine:
Type: Server System: Supermicro product: C9Z390-PGW v: 0123456789 serial: <filter>
Mobo: Supermicro model: C9Z390-PGW v: 1.01A serial: <filter> UEFI: American Megatrends v: 1.3
date: 06/03/2020
CPU:
Info: 8-core model: Intel Core i9-9900K bits: 64 type: MT MCP arch: Coffee Lake rev: D cache:
L1: 512 KiB L2: 2 MiB L3: 16 MiB
Speed (MHz): avg: 3687 high: 5002 min/max: 800/5000 cores: 1: 5002 2: 3600 3: 3600 4: 3600
5: 3600 6: 3600 7: 3600 8: 3600 9: 3600 10: 3600 11: 3600 12: 3600 13: 3600 14: 3600 15: 3600
16: 3600 bogomips: 115200
Flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx
/: Bus 02.Port 1: Dev 1, Class=root_hub, Driver=
ID 1d6b:0003 Linux Foundation 3.0 root hub
|__ Port 4: Dev 10, If 0, Class=Mass Storage, Driver=uas, 10000M
ID 04e8:61fb Samsung Electronics Co., Ltd
BOOT_IMAGE=
io-pci vfio_pci.
[349280.239403] usb 2-4: USB disconnect, device number 9
[349280.239689] xhci_hcd 0000:00:14.0: WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
[349280.239695] usb 2-4: cmd cmplt err -108
[349280.239702] sd 9:0:0:0: [sdh] tag#13 uas_zap_pending 0 uas-tag 1 inflight: CMD
[349280.239705] sd 9:0:0:0: [sdh] tag#13 CDB: Write(16) 8a 00 00 00 00 00 d3 28 e4 00 00 00 00 d8 00 00
[349280.239724] sd 9:0:0:0: [sdh] tag#13 FAILED Result: hostbyte=
[349280.239726] sd 9:0:0:0: [sdh] tag#13 CDB: Write(16) 8a 00 00 00 00 00 d3 28 e4 00 00 00 00 d8 00 00
[349280.239728] I/O error, dev sdh, sector 3542672384 op 0x1:(WRITE) flags 0x8800 phys_seg 27 prio class 2
[349280.239741] device offline error, dev sdh, sector 3542674432 op 0x1:(WRITE) flags 0x8800 phys_seg 35 prio class 2
[349280.239747] device offline error, dev sdh, sector 3542672640 op 0x1:(WRITE) flags 0x8800 phys_seg 24 prio class 2
[349280.239750] device offline error, dev sdh, sector 3542677504 op 0x1:(WRITE) flags 0x8800 phys_seg 45 prio class 2
[349280.239753] device offline error, dev sdh, sector 3542680576 op 0x1:(WRITE) flags 0x8800 phys_seg 41 prio class 2
[349280.239788] device offline error, dev sdh, sector 3542663168 op 0x1:(WRITE) flags 0x8800 phys_seg 35 prio class 2
[349280.239793] device offline error, dev sdh, sector 3542663680 op 0x1:(WRITE) flags 0x8800 phys_seg 29 prio class 2
[349280.239799] device offline error, dev sdh, sector 3542663936 op 0x1:(WRITE) flags 0x8800 phys_seg 26 prio class 2
[349280.299534] sd 9:0:0:0: [sdh] Synchronizing SCSI cache
[349280.523475] sd 9:0:0:0: [sdh] Synchronize Cache(10) failed: Result: hostbyte=DID_ERROR driverbyte=...

|
#214 |
I also get random disconnects on kernel 6.3.0-7-generic with a Samsung T7 Shield external SSD drive. Unfortunately it is hard to reproduce this error, it usually takes hours before it occurs the first time.
System:
Kernel: 6.3.0-7-generic arch: x86_64 bits: 64 compiler: N/A Console: pty pts/10 Distro: Ubuntu
23.10 (Mantic Minotaur)
Machine:
Type: Server System: Supermicro product: C9Z390-PGW v: 0123456789 serial: <filter>
Mobo: Supermicro model: C9Z390-PGW v: 1.01A serial: <filter> UEFI: American Megatrends v: 1.3
date: 06/03/2020
CPU:
Info: 8-core model: Intel Core i9-9900K bits: 64 type: MT MCP arch: Coffee Lake rev: D cache:
L1: 512 KiB L2: 2 MiB L3: 16 MiB
Speed (MHz): avg: 3687 high: 5002 min/max: 800/5000 cores: 1: 5002 2: 3600 3: 3600 4: 3600
5: 3600 6: 3600 7: 3600 8: 3600 9: 3600 10: 3600 11: 3600 12: 3600 13: 3600 14: 3600 15: 3600
16: 3600 bogomips: 115200
Flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx
/: Bus 02.Port 1: Dev 1, Class=root_hub, Driver=
ID 1d6b:0003 Linux Foundation 3.0 root hub
|__ Port 4: Dev 10, If 0, Class=Mass Storage, Driver=uas, 10000M
ID 04e8:61fb Samsung Electronics Co., Ltd
BOOT_IMAGE=
io-pci vfio_pci.
[349280.239403] usb 2-4: USB disconnect, device number 9
[349280.239689] xhci_hcd 0000:00:14.0: WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
[349280.239695] usb 2-4: cmd cmplt err -108
[349280.239702] sd 9:0:0:0: [sdh] tag#13 uas_zap_pending 0 uas-tag 1 inflight: CMD
[349280.239705] sd 9:0:0:0: [sdh] tag#13 CDB: Write(16) 8a 00 00 00 00 00 d3 28 e4 00 00 00 00 d8 00 00
[349280.239724] sd 9:0:0:0: [sdh] tag#13 FAILED Result: hostbyte=
[349280.239726] sd 9:0:0:0: [sdh] tag#13 CDB: Write(16) 8a 00 00 00 00 00 d3 28 e4 00 00 00 00 d8 00 00
[349280.239728] I/O error, dev sdh, sector 3542672384 op 0x1:(WRITE) flags 0x8800 phys_seg 27 prio class 2
[349280.239741] device offline error, dev sdh, sector 3542674432 op 0x1:(WRITE) flags 0x8800 phys_seg 35 prio class 2
[349280.239747] device offline error, dev sdh, sector 3542672640 op 0x1:(WRITE) flags 0x8800 phys_seg 24 prio class 2
[349280.239750] device offline error, dev sdh, sector 3542677504 op 0x1:(WRITE) flags 0x8800 phys_seg 45 prio class 2
[349280.239753] device offline error, dev sdh, sector 3542680576 op 0x1:(WRITE) flags 0x8800 phys_seg 41 prio class 2
[349280.239788] device offline error, dev sdh, sector 3542663168 op 0x1:(WRITE) flags 0x8800 phys_seg 35 prio class 2
[349280.239793] device offline error, dev sdh, sector 3542663680 op 0x1:(WRITE) flags 0x8800 phys_seg 29 prio class 2
[349280.239799] device offline error, dev sdh, sector 3542663936 op 0x1:(WRITE) flags 0x8800 phys_seg 26 prio class 2
[349280.299534] sd 9:0:0:0: [sdh] Synchronizing SCSI cache
[349280.523475] sd 9:0:0:0: [sdh] Synchronize Cache(10) failed: Result: hostbyte=DID_ERROR driverbyte=DRIVE...

|
#215 |
Have this problem in Raspberry Pi hosts with 6.8.0, while Intel hosts with the same USB Ethernet do not complain.
# uname -a
Linux XX 6.8.0-1013-raspi #14-Ubuntu SMP PREEMPT_DYNAMIC Wed Oct 2 15:14:53 UTC 2024 aarch64 aarch64 aarch64 GNU/Linux
# dmesg
[170385.819352] usb 3-1: Disable of device-initiated U1 failed.
[170385.819381] usb 3-1: Disable of device-initiated U2 failed.
[170385.841365] xhci-hcd xhci-hcd.0: WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
[170385.841386] cdc_ncm 3-1:2.0 enx3c8cf860689d: unregister 'cdc_ncm' usb-xhci-hcd.0-1, CDC NCM (NO ZLP)
[170385.841640] xhci-hcd xhci-hcd.0: WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
[170385.873312] usb 3-1: Enable of device-initiated U1 failed.
[170385.937301] usb 3-1: Enable of device-initiated U2 failed.
[170386.161431] usb 3-1: reset SuperSpeed USB device number 2 using xhci-hcd
[170386.301112] cdc_ncm 3-1:2.0: MAC-Address: 3c:8c:f8:60:68:9d
[170386.301121] cdc_ncm 3-1:2.0: setting rx_max = 16384
[170386.301181] cdc_ncm 3-1:2.0: setting tx_max = 16384
[170386.301953] cdc_ncm 3-1:2.0 eth1: register 'cdc_ncm' at usb-xhci-hcd.0-1, CDC NCM (NO ZLP), 3c:8c:f8:60:68:9d
[170386.358603] cdc_ncm 3-1:2.0 enx3c8cf860689d: renamed from eth1
# lsusb
...
Bus 003 Device 002: ID 20f4:e02c TRENDnet TUC-ET2G(v2.0R)
...
Same device works on:
# uname -a
Linux YY 6.8.0-48-generic #48~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Mon Oct 7 11:24:13 UTC 2 x86_64 x86_64 x86_64 GNU/Linux
# lsusb
...
Bus 004 Device 003: ID 20f4:e02c TRENDnet TUC-ET2G(v2.0R)
Bus 004 Device 002: ID 20f4:e02c TRENDnet TUC-ET2G(v2.0R)
...

|
#216 |
Just FYI,
[code]
Linux version 6.8.0-48-generic (buildd@
[/code]
exhibits the same problem. In my case it is a raspberry pi pico (or rather an RP2040 based board) connected to an USB2 hub that calls itself "USB 3 HUB".
Not sure if it is related, but checking lsusb to verify this is what was going on, I've connected and disconnected a device 3 weeks ago, that LSUSB is still listing.

|
#217 |
Just ran into this bug just now on a Gigabyte Z790 UD AC motherboard.
xhci_hcd WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state
Added kernel command line intel_iommu=off option seemingly helped slightly with improved boot times (usb slow initialization), however subsequently locked-up the BIOS/EFI with me manually setting bad memory SPD timings, reluctantly resetting CMOS via shorting using a screwdriver. After re-configuring the BIOS to sane OEM stable settings (eg. Deactivate overclocking Turbo Modes), xhci_hcd the error is now gone! (eg. dmesg |grep fail) This xhci_hcd error seemed to mysteriously occur after activating XMP. I've activated XMP in the past without problems, but XMP was only activated briefly alongside CPU Turbo mode. Weird!
I'm really really not liking EFI/UEFI. Too many mysterious bugs popping-up.
For those still monitoring, try hard resetting your CMOS (with the screwdriver/short pins) and then loading BIOS/EFI default settings. Shrugs as to whether XMP or CPU Turbo modes have any affect.

|
#218 |
I also briefly rechecked all USB connections on the back of the computer case for loose connections, leaving intel_iommu=off activate within the kernel command line options.

|
#219 |
Oh, and one more likely more important relevant BIOS/EFI option, I left ASPM Native to disabled. Prior, I had ASPM Native (O/S ASPM support) enabled.

|
#220 |
Getting this xhci_hcd fail or slow boot more so down to the cause, I've been debugging enabling power-on via keyboard/mouse, and again this xhci_hcd fail showed-up after, not when enabling ErP setting (below power-on via keyboard/mouse settings), but merrily toggling the ErP enable/disable option within BIOS/EFI screen and then rebooting/power cycling. ErP (Energy-Related Products) mode is somewhat directly related with powering on via keyboard/mouse. What's exactly transpiring here I do not know, as BIOS/EFI settings tend to be elusive, especially when the settings modify other values on another BIOS/EFI screen or settings not displayed.
Likely a PCI(E), in this case an xhci_hcd (USB bus) device, is having problems waking after power-off due to either ASPM/ErP? If this is the case recommendations else where suggest disabling ASPM and/or ErP.
NOTE: Fast/Ultra-fast boot BIOS/EFI settings also provides further refinement for waking via keyboard/mouse, so some settings might be intermingling with ErP, and maybe ASPM. Since BIOS/EFI documentation nowadays (in my case Gigabyte) is significantly increasingly lacking, can only guess based on how the PC boots?

|
#221 |
# uname -a
Linux 1void 6.12.9_1 #1 SMP PREEMPT_DYNAMIC Fri Jan 10 00:53:27 UTC 2025 x86_64 GNU/Linux
OK. I'm getting somewhere and starting to get a pretty picture of what is likely transpiring!
My xhci_hcd fail on this bug is occurring on my second bus. (eg. dmesg |grep 2-4 -> usb 2-4) This is second USB bus on the Gigabyte "Z790 UD AC" make/model board is the 5GB USB device!
lsusb -t
Bus 002.Port 001: Dev 001, Class=root_hub, Driver=xhci_hcd/9p, 20000M/x2
Port 004: Dev 009, If 0, Class=Hub, Driver=hub/4p, 5000M
Port 006: Dev 018, If 0, Class=Mass Storage, Driver=usb-storage, 5000M
DEVICE IDs idVendor=0451, idProduct=8041 (hub)
dmesg |grep 2-4
[ 6.497899] usb 2-4: new SuperSpeed USB device number 2 using xhci_hcd
[ 6.509171] usb 2-4: New USB device found, idVendor=0451, idProduct=8041, bcdDevice= 1.00
[ 6.509175] usb 2-4: New USB device strings: Mfr=0, Product=0, SerialNumber=0
[ 6.509932] hub 2-4:1.0: USB hub found
[ 6.509959] hub 2-4:1.0: 4 ports detected
Looks like this is occurring on the Gigabyte Z790 UD AC 1 x USB 3.2 Gen 2 bus/ports, red port rated for 5Gb/s and reporting 5000Mb/s via Linux USB utilities.
Since this 5GB USB tech is likely new, likely seeing a bug with with either the BIOS/EFI initialization of the on-board motherboard USB 5GB PCI device, or a kernel driver bug with initialization of the on-board motherboard USB 5GB PCI device. I'm more less guessing, the bug is Linux kernel driver related, as I'm not seeing significant similar delays with Windows 11.
This bug is still occurring apparently randomly, from reboot to cold start, without apparent pattern here.

|
#222 |
Trying to scan through, read through all relevant posts some further information on this Gigabyte Z790 UD AC motherboard, looks like some quirks are already applied via the Linux kernel 6.12.9_1 xhci_hcd driver.
# dmesg |grep xhci
[ 6.010722] xhci_hcd 0000:00:14.0: xHCI Host Controller
[ 6.011225] xhci_hcd 0000:00:14.0: new USB bus registered, assigned bus number 1
[ 6.012603] xhci_hcd 0000:00:14.0: hcc params 0x20007fc1 hci version 0x120 quirks 0x0000000200009810
[ 6.014181] xhci_hcd 0000:00:14.0: xHCI Host Controller
[ 6.014659] xhci_hcd 0000:00:14.0: new USB bus registered, assigned bus number 2
[ 6.014968] xhci_hcd 0000:00:14.0: Host supports USB 3.2 Enhanced SuperSpeed
[ 6.016517] usb usb1: Manufacturer: Linux 6.12.9_1 xhci-hcd
[ 6.022716] usb usb2: Manufacturer: Linux 6.12.9_1 xhci-hcd
Denotes this PCI controller is parent of the USB #2 hub.
$ lspci -k -nn | grep -B3 xhci
00:14.0 USB controller [0c03]: Intel Corporation Raptor Lake USB 3.2 Gen 2x2 (20 Gb/s) XHCI Host Controller [8086:7a60] (rev 11)
DeviceName: Onboard - Other
Subsystem: Gigabyte Technology Co., Ltd Device [1458:5007]
Kernel driver in use: xhci_hcd
Kernel modules: mei_me, xhci_pci

|
#223 |
PROBLEM: Identified a USB 3.0 Cable likely either going bad or out-of-
Cable Matters SuperSpeed USB 3.0 Type A to B Cable in Black 10 Feet
Item model number : 200007-BLACK-10
Date First Available : October 20, 2011
Manufacturer : Cable Matters
ASIN : B00C7RZPJ0
SCENARIO: The ten foot USB 3.0 rated cable is connecting from the motherboard USB-3 port to the display included USB-3 rated hub, with two USB-2 peripheral input devices connect to the USB display hub. Although connected through the #1 USB-3 motherboard hub, errors were reporting on the #2 USB-3 motherboard hub/ports.
SOLUTION: Connect the lengthy ten foot USB-3 cable inhibiting errors/failures on the motherboard USB-3 hub to a motherboard USB-2 hub. This newer recently purchased motherboard has USB-3 ports, whereas my older motherboard had only USB-2 ports. I also noted problems using a ten foot USB-2 rated cable with a USB printer, plugging into an external hub and with similar kernel printed errors/failures awhile ago, subsequently plugging the printer USB cable directly into the computer motherboard/PCI-E USB hub. NOTE: Probably a good idea for labeling these ten foot or lengthy cords, with labels noting only for use with USB-2 ports.
It really baffles to-date that there are no decent front-ends or back-ends, reporting bad cables and/or connections due to lack of power/electrical current!

|
#224 |
On Wed, Jan 15, 2025 at 08:28:35PM +0000, <email address hidden> wrote:
> Denotes this PCI controller is parent of the USB #2 hub.
The controller where I see it is:
magigamix:~> lspci -k -nn | grep -B3 xhci
00:14.0 USB controller [0c03]: Intel Corporation Tiger Lake-H USB 3.2 Gen 2x1 xHCI Host Controller [8086:43ed] (rev 11)
DeviceName: Onboard - Other
Subsystem: Gigabyte Technology Co., Ltd Tiger Lake-H USB 3.2 Gen 2x1 xHCI Host Controller [1458:5007]
Kernel driver in use: xhci_hcd
Kernel modules: xhci_pci
I think this is an older device, i.e. not supporting the latest
superspeed, red plugs.
Someone (you) just reported something with suspend/resume that might
not work as intended. My system never suspends/sleeps. It is simply ON
24/7. It now has an uptime of 2 months, but if the display wouldn't
crash every now and then, it would have had an uptime on the order of
2 years.
Roger.
After upgrading to the 4.20 Kernel(was using 4.19 previously) my usb wifi stick doesn´t work until I reboot the system. This issue happens every time I start my pc(only when the system was shut down, it doesn´t happen after rebooting). The wifi driver in use is rt2800usb. I tried restarting the NetworkManager, but this didn´t change anything.