Opened 7 years ago

Closed 5 years ago

#13792 closed bug (fixed)

xhci: stall error does not recover

Reported by: GregCrain Owned by: nobody
Priority: normal Milestone: R1/beta2
Component: Drivers/USB/XHCI Version: R1/Development
Keywords: Cc:
Blocked By: Blocking: #14756
Platform: All

Description

I have an early revision NEC USB 3.0 chip.

: PCI:   class_base 0c, class_function 03, class_api 30
: PCI:   vendor 1033: NEC Corporation
: PCI:   device 0194: uPD720200 USB 3.0 Host Controller
: PCI:   info: Serial bus controller (USB controller, XHCI)

usb xhci -1: interface version: 0x0096
usb xhci -1: structural parameters: 1:0x04000820 2:0x00000011 3:0x00000000
usb xhci -1: capability params: 0x014042cb


During normal operation, a sequence of events occurs and transfers seem ok:
usb xhci -1: SubmitTransfer()
usb xhci -1: Ding Dong! slot:1 endpoint 1
usb xhci -1: event[14] = 32 (0x000000000d8a1020 0x01000000 0x02018001)
usb xhci -1: slot=1 epno=1 remainder=0 status=1 halted=0

. . .

With some additional debugging code borrowed from FreeBSD:

/* check if error means halted */
halted = (completionCode != COMP_SHORT_PACKET &&
	    	completionCode != COMP_SUCCESS);

TRACE_ALWAYS("slot=%u epno=%u remainder=%lu status=%u halted=%u\n", slot, endpointNumber, remainder, completionCode, halted);

But at some point in the function

"HandleTransferComplete(xhci_trb* trb)", A Stall Error occurs.

usb xhci -1: slot=1 epno=1 remainder=9 status=6 halted=1

A Stall Error is reported by the status=6, TRB completion code.

Then at some point after:

usb error xhci -1: _LinkDescriptorForPipe max transfers count exceeded 8

There are no interrupts that occur after this.

It occurs very soon on my 0x0096 revision chipset, but I believe that it happens on other chipsets eventually.

The driver doesn't seem to recover from a Stall Error, or do anything. It eventually stops, and even though

usb xhci -1: SubmitTransfer()

transfers are being issued, no more interrupts occur.

Change History (7)

comment:1 by diver, 7 years ago

Component: - GeneralDrivers/USB/XHCI

comment:2 by pulkomandy, 6 years ago

Milestone: UnscheduledR1/beta2

comment:3 by waddlesplash, 6 years ago

Blocking: 14756 added

comment:4 by waddlesplash, 6 years ago

This is now largely unreproducible, it seems, after ​db360a20648 & hrev52890, according to various reports on IRC. My guess is that the first commit was the fix: we were creating a NULL descriptor and trying to submit it as a transfer, which of course did nothing and then we never got any reply.

We still don't handle stall errors properly (per the spec we need to reset the endpoint), and perhaps things still are not so good on "early-revision" controllers, so Greg if you could re-test this, that'd be great.

comment:5 by waddlesplash, 6 years ago

It seems that the two commits I referenced in my previous message did improve the situation on quite a lot of hardware, but talking to GregCrain and kallisti5 on IRC, it seems their hardware still fails -- specifically, the transfers stall during or shortly after partition identification of the disks.

So, at least we do get some transfers before everything grinds to a halt; so we are "almost correct" at this point, it seems.

comment:6 by waddlesplash, 6 years ago

Please retest after hrev52966; this fixed a nasty race condition which was likely related.

comment:7 by waddlesplash, 5 years ago

Resolution: fixed
Status: newclosed

Greg confirmed to me that he can now boot successfully via USB3 on this device following the Event Data changes.

Note: See TracTickets for help on using tickets.