Opened 8 years ago

Closed 5 years ago

Last modified 5 years ago

#12929 closed bug (fixed)

[XHCI] crash in _LinkDescriptorForPipe()

Reported by: diver Owned by: nobody
Priority: normal Milestone: R1/beta2
Component: Drivers/USB/XHCI Version: R1/Development
Keywords: Cc: korli
Blocked By: Blocking: #12915, #12951, #13912, #14798, #15013
Platform: All

Description

hrev50496 running in VirtualBox 5.0.26.

KDL often occurs on VM resume after pause.

Attachments (1)

kdl.png (44.7 KB ) - added by diver 8 years ago.

Download all attachments as: .zip

Change History (21)

by diver, 8 years ago

Attachment: kdl.png added

comment:1 by diver, 8 years ago

Owner: changed from nobody to korli
Status: newassigned

comment:2 by korli, 8 years ago

Owner: changed from korli to nobody

comment:3 by diver, 8 years ago

Blocking: 12951 added

comment:4 by pulkomandy, 6 years ago

Milestone: UnscheduledR1/beta2

comment:5 by waddlesplash, 5 years ago

Still happens?

comment:6 by waddlesplash, 5 years ago

Blocking: 13912 added

comment:7 by waddlesplash, 5 years ago

Found an easy way to reproduce this:

  1. Connect USB MIDI device to a USB3 port (any Android phone running 6.0+ can operate in USB MIDI device mode)
  2. Run find /dev
  3. Instant KDL with this backtrace.

comment:8 by waddlesplash, 5 years ago

Summary: [XHCI] crash in LinkDescriptioinForPipe()[XHCI] crash in _LinkDescriptorForPipe()

comment:9 by waddlesplash, 5 years ago

Blocking: 14798 added

comment:10 by waddlesplash, 5 years ago

Blocking: 12915 added

comment:11 by waddlesplash, 5 years ago

With tracing on:

usb xhci -1: SubmitTransfer()
usb xhci -1: SubmitControlRequest() length 0
usb xhci -1: _LinkDescriptorForPipe
usb xhci -1: _LinkDescriptorForPipe current 7, next 0

And then the page fault occurs.

comment:12 by waddlesplash, 5 years ago

Cc: korli added

So, it appears that since the ControlRequest length is 0, we won't get any TDs at all:

	// keep one trb for linking
	int32 tdCount = (trbCount + XHCI_MAX_TRBS_PER_TD - 2)
		/ (XHCI_MAX_TRBS_PER_TD - 1);

Probably the page fault is just LinkDescriptorForPipe trying to link the TD, since we wrapped from 7 -> 0, but it can't because it has no link TD.

korli, you originally added the logic above, can you please comment here?

comment:13 by korli, 5 years ago

Control requests actually happen here and don't call CreateDescriptorChain() but CreateDescriptor()

Anyway for normal requests, I see that Linux adds one trb when the buffer length is zero: https://github.com/torvalds/linux/blob/master/drivers/usb/host/xhci-ring.c#L2954 We should probably do the same. This means the trbCount is to increment, the tdCount computation seems already correct.

BTW I'm not sure this has to do with this bug, since ATM CreateDescriptorChain() returns NULL for a zero buffer length, _LinkDescriptorForPipe() shouldn't be called at all. Maybe I misread the code.

comment:14 by waddlesplash, 5 years ago

BTW I'm not sure this has to do with this bug, since ATM CreateDescriptorChain() returns NULL for a zero buffer length, _LinkDescriptorForPipe() shouldn't be called at all. Maybe I misread the code.

Well, if we should make sure trbCount is non-zero, then tdCount will of course be 1 in that case, yes?

Control requests actually happen ​here and don't call ​CreateDescriptorChain() but ​CreateDescriptor()

OK, then in that case I really don't understand how we get a page fault here.

Is there a reason we allocate/free descriptors for every request? That seems a little slow to say the least.

in reply to:  14 comment:15 by korli, 5 years ago

Replying to waddlesplash:

Is there a reason we allocate/free descriptors for every request? That seems a little slow to say the least.

This mirrors what EHCI does already, aka the USB stack manages memory chunks for the bus driver. Yes, xhci_td could be cached, but I suppose the overhead isn't noticeable.

in reply to:  14 comment:16 by korli, 5 years ago

Replying to waddlesplash:

Well, if we should make sure trbCount is non-zero, then tdCount will of course be 1 in that case, yes?

Yes

OK, then in that case I really don't understand how we get a page fault here.

SubmitControlRequest() could at least check _LinkDescriptorForPipe() return value and exit, instead of calling Ring() and returning B_OK.

comment:17 by waddlesplash, 5 years ago

SubmitControlRequest() could at least check _LinkDescriptorForPipe() return value and exit, instead of calling Ring() and returning B_OK.

Except the page fault occurs inside _LinkDescriptorForPipe, so I'm not sure what difference that will make?

I'll commit that change and the trbCount one, though.

comment:18 by waddlesplash, 5 years ago

Done in hrev52888.

comment:19 by waddlesplash, 5 years ago

Resolution: fixed
Status: assignedclosed

Fixed the KDL in hrev52963. The underlying cause still exists, though, but a new ticket should be made for it.

comment:20 by waddlesplash, 5 years ago

Blocking: 15013 added
Note: See TracTickets for help on using tickets.