Opened 9 months ago

Last modified 9 months ago

#18531 new bug

kernel panic when unplugging sd card reader

Reported by: pulkomandy Owned by: mmlr
Priority: normal Milestone: Unscheduled
Component: Drivers/USB Version: R1/beta4
Keywords: Cc:
Blocked By: Blocking:
Platform: All

Description

When unplugging my SD card reader (not currently mounted) I get this panic.

Attachments (3)

DSC_0008.JPG (3.7 MB ) - added by pulkomandy 9 months ago.
listusb.txt (1.3 KB ) - added by pulkomandy 9 months ago.
kdl.jpg (1.2 MB ) - added by pulkomandy 9 months ago.

Change History (10)

by pulkomandy, 9 months ago

Attachment: DSC_0008.JPG added

comment:1 by waddlesplash, 9 months ago

Appears to be a use-after-free in the device_manager. This may or may not be XHCI's fault, it could be the stack or device_manager too; I'll have to read the code carefully.

comment:2 by waddlesplash, 9 months ago

Component: Drivers/USB/XHCIDrivers/USB
Owner: changed from waddlesplash to mmlr

The XHCI driver just calls "delete" on the passed device, the same as the generic BusManager implementation does. So the problem is not there.

Reading through the Stack code, I don't really see how we could wind up in a double-free here. The USB stack Device code sets the device_node pointer to NULL after calling "unregister", and we have multiple locks acquired at this point, so it isn't really possible that we are double-unregistering something, as far as I can tell. Even if there were two Device pointers, that would mean passing 0xdeadbeef directly to the device_manager's unregister function, which I don't think is happening because it would have faulted earlier.

So that points to a double-free in the device manager, or some other fault I've missed in the USB stack itself. Either way, the XHCI driver is not at fault.

comment:3 by waddlesplash, 9 months ago

This is actually especially weird because device_nodes appear to be reference-counted. I guess the next thing to establish is exactly where the deadbeef pointer came from, and whether it's a device_node or something else entirely.

Could you run syslog | tail 50 at the KDL prompt? This may help in diagnosing what happened leading up to the crash.

comment:4 by pulkomandy, 9 months ago

I forgot to mention what's (possibly) special about this device: it is a multi-card reader (SD, CompactFlash, and a few others) so it results in multiple mass storage disks. If I remember correctly, they are represented as SCSI LUNs. Maybe this leads to some confusion on the usb mass storage driver side?

I will screenshot the syslog when I'm back home.

by pulkomandy, 9 months ago

Attachment: listusb.txt added

by pulkomandy, 9 months ago

Attachment: kdl.jpg added

comment:5 by pulkomandy, 9 months ago

Attached the listusb (but I think it reveals nothing special) and the errors seen in KDL when unplugging the device.

It's possible that the disk storage is busy trying to scan all the empty card slots and timing out on them and so there are in-progress transfers from that?

comment:6 by waddlesplash, 9 months ago

Yes, listusb is nothing special.

In-progress transfers should make no difference. The stack will deal with cancelling them during teardown, and XHCI will further check state before final deletion.

The more concerning thing is that "Unknown Device Error", that may be indicative of whatever problem leads to this use-after-free / double-free. By the time we get to actually deleting the device, all drivers attached to it should have been notified and torn down. The fact that we get such an error seems to indicate that didn't happen as expected.

comment:7 by waddlesplash, 9 months ago

Adding more tracing in the device manager is probably the way to go here, I guess.

Note: See TracTickets for help on using tickets.