Opened 4 years ago

Closed 3 years ago

Last modified 3 years ago

#16794 closed bug (fixed)

XHCI: Divide error exception in SubmitNormalRequest

Reported by: LSS37040 Owned by: waddlesplash
Priority: normal Milestone: R1/beta4
Component: Drivers/USB/XHCI Version: R1/Development
Keywords: Cc: korli, x512
Blocked By: Blocking: #16586, #16603, #16878, #17287
Platform: All

Description

I have a SD/MMC/MS card reader on my case that is connected to the motherboard via USB header. When the card reader is empty, DriveSetup would first stuck for a few seconds on startup then the system crashes with a panic.

It happens on R1/beta2 as well as the most recent nightly, hrev54954, and at that time blocked a successful installation since I need to format a partition to BeFS before installing it to the system.

I was able to work around it by putting the Haiku install image on a SD card, insert it to the card reader and install from there. When the card reader is populated, DriveSetup ran without any issue.

I'm attaching the photo I took when the panic happens. It seems to have mentions about XHCI, so I think it's possible that all USB ports/headers on my motherboard (whether 2.0 or 3.0) are provided through USB 3.0 host controllers.

The motherboard is ASRock X570 Taichi Razer Edition.

PS: From my past experiences, some chassis card readers are known to cause issues under certain circumstances, when unpopulated (empty).

Attachments (6)

IMG_20210213_204101.jpg (1.9 MB ) - added by LSS37040 4 years ago.
Panic when opening DriveSetup with the card reader unpopulated (empty).
photo_2021-06-08 00.16.16.jpeg (213.0 KB ) - added by exstrim401 3 years ago.
Panic when touching usb wireless mouse dongle
IMG_20210727_233017.jpg (2.3 MB ) - added by LSS37040 3 years ago.
Divide Error Exception reproduced on R1 Beta3.
IMG_20210912_204926.jpg (1.7 MB ) - added by LSS37040 3 years ago.
"endpoint not initialized" KDL when accessing DriveSetup with the chassis SD card reader empty.
IMG_20210919_205814.jpg (1.7 MB ) - added by LSS37040 3 years ago.
KDL with empty chassis card reader on hrev55436.
IMG_20210921_204639.jpg (2.1 MB ) - added by LSS37040 3 years ago.
hrev55443 "vm_page_fault" KDL.

Change History (38)

by LSS37040, 4 years ago

Attachment: IMG_20210213_204101.jpg added

Panic when opening DriveSetup with the card reader unpopulated (empty).

comment:1 by waddlesplash, 4 years ago

Cc: korli added
Summary: DriveSetup causes panic with an empty card reader.XHCI: Divide error exception in SubmitNormalRequest

I was under this is the impression that this is what hrev54876 was supposed to fix. korli, any ideas?

comment:2 by diver, 4 years ago

Was the attached panic photo made with beta2 or hrev54954?

comment:3 by LSS37040, 4 years ago

This photo was made on an installed hrev54954 system.

I managed to install it by putting the iso into a SD Card, inserting it to the card reader and boot from there. This way the card reader is not empty and I'm able to proceed with the installation without issues.

Last edited 4 years ago by LSS37040 (previous) (diff)

comment:4 by korli, 4 years ago

Well, SubmitNormalRequest should fail in all cases if trbSize is found to be zero (original patch: https://review.haiku-os.org/c/haiku/+/3611/1 )

comment:5 by waddlesplash, 4 years ago

I wonder if the problem here is that the xhci_endpoint is actually not initialized. Adding checks for that may solve this. However I am not sure how we can wind up in a state where the endpoint is not initialized but a Pipe exists...

comment:6 by waddlesplash, 4 years ago

Blocking: 16878 added

in reply to:  5 comment:7 by LSS37040, 4 years ago

Replying to waddlesplash:

I wonder if the problem here is that the xhci_endpoint is actually not initialized. Adding checks for that may solve this. However I am not sure how we can wind up in a state where the endpoint is not initialized but a Pipe exists...

I'm not an expert, but I do recall seeing some BIOSes/bootloaders don't play well with empty card readers before (that I either need to insert a card in it, or hide the reader somehow).

I think the BIOS could interpret empty card readers in all kinds of ways, and in some cases would cause certain BIOS/system calls to return invalid values which make the caller freaks out.

by exstrim401, 3 years ago

Panic when touching usb wireless mouse dongle

comment:8 by exstrim401, 3 years ago

I have the same problem when touching usb wireless mouse dongle

comment:9 by LSS37040, 3 years ago

Just tried reinstalling the same system with R1 Beta3 and the issue persists. I still need to populate the chassis SD card reader to prevent DriveSetup from crashing.

Also, while the error is the same (Divide Error Exception), the stacktrace looks a bit different this time.

by LSS37040, 3 years ago

Attachment: IMG_20210727_233017.jpg added

Divide Error Exception reproduced on R1 Beta3.

comment:11 by waddlesplash, 3 years ago

That patch would "fix" the KDL, but really just masks it. The code is structured such that those values should never be 0, and if they are, something has gone wrong somewhere else earlier. One such path was cut off by a patch of korli's, but something else must be going wrong in order to hit this result, and I would strongly prefer that be fixed rather than masking the problem.

If there are certain devices that reliably trigger the panic when removing/etc. them, perhaps I or someone else can try to look into it using that; or other information about what reliably causes it.

comment:12 by jessicah, 3 years ago

Perhaps adding a trace at https://github.com/haiku/haiku/blob/master/src/add-ons/kernel/bus_managers/usb/usb.cpp#L470 to check if packetCount is non-zero?

This gets used at https://github.com/haiku/haiku/blob/master/src/add-ons/kernel/busses/usb/xhci.cpp#L2949, maybe a possible divide by zero?

Running an objdump on xhci and searching for div instructions yielded the following functions:

  • GetUSBID
  • FinishTransfers
  • _Resize
  • PhysicalMemoryAllocator
  • Allocate
  • Deallocate

Also note that in the last attachment, that's being triggered by a stat syscall, pretty basic code there.

comment:13 by waddlesplash, 3 years ago

All the stack traces in question are Bulk or Interrupt operations, not Isochronous (which only audio/video devices use), though; and all of them appear to be in the Submit function, not the Finish function.

comment:14 by waddlesplash, 3 years ago

Platform: x86-64All

comment:15 by KapiX, 3 years ago

I am able to reliably trigger this with my slightly bent wireless keyboard dongle. Since it's bent, sometimes it doesn't work, and fiddling with it makes it cycle between on and off very fast. Then it doesn't take long before this KDL comes up.

Version 0, edited 3 years ago by KapiX (next)

comment:16 by waddlesplash, 3 years ago

If I have surmised as to the nature of the problem correctly, this patch should fix the problem: https://review.haiku-os.org/c/haiku/+/4421

KapiX, if you can reproduce it so easily, you seem to be the ideal candidate to build and test it. :)

comment:17 by waddlesplash, 3 years ago

Blocking: 16603 added

comment:18 by waddlesplash, 3 years ago

Blocking: 16586 added

comment:19 by waddlesplash, 3 years ago

hrev55404 should probably change this KDL from "division by zero" to "endpoint is not initialized".

comment:20 by LSS37040, 3 years ago

Tested hrev55409 and hrev55410.

An empty card reader still KDLs the system when starting DriveSetup.

However, when the system KDLs only a few lines of white area appear at the top of the screen and then the system appears frozen without showing any actual KDL messages, not even "endpoint is not initialized".

comment:21 by waddlesplash, 3 years ago

Probably because it is trying to use the usb_keyboard kernel debug addon. You may get a backtrace by blacklisting that.

by LSS37040, 3 years ago

Attachment: IMG_20210912_204926.jpg added

"endpoint not initialized" KDL when accessing DriveSetup with the chassis SD card reader empty.

comment:22 by LSS37040, 3 years ago

Yeah, after disabling usb_keyboard kernel debug addon I'm able to get an "endpoint not initialized" KDL backtrace. I've attached one. I can reliably reproduce this by accessing DriveSetup with the chassis SD card reader left empty.

comment:23 by waddlesplash, 3 years ago

I might have found a way to reproduce perhaps closely related bug in QEMU by rapidly adding and removing USB devices: a NULL dereference in USB ECM called from the explore thread, apparently into a device descriptor. It seems the device descriptor was already removed -- but that should only have occurred in the explore thread itself.

comment:24 by waddlesplash, 3 years ago

Milestone: UnscheduledR1/beta4
Resolution: fixed
Status: newclosed

I believe this should be fixed in hrev55429. There is a pending change on Gerrit which may resolve some other race conditions which should be merged pretty soon and resolve whatever other problems remain.

by LSS37040, 3 years ago

Attachment: IMG_20210919_205814.jpg added

KDL with empty chassis card reader on hrev55436.

comment:25 by LSS37040, 3 years ago

Resolution: fixed
Status: closedreopened

Just tested hrev55436 and the issue persists under the same circumstances (the presence of an empty chassis card reader), with a different KDL.

Now the panic says "USB object did not become unbusy!"

comment:26 by waddlesplash, 3 years ago

I guess this is probably due to the Control pipe features being in use. Seems we need a more cogent strategy for that teardown.

comment:27 by waddlesplash, 3 years ago

This change should hopefully resolve the problem: https://review.haiku-os.org/c/haiku/+/4491/1

When test builds are available, I will post a link here.

comment:28 by waddlesplash, 3 years ago

X512 reports that both the KDL and the original bug in #16969 was not immediately reproducible with those changes, so I just merged them in hrev55442. Please retest with that.

comment:29 by LSS37040, 3 years ago

Tested hrev55443. DriveSetup no longer KDLs the system, but it's not functioning correctly.

With an empty chassis card reader, it takes about several minutes to show the main screen and even after that DriveSetup window looks totally frozen. I can't do anything with it, such as scrolling down to see how an empty card reader would look like, as well as closing the window.

For that I'll file a separate issue as the KDL, which is the original topic of this issue, is now gone.

comment:30 by LSS37040, 3 years ago

Ack... spoke too soon. That was tested during install phase.

This time I actually installed the system and tried opening DriveSetup in the newly installed environment and I got a much different KDL, "vm_page_fault".

What I did is to open DriveSetup while my card reader is empty, open some folders while DriveSetup loads (which eventually KDLs).

by LSS37040, 3 years ago

Attachment: IMG_20210921_204639.jpg added

hrev55443 "vm_page_fault" KDL.

comment:31 by waddlesplash, 3 years ago

Resolution: fixed
Status: reopenedclosed

That appears to be a separate issue in a different component, please open a new ticket for it.

comment:32 by LSS37040, 3 years ago

Blocking: 17287 added
Note: See TracTickets for help on using tickets.