Opened 16 years ago
Closed 10 years ago
#2367 closed bug (fixed)
Media checker blocks in USB when booting from USB
Reported by: | axeld | Owned by: | mmlr |
---|---|---|---|
Priority: | normal | Milestone: | R1 |
Component: | Drivers/USB | Version: | R1/pre-alpha1 |
Keywords: | Cc: | ||
Blocked By: | Blocking: | ||
Platform: | All |
Description (last modified by )
I've installed Haiku on a 128 MB USB stick, and booted it from there. Booting itself went fine, but applications using the disk device API (like mountvolume, DriveSetup) hang when I start them.
The media checker waits on some EHCI finisher which never seems to return. When I disable the media checker in the kernel, the lockup does not happen, and the above mentioned apps work nicely.
Tested with hrev25898.
Attachments (2)
Change History (16)
comment:1 by , 16 years ago
Description: | modified (diff) |
---|---|
Priority: | normal → high |
comment:2 by , 16 years ago
Status: | new → assigned |
---|
comment:3 by , 16 years ago
I just checked here with hrev25882 (no changes in that regard to hrev25898) and I am not able to reproduce. I booted of the stick and ran DriveSetup and also mountvolume. Both worked as expected and didn't hang. I find it a bit strange that it waits for the EHCI finisher, as this thread is just finishing transfers and calls the callbacks. Even if a callback would queue a new transfer this would be done asynchronously so it cannot really deadlock there (at least not with itself). Why do you think it waits for the finisher? Can you provide some sort of debug output? I can only imagine that, as mentioned above, the device does simply never act on a queued transfer (although then the controller should still return the transfer with a timeout at some point). What would surprise me then is if other IO to that device would still work, as it should simply lock up usb_disk then.
comment:4 by , 16 years ago
There is only a single USB disk attached, plus the built-in SATA mass storage (two drives). I don't have a serial output from that machine, but I'll try to get more specific data.
The media checker itself waited for some disk device lock, and the lock owner (I don't remember who) was waiting for the EHCI finisher.
I probably don't have the time to do this before Thursday, though.
comment:5 by , 16 years ago
When you're at that machine again, could you please check what chip it uses for EHCI? There seem to be workarounds applied in other EHCI drivers namely for broken VIA chips that simply lose completion interrupts...
comment:6 by , 16 years ago
"I've installed Haiku on a 128 MB USB stick.."
How You installed the Haiku to an 128MB stick? The image size is 262MB (But it contains ~100MB free space, then 262-100=162 and 162>128). Are You sure You made correct bootdisk? How You made the bootdisk?
comment:7 by , 16 years ago
To mmlr: added image of media checker stack trace. Seems I could have remembered better... :-)
To miqlas: just remove all optional packages, and Haiku installs fine on smaller images. The image was actually only 100 MB in size, I don't remember how much free space was left, though.
comment:8 by , 16 years ago
Axel, any news regarding the EHCI chip in use? If it really is a VIA or ATI one a workaround for lost interrupts might be in order. I'll attach a patch to this ticket that should do pretty much that by setting a timeout on sem acquisition in the finisher thread, so that it will unconditionally wake up once every ms. Could you please try with that and check if it solves the problem. If so I would like to blacklist that chip you have there to always use such a workaround.
by , 16 years ago
Attachment: | ehci_finish_every_ms.diff added |
---|
Possible workaround for lost interrupts on broken controllers.
comment:9 by , 16 years ago
I've added a timeout in usb_disk in hrev26082. Which might solve or at least work around this issue. Could you please retry with that.
comment:10 by , 16 years ago
The timeout seems to have successfully worked around the issue. I get the following messages in syslog (tons of):
usb_disk: sending the command block wrapper failed usb_ehci: qtd (0x0f015f00) error: 0x80008d40 ... usb_disk: acquire_sem failed while waiting for data transfer
Not sure what this means; maybe the command couldn't even been send in the first place? Is is possible to differentiate between devices where it makes sense to check for media, and those where it doesn't?
The EHCI controller is one from Intel 0x265c, the UHCI controllers as well (ICH6).
comment:11 by , 16 years ago
Milestone: | R1/alpha1 → R1 |
---|---|
Priority: | high → normal |
Since the lockup is gone, I'm changing the milestone.
comment:12 by , 16 years ago
Replying to axeld:
Not sure what this means; maybe the command couldn't even been send in the first place? Is is possible to differentiate between devices where it makes sense to check for media, and those where it doesn't?
Well, it makes sense to check for media when it's declared as removable. Most USB drives are labeled removable though, so this is not really a good way of telling. What could and probably should be done is to just stop checking for media changes when the test unit ready command doesn't work.
In any case, could you check with a revision >= hrev28934 to see if the fixed reset recovery solves this issue?
comment:13 by , 15 years ago
As mentioned in the comment above disabling media checking on devices that don't seem to support it has been implemented some time ago. The fixed reset recovery might have improved things as well, so please retest if possible.
comment:14 by , 10 years ago
Resolution: | → fixed |
---|---|
Status: | in-progress → closed |
As mentioned in the comments above:
- Timeout handling in usb_disk has been implemented.
- The test unit ready command is now disabled on devices that fail it too often.
The issue should therefore be long gone.
Does this happen with the boot USB stick alone or are there other mass storage devices? If the media checker blocks waiting for a transfer that never finishes (which is possible as there is no timeout handling in usb_disk yet) then theoretically all IO to the boot volume should block too.