Opened 17 years ago

Closed 15 years ago

#2083 closed bug (fixed)

EHCI bus hanging at boot.

Reported by: bga Owned by: mmlr
Priority: blocker Milestone: R1
Component: Drivers/USB Version: R1/pre-alpha1
Keywords: Cc: vegarwa@…
Blocked By: Blocking:
Platform: All

Description

Tried this with the latest revision, hrev24979, and the problem is still there. When booting, the system will hang when the ehci bus is trying to "steal" the host controller from the BIOS. I could not get serial debug output (as my computer has no serial port) so i took a picture of the point where it stops (hope it is clear enough). Note that the boot actually gets pretty fast up to the line that says:

More than 99% interrupts of vector 7 are unhandled

It takes several minutes for a new message to appear (probably because the previous message repeated for like 4000 times as the output shows).

Note the around one week ago it worked flawlessly on my computer.

Attachments (15)

output.JPG (411.3 KB ) - added by bga 17 years ago.
Debug output (crappy picture, sorry)
lspci.txt (8.8 KB ) - added by pieterpan 17 years ago.
lspci from linux
lsusb.txt (34.6 KB ) - added by pieterpan 17 years ago.
lsusb from linux
diff_no_usb_trace.diff (755 bytes ) - added by pieterpan 17 years ago.
cant get usb traccking output with this... why not?
DSC02671 (Custom).JPG (155.0 KB ) - added by pieterpan 17 years ago.
bunch of pictures of all usb related output
DSC02672 (Custom).JPG (95.5 KB ) - added by pieterpan 17 years ago.
bunch of pictures of all usb related output
DSC02673 (Custom).JPG (103.0 KB ) - added by pieterpan 17 years ago.
bunch of pictures of all usb related output
DSC02674 (Custom).JPG (93.7 KB ) - added by pieterpan 17 years ago.
bunch of pictures of all usb related output
DSC02676 (Custom).JPG (93.3 KB ) - added by pieterpan 17 years ago.
bunch of pictures of all usb related output
DSC02677 (Custom).JPG (88.3 KB ) - added by pieterpan 17 years ago.
bunch of pictures of all usb related output
DSC02678 (Custom).JPG (149.5 KB ) - added by pieterpan 17 years ago.
bunch of pictures of all usb related output
lspci.log (1.3 KB ) - added by Minoru-kun 16 years ago.
laptop's lspci
lsusb.log (247 bytes ) - added by Minoru-kun 16 years ago.
laptop's lsusb
2008-02-27_r29337_ehci_hanging.log (58.0 KB ) - added by vegardw 16 years ago.
Complete serial log from boot attempt
echi_handover.diff (1.1 KB ) - added by vegardw 15 years ago.
Attaching the changes I did as a diff

Download all attachments as: .zip

Change History (51)

by bga, 17 years ago

Attachment: output.JPG added

Debug output (crappy picture, sorry)

comment:1 by mmlr, 17 years ago

Status: newassigned

It seems that the BIOS first does not hand over the control, but then still decides to do so which causes an interrupt flood. I've increased the amount of time the handover code tries to get the controller with hrev24980 and also let it disable interrupts if the handover fails. Could you please check if this allows to boot, and if so if the EHCI controller then actually failed or succeeded in its initialization? You should see corresponding lines in the syslog. In fact Linux does not really care about a failed handover. It just force turns off the BIOS owned bit and disables all the SMIs and then goes on assuming now owning the controller. If it turns out that there are BIOSes that do in fact not complete the handover at all then we might want to do the same thing in the future.

comment:2 by bga, 17 years ago

The system booted, but I am not sure ehci was ever initialized. The only references to ehci in the syslog are from the PCI bus scan. I then searched for usb and besides the PCI scan, I only found the following messages:

KERN: loaded driver /boot/beos/system/add-ons/kernel/drivers/dev/input/usb_hid

and

KERN: loaded driver /boot/beos/system/add-ons/kernel/drivers/dev/bus/usb_raw

Mouse is connected through USB and is working. Also:

~>ls /dev/bus/usb/ 0 1 2 3 4 5 6 raw

and

~>ls /dev/bus/usb/3 1 hub

~>ls /dev/bus/usb/5 1 hub

(those probably are the only 2 devices I have connected through USB. The mouse and a webcam).

If you need any other information, let me know.

comment:3 by mmlr, 17 years ago

You could use usb_dev_info to get the info about all of the hubs (i.e. "usb_dev_info /dev/bus/usb/x/hub"). Their product string will tell you whether they are UHCI or EHCI root hubs. Usually you have a 1:3 ratio between UHCI and EHCI devices. So I guess with the 7 devices that 6 of them (0-5) should be UHCI and one should be EHCI (6) with one EHCI controller probably missing. If you have USB 2.0 devices you can just plug them into every port you have and see where they get published. If they end up under a UHCI root hub then the EHCI controller that's responsible for these ports wasn't initialized.

comment:4 by bga, 17 years ago

You were right. 6 was EHCI. Just to be sure I connected a USB card reader to the computer and got this (this is a multi-port reader with only one MS Pro attached to it):

KERN: usb_disk: device reports a lun count of 4 KERN: usb_disk: vendor_identification "Generic " KERN: usb_disk: product_identification "USB SD Reader " KERN: usb_disk: product_revision_level "1.00" KERN: usb_disk: operation 0x00 failed at the SCSI level KERN: usb_disk: vendor_identification "Generic " KERN: usb_disk: product_identification "USB CF Reader " KERN: usb_disk: product_revision_level "1.01" KERN: usb_disk: operation 0x00 failed at the SCSI level KERN: usb_disk: vendor_identification "Generic " KERN: usb_disk: product_identification "USB SM Reader " KERN: usb_disk: product_revision_level "1.02" KERN: usb_disk: operation 0x00 failed at the SCSI level KERN: usb_disk: vendor_identification "Generic " KERN: usb_disk: product_identification "USB MS Reader " KERN: usb_disk: product_revision_level "1.03" KERN: usb_disk: operation 0x00 failed at the SCSI level KERN: usb_disk: request_sense: key: 0x06; asc: 0x28; ascq: 0x00; KERN: usb_disk: request_sense: media changed KERN: usb_disk: operation 0x00 failed at the SCSI level KERN: usb_ehci: qtd (0x0393d100) error: 0x00088d40 KERN: usb_ehci: qtd (0x0393d300) error: 0x80008d40 KERN: usb_ehci: qtd (0x0393d480) error: 0x000d8d40 KERN: usb_ehci: qtd (0x0393d680) error: 0x80008d40 KERN: usb_ehci: qtd (0x0393d800) error: 0x000d8d40 KERN: usb_disk: receiving the command status wrapper failed KERN: usb_ehci: qtd (0x0393dc00) error: 0x80008d40 KERN: usb_disk: failed to update capacity

After this the system stop responding. I can for example type ls on the terminal but it will hang just after I hit enter. I can also not start any other apps and the ones that are running eventually hang too. Then I remove the card reader form the USB port and get this:

KERN: usb_ehci: qtd (0x0393df80) error: 0x001f8049 KERN: usb_disk: sending the command block wrapper failed KERN: usb_ehci: qtd (0x0393e100) error: 0x00080248 KERN: usb_ehci: qtd (0x0393e300) error: 0x00080248 KERN: usb_ehci: qtd (0x0393e500) error: 0x00080248 KERN: usb_ehci: qtd (0x0393e700) error: 0x001f8049 KERN: usb_disk: sending the command block wrapper failed KERN: usb_ehci: qtd (0x0393e880) error: 0x00080248 KERN: usb_ehci: qtd (0x0393ea80) error: 0x00080248 KERN: usb_ehci: qtd (0x0393ec80) error: 0x00080248 KERN: usb_disk: failed to update capacity KERN: usb_ehci: qtd (0x0393ee80) error: 0x001f8049 KERN: usb_disk: sending the command block wrapper failed KERN: usb_ehci: qtd (0x0393f000) error: 0x00080248 KERN: usb_ehci: qtd (0x0393f200) error: 0x00080248 KERN: usb_ehci: qtd (0x0393f400) error: 0x00080248 KERN: usb_ehci: qtd (0x0393f600) error: 0x001f8049 KERN: usb_disk: sending the command block wrapper failed KERN: usb_ehci: qtd (0x0393f780) error: 0x00080248 KERN: usb_ehci: qtd (0x0393f980) error: 0x00080248 KERN: usb_ehci: qtd (0x0393fb80) error: 0x00080248 KERN: usb_ehci: qtd (0x0393fd80) error: 0x001f8049 KERN: usb_disk: sending the command block wrapper failed KERN: usb_ehci: qtd (0x0393ff00) error: 0x00080248 KERN: usb_ehci: qtd (0x03940100) error: 0x00080248 KERN: usb_ehci: qtd (0x03940300) error: 0x00080248 KERN: usb_disk: failed to update capacity KERN: usb_ehci: qtd (0x03940500) error: 0x001f8049 KERN: usb_disk: sending the command block wrapper failed KERN: usb_ehci: qtd (0x03940680) error: 0x00080248 KERN: usb_ehci: qtd (0x03940880) error: 0x00080248 KERN: usb_ehci: qtd (0x03940a80) error: 0x00080248 KERN: usb_ehci: qtd (0x03940c80) error: 0x001f8049 KERN: usb_disk: sending the command block wrapper failed KERN: usb_ehci: qtd (0x03940e00) error: 0x00080248 KERN: usb_ehci: qtd (0x03941000) error: 0x00080248 KERN: usb_ehci: qtd (0x03941200) error: 0x00080248 KERN: usb_ehci: qtd (0x03941400) error: 0x001f8049 KERN: usb_disk: sending the command block wrapper failed KERN: usb_ehci: qtd (0x03941580) error: 0x00080248 KERN: usb_ehci: qtd (0x03941880) error: 0x00080248 KERN: usb_ehci: qtd (0x03941a80) error: 0x00080248 KERN: usb_disk: failed to update capacity KERN: usb_ehci: qtd (0x03941c80) error: 0x001f8049 KERN: usb_disk: sending the command block wrapper failed KERN: usb_ehci: qtd (0x03941e00) error: 0x00080248 KERN: usb_ehci: qtd (0x03942000) error: 0x00080248 KERN: usb_ehci: qtd (0x03942200) error: 0x00080248 KERN: usb_ehci: qtd (0x03942400) error: 0x001f8049 KERN: usb_disk: sending the command block wrapper failed KERN: usb_ehci: qtd (0x03942580) error: 0x00080248 KERN: usb_ehci: qtd (0x03942780) error: 0x00080248 KERN: usb_ehci: qtd (0x03942980) error: 0x00080248

And everything starts working again.

At least it seems EHCI is at least kinda working. :)

comment:5 by pieterpan, 17 years ago

Tried booting haiku on my laptop, hadn't done so in a while. It used to boot fine, but now it also hangs. Sometimes the last debug output is "usb_uhci: successfully started the controller". Also, sometimes it gets further and shows the "More than 99% interrupts of vector 7 are unhandled" Once, after entering the kdb with F12 and continue-ing it actually booted. Have tried to reproduce this but it wouldn't work again. How does one use the bfs_shell to remove the usb_uhci driver to see if it continues booting?

comment:6 by mmlr, 17 years ago

If it explicitly says that it loaded and started UHCI correctly then I wouldn't suspect the problem to be UHCI. It could be EHCI that is loaded after UHCI. If you build your installation yourself then you can simply remove "<usb>ehci" or "<usb>uhci" or both from the build/jam/HaikuImage.

comment:7 by meanwhile, 17 years ago

Bug #1919 looks related, as the same line "More than 99% interrupts of vector 7 are unhandled" shows up (that's a very superficial argument, matching my insight perfectly :p).

comment:8 by mmlr, 17 years ago

Bug #1919 is most probably not related to this one (no USB controller present at the mentioned interrupt vector). That interrupts are unhandled is a common symptom that can be triggered by most drivers if they activate interrupts but fail to handle/acknowledge them correctly.

comment:9 by pieterpan, 17 years ago

I'm building Haiku from Ubuntu. Removing ehci in HaikuImage 'fixed' the problem indeed, it boots again. Anything I can try for debugging?

comment:10 by mmlr, 17 years ago

You could enable debug output for EHCI by adding #define TRACE_USB at the top of src/add-ons/kernel/busses/usb/ehci.cpp. Then (adding EHCI back to the image first of course) you can check the debug output (best via serial debugging or otherwise by enabling on-screen debug output) and post here where it gets stuck exactly.

comment:11 by pieterpan, 17 years ago

For some reason that does not yield any additional output, though looking at the code I should get plenty. I added EHCI back of course. (though I probably would have forgotten first try if you hadn't reminde me :-) ) I attached the diff of my attempt to the latest revision. I did a jam clean and rebuild, no change. Even enabling ENABLE_TRACING did not help. Am I missing something?

by pieterpan, 17 years ago

Attachment: lspci.txt added

lspci from linux

by pieterpan, 17 years ago

Attachment: lsusb.txt added

lsusb from linux

by pieterpan, 17 years ago

Attachment: diff_no_usb_trace.diff added

cant get usb traccking output with this... why not?

comment:12 by mmlr, 17 years ago

Well that tracing and TRACE_USB has nothing to do with eachother ;-). TRACE_USB is enough to define all the TRACE() statements to dprintf which will/should cause the debug output to appear. You could globally enable TRACE_USB by adding that define at the top of "src/add-ons/kernel/bus_managers/usb/usb_p.h". This will enable debug output for all of the USB stack and the host controller drivers. Possibly it's not EHCI, but another UHCI controller that fails in that case.

comment:13 by pieterpan, 17 years ago

I knew that :-), but since I got no output I gave it a try anyway... I have enabled TRACE_USB in usb_p as you said, and now we get lots of output, hope it is useful for you. Sorry it took a while, i had taken the pics with my nice camera, but we're in the middle of a move, so that camera is now in our new house :) You'll have to make do with the crappy camera. The most interesting bit probably is (Only after I pressed F12, the "Last message repeated..." appeared)

usb_uhci: installing interrupt handler
usb_uhci: host controller halted
Last message repeated 4459366 times

Let me know if I can try something else

by pieterpan, 17 years ago

Attachment: DSC02671 (Custom).JPG added

bunch of pictures of all usb related output

by pieterpan, 17 years ago

Attachment: DSC02672 (Custom).JPG added

bunch of pictures of all usb related output

by pieterpan, 17 years ago

Attachment: DSC02673 (Custom).JPG added

bunch of pictures of all usb related output

by pieterpan, 17 years ago

Attachment: DSC02674 (Custom).JPG added

bunch of pictures of all usb related output

by pieterpan, 17 years ago

Attachment: DSC02676 (Custom).JPG added

bunch of pictures of all usb related output

by pieterpan, 17 years ago

Attachment: DSC02677 (Custom).JPG added

bunch of pictures of all usb related output

by pieterpan, 17 years ago

Attachment: DSC02678 (Custom).JPG added

bunch of pictures of all usb related output

comment:14 by mmlr, 16 years ago

Bruno, does this still occure? I have changed EHCI to force take-over the controller in case the BIOS fails to hand it over in hrev25918. Could you please check if the EHCI controllers actually work now? Best by checking a USB 2.0 device on the previously disabled ports, and/or by checking the presence with usb_dev_info.

Pieterpan the problem you see is in fact non-EHCI but UHCI. Could you please verify if it still happens in current revisions. If it does, please open a new bug report for that.

comment:15 by bga, 16 years ago

I will check but I can not do that until next week when I am back home (I am in the US right now).

comment:16 by bga, 16 years ago

During boot I get the message that the BIOS is not giving up the controller and that it will ignore it. The next message seems to indicate that the controller was correctly initialized.

I still can not connect my USB card reader tough. If I do, the system stops completely until I remove it (sometimes not even after that).

comment:17 by pieterpan, 16 years ago

Okay I'm back from my vacation in the USA. Indeed the problem now seems related to OHCI. I added my issues to #1044.

comment:18 by Minoru-kun, 16 years ago

I've got the same error on my BenQ A52-R20 laptop yesterday.:

usb_ohci: successfully started the controller.
usb_ehci: the host controller is bios-owned
usb_ehci: claiming ownership of the host controller
usb_ehci: controller is still bios-owned, waiting
Last message repeated 19 times
usb_ehci: successfully started the controller
USB ControlPipe: timeout waiting for queued request to complete
USB ControlPipe: timeout waiting for queued request to complete
usb_ehci: qtd (0x02ab7680) error: 0x00080248
USB ControlPipe: timeout waiting for queued request to complete
usb_ehci: qtd (0x02ab7680) error: 0x00080248
USB BusManager: error while setting device address
usb_ehci: lowspeed device connected, giving up
get_boot_partitions(): boot volume message:
<...>

I tried building Haiku from sources on my Linux box (without usb_ehci), but it still doesn't wants to load, reporting that usb_ehci module is not found.

What can i do? I attach my lspci/lsusb logs.

by Minoru-kun, 16 years ago

Attachment: lspci.log added

laptop's lspci

by Minoru-kun, 16 years ago

Attachment: lsusb.log added

laptop's lsusb

in reply to:  18 comment:19 by mmlr, 16 years ago

Replying to Minoru-kun:

I've got the same error on my BenQ A52-R20 laptop yesterday.: ... USB ControlPipe: timeout waiting for queued request to complete usb_ehci: qtd (0x02ab7680) error: 0x00080248 ... What can i do? I attach my lspci/lsusb logs.

Sadly it seems you have one of those boards that have broken interrupt/timer routing. What most probably happens is that there is no interrupt received when the transfer descriptors are done. USB (and most other devices for that matter) cannot work without receiving these interrupts however. Please see bug #2342 and #2335 for more details about the IXP problems.

This shouldn't be an isolated problem though. If you manage to boot at all, you should notice that other devices, most prominently network, will probably not work. If you could verify whether or not this problem only affects USB and no other device that'd be helpful.

comment:20 by Minoru-kun, 16 years ago

In that case, why does Linux works fine? (Cause it's boots from hdd/cd-drive, do you meen?). Else, may be, IXP hacks can be ported to Haiku?

in reply to:  20 comment:21 by mmlr, 16 years ago

Replying to Minoru-kun:

In that case, why does Linux works fine? (Cause it's boots from hdd/cd-drive, do you meen?). Else, may be, IXP hacks can be ported to Haiku?

Because linux specifically works around the broken hardware, and yes that is supposed to be ported over to Haiku. That's what is tracked in bug #2342 indeed.

comment:22 by anevilyak, 16 years ago

How recently was that tested? hrev27333 in theory ported over such a workaround, unless more is needed with respect to IO-APIC or whatnot.

comment:23 by Minoru-kun, 16 years ago

Replying to anevilyak

How recently was that tested? hrev27333 in theory ported over such a workaround, unless more is needed with respect to IO-APIC or whatnot.

I encountered my problem on 27498 :)

comment:24 by nielx, 16 years ago

I'm having a similar problem with unhandled interrupts. EHCI freezes my system. Only when I remove and/or plug in a device the system unfreezes.

Is there anything I can do to diagnose the problem, or anything I can search for to fix?

comment:25 by vegardw, 16 years ago

Cc: vegarwa@… added

Having what seems to be the same or a similar problem. Haiku hangs during boot, the last line in the serial log is:

usb ehci -1: claiming ownership of the host controller

recompiling with #define TRACE_USB uncommented in src/add-ons/kernel/bus_managers/usb/usb_p.h gives me the following:

usb ohci -1: iospace offset: 0xfbdfe000
usb ohci -1: smm is in control of the host controller
usb ohci -1: ownership change successful
usb ohci -1: successfully started the controller
usb ohci -1: iospace offset: 0xfbdfd000
usb ohci -1: smm is in control of the host controller
usb ohci -1: ownership change successful
usb ohci -1: successfully started the controller
usb ohci -1: iospace offset: 0xfbdfc000
usb ohci -1: smm is in control of the host controller
usb ohci -1: ownership change successful
usb ohci -1: successfully started the controller
usb ohci -1: iospace offset: 0xfbdfb000
usb ohci -1: smm is in control of the host controller
usb ohci -1: ownership change successful
usb ohci -1: successfully started the controller
usb ohci -1: iospace offset: 0xfbdf9000
usb ohci -1: smm is in control of the host controller
usb ohci -1: ownership change successful
usb ohci -1: successfully started the controller
usb ehci -1: claiming ownership of the host controller

If I remove "<usb>ehci" from "build/jam/HaikuImage" and recompile Haiku boots without problems.

Tested with

  • Haiku hrev29337
  • Compiled with GCC2 in ubuntu
  • Running from real partition

Hardware

The problem occurs on my desktop computer with the following hardware.

  • Asus M3A78 motherboard (Chipset: AMD 770 / AMD SB700)
  • AMD Phenom 9850 Black Edition Quad-Core CPU
  • 2x2GB DDR2 PC6400 RAM
  • Samsung SATA HDD

Reproducible

Always, happens on every boot unless i remove <usb>ehci from the image

by vegardw, 16 years ago

Complete serial log from boot attempt

in reply to:  25 comment:26 by vegardw, 16 years ago

Replying to vegardw:

recompiling with #define TRACE_USB uncommented in src/add-ons/kernel/bus_managers/usb/usb_p.h gives me the following:
(...)

Sorry, pasted from the wrong log file, that was from the one compiled without TRACE_USB defined. Should be:

usb ehci: searching devices
usb ehci: found device at IRQ 11
usb ehci -1: constructing new EHCI host controller driver
usb ehci -1: map physical memory 0xfbdff000 (base: 0xfbdff000; offset: 0); size: 256
usb ehci -1: mapped capability registers: 0x809da000
usb ehci -1: mapped operational registers: 0x809da020
usb ehci -1: structural parameters: 0x00102306
usb ehci -1: capability parameters: 0x0000a072
usb ehci -1: extended capabilities register at 160
usb ehci -1: claiming ownership of the host controller

comment:27 by mmlr, 16 years ago

Can you please retry with hrev29354 and see if the behaviour and/or output changes?

comment:28 by vegardw, 16 years ago

No changes in behaviour or output for hrev29354 for me. Still hangs at "claiming ownership of the host controller":

usb stack 0: module busses/usb/ohci successfully loaded
usb stack 0: looking for module busses/usb/ehci
usb ehci: ehci init module
usb stack 0: adding module busses/usb/ehci
usb ehci: searching devices
usb ehci: found device at IRQ 11
usb ehci -1: constructing new EHCI host controller driver
usb ehci -1: map physical memory 0xfbdff000 (base: 0xfbdff000; offset: 0); size: 256
usb ehci -1: mapped capability registers: 0x80690000
usb ehci -1: mapped operational registers: 0x80690020
usb ehci -1: structural parameters: 0x00102306
usb ehci -1: capability parameters: 0x0000a072
usb ehci -1: extended capabilities register at 160
usb ehci -1: claiming ownership of the host controller

comment:29 by vegardw, 16 years ago

When booting Ubuntu Linux 8.10 on the computer I have the following messages in the dmesg output

[    4.872585] ehci_hcd 0000:00:12.2: applying AMD SB600/SB700 USB freeze workaround 
 [    5.092602] ehci_hcd 0000:00:13.2: applying AMD SB600/SB700 USB freeze workaround

A couple of quick google searches after yields the following

Seems like a bug in the SB700 chipset, could this be related to the hanging I'm experiencing in Haiku?

I can attach the complete output from dmesg in Ubuntu if that is of any help

comment:30 by bga, 16 years ago

Just some update to this that I noticed. I have USB ports on the back of my computer (4) and in its front panel (another 4). I would take they are connected in two different physical USB controllers.

If I connect anything to the back ports, I have no problems (I have a webcam connect6ed to those and it is recognized by Haiku, although it does not work as UVC just started being implemented). But anything I connect to the ports in the front panel results in the IRQ flood (if it is plugged during boot, the boot process hangs, if plugged in when the system is running, it locks up).

Based on this I would guess the problem could be due to IRQ sharing between the problematic controller and some other hardware? If so, your (mmlr's) recent changes to tryu to work things like this around didn't work, unfortunately.

comment:31 by bga, 16 years ago

Re-reading a past post from myself I remembered that, in fact, a USB Mass Storage card reader I have actually works up to a point even when connected to the front ports. What does not work for sure and results in IRQ flood:

1 - Bluetooth USB dongle. 2 - My printer.

So, what could cause the IRQ floos on some devices and lot on others?

-Bruno

comment:32 by bga, 16 years ago

Just as an addendum, I have been getting this quite frequently. I don't know how it happens but usually I notice it when doing "svn up" on the Haiku tree, Usually directories inside the .svn dirs are the ones affected so I am inclined to think this is happening because bad data is being written to the directory entry when these files are updated. It *COULD* be related to advisory file locking as the files are locked by svn when they are being updated, In fact, the first indication that the problem occured is that when doing svn up I get a message that i need to run svn cleanup (which usually fails and checking syslog shows the "Bad data" message).

comment:33 by mmlr, 16 years ago

Can you please retry with hrev29931? This might be an issue with broken BIOSes.

comment:34 by vegardw, 15 years ago

EHCI is still hanging at boot for me with hrev35770. Have done some investigating today, and it hangs at the sPCIModule->write_pci_config at line 188 in src/add-ons/kernel/busses/usb/ehci.cpp, which never returns. If I add a TRACE_ALWAYS on the next line it is never printed. Also, "the host controller is bios owned" at line 185 is not printed, so if (legacySupport & EHCI_LEGSUP_BIOSOWNED) returns 0.

184     if (legacySupport & EHCI_LEGSUP_BIOSOWNED)
185         TRACE_ALWAYS("the host controller is bios owned\n");
186
187     TRACE_ALWAYS("claiming ownership of the host controller\n");
188     sPCIModule->write_pci_config(fPCIInfo->bus, fPCIInfo->device,
189         fPCIInfo->function, extendedCapPointer + 3, 1, 1);

I then took a look at the freebsd source, dev/usb/controllerechi.c, and noticed that they only do the write if the host controller is BIOS owned.

I then changed the haiku driver so that the write is only done if the host controller is BIOS owned, and haiku boots fine on my computer with that change applied:

Index: src/add-ons/kernel/busses/usb/ehci.cpp
===================================================================
--- src/add-ons/kernel/busses/usb/ehci.cpp	(revision 35770)
+++ src/add-ons/kernel/busses/usb/ehci.cpp	(working copy)
@@ -181,12 +181,13 @@
 		uint32 legacySupport = sPCIModule->read_pci_config(fPCIInfo->bus,
 			fPCIInfo->device, fPCIInfo->function, extendedCapPointer, 4);
 		if ((legacySupport & EHCI_LEGSUP_CAPID_MASK) == EHCI_LEGSUP_CAPID) {
-			if (legacySupport & EHCI_LEGSUP_BIOSOWNED)
+			if (legacySupport & EHCI_LEGSUP_BIOSOWNED) {
 				TRACE_ALWAYS("the host controller is bios owned\n");
 
-			TRACE_ALWAYS("claiming ownership of the host controller\n");
-			sPCIModule->write_pci_config(fPCIInfo->bus, fPCIInfo->device,
-				fPCIInfo->function, extendedCapPointer + 3, 1, 1);
+				TRACE_ALWAYS("claiming ownership of the host controller\n");
+				sPCIModule->write_pci_config(fPCIInfo->bus, fPCIInfo->device,
+					fPCIInfo->function, extendedCapPointer + 3, 1, 1);
+			}
 
 			for (int32 i = 0; i < 20; i++) {
 				legacySupport = sPCIModule->read_pci_config(fPCIInfo->bus,

I don't know if the way I did it is the correct way to fix it, but it worked for me.

by vegardw, 15 years ago

Attachment: echi_handover.diff added

Attaching the changes I did as a diff

comment:35 by mmlr, 15 years ago

I've applied something similar in hrev35780. It also removes the no-op loop in that case. I'm still in the "broken BIOS" department though, as indicating that we want ownership shouldn't cause this. The specs describe this as fairly unconditional action to take when the OS intends to use the controller exclusively. Since it also states that the BIOS may only take ownership during POST and when the OS gives up control this interpretation can work though (since the BIOS obviously didn't care during POST and giving up ownership can never happen since we can't claim it in the first place due to this exact problem). Pretty much kills off the possibility to actively hand over the host controller control back to the BIOS, but we don't do that anyway.

comment:36 by mmlr, 15 years ago

Resolution: fixed
Status: in-progressclosed

This bug report a mess. The original report concerned an interrupt storm (probably fixed), then there were reports of spurious and unacknowledged interrupts (fixed), there was a report about missing interrupts due to bad chipset (different issue) and eventually there was the SMI storm (most probably fixed in hrev35780). I'm therefore going to close this report, please file new reports if you still encounter any of these issues.

Note: See TracTickets for help on using tickets.