Opened 16 years ago

Closed 16 years ago

Last modified 16 years ago

#2361 closed bug (duplicate)

Hard lockup during boot since r25836

Reported by: drackham Owned by: mmlr
Priority: normal Milestone: R1
Component: - General Version: R1/pre-alpha1
Keywords: Cc:
Blocked By: Blocking:
Platform: All

Description

Starting with hrev25836 (default support for booting from USB) my machine doesn't boot of an IDE drive past the 3rd icon. Serial debug output stops after:

ahci: ahci_supports_device usb_uhci: no devices found

At this point I can't even get into KDL by hitting F12 on my ps2 keyboard. Disabling USB in the BIOS or reverting to hrev25835 avoids this problem.

This machine only has ehci and ohci USB support.

Complete serial debug log attached.

Attachments (1)

serial_debug.txt (23.2 KB ) - added by drackham 16 years ago.
Serial debug output

Download all attachments as: .zip

Change History (14)

by drackham, 16 years ago

Attachment: serial_debug.txt added

Serial debug output

comment:1 by mmlr, 16 years ago

Owner: changed from axeld to mmlr
Status: newassigned

I assume that there's an interrupt flood when initializing OHCI.

comment:2 by mmlr, 16 years ago

Could you enable tracing in OHCI by adding a #define TRACE_USB at the top of src/add-ons/kernel/busses/usb/ohci.cpp and check with enabled on-screen debug output where this ends up? I suspect that it is a BIOS handover issue, but as I don't have any on-board OHCI controllers I cannot test this.

comment:3 by drackham, 16 years ago

With TRACE_USB the debug ends up with: . . . ahci: ahci_supports_device usb_uhci: no devices found usb_ohci_module: init module usb_ohci: searching devices usb_ohci: found device at IRQ 3 usb_ohci: constructing new OHCI Host Controller Driver usb_ohci: iospace offset: 0xfdfff000 usb_ohci: mapped operational registers: 0x80157000 usb_ohci: version 1.0, legacy support usb_ohci: SMM is in control of the host controller

comment:4 by mmlr, 16 years ago

Could you please try with hrev25915. If it doesn't work I'd be interested in the trace output again to see if the handover from SMM actually completes.

comment:5 by drackham, 16 years ago

Tested with hrev25915 and hrev25924. Bug still occurs. The *only* diff from the original trace output is the capitalization of the very last line:

usb_ohci: smm is in control of the host controller

comment:6 by mmlr, 16 years ago

Ah great, crappy hardware or crappy BIOS. Could you please add the following lines to ohci.cpp just after the TRACE(("usb_ohci: smm is in control of the host controller\n"));:

dprintf("usb_ohci: interrupt enable is 0x%08lx\n", _ReadReg(OHCI_INTERRUPT_ENABLE));
dprintf("usb_ohci: command status is 0x%08lx\n", _ReadReg(OHCI_COMMAND_STATUS));
dprintf("usb_ohci: control is 0x%08lx\n", _ReadReg(OHCI_CONTROL));

And then post here what this outputs?

comment:7 by drackham, 16 years ago

Tested with hrev25926 + requested dprintfs. Debug output ends with:

usb_ohci_module: init module usb_ohci: searching devices usb_ohci: found device at IRQ 3 usb_ohci: constructing new OHCI Host Controller Driver usb_ohci: iospace offset: 0xfdfff000 usb_ohci: mapped operational registers: 0x8015b000 usb_ohci: version 1.0, legacy support usb_ohci: smm is in control of the host controller usb_ohci: interrupt enable is 0xc0000042 usb_ohci: command status is 0x00000000 usb_ohci: control is 0x00000184

comment:8 by mmlr, 16 years ago

Ok, that's interesting. I suppose this is due to the roothub status change interrupt being enabled that fires as soon as the handover completes (without an installed handler). Could you please try adding the following in ohci.cpp after where you put the dprintfs:

cpu_status former = disable_interrupts();

And then directly after the for loop:

_WriteReg(OHCI_INTERRUPT_DISABLE, OHCI_ALL_INTERRUPTS);
_WriteReg(OHCI_INTERRUPT_STATUS, OHCI_ALL_INTERRUPTS);
restore_interrupts(former);

This should stop the interrupts from taking down the system before they can be disabled. In any case thanks for your patience and testing!

comment:9 by drackham, 16 years ago

I tried inserting the code you suggested but it didn't seem to make a difference. I'm now thinking I may be experiencing a deeper problem.

It seems that the call to snooze(1000) on line 211 of ohci.cpp never returns. I reverted to the repo version, and replaced the snooze(1000) with a spin(1000). With this change the trace gets as far as the "usb_ohci: ownership change successful" but then never returns from snooze(USB_DELAY_BUS_RESET) on line 234. In fact, even a snooze(1) inserted at the top of OHCI::AddTo() never returns.

I'm guessing this must be the same bug as #2335 which I also experience on this machine. It's a Gateway with an MS-7173 motherboard, single P4 processor, 1G RAM, Phoenix Award BIOS v6.00PG. It runs linux and windows XP well.

I'm happy to offer any help that I can. Once I finally get this machine to boot I hope to do more. Thank *you* Michael for all your patience and coding!

comment:10 by dustin howett, 16 years ago

This happens to me (it stops in the exact same place), but only after my recent systimer modifications.

comment:11 by mmlr, 16 years ago

Ah ok, so it's one of those broken ATI boards then. The problem is simply that timer interrupts don't work as intended and therefore the timer in thread_block_with_timeout_locked() in src/system/kernel/thread.cpp never gets fired. As the OHCI init is now happening in main2 already it probably forever blocks it, which stops the complete boot process. In any case this and probably #2335 too are caused by #2342. Will close this as a duplicate of #2342 therefore.

comment:12 by mmlr, 16 years ago

Resolution: duplicate
Status: assignedclosed

comment:13 by dustin howett, 16 years ago

I have an nForce board. Then again, it's possible that it's my timer changes that cause it, since it only happens on my builds.

Note: See TracTickets for help on using tickets.