Opened 11 years ago

Closed 11 years ago

#2776 closed bug (fixed)

Regression: r27752 broke booting

Reported by: luroh Owned by: bonefish
Priority: critical Milestone: R1/alpha1
Component: System Version: R1/pre-alpha1
Keywords: Cc:
Blocked By: Blocking:
Has a Patch: no Platform: All

Description

Since hrev27752, I get the following after the red rocket icon has lit up: "PANIC: page fault, but interrupts were disabled. Touching address 0x00000020 from eip 0x8003ac1b"

Picture and serial output from hrev27757 attached.

Attachments (6)

panic.png (48.2 KB) - added by luroh 11 years ago.
haiku-serial-port.txt (30.3 KB) - added by luroh 11 years ago.
serial_r27758.txt.zip (34.2 KB) - added by luroh 11 years ago.
serial_r27761.txt.zip (35.4 KB) - added by luroh 11 years ago.
serial_r27763.txt.zip (35.4 KB) - added by luroh 11 years ago.
team_images_1.txt (3.2 KB) - added by luroh 11 years ago.

Download all attachments as: .zip

Change History (23)

Changed 11 years ago by luroh

Attachment: panic.png added

Changed 11 years ago by luroh

Attachment: haiku-serial-port.txt added

comment:1 Changed 11 years ago by bonefish

If you build your images yourself, please update to hrev27758 and enable tracing in src/system/kernel/module.cpp (uncomment the #define TRACE_MODULE line).

comment:2 Changed 11 years ago by luroh

Done, serial_r27758.txt attached.

Changed 11 years ago by luroh

Attachment: serial_r27758.txt.zip added

comment:3 Changed 11 years ago by bonefish

Thanks! I fixed a bug in hrev27760 that might be related or even cause this bug. If the problem still persists, please provide a new serial output, since I also added some more debug output.

comment:4 Changed 11 years ago by luroh

Still with us, serial_r27761.txt attached.

Changed 11 years ago by luroh

Attachment: serial_r27761.txt.zip added

comment:5 Changed 11 years ago by luroh

Summary: Regression: r27752 broke booting in VMwareRegression: r27752 broke booting

FWIW, same happens on real hw as well.

comment:6 Changed 11 years ago by bonefish

I've no idea what's wrong. Everything works fine here, in VMware as well as on real hardware, gcc 2 and gcc 4. The module debug output from your syslog looks as it should.

The stack crawl suggests that gUSBStack in the USB bus manager module is NULL. According to the module debug output the initialization function that should set it is invoked, it is not uninitialized later, nor is a second image with the USB bus manager loaded. So everything should be fine.

You could try to also enable USB tracing (uncomment the #define TRACE_USB in src/add-ons/kernel/bus_managers/usb/usb_p.h). Maybe Michael has an idea.

comment:7 Changed 11 years ago by luroh

It's like deja vue all over again. ;)

Thanks for the advice, USB tracing enabled in serial_r27763.txt.

Changed 11 years ago by luroh

Attachment: serial_r27763.txt.zip added

comment:8 Changed 11 years ago by luroh

bonefish: Reading a recent commit log of yours, I guess you might be building Haiku with -j2 or some such option, correct? Now, I'll be the first to admit that I don't know the first thing about jam, but I do know that I have had problems with concurrent jam jobs in the past, i.e., ending up with various degrees of different program behaviour. This prompted me to go back to jam -q and the occasional jam -aq. Could this be the cause of what we're seeing here? You not being able to repeat the problem, I mean.

comment:9 Changed 11 years ago by anevilyak

Milestone: R1R1/alpha1
Priority: normalcritical

Hi luroh,

Just curious, when it KDLs, can you grab the output of team_images 1 from the kernel debugger? I'm wondering if the same thing is happening that happened to me where the PCI bus manager's image was loaded twice, which resulted in a NULL ptr due to the second image not having been completely initialized. I can't test myself right now due to systems being packed for moving.

comment:10 Changed 11 years ago by luroh

anevilyak: Hi! Doesn't look like it, team_images_1.txt excerpt attached.

Changed 11 years ago by luroh

Attachment: team_images_1.txt added

comment:11 Changed 11 years ago by julun

Hi,

for me it's a bit different, i don't get that kdl really over here. My vmx file has been changed to use 2 cpu's and 512 mb ram. Reverting this back gives the KDL as in the panic.png screenshot, but having two cpus it simply stops, no kdl. About the jam thing, I'm using it like jam -aqj 4 without any problems since months.

Karsten

comment:12 Changed 11 years ago by luroh

julun: I can confirm, numvcpus = "2" makes it hang at the red rocket, pressing F12 does nothing.

comment:13 Changed 11 years ago by julun

Hi,

I've uploaded such an unbootable image, one can get get it here:

www.julun.de/haiku/haiku.tar.bz2

Karsten

comment:14 Changed 11 years ago by bonefish

Blocking: 2778 added

(In #2778) Duplicate of #2776.

comment:14 Changed 11 years ago by bonefish

Blocking: 2778 removed
Owner: changed from axeld to bonefish
Status: newassigned

My VMware is already configured for 1 CPU, but I can reproduce the problem with qemu.

comment:15 Changed 11 years ago by mmlr

There is (broken) code in the USB module to handle R5 where the module is loaded multiple times. If that was the case and one of the modules would get unloaded afterwards (which never happens in case of R5) a crash would occure. But from the supplied syslogs here, it doesn't look like this is happening at all, because the corresponding message "usb_module: uninit" is not present and also it doesn't look like the module is loaded multiple times at all. I'll #ifdef that code out anyway though.

comment:16 Changed 11 years ago by bonefish

Resolution: fixed
Status: assignedclosed

Fixed in hrev27767.

Note: See TracTickets for help on using tickets.