Opened 9 years ago

Closed 9 years ago

#5861 closed bug (fixed)

r36542 Kernel Panics on Boot

Reported by: mbrumbelow Owned by: bonefish
Priority: critical Milestone: R1/alpha2
Component: Drivers/ACPI Version: R1/Development
Keywords: Cc: fredrik.holmqvist@…
Blocked By: Blocking:
Has a Patch: no Platform: x86

Description (last modified by tqh)

hrev36542 kernel panics on DELL Inspiron & HP laptops. Works fine on desktop system. Include screenshots of the two different systems.

Attachments (4)

IMG_0707.jpg (437.8 KB) - added by mbrumbelow 9 years ago.
Screenshot of HP laptop
IMG_0711.jpg (459.2 KB) - added by mbrumbelow 9 years ago.
Screenshot of DELL laptop
IMG_0727.jpg (473.3 KB) - added by mbrumbelow 9 years ago.
Still panics on DELL with trunk build hrev36564, included "threads" command as requested.
IMG_0734.jpg (458.5 KB) - added by mbrumbelow 9 years ago.
Still panics on HP with trunk build hrev36564, included "threads" command as requested.

Download all attachments as: .zip

Change History (22)

Changed 9 years ago by mbrumbelow

Attachment: IMG_0707.jpg added

Screenshot of HP laptop

Changed 9 years ago by mbrumbelow

Attachment: IMG_0711.jpg added

Screenshot of DELL laptop

comment:1 Changed 9 years ago by tqh

Cc: fredrik.holmqvist@… added
Description: modified (diff)

Looks like two different issues, the HP one is ACPI related.

comment:2 Changed 9 years ago by tqh

Description: modified (diff)

Oops, wrote in wrong area. Additional info:

The Dell one I'm not so sure, could you try disabling ACPI at boot. ( http://www.haiku-os.org/docs/userguide/en/bootloader.html )

comment:3 Changed 9 years ago by mbrumbelow

Both boot with ACPI turned off.

comment:4 Changed 9 years ago by tqh

Looks like it panics while initializing the embedded controller. It is waiting for a signal and there is nothing in run queue. Should it really panic? If you can you should be able to type "continue" in these kdls.

comment:5 Changed 9 years ago by mbrumbelow

Typing "continue" keeps me in kdl for both machines.

comment:6 Changed 9 years ago by tqh

According to deadyak we should have an idle thread to switch to, but doesn't seem to be the case.

comment:7 Changed 9 years ago by tqh

Are you building for trunk or are these alpha2 images?

comment:8 Changed 9 years ago by mbrumbelow

They are alpha2 images.

comment:9 Changed 9 years ago by tqh

Owner: changed from nobody to bonefish
Status: newassigned

Ingo, from what I can tell it should use reschedule_no_op, not reschedule. And the idle_threads should already exist. Do you have any idea what might be goind on?

I think we could make acpi_embedded_controller use polling this early if needed, but I'd like to know what is going on. (See http://dev.haiku-os.org/browser/haiku/trunk/src/add-ons/kernel/bus_managers/acpi/acpi_embedded_controller.cpp#L674 )

comment:10 in reply to:  9 ; Changed 9 years ago by bonefish

Replying to tqh:

Ingo, from what I can tell it should use reschedule_no_op, not reschedule. And the idle_threads should already exist. Do you have any idea what might be goind on?

The early kernel initialization is basically single threaded and performed by what will later become "idle thread 1" (there's a very small amount of per-CPU initialization happening on off-boot CPUs, too). Interrupts are disabled, which imposes several restrictions on the code executed. E.g. locking primitives can be used as long as it is guaranteed that they do not wait. Mutexes, recursive_locks, and rw_locks are fine, since no other thread has been running at that point. Anything that causes the idle thread to wait (snooze(), not yet triggered condition variables) is a problem though. My guess is that the ACPI code is doing something like this.

I'll add a panic() in scheduler_reschedule_no_op(), triggered when the calling thread is not continuing to run. It will fail early and thus help to identify the perpetrator.

In the meantime the output of the KDL command "threads" would be helpful.

comment:11 in reply to:  10 Changed 9 years ago by bonefish

Replying to bonefish:

I'll add a panic() in scheduler_reschedule_no_op(), triggered when the calling thread is not continuing to run. It will fail early and thus help to identify the perpetrator.

Done in hrev36560 (trunk).

comment:12 Changed 9 years ago by tqh

Added some code to let acpi_embedded_controller poll instead on bootup in hrev36565. I think it should help. It's only available in builds from trunk atm.

Changed 9 years ago by mbrumbelow

Attachment: IMG_0727.jpg added

Still panics on DELL with trunk build hrev36564, included "threads" command as requested.

comment:13 Changed 9 years ago by tqh

Ah, I think that image will help a lot. Does it work on the other machine now?

I guess we are not allowed to snooze during boot. I think we can switch to use a spinlock while booting. Similar to what is done in hrev36565 for the embedded controller.

comment:14 in reply to:  13 Changed 9 years ago by bonefish

Replying to tqh:

Ah, I think that image will help a lot. Does it work on the other machine now?

I guess we are not allowed to snooze during boot. I think we can switch to use a spinlock while booting. Similar to what is done in hrev36565 for the embedded controller.

Yep, you can spin() instead of snooze().

Changed 9 years ago by mbrumbelow

Attachment: IMG_0734.jpg added

Still panics on HP with trunk build hrev36564, included "threads" command as requested.

comment:15 Changed 9 years ago by mbrumbelow

The good news is unlike before, the stack traces are now consistent.

comment:16 Changed 9 years ago by tqh

Trying to avoid using snooze and use spinlocks instead in hrev36569.

comment:17 Changed 9 years ago by mbrumbelow

Updated build to hrev36571 and it works on all machines. Syslog is clean, no error or warning conditions. Please close.

comment:18 Changed 9 years ago by tqh

Resolution: fixed
Status: assignedclosed

I meant to say spin not spinlocks, sorry for the confusion, it is not synchronization primitive. mbrumbelow thanks for the fast response.

Note: See TracTickets for help on using tickets.