Opened 4 years ago
Last modified 2 years ago
#16546 assigned bug
KDL, laptop no longer boots
Reported by: | luroh | Owned by: | korli |
---|---|---|---|
Priority: | normal | Milestone: | Unscheduled |
Component: | System/Kernel | Version: | R1/Development |
Keywords: | boot-failure | Cc: | |
Blocked By: | Blocking: | ||
Platform: | x86 |
Description
hrev54600, gcc2h
Laptop with i7-7500U no longer boots, this regressed somewhere between R1/beta2 and now. Sometimes it drops me into KDL, sometimes it just shows the blue background with the mouse pointer stuck in the center.
Attachments (4)
Change History (30)
by , 4 years ago
Attachment: | hrev54600.png added |
---|
comment:1 by , 4 years ago
Milestone: | Unscheduled → R1/beta3 |
---|---|
Priority: | normal → high |
comment:2 by , 4 years ago
comment:3 by , 4 years ago
Can you type in KDL, for instance "syslog"? There should be a line "using Intel C-states".
by , 4 years ago
Attachment: | hrev54605_syslog1.png added |
---|
by , 4 years ago
Attachment: | hrev54605_syslog2.png added |
---|
comment:5 by , 4 years ago
Thanks, I can't find a reason for this to happen. We could eventually add an assert.
comment:6 by , 4 years ago
Sure, no hurry. I can build straight to disk on this machine so turnaround time for testing is short, should you get any ideas.
comment:7 by , 4 years ago
Hi,
Please try hrev54870. The error message should be different and will give us more information about where the problem could be.
comment:9 by , 4 years ago
timeStep is either set to BASE_TIME_STEP (500) or BASE_TIME_STEP / 4 (125), and these are clearly non-negative.
So, there is probably something resulting in a negative result, but it's not obvious how it could happen. The other involved value is a delta of two successive current_time call, which could only fail if the time goes back.
Or it could be that the number of CPUs changed and we are accessing the array out of range.
I don't see any reason one of these would happen, so I just logged all the values involved in the computation. We can then move our attention to the one that's not behaving as expected and investigate further.
by , 4 years ago
Attachment: | hrev54896.png added |
---|
comment:11 by , 4 years ago
So it's the idleTime being negative.
It's computed this way:
bigtime_t start = system_time(); // go in suspend mode and wait until we need to wakeup... bigtime_t delta = system_time() - start; idleTime = (idleTime + delta) / 2;
The only thing that I can imagine going wrong here is if system_time() somehow goes back in time?
Its implementation is based on rdtsc multiplied with a conversion factor to get microseconds. We are sure that the two calls to it will be run on the same CPU here so that shouldn't be a problem with de-synchronized TSC between two CPU cores.
The idle time value converted to hex: 0xc00bb0b000000001. Not sure what to make of that.
We could ignore the delta values that we find to be negative, but is that the proper fix, or is there some deeper problem at play here? Could it be a problem with the conversion factor used by system_time? It's computed by matching the rdtsc changes with the PC programmable timer, and the code in the bootloader looks like it can fail silently if it doesn't manage to compute a stable value after 20 tries. It will still gives a "best guess", but it could be completely wrong, and in particular it could result in overflow of system_time computations?
comment:12 by , 4 years ago
hrev54937, gcc2h
No KDL, just blue background with mouse pointer stuck in the middle, no desktop.
follow-up: 14 comment:13 by , 4 years ago
No KDL, just blue background with mouse pointer stuck in the middle, no desktop.
Can you enter KDL by keyboard (Ctrl+Alt+SysRq+D)? If you can, please type teams
, press enter key and take photo of screen.
comment:14 by , 4 years ago
Can you enter KDL by keyboard (Ctrl+Alt+SysRq+D)?
If someone could provide a patch to reduce it to Ctrl+Alt+D, perhaps.
According to the manual, Fn+S should emulate SysRq but it doesn't work (horrible Lenovo keyboard).
follow-up: 16 comment:15 by , 4 years ago
If someone could provide a patch to reduce it to Ctrl+Alt+D, perhaps.
Print screen key should work as SysRq.
follow-up: 18 comment:17 by , 4 years ago
Did you try both 32 and 64bit versions of Haiku? The system_time implementation is a bit different, if one works but not the other, that would be a likely place to check.
comment:18 by , 4 years ago
Did you try both 32 and 64bit versions of Haiku?
Can't remember but I'll give it a try, good idea.
comment:19 by , 4 years ago
Yes, 64-bit works. Come to think of it, it may very well have been the case that gcc2h never worked on this machine, sorry about that.
comment:20 by , 4 years ago
Platform: | All → x86 |
---|
comment:21 by , 4 years ago
So I suspect something is not working as expected with the code to compute the conversion factor for system time: https://git.haiku-os.org/haiku/tree/src/system/boot/arch/x86/arch_cpu.cpp
Can you check this?
From the bootloader menu, go in debug options -> display current bootloader log.
See if one of these logs are visible:
"needed %" B_PRIu32 " quick samples for TSC calibration\n" "needed %" B_PRIu32 " slow samples for TSC calibration\n"
If one of these is 20 or larger, it means we didn't manage to properly find the timer frequency. As a result, everything involving system_time would be broken, including anything that tries to sleep for some number of microseconds.
If that's the case, the behavior could be different between EFI and BIOS booting, since different timers are used in each case.
comment:24 by , 3 years ago
Keywords: | boot-failure added |
---|
comment:26 by , 2 years ago
Milestone: | R1/beta4 → Unscheduled |
---|---|
Priority: | high → normal |
No reply, bumping out of the milestone.
Apologies, R1/beta2 doesn't fully boot either, it hangs on the blue background with an immobile mouse pointer in the center of the screen. No KDL (tried 10 restarts).
This laptop has booted in the past but I am not sure when. Maybe something else changed at some point, it has received a few BIOS updates in the past year or so.
I guess the good news is that I now at least get dropped to KDL 50% of the time as opposed to just being stuck with a blue background.