Opened 11 years ago
Closed 11 years ago
#10063 closed bug (fixed)
Long blank screen (~17s) before Haiku boot screen
Reported by: | Kev | Owned by: | axeld |
---|---|---|---|
Priority: | normal | Milestone: | R1 |
Component: | System/Boot Loader | Version: | R1/Development |
Keywords: | Cc: | ||
Blocked By: | Blocking: | ||
Platform: | x86 |
Description
Not sure if it's PM-related, but builds from a few months ago went almost immediately to the Haiku boot screen with the icons in the centre, whereas hrev46164 has a 17-second delay at a blank screen before the screen with the icons. Since that's longer than it used to take to boot in total, I almost thought the computer had frozen--during this time, the numlock light would not toggle when numlock was pressed on the keyboard.
Is this something that will be optimized? If not, maybe some kind of placeholder screen instead of a blank screen, or better yet, some kind of indication as to what it's doing during this?
Attachments (8)
Change History (46)
comment:1 by , 11 years ago
Component: | System → System/Boot Loader |
---|---|
Owner: | changed from | to
Type: | enhancement → bug |
Version: | R1/Package Management → R1/Development |
comment:2 by , 11 years ago
Booting off an internal hard disk with 4 BFS partitions, one ext2, and one Linux LVM, using Haiku's BootManager.
by , 11 years ago
Attachment: | eMachines T3104 specs added |
---|
by , 11 years ago
Attachment: | listdev.txt added |
---|
by , 11 years ago
Attachment: | listimage-grep-drivers.txt added |
---|
by , 11 years ago
Attachment: | listusb.txt added |
---|
follow-up: 7 comment:3 by , 11 years ago
Pure curiosity, is there any difference if you try booting with that card reader unplugged?
comment:4 by , 11 years ago
I have a same problem with my Acer W500, but delay is about 5 seconds. I was happend only in package management version. After PM-branch merge it always happens. In old Haiku splash screen and first icon are displayed immediately.
System boot from internal SDD. No external hardware plugged in.
follow-up: 6 comment:5 by , 11 years ago
Interestingly, I can reproduce it with vbox 4.1.8, but can't with the 4.2.18.
follow-up: 8 comment:6 by , 11 years ago
Replying to diver:
Interestingly, I can reproduce it with vbox 4.1.8, but can't with the 4.2.18.
Could you provide syslogs from both. Maybe there's some obvious difference.
@Kev: Unfortunately your syslog wrapped at the end of the boot process and the new one only shows the part after the interesting output.
comment:7 by , 11 years ago
Replying to anevilyak:
Pure curiosity, is there any difference if you try booting with that card reader unplugged?
It's built-in. There might be an internal cable, did you want me to open it up? Or just disable it in the BIOS?
comment:8 by , 11 years ago
Replying to bonefish:
@Kev: Unfortunately your syslog wrapped at the end of the boot process and the new one only shows the part after the interesting output.
Oh, I didn't know it could do that. Should I delete the syslog and reboot? Or how does it work?
comment:9 by , 11 years ago
When the syslog file has reached the configured maximum size it is renamed to "syslog.old" and a new "syslog" file is started. So, when in doubt, attaching both files (or concatenating them) is usually a good idea. But yes, removing the syslog file and rebooting should result in a complete one.
comment:11 by , 11 years ago
I can't spot anything suspicious in the syslog. Unfortunately there aren't timestamps for all syslog entries. In fact only packagefs and package_daemon have any. Checking those:
KERN: packagefs [37985322: 15] Failed to open packages activation file: No such file or directory [...] KERN: packagefs [40359443: 15] bytes saved: 518900
There's where the packagefs volumes are mounted. Takes about 2.4 s in total, which is OK. That's after the fourth icon on the boot screen has been highlighted, but before the fifth. The next entries with timestamps are from the package daemon:
KERN: package_daemon [44966964: 352] root at "/boot" (device: 3, node: 524288) registered KERN: package_daemon [44977559: 352] volume at "/boot/system" registered KERN: package_daemon [44991094: 352] volume at "/boot/common" registered KERN: package_daemon [44998737: 352] volume at "/boot/home/config" registered [...] KERN: package_daemon [45517282: 360] active package: "zlib_x86_devel-1.2.8-3-x86_gcc2.hpkg" [...] KERN: package_daemon [46106324: 360] active package: "libogg-1.3.0-2-x86_gcc2.hpkg" KERN: package_daemon [46113401: 360] Volume::InitialVerify((nil), (nil)) KERN: package_daemon [46256281: 360] Volume::InitialVerify(): volume at "/boot/system" is consistent [...] KERN: package_daemon [46707585: 360] active package: "vision-0.9.7.r944-1-x86_gcc2.hpkg" [...] KERN: package_daemon [47007521: 360] active package: "make-3.82-5-x86_gcc2.hpkg" KERN: package_daemon [47016619: 360] Volume::InitialVerify(0x3b4bf28, (nil)) KERN: package_daemon [47032323: 360] Volume::InitialVerify(): volume at "/bKERN: oot/common" is consistent KERN: package_daemon [47048583: 360] Volume::InitialVerify(0x3b4be00, 0x3b4bf28) KERN: package_daemon [47065806: 360] Volume::InitialVerify(): volume at "/boot/home/config" is consistent
So the package daemon is started about 4.6 s after the packagefs volumes have been mounted -- which is pretty much the same I see here for a qemu boot -- and it needs 2.1 s to read in all package infos and verify that everything looks good. It's only about 0.5 s here, but I guess that's OK, given that your machine is a bit slower. That is done in the background anyway. After the Booscript starts the package daemon, it starts the net server and then the app server.
Assuming the blank screen you're referring to is blue, the delay you're experiencing is something between the start of the app server and the initialization of Tracker. In that case I suppose the ton of Radeon output can be ignored. The line
KERN: bfs: bfs_create_index:2095: File or Directory already exists
is caused by Tracker. And in between there isn't much: Some input server related driver initializations (PS/2), an attempt to load usb_midi (so the midi server has probably already been started), and some DHCP messages.
Unfortunately without timestamps it is impossible to say what exactly takes so long. In the Bootscript there's some synchronous waiting for the mount server going on, right before Tracker is started. Maybe something is going wrong there.
Is there disk activity during the wait period? Do you have a movable mouse cursor?
comment:12 by , 11 years ago
The blank screen is before any icons are highlighted. It's before the screen with the icons that get highlighted, i.e., two screens before the desktop [textual BootManager; blank screen we're talking about; black screen with centred Haiku logo and loading icons that highlight; Deskbar/Tracker] so there's certainly no mouse cursor yet. If it's blue, it's a blue that's nearly black.
I'll have to get back to you about disk activity, since my desktop is currently buried. I didn't notice anything audible though.
comment:13 by , 11 years ago
Ah, I totally misunderstood you. So that's even before the boot screen, i.e. in the boot loader. I don't see anything out of the ordinary in the syslog.
I don't suppose you can capture the serial debug output live, to see where exactly it takes so long?
comment:14 by , 11 years ago
Yes, exactly. Thanks for looking through the syslog anyway though.
So, there's no disk activity the majority of the time. Then there's a bit of it, and a second later, the black screen with centred logo and loading icons comes up, and a second after that, the first three icons light up almost simultaneously, and just after that, the fourth, where it pauses for a bit, but that's always been the longest icon, even on BeOS back in the day.
I don't have a serial cable, unfortunately... :|
Is there a way to add more timestamps to the syslog to help with this?
comment:15 by , 11 years ago
I have attached a boot loader package that includes timestamps in the debug output.
comment:16 by , 11 years ago
Thanks bonefish. Excuse my ignorance, but I couldn't figure out from the PM wiki/google/trying stuff out on the command line how to install it. Expander (came up when I double-clicked) gave me the file but then I can't copy it because /boot/system is read-only.
I tried dragging and dropping it into /boot/home/config/packages, but then I got this cryptic dialog I'll attach, talking about changing from X to X, or ignoring. Which am I supposed to choose? Or was this the wrong way to install?
by , 11 years ago
Attachment: | screenshot1.png added |
---|
dialog dragging .hpkg to /boot/home/config/packages
comment:17 by , 11 years ago
You should be able to drop it into /boot/system/packages, that one is writable and is where the actual loader resides. The dialog you got was because of the conflict between having a version of it in system and a version in config.
comment:18 by , 11 years ago
Yes, the boot loader package must be moved to /boot/system/packages, replacing the one already there. The architecture conflict the dialog mentions is probably due to the fact that the boot loader I attached has been build with gcc 4, while I assume you're using a gcc 2 Haiku. I'm attaching a gcc 2 version...
comment:19 by , 11 years ago
Uhm, Trac renamed the attached file. You'll have to rename it back to haiku_loader.hpkg
after downloading.
comment:20 by , 11 years ago
Okay, done. Attaching new syslogs after rebooting with a renamed and installed gcc2 version.
BTW, is gcc4 the build to try now? I was testing the gcc4hybrid nightlies for a while, and it sounded from when I'd submit bug reports that gcc2hybrid was more what people were expected to test, since it was more like the alphas.
comment:21 by , 11 years ago
KERN: [25764044007] 0x001059d0 Partition::_Mount check for file_system: BFS Filesystem KERN: [55494398204] load kernel kernel_x86...
So there's a gap of almost 30 billion cycles. It's only about 2.7 billion cycles here. I'm attaching a new boot loader package with more output in between the two entries.
gcc2hybrid is still the official version. I'm just using gcc 4 for development ATM.
comment:22 by , 11 years ago
It seems you have re-uploaded the same syslog. There's no new debug output and the timestamps are exactly the same as before.
comment:24 by , 11 years ago
Oh, sorry, I had two paths going on. I reuploaded, you're right. Just a sec.
by , 11 years ago
Attachment: | syslog.2.txt added |
---|
comment:25 by , 11 years ago
KERN: [26016788551] packagefs_mount_file(): parsing package KERN: [55518308594] packagefs_mount_file(): creating root node
OK, parsing the system package takes so long. Here comes a version with more debug output.
comment:26 by , 11 years ago
(I redid it, BTW :)...in case in a flood of e-mail notifications it looked like I hadn't. If you saw already no worries & sorry :) )
comment:27 by , 11 years ago
No worries, I see the notification.
Unfortunately some of the output is missing. I assume the syslog buffer the boot loader uses to temporarily store the output is too small. Attaching a new boot loader with larger buffer.
comment:28 by , 11 years ago
KERN: [27080301734] data KERN: [54341438117] data: 4312 entries KERN: [54346780477] add-ons KERN: [55623077114] add-ons: 433 entries
Parsing the entries in "data" takes about 27 billion cycles, those in "add-ons" 1.3. For comparison for a qemu run on my machine I get (different order):
[4336010619] add-ons [4517829003] add-ons: 432 entries [4570087101] data [6931807086] data: 4282 entries
i.e. 2.4 and 0.18 billion cycles -- between 7 and 11 times fewer cycles. That doesn't sound like it could be caused just by a difference in CPU performance. Please fire up a Terminal and enter:
time package list /system/packages/haiku.hpkg > /dev/null
comment:29 by , 11 years ago
/> time package list /system/packages/haiku.hpkg > /dev/null real 0m0.451s user 0m0.414s sys 0m0.025s
comment:30 by , 11 years ago
OK, thanks. That verifies that it isn't a CPU performance related issue -- my test machine is only somewhat faster (0.380 s). It also verifies that the package reading code as such isn't to blame, because it obviously doesn't even take half a second to list the package contents. That leaves the packagefs code in the boot loader and the functionality provided by the boot loader. It can't be I/O related, since all I/O happens before already.
Since quite a few objects are created, let's get some heap allocator timing statistics. With 2.6 billion cycles spent in malloc()
and free()
on my test system it looks like a pretty good candidate. Attaching a new boot loader package...
comment:31 by , 11 years ago
KERN: [58958381088] total malloc() cycles: 18373374263 KERN: [58966183437] total free() cycles: 11621297448
We've found the culprit. Almost 30 billion cycles spent in malloc()
and free()
. I'll have a look how the heap allocator can be optimized.
comment:32 by , 11 years ago
Very cool! Thanks, bonefish! This is way over my head, so I'm glad you were able to sort it out. =)
follow-up: 35 comment:34 by , 11 years ago
Huzzah! It's down to about 3 seconds before the logo+highlighting-icons screen.
Although, this update gave a weird side-effect: instead of the logo, there is a row of inactive icons above the inactive-becoming-active icons. I.e. two rows and the bottom ones light up, and no logo.
comment:35 by , 11 years ago
Replying to Kev:
Although, this update gave a weird side-effect: instead of the logo, there is a row of inactive icons above the inactive-becoming-active icons. I.e. two rows and the bottom ones light up, and no logo.
That will happen if the loader and kernel are built with different distro compatibility settings (I.e. one official and one not).
comment:36 by , 11 years ago
Could you attach a syslog for the latest version? I'd like to check whether everything looks as expected.
by , 11 years ago
Attachment: | haiku_loader.hpkg added |
---|
Boot loader with timestamps in debug output (gcc 2 version)
comment:38 by , 11 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
KERN: [25555834965] packagefs_mount_file(2) [...] KERN: [26839593187] packagefs_mount_file() done
About 1.3 billion cycles for mounting packagefs sounds much better -- and I suppose most the time is actually spent sending the debug output to the serial port. So that will improve when I remove it. And with:
KERN: [28830202364] total malloc() cycles: 13380260 KERN: [28837534949] total free() cycles: 12700004
I would definitely consider the heap allocator issue fixed. While I expected a good speedup for malloc()
, I'm a bit surprised wrt. to free()
, since it basically uses the same algorithm as before. Anyway, I'm not complaining... :-)
To sum it up:
KERN: [22823449496] options = 0 [...] KERN: [24169884445] VESA compatible graphics! [...] KERN: [26846322372] load kernel kernel_x86... KERN: [27711964892] video mode: 1024x768x32
We have about 4.9 billion cycles from the first boot loader output to about the time the splash screen should be shown. It's about 1.3 billion cycles for mounting packagefs and about 0.9 billion cycles for loading the kernel, which I don't think we can optimize a lot. Apparently it takes more than 1.3 billion cycles to enumerate the VESA modes -- respectively print them to serial output. So I suppose disabling serial debug output in the boot loader would actually be the biggest opportunity for optimization left. We might want to do that for R1.
Anyway, I pushed my changes in hrev46192 and consider the main issue fixed.
That sounds more like a bug of some kind, as this hasn't been the case on any system I've seen it running on. More details on the hardware would be interesting, as well as what's being booted off in this case (hard disk, usb stick, CD, ...).