Opened 11 years ago

Closed 11 years ago

#10063 closed bug (fixed)

Long blank screen (~17s) before Haiku boot screen

Reported by: Kev Owned by: axeld
Priority: normal Milestone: R1
Component: System/Boot Loader Version: R1/Development
Keywords: Cc:
Blocked By: Blocking:
Platform: x86

Description

Not sure if it's PM-related, but builds from a few months ago went almost immediately to the Haiku boot screen with the icons in the centre, whereas hrev46164 has a 17-second delay at a blank screen before the screen with the icons. Since that's longer than it used to take to boot in total, I almost thought the computer had frozen--during this time, the numlock light would not toggle when numlock was pressed on the keyboard.

Is this something that will be optimized? If not, maybe some kind of placeholder screen instead of a blank screen, or better yet, some kind of indication as to what it's doing during this?

Attachments (8)

eMachines T3104 specs (1.1 KB ) - added by Kev 11 years ago.
listdev.txt (3.1 KB ) - added by Kev 11 years ago.
listimage-grep-drivers.txt (1.5 KB ) - added by Kev 11 years ago.
listusb.txt (8.4 KB ) - added by Kev 11 years ago.
screenshot1.png (13.8 KB ) - added by Kev 11 years ago.
dialog dragging .hpkg to /boot/home/config/packages
syslog.2.txt (537.0 KB ) - added by Kev 11 years ago.
haiku_loader.hpkg (257.2 KB ) - added by bonefish 11 years ago.
Boot loader with timestamps in debug output (gcc 2 version)
syslog.txt (990.0 KB ) - added by Kev 11 years ago.
with latest refined bootloader

Download all attachments as: .zip

Change History (46)

comment:1 by anevilyak, 11 years ago

Component: SystemSystem/Boot Loader
Owner: changed from nobody to axeld
Type: enhancementbug
Version: R1/Package ManagementR1/Development

That sounds more like a bug of some kind, as this hasn't been the case on any system I've seen it running on. More details on the hardware would be interesting, as well as what's being booted off in this case (hard disk, usb stick, CD, ...).

comment:2 by Kev, 11 years ago

Booting off an internal hard disk with 4 BFS partitions, one ext2, and one Linux LVM, using Haiku's BootManager.

by Kev, 11 years ago

Attachment: eMachines T3104 specs added

by Kev, 11 years ago

Attachment: listdev.txt added

by Kev, 11 years ago

Attachment: listimage-grep-drivers.txt added

by Kev, 11 years ago

Attachment: listusb.txt added

comment:3 by anevilyak, 11 years ago

Pure curiosity, is there any difference if you try booting with that card reader unplugged?

comment:4 by X512, 11 years ago

I have a same problem with my Acer W500, but delay is about 5 seconds. I was happend only in package management version. After PM-branch merge it always happens. In old Haiku splash screen and first icon are displayed immediately.

System boot from internal SDD. No external hardware plugged in.

Last edited 11 years ago by X512 (previous) (diff)

comment:5 by diver, 11 years ago

Interestingly, I can reproduce it with vbox 4.1.8, but can't with the 4.2.18.

in reply to:  5 ; comment:6 by bonefish, 11 years ago

Replying to diver:

Interestingly, I can reproduce it with vbox 4.1.8, but can't with the 4.2.18.

Could you provide syslogs from both. Maybe there's some obvious difference.

@Kev: Unfortunately your syslog wrapped at the end of the boot process and the new one only shows the part after the interesting output.

in reply to:  3 comment:7 by Kev, 11 years ago

Replying to anevilyak:

Pure curiosity, is there any difference if you try booting with that card reader unplugged?

It's built-in. There might be an internal cable, did you want me to open it up? Or just disable it in the BIOS?

in reply to:  6 comment:8 by Kev, 11 years ago

Replying to bonefish:

@Kev: Unfortunately your syslog wrapped at the end of the boot process and the new one only shows the part after the interesting output.

Oh, I didn't know it could do that. Should I delete the syslog and reboot? Or how does it work?

comment:9 by bonefish, 11 years ago

When the syslog file has reached the configured maximum size it is renamed to "syslog.old" and a new "syslog" file is started. So, when in doubt, attaching both files (or concatenating them) is usually a good idea. But yes, removing the syslog file and rebooting should result in a complete one.

comment:10 by Kev, 11 years ago

There, that's syslog.old with syslog appended.

comment:11 by bonefish, 11 years ago

I can't spot anything suspicious in the syslog. Unfortunately there aren't timestamps for all syslog entries. In fact only packagefs and package_daemon have any. Checking those:

KERN: packagefs [37985322:    15] Failed to open packages activation file: No such file or directory
[...]
KERN: packagefs [40359443:    15]   bytes saved:               518900

There's where the packagefs volumes are mounted. Takes about 2.4 s in total, which is OK. That's after the fourth icon on the boot screen has been highlighted, but before the fifth. The next entries with timestamps are from the package daemon:

KERN: package_daemon [44966964:   352] root at "/boot" (device: 3, node: 524288) registered
KERN: package_daemon [44977559:   352] volume at "/boot/system" registered
KERN: package_daemon [44991094:   352] volume at "/boot/common" registered
KERN: package_daemon [44998737:   352] volume at "/boot/home/config" registered
[...]
KERN: package_daemon [45517282:   360] active package: "zlib_x86_devel-1.2.8-3-x86_gcc2.hpkg"
[...]
KERN: package_daemon [46106324:   360] active package: "libogg-1.3.0-2-x86_gcc2.hpkg"
KERN: package_daemon [46113401:   360] Volume::InitialVerify((nil), (nil))
KERN: package_daemon [46256281:   360] Volume::InitialVerify(): volume at "/boot/system" is consistent
[...]
KERN: package_daemon [46707585:   360] active package: "vision-0.9.7.r944-1-x86_gcc2.hpkg"
[...]
KERN: package_daemon [47007521:   360] active package: "make-3.82-5-x86_gcc2.hpkg"
KERN: package_daemon [47016619:   360] Volume::InitialVerify(0x3b4bf28, (nil))
KERN: package_daemon [47032323:   360] Volume::InitialVerify(): volume at "/bKERN: oot/common" is consistent
KERN: package_daemon [47048583:   360] Volume::InitialVerify(0x3b4be00, 0x3b4bf28)
KERN: package_daemon [47065806:   360] Volume::InitialVerify(): volume at "/boot/home/config" is consistent

So the package daemon is started about 4.6 s after the packagefs volumes have been mounted -- which is pretty much the same I see here for a qemu boot -- and it needs 2.1 s to read in all package infos and verify that everything looks good. It's only about 0.5 s here, but I guess that's OK, given that your machine is a bit slower. That is done in the background anyway. After the Booscript starts the package daemon, it starts the net server and then the app server.

Assuming the blank screen you're referring to is blue, the delay you're experiencing is something between the start of the app server and the initialization of Tracker. In that case I suppose the ton of Radeon output can be ignored. The line

KERN: bfs: bfs_create_index:2095: File or Directory already exists

is caused by Tracker. And in between there isn't much: Some input server related driver initializations (PS/2), an attempt to load usb_midi (so the midi server has probably already been started), and some DHCP messages.

Unfortunately without timestamps it is impossible to say what exactly takes so long. In the Bootscript there's some synchronous waiting for the mount server going on, right before Tracker is started. Maybe something is going wrong there.

Is there disk activity during the wait period? Do you have a movable mouse cursor?

comment:12 by Kev, 11 years ago

The blank screen is before any icons are highlighted. It's before the screen with the icons that get highlighted, i.e., two screens before the desktop [textual BootManager; blank screen we're talking about; black screen with centred Haiku logo and loading icons that highlight; Deskbar/Tracker] so there's certainly no mouse cursor yet. If it's blue, it's a blue that's nearly black.

I'll have to get back to you about disk activity, since my desktop is currently buried. I didn't notice anything audible though.

comment:13 by bonefish, 11 years ago

Ah, I totally misunderstood you. So that's even before the boot screen, i.e. in the boot loader. I don't see anything out of the ordinary in the syslog.

I don't suppose you can capture the serial debug output live, to see where exactly it takes so long?

comment:14 by Kev, 11 years ago

Yes, exactly. Thanks for looking through the syslog anyway though.

So, there's no disk activity the majority of the time. Then there's a bit of it, and a second later, the black screen with centred logo and loading icons comes up, and a second after that, the first three icons light up almost simultaneously, and just after that, the fourth, where it pauses for a bit, but that's always been the longest icon, even on BeOS back in the day.

I don't have a serial cable, unfortunately... :|

Is there a way to add more timestamps to the syslog to help with this?

comment:15 by bonefish, 11 years ago

I have attached a boot loader package that includes timestamps in the debug output.

comment:16 by Kev, 11 years ago

Thanks bonefish. Excuse my ignorance, but I couldn't figure out from the PM wiki/google/trying stuff out on the command line how to install it. Expander (came up when I double-clicked) gave me the file but then I can't copy it because /boot/system is read-only.

I tried dragging and dropping it into /boot/home/config/packages, but then I got this cryptic dialog I'll attach, talking about changing from X to X, or ignoring. Which am I supposed to choose? Or was this the wrong way to install?

by Kev, 11 years ago

Attachment: screenshot1.png added

dialog dragging .hpkg to /boot/home/config/packages

comment:17 by anevilyak, 11 years ago

You should be able to drop it into /boot/system/packages, that one is writable and is where the actual loader resides. The dialog you got was because of the conflict between having a version of it in system and a version in config.

comment:18 by bonefish, 11 years ago

Yes, the boot loader package must be moved to /boot/system/packages, replacing the one already there. The architecture conflict the dialog mentions is probably due to the fact that the boot loader I attached has been build with gcc 4, while I assume you're using a gcc 2 Haiku. I'm attaching a gcc 2 version...

comment:19 by bonefish, 11 years ago

Uhm, Trac renamed the attached file. You'll have to rename it back to haiku_loader.hpkg after downloading.

Last edited 11 years ago by anevilyak (previous) (diff)

comment:20 by Kev, 11 years ago

Okay, done. Attaching new syslogs after rebooting with a renamed and installed gcc2 version.

BTW, is gcc4 the build to try now? I was testing the gcc4hybrid nightlies for a while, and it sounded from when I'd submit bug reports that gcc2hybrid was more what people were expected to test, since it was more like the alphas.

comment:21 by bonefish, 11 years ago

KERN: [25764044007] 0x001059d0 Partition::_Mount check for file_system: BFS Filesystem
KERN: [55494398204] load kernel kernel_x86...

So there's a gap of almost 30 billion cycles. It's only about 2.7 billion cycles here. I'm attaching a new boot loader package with more output in between the two entries.

gcc2hybrid is still the official version. I'm just using gcc 4 for development ATM.

comment:22 by bonefish, 11 years ago

It seems you have re-uploaded the same syslog. There's no new debug output and the timestamps are exactly the same as before.

comment:23 by Kev, 11 years ago

Umm...okay. That's odd, I'll try again...

comment:24 by Kev, 11 years ago

Oh, sorry, I had two paths going on. I reuploaded, you're right. Just a sec.

by Kev, 11 years ago

Attachment: syslog.2.txt added

comment:25 by bonefish, 11 years ago

KERN: [26016788551] packagefs_mount_file(): parsing package
KERN: [55518308594] packagefs_mount_file(): creating root node

OK, parsing the system package takes so long. Here comes a version with more debug output.

comment:26 by Kev, 11 years ago

(I redid it, BTW :)...in case in a flood of e-mail notifications it looked like I hadn't. If you saw already no worries & sorry :) )

comment:27 by bonefish, 11 years ago

No worries, I see the notification.

Unfortunately some of the output is missing. I assume the syslog buffer the boot loader uses to temporarily store the output is too small. Attaching a new boot loader with larger buffer.

comment:28 by bonefish, 11 years ago

KERN: [27080301734]   data
KERN: [54341438117]   data: 4312 entries
KERN: [54346780477]   add-ons
KERN: [55623077114]   add-ons: 433 entries

Parsing the entries in "data" takes about 27 billion cycles, those in "add-ons" 1.3. For comparison for a qemu run on my machine I get (different order):

[4336010619]   add-ons
[4517829003]   add-ons: 432 entries
[4570087101]   data
[6931807086]   data: 4282 entries

i.e. 2.4 and 0.18 billion cycles -- between 7 and 11 times fewer cycles. That doesn't sound like it could be caused just by a difference in CPU performance. Please fire up a Terminal and enter:

time package list /system/packages/haiku.hpkg > /dev/null

comment:29 by Kev, 11 years ago

/> time package list /system/packages/haiku.hpkg > /dev/null

real    0m0.451s
user    0m0.414s
sys     0m0.025s
Last edited 11 years ago by Kev (previous) (diff)

comment:30 by bonefish, 11 years ago

OK, thanks. That verifies that it isn't a CPU performance related issue -- my test machine is only somewhat faster (0.380 s). It also verifies that the package reading code as such isn't to blame, because it obviously doesn't even take half a second to list the package contents. That leaves the packagefs code in the boot loader and the functionality provided by the boot loader. It can't be I/O related, since all I/O happens before already.

Since quite a few objects are created, let's get some heap allocator timing statistics. With 2.6 billion cycles spent in malloc() and free() on my test system it looks like a pretty good candidate. Attaching a new boot loader package...

comment:31 by bonefish, 11 years ago

KERN: [58958381088] total malloc() cycles: 18373374263
KERN: [58966183437] total free() cycles:   11621297448

We've found the culprit. Almost 30 billion cycles spent in malloc() and free(). I'll have a look how the heap allocator can be optimized.

comment:32 by Kev, 11 years ago

Very cool! Thanks, bonefish! This is way over my head, so I'm glad you were able to sort it out. =)

comment:33 by bonefish, 11 years ago

Please give the attached new boot loader a try.

comment:34 by Kev, 11 years ago

Huzzah! It's down to about 3 seconds before the logo+highlighting-icons screen.

Although, this update gave a weird side-effect: instead of the logo, there is a row of inactive icons above the inactive-becoming-active icons. I.e. two rows and the bottom ones light up, and no logo.

in reply to:  34 comment:35 by anevilyak, 11 years ago

Replying to Kev:

Although, this update gave a weird side-effect: instead of the logo, there is a row of inactive icons above the inactive-becoming-active icons. I.e. two rows and the bottom ones light up, and no logo.

That will happen if the loader and kernel are built with different distro compatibility settings (I.e. one official and one not).

comment:36 by bonefish, 11 years ago

Could you attach a syslog for the latest version? I'd like to check whether everything looks as expected.

by bonefish, 11 years ago

Attachment: haiku_loader.hpkg added

Boot loader with timestamps in debug output (gcc 2 version)

comment:37 by bonefish, 11 years ago

New boot loader version with previously introduced bug fixed.

by Kev, 11 years ago

Attachment: syslog.txt added

with latest refined bootloader

comment:38 by bonefish, 11 years ago

Resolution: fixed
Status: newclosed
KERN: [25555834965] packagefs_mount_file(2)
[...]
KERN: [26839593187] packagefs_mount_file() done

About 1.3 billion cycles for mounting packagefs sounds much better -- and I suppose most the time is actually spent sending the debug output to the serial port. So that will improve when I remove it. And with:

KERN: [28830202364] total malloc() cycles: 13380260
KERN: [28837534949] total free() cycles:   12700004

I would definitely consider the heap allocator issue fixed. While I expected a good speedup for malloc(), I'm a bit surprised wrt. to free(), since it basically uses the same algorithm as before. Anyway, I'm not complaining... :-)

To sum it up:

KERN: [22823449496] options = 0
[...]
KERN: [24169884445] VESA compatible graphics!
[...]
KERN: [26846322372] load kernel kernel_x86...
KERN: [27711964892] video mode: 1024x768x32

We have about 4.9 billion cycles from the first boot loader output to about the time the splash screen should be shown. It's about 1.3 billion cycles for mounting packagefs and about 0.9 billion cycles for loading the kernel, which I don't think we can optimize a lot. Apparently it takes more than 1.3 billion cycles to enumerate the VESA modes -- respectively print them to serial output. So I suppose disabling serial debug output in the boot loader would actually be the biggest opportunity for optimization left. We might want to do that for R1.

Anyway, I pushed my changes in hrev46192 and consider the main issue fixed.

Note: See TracTickets for help on using tickets.