#12258 closed bug (fixed)
[launch_daemon] increased boot time
Reported by: | diver | Owned by: | axeld |
---|---|---|---|
Priority: | normal | Milestone: | Unscheduled |
Component: | Servers/launch_daemon | Version: | R1/Development |
Keywords: | Cc: | ||
Blocked By: | Blocking: | ||
Platform: | All |
Description
It looks like boot time has increased after switching to launch_daemon. I can't tell for sure but it seems to spend several seconds at the rocket icon. I have a feeling app_server now starts longer.
Attachments (7)
Change History (53)
comment:1 by , 9 years ago
comment:2 by , 9 years ago
I'm not sure, we stay longer on the boot screen but after that we get immediately to the desktop with Tracker and Deskbar running already here. Before, the app_server would start earlier but then there would be just the empty desktop and mouse cursor for some time.
comment:3 by , 9 years ago
In VBox it spends about 9 seconds at the rocket icon whereas it took BeOS only 6 seconds to fully boot from bootman :P So I think it's just too much. Do we have a boot profiler somewhere?
comment:4 by , 9 years ago
For the comparison to be fair, you need to build the kernel in release mode and without serial output. Not sure about boot profilers, maybe we should add timestamps to the syslog?
comment:5 by , 9 years ago
There is an option to add timestamps to the syslog already. However, we also have a profiler for the boot process; I even made it work again before doing the launch_daemon changes. See this commit: http://cgit.haiku-os.org/haiku/commit/?id=035e3e77ed4550c9e9e5d932d02125be0d80d04c
Basically, you copy the 'build/config_headers/kernel_debug_config.h' to 'build/user_config_headers', and set SYSTEM_PROFILER
to 1. From the running system you can then call profile -r
to get the data. This can then be analyzed using different tools.
BTW, it's totally expected that the time spent on the rocket icon will be longer now. However, the total time to have the system booted should be reduced. If that's not the case, we should find out what's going on :-)
comment:6 by , 9 years ago
Just updated Haiku to launch_daemon in VMware Fusion. Boot time increased from 6 seconds (yeah, like in BeOS!) to 11 seconds. So there's definitely a regression :/
comment:7 by , 9 years ago
Takes 19 seconds to go from the VirtualBox logo to a fully-booted Haiku (Tracker/Deskbar running) here. Took about half that before launch_daemon. So yes, there is most certainly a regression.
comment:8 by , 9 years ago
Boot time for pre-launch daemon 16 seconds Boot time for post-launch daemon 25 seconds This is on real hardware, both installs sharing same hard drive on separate partitions. Compared to Windows, this is still pretty good, so not a big deal about a few extra seconds.
comment:9 by , 9 years ago
If we can at all possibly help it, I'd prefer we didn't stop at "pretty good" and try to get all the way to "great" or beyond. :)
comment:10 by , 9 years ago
How much differs the measurement when you stop once the HD stops working?
The original Bootscript launched a number of services only after Tracker. The launch_daemon no longer does this, and massively parallelizes the boot sequence. I think 9 seconds are a huge slow down, so it'd be worth investigating whether the launch_daemon is at fault, or the lower layers of Haiku slow down that much when they have to work in parallel.
In any case, I have a short pause during the boot, and that's probably not a good thing :-)
follow-up: 12 comment:11 by , 9 years ago
and massively parallelizes the boot sequence
It should only launch as many jobs at once as there are cores -- launch, wait 200ms or so for that to cool down, etc. Right?
comment:12 by , 9 years ago
Replying to waddlesplash:
and massively parallelizes the boot sequence
It should only launch as many jobs at once as there are cores -- launch, wait 200ms or so for that to cool down, etc. Right?
If the boot process was CPU bound, reducing the number of active threads would make some -- though I doubt the performance hit with more active threads would be relevant. However, I think the boot process is mainly I/O bound, so the number of cores/hardware threads probably isn't a useful criterion.
comment:13 by , 9 years ago
At least a 5 second boot delay was fixed in hrev49561. I'm unsure what should happen with this ticket, as other concerns have been raised within the comments and not all of the increase in boot time is explained by the 5 second timeout.
comment:14 by , 9 years ago
I updated to hrev49591 and the boot and shutdown seemed to be faster. Just to make sure, I timed the process, and was indeed faster by about 4 seconds (comparable to pre-launch daemon times).
comment:15 by , 9 years ago
The previous post was for 64 bit. After updating the x86_gcc2 build to the same rev, I noted that boot and shutdown are considerably slower by several seconds.
follow-up: 23 comment:17 by , 9 years ago
I did a series of boots for Alpha 4.1, 32 bit and 64 bit. To be consistent, I decided the system to be booted as soon as the blue screen appears. Alpha 4.1 is a consistent 12 second boot. Next up is 64 bit at 18 seconds. 32 bit took 21 seconds. 32 bit would not shut down at all on two consecutive attempts. This issue is already tracked in another ticket. Otherwise, shutdown times are comparable for all three builds.
comment:18 by , 9 years ago
My earlier attempt at timing were probably based on a different criteria of when the system was considered to be fully booted up. I guess to be consistent in testing from now on I'll use the criteria from the last series I.e. The system is fully booted as soon as the blue screen appears.
comment:19 by , 9 years ago
So, it appears that launch daemon boot times are still lagging boot script times on the order of 6 to 9 seconds.
follow-up: 29 comment:20 by , 9 years ago
Please note comment:2. Testing agains A4.1 is irrelevant for this ticket as this release happened to long ago. What's important here is testing against pre-launch_daemon build.
P.S. Axel, is it possible to return the previous behaviour of app_server and show cursor ASAP. This way the overall feeling of boot speed increases dramatically.
comment:21 by , 9 years ago
Ok. I'll install a pre-launch daemon build and try a series of boots using the same criteria as the last series of tests.
comment:22 by , 9 years ago
The results of testing pre-launch daemon (hrev49408 x86_gcc2) boot times against the most current revision (hrev49591 x86_gcc2) did yield boot times that were about 2 or 3 seconds faster. I got consistent boot times of 18 -19 seconds, compared to 21 - 22. For whatever reason, 64 bit boots faster than either of these. Out of curiosity, I'll try a comparison of 64 bit boot times tomorrow.
comment:23 by , 9 years ago
Replying to vidrep:
I did a series of boots for Alpha 4.1, 32 bit and 64 bit. To be consistent, I decided the system to be booted as soon as the blue screen appears.
Unfortunately this isn't a good criteria at all. The introduction of the launch_daemon made the (userland) boot process more parallel. This means more servers are started at the same time. While this may make the individual server start up more slowly (simply due to more happening in the system at the same time), the overall boot time, i.e. the time it takes until all servers are ready, may still be reduced.
The blue screen may therefore appear later because the app_server init takes longer. However at the point it does appear, all the other servers have started as well and may even be ready completing the boot process. Before the launch_daemon the app_server would be started early and the blue screen appeared early due to that, all the other servers had to wait for this to happen though and would only be started then, extending the boot process away from the blue screen appearing.
There currently is no reliable way to determine when the system is fully booted. Making the criteria "once all servers are launched" will for example strongly favor the launch_daemon against the previous boot method because it launches them all at the same time. This doesn't mean that the launched servers are actually usable at that point though (the blue screen isn't necessarily shown yet at that point for example), making this criteria bogus.
One of the last things to happen in either boot method is the UserBootScript. It's still far from ideal as in the previous method it is launched in parallel with non-essential servers but after initial volumes are mounted, while with the launch_daemon it is launched in parallel to the initial volume mounting. Still it'd probably be the easiest way to get a more or less useful number out of this by putting a symlink to AboutSystem into "/boot/home/config/settings/boot/launch" (which will cause it to be launched when the UserBootScript runs) and take a look at the uptime when it appears.
Another measure could be to wait for the system to become idle after boot (monitoring the CPU use in ProcessController for example). This still has the problem for cases where the startup of a server isn't CPU but IO bound, but in the normal case it should be reasonably accurate.
comment:24 by , 9 years ago
one can use alert "uptime
" instead of starting AboutSystem, as the latter updates the time in its window. Or just wait for the HDD led to stop blinking.
follow-up: 26 comment:25 by , 9 years ago
Or even add uptime >> ~/uptime.log
at the end of the UserBootScript.
comment:26 by , 9 years ago
Replying to diver:
Or even add
uptime >> ~/uptime.log
at the end of the UserBootScript.
Yes, very clever, except that "up 0 minutes" doesn't really tell you much...
As another note: As the launch_daemon starts things in parallel, it profits from multiple CPUs. When running things in a single CPU VM the results between pre- and post-launch_daemon should be similar.
I've tested this over here with a gcc2 image before and after the launch daemon in kvm and the single CPU configuration was the same (10 vs. 10 seconds) while with 4 emulated CPUs (on a quad-core host) the results with the launch_daemon were better (11 vs. 12 seconds). I attribute the slowdown with more cores to emulation overhead and slow CPU bootstrap. The numbers are so similar that it's hard to tell if the launch_daemon helps things, it certainly doesn't make things drastically worse for me though. This will likely vary with hardware/emulation software used though.
A possible explanation for the 64 bit builds being quicker (apart from the better compiler and instruction set) could also be that there just aren't all drivers present.
comment:27 by , 9 years ago
Until a criterion is established for what constitutes "fully booted", then any further testing is just wasted time. I'm beginning to suspect that we may not have a problem here at all. Is there some method whereby the system itself can be automatically timed during the boot process and logged? Of greater concern to myself is the intermittent "no shutdown" problem, which manifested itself in recent builds.
comment:28 by , 9 years ago
I've added the system_time bin command which prints system_time() (the uptime in microseconds) and a script for determining the "boot complete" point in hrev49598. I'll also attach these to this ticket (for x86_gcc2) as on older hrevs they will obviously not be available. The established criterion is "all servers (and Tracker & Deskbar) are started and respond to messages", which is probably as close as one can get to the definition of a usable system. Note that this isn't super precise either as the used waitfor has a generous 100ms delay loop when waiting, but it should be reasonably close.
To take measurements, put the system_time binary and boot_time_logger.sh script into /boot/home/config/non-packaged/bin
and add the following line to /boot/home/config/settings/boot/UserBootScript
:
boot_time_logger.sh >> /boot/home/Desktop/boot_time.txt
After each reboot a new line should appear in that file on the desktop telling the uptime after the script waited for a reply of all servers.
With the results we should be able to finally decide if this ticket can be closed.
A side note regarding the shutdown problem: The launch_daemon is definitely innocent in this case, as it doesn't change anything in that regard. I have observed a regular crash in the BMediaRoster looper when the media_server quits during shutdown, which is a recent regression. In my configuration this does not prevent shutdown, but it might be different depending on timing.
follow-up: 34 comment:29 by , 9 years ago
Replying to diver:
P.S. Axel, is it possible to return the previous behaviour of app_server and show cursor ASAP. This way the overall feeling of boot speed increases dramatically.
Nothing really changed there; the app_server will show the blue screen once an application tries to connect to it -- and that's also when the input_server is launched. On my system the time span between I last see the rocket icon, and a completely booted system (ie. moving cursor on Tracker with all windows open) is less than a second.
One way to improve on that would be to delay the blue screen until the first window opens. I've already started doing something like that in my "background" branch on github (https://github.com/axeld/haiku/tree/background). However, it's not yet complete.
comment:30 by , 9 years ago
To take measurements, put the system_time binary and boot_time_logger.sh script into /boot/home/config/non-packaged/bin and add the following line to /boot/home/config/settings/boot/UserBootScript: boot_time_logger.sh >> /boot/home/Desktop/boot_time.txt
I did as instructed. However, after a reboot, the "boot_time.txt" created on the desktop is empty.
follow-up: 32 comment:31 by , 9 years ago
You probably forgot to chmod +x /boot/home/config/non-packaged/bin/system_time
.
comment:32 by , 9 years ago
Replying to diver:
You probably forgot to
chmod +x /boot/home/config/non-packaged/bin/system_time
.
The solution was to chmod the boot time logger script rather than the bin. Anyway, your tip helped solve the problem. Thanks!
comment:33 by , 9 years ago
Attached are my test results using hrev49605 x86_gcc2 and hrev49292 x86_gcc2 (pre-launch daemon). They both have virtually identical boot times. Just as an experiment I copied the same bin and boot script to Alpha 4.1, and uncommented #application/x-vnd.haiku-package_daemon in the boot script, since the package daemon was non-existent. I attached this as well simply for sake of comparison. LD_boot_time (launch daemon boot time) PLD_boot_time (pre-launch daemon boot time) Alpha_boot_time (Alpha 4.1 boot time)
by , 9 years ago
Attachment: | LD_boot_time.txt added |
---|
by , 9 years ago
Attachment: | PLD_boot_time.txt added |
---|
by , 9 years ago
Attachment: | Alpha_boot_time.txt added |
---|
comment:34 by , 9 years ago
Replying to axeld:
Nothing really changed there; the app_server will show the blue screen once an application tries to connect to it -- and that's also when the input_server is launched. On my system the time span between I last see the rocket icon, and a completely booted system (ie. moving cursor on Tracker with all windows open) is less than a second.
One way to improve on that would be to delay the blue screen until the first window opens. I've already started doing something like that in my "background" branch on github (https://github.com/axeld/haiku/tree/background). However, it's not yet complete.
Delay the blue screen? I was meaning exactly an opposite :) I meant if it was possible to show blue screen with cursor faster (as was with Boostscript). Quoting PulkoMandy: "we stay longer on the boot screen but after that we get immediately to the desktop with Tracker and Deskbar running already here. Before, the app_server would start earlier but then there would be just the empty desktop and mouse cursor for some time."
follow-up: 36 comment:35 by , 9 years ago
But that was already fixed, at least for me, in hrev49561. I get the empty blue screen again. Now most of the time spent on boot on my machine is on the "disk" icon, which I think is before launch daemon is even started. It is longer if I have more disks connected to the system (audio CD inserted, SD card, USB drives, etc); but this is not related to the launch daemon.
So I think we can close this ticket now and try to optimize the early parts of the boot process to get it back to alpha4 speed. According to vidrep's benchmarks the launch daemon only causes a 200ms increase in boot time, and I guess this is with only one CPU?
follow-up: 37 comment:36 by , 9 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
Replying to pulkomandy:
So I think we can close this ticket now and try to optimize the early parts of the boot process to get it back to alpha4 speed. According to vidrep's benchmarks the launch daemon only causes a 200ms increase in boot time, and I guess this is with only one CPU?
Yes, let's close this.
Note that the alpha4 had serial debugging disabled by default, so if these tests were all with default settings, it's fully expected to have a boot time difference of several seconds. Local tests showed that boot time dropped from the above mentioned 10 to somewhat over 6 seconds when serial debugging was disabled, so it can be quite dramatic (because serial output is really slow and there's a lot of it, sometimes more and sometimes less depending on the drivers involved).
So for the sake of comparison it'd be interesting to have the boot times of a post launch_daemon build with serial debugging disabled in /boot/home/config/settings/kernel/drivers/kernel
.
comment:37 by , 9 years ago
Replying to mmlr:
Note that the alpha4 had serial debugging disabled by default, so if these tests were all with default settings, it's fully expected to have a boot time difference of several seconds.
The comparison with alpha 4 is also not quite fair, because, as had been reported, the introduction of package management slowed down the boot several seconds. One reason for that is probably the additional serial debug output. The hope was that the packaged system would actually boot faster, due to the compressed code increasing the effective I/O throughput. However, under virtualization the host I/O caches negate that effect -- I/O will be very fast (after the first boot) while the decompression time slows down the packaged boot. And unfortunately the current compression algorithm used is relatively slow -- switching to a faster one (e.g. LZ4) would help quite a bit.
follow-ups: 39 46 comment:38 by , 9 years ago
Resolution: | fixed |
---|---|
Status: | closed → reopened |
Hi,
I see that this ticket is closed as fixed, but I experienced an evident boot delay with hrev50067, where the launch_daemon is present. The last time that i run Haiku on my computer (Pentium dual core E5300 @ 2.60GHz - 6 GB DDR3 of ram) was about one year ago, and the boot time was fairly quick (on my system I have the same hpkgs and the same settings).
But now, on hrev50067, I noticed that the boot screen will delay for about 12 seconds at the disk icon.This delay is also occurred when I reinstalled Haiku, so without any hpkgs and without startup items.
and is never happened for me the last time that I booted Haiku one year ago.
I attempted to run the boot_time_logger, then I opened the boot_time log and there was only a message which stated "Failed to get a reply for application/x-vnd.Haiku-cddb_daemon"; in facts the cddb_daemon is not currently active on my system, although that is present in the file /boot/system/data/launch/system
Maybe is due to this bug ticket:12634 ?
comment:39 by , 9 years ago
Replying to Giova84:
I attempted to run the boot_time_logger, then I opened the boot_time log and there was only a message which stated "Failed to get a reply for application/x-vnd.Haiku-cddb_daemon"; in facts the cddb_daemon is not currently active on my system, although that is present in the file /boot/system/data/launch/system
Well, i removed the reference to the cddb_daemon from the boot_time_logger and now I register the boot time: it takes 23342952 microseconds, which are 23,342952 seconds. Is absolutely too much compared to my previous installation from one year ago.
follow-up: 42 comment:40 by , 9 years ago
It could be something else than the launch_daemon, for example, booting is slower currently if you have a lot of things in /system/packages/administrative. Maybe try cleaning the old "state" directories there?
follow-up: 43 comment:41 by , 9 years ago
@Giova84: It may very well be related to the issue you reported with #12634, i.e. something AHCI related keep filling up your syslog.
comment:42 by , 9 years ago
Replying to pulkomandy:
It could be something else than the launch_daemon, for example, booting is slower currently if you have a lot of things in /system/packages/administrative. Maybe try cleaning the old "state" directories there?
I cleaned the old state directories but this doesn't changed anything about the boot time. However I also noticed that if I also boot Haiku hrev50067 from the live cd (in the live cd there is no old state directories :-) ) , occurs the identical issue at the boot screen: the disk icon hangs for a lot of seconds. I didn't see this behaviour before on my system.
comment:43 by , 9 years ago
Replying to humdinger:
@Giova84: It may very well be related to the issue you reported with #12634, i.e. something AHCI related keep filling up your syslog.
There is a way to temporarly disable the syslog? In the /boot/home/config/settings/kernel/drivers/kernel file I just disabled the serial debug output.
by , 9 years ago
Attachment: | previous_syslog added |
---|
by , 9 years ago
comment:44 by , 9 years ago
However, for investigation reasons, I attached the syslog file and the previous_syslog file (the second one due to the fact that the plain syslog is spammed with the same messages).
comment:45 by , 9 years ago
Resolution: | → fixed |
---|---|
Status: | reopened → closed |
The launch_daemon is started when the rocket icon lits up.
comment:46 by , 9 years ago
Replying to Giova84:
Hi,
I see that this ticket is closed as fixed, but I experienced an evident boot delay with hrev50067, Maybe is due to this bug ticket:12634 ?
Well, the bug described in ticket:12634 and in ticket:12415 seems the culprit of this boot delay. As I've said in ticket:12415#comment:13 I tried with an old revision (hrev49662) and Haiku is booted fast as usual.
Boot times definitely took a hit here in VBox, I've been meaning to file a bug and provide some metrics but admittedly laziness got the better of me.