Opened 18 years ago

Last modified 4 years ago

#1071 new enhancement

Identify and fix performance bottlenecks in the kernel and I/O subsystems

Reported by: axeld Owned by: nobody
Priority: normal Milestone: R1.1
Component: System/Kernel Version:
Keywords: Cc: kaoutsis@…
Blocked By: Blocking:
Platform: All

Description

Just to give an example, the current syscall mechanism is much slower than it should be.

Attachments (6)

libMicro-make-output.txt (1.9 KB ) - added by kaoutsis 18 years ago.
vm.cpp.diff (345 bytes ) - added by kaoutsis 18 years ago.
area-results.tar.bz2 (90.5 KB ) - added by kaoutsis 17 years ago.
Some results!
area_tests.results (7.6 KB ) - added by kaoutsis 17 years ago.
Updated with fragmentation results
area_creation_test.cpp (17.8 KB ) - added by kaoutsis 17 years ago.
area_tests2.results (3.2 KB ) - added by kaoutsis 17 years ago.

Download all attachments as: .zip

Change History (24)

comment:1 by kaoutsis, 18 years ago

Cc: kaoutsis@… added

comment:2 by kaoutsis, 18 years ago

How i could help here?

comment:3 by axeld, 18 years ago

Run/write performance test apps, see where Haiku sucks in comparison with BeOS and/or FreeBSD/Linux/Windows, then find out why it does - and try to fix it :-) Alternatively, you could write profiling code that can be used in the kernel and use the output from that to see what should be optimized.

For example you could create an app that allocates and frees lots of kernel resources (like sems, areas, ports, ...), maybe even in multiple threads, and see how the performance is compared to BeOS.

in reply to:  3 comment:4 by kaoutsis, 18 years ago

Replying to axeld: Ok, nice idea. For a start i will make some little apps for comparison haiku/hrev5.

comment:5 by axeld, 18 years ago

Note "src/tests/system/benchmarks" where are already some performance testers. There are also some public portable tests that might be interesting - when not testing some special BeOS/Haiku API, platform independent tests are preferred, of course.

comment:6 by axeld, 18 years ago

Oh, and thanks :-)

in reply to:  5 comment:7 by kaoutsis, 18 years ago

Replying to axeld: i am willing to port (or better compile and run in both hrev5 and haiku) the LMbench - Tools for Performance Analysis from http://www.bitmover.com/lm/lmbench/ , and from the other hand to reveal other problems (missing headers, missing functions from libroot, etc), before R1 comes. Is it worthing the effort for that lmbench or not?

comment:8 by axeld, 18 years ago

lmbench is quite dated, but it might still be worth the effort (after all, it tests various things :-)). Also interesting could be libmicro from Sun http://www.opensolaris.org/os/community/performance/libmicro/ But beyond those UNIX heavy test suites, something that compares ports/sems/area creation/deletion/etc. between BeOS and Haiku is definitely worth a look, too.

in reply to:  8 comment:9 by kaoutsis, 18 years ago

Replying to axeld: i have both on my disk now, i have compiled successfully lmbench2 in linux. Both on hrev5 and haiku the complained about missing headers... for the libmicro i will attach the make output. For the area test i will attach here my first attempt to write something useful.

by kaoutsis, 18 years ago

Attachment: libMicro-make-output.txt added

by kaoutsis, 18 years ago

Attachment: vm.cpp.diff added

comment:10 by kaoutsis, 18 years ago

Even though, i am aware of the new protections that wm.cpp have, i made all the switches that create_area can take: (B_ANY_KERNEL_ADDRESS, etc) for testing purposes. So i managed to kdl hrev5... with my own hands:) I am planning to make more test for areas, sems, ports, etc to study bug #1071.

Playing around with it, a made a small patch for vm.cpp Log: if the requested area has zero size don't let the app crash (in this case: strcpy); return a B_BAD_VALUE instead, as hrev5 does.

But i am still skeptic, if this is right. The userspace program might want to create the area with zero size, (as a kind of initialization) and call resize_area with some real value later. In this case some other protection means should be taken, to avoid crashing. (I don't have any idea for this yet). It's up to you.

by kaoutsis, 17 years ago

Attachment: area-results.tar.bz2 added

Some results!

comment:11 by kaoutsis, 17 years ago

If the above measurements are correct, what i found briefly is this: a) create_area() on haiku is more faster than beos; b) delete_area() on haiku is significantly slower.

by kaoutsis, 17 years ago

Attachment: area_tests.results added

Updated with fragmentation results

comment:12 by kaoutsis, 17 years ago

  • Attached a new file with test-results: area_tests2.results.
  • The file area_creation_test.cpp has been updated, (now writes only to the

first byte of each area).

by kaoutsis, 17 years ago

Attachment: area_creation_test.cpp added

comment:13 by kaoutsis, 17 years ago

Summary of area_creation_test:

  • haiku's create_area is indeed faster than hrev5.
  • the delete_area() is still an issue (tested with hrev22045, i guess it needs an update)
  • It seems that touching a page costs double on haiku!

Todo: update the area_tests2.results more regularly with the new revisions.

by kaoutsis, 17 years ago

Attachment: area_tests2.results added

comment:14 by kaoutsis, 17 years ago

i updated the file area_tests2.results, with hrev24702, some comments after running the test program for 2 hours, allocating, and freeing all the available memory with various ways:

  • haiku's create_area() has an almost constant cost of 21 - 25 us

(some very rare cases 1 per 5 the cost may reach 87 us, which is the maximum value that haiku gives)

  • haiku's delete_area() has been significantly improved since hrev22045;

now the value is related to the number and the size of the areas, the overall cost not only has been reduced, but stays the same regardless of the "memory overhead" of the system. Still the haiku numbers is somehow bigger than the hrev5 equivalent.

  • touching a page costs a bit more than the cost in hrev22045 (approximately 20% - 25%)

Stressing the system more (allocating and freeing all the available memory for more than an hour with the test program) the cost of touching a page is increased a bit more 10% - 15%, but after two hours seems to stabilized to this maximum value.

comment:15 by bonefish, 17 years ago

Thanks for the update! As expected Michael's kernel heap fixes have fixed the worst problem. I suppose now the page fault performance deserves to be looked into -- I wouldn't see why we should be slower than BeOS.

in reply to:  15 comment:16 by kaoutsis, 17 years ago

Replying to bonefish:

Thanks for the update! As expected Michael's kernel heap fixes have fixed the worst problem. I suppose now the page fault performance deserves to be looked into -- I wouldn't see why we should be slower than BeOS.

yes, that would be great.

comment:17 by axeld, 15 years ago

Owner: changed from axeld to nobody
Version: R1/pre-alpha1

comment:18 by pulkomandy, 4 years ago

Milestone: R1R1.1
Note: See TracTickets for help on using tickets.