Opened 12 years ago

Last modified 5 months ago

#1074 assigned enhancement

Identify and fix performance bottlenecks in our app_server

Reported by: axeld Owned by: jua
Priority: normal Milestone: R1
Component: Servers/app_server Version:
Keywords: Cc: kaoutsis
Blocked By: Blocking:
Has a Patch: no Platform: All

Description

The current app_server implementation is mostly not optimized yet. As this is actually noticeable, and we aren't that close to the speed of BeOS R5 yet, this should be changed, though :-)

Change History (21)

comment:1 Changed 12 years ago by kaoutsis

Cc: kaoutsis@… added

comment:2 Changed 12 years ago by kaoutsis

i find interesting this topic. Can you think a way, that i could help?

comment:3 Changed 12 years ago by wkornewald

Cc: kaoutsis stippi bonefish added; kaoutsis@… removed

Well, you'd have to use a profiler and compile with profiling options. Stephan has already done some profiling. I think you need to add this to the Jamfiles for the app_server (Ingo will know better):

{
	SubDirCcFlags -pg ;
	SubDirC++Flags -pg ;
}

I also found a short intro to this topic. I've rarely used profilers, though, and I don't know if you will also need to set linker flags in the Jamfile. You could just try and see if it works. I suggest that you ask on the main mailing list if you need more (=better ;) help.

comment:4 Changed 12 years ago by axeld

It will also require us to measure the impact of context switches, and to find ways to reduce their cost (ie. by trying to make sure that drawing threads always use their full quantum if there is something left to draw).

comment:5 Changed 12 years ago by bonefish

The SubDirCcFlags/SubDirC++Flags should work. You can also add the compiler flags in your build/jam/UserBuildConfig file, which is the recommended way, since you don't need to meddle with any Jamfiles that are under version control. To add the "-pg" flag in src/servers/app and all subdirectories use the following:

  AppendToConfigVar CCFLAGS : HAIKU_TOP src servers app : -pg : global ;
  AppendToConfigVar C++FLAGS : HAIKU_TOP src servers app : -pg : global ;

If you want to add it only for specific directories, replace the "global" by "local" and add respective lines for the directories you're interested in.

comment:6 Changed 12 years ago by kaoutsis

Does the profiling switches -pg working for sure in gcc 2.95.3-beos-060710 ? If i enable the switches, it seems that i can not produce gmon.out: $ gprof -v GNU gprof 2.15 $ gprof MyCompiledAndLinkedWithProf gmon.out: No such file or directory

comment:7 Changed 12 years ago by wkornewald

Did you run the app before you started the profiler? I.e.: $ MyCompiledAndLinkedWithProf $ gprof MyCompiledAndLinkedWithProf

comment:8 in reply to:  7 Changed 12 years ago by kaoutsis

Replying to wkornewald: Yes, i did.

comment:9 in reply to:  5 ; Changed 12 years ago by kaoutsis

Replying to bonefish: Very elegant. Now something to finish that work for linking: in my hrev5 & gcc 2.95.3-beos-060710 need to link with /boot/develop/lib/x86/i386-mcount.o to resolve the profiling symbols.

comment:10 Changed 12 years ago by axeld

You're misusing gprof :-) You're supposed to run the executable, and quit it (killing it won't help), then it writes out a file called something like profile* - and *that* file is the one you have to use gprof on. Because of these semantics, you can more or less only use profiling in the app_server test environment. Or write your own profiling code (instead of using the one provided by i386-mcount.o).

comment:11 in reply to:  10 Changed 12 years ago by kaoutsis

Replying to axeld:

You're misusing gprof :-)

this is a nice opportunity to clear this issue: a) i follow this guide http://www.network-theory.co.uk/docs/gccintro/gccintro_80.html that Waldemar kindly provided. Doesn't working on hrev5 as described, (did i miss something?) but i believe on linux it would (i will test it tomorrow to be sure). b) I just found this: taken from our old newsletter issue 20: [...] Daniel Reinhold writes: [...] There is a GNU profiler that is included with many POSIX systems called 'gprof'. It is included with BeOS R5 as well (or at least with the developer tools mentioned above), but frankly, it doesn't work. However, Be did provide their own profiler called "profile". [...] Could you please clear the situation: Does gprof working on hrev5, should we use be's profile, or we may use Daniel's Ezprof (it is working fine), and we may extend to fit our needs?

You're supposed to run the executable, and quit it (killing it won't help),

Yes, i already do that, ...

then it writes out a file called something like profile* -

Yes, this file is generated...

and *that* file is the one you have to use gprof on.

How?

Because of these semantics, you can more or less only use profiling in the app_server test environment. Or write your own profiling code (instead of using the one provided by i386-mcount.o).

That is a very important information to go on. Thanks!

comment:12 in reply to:  9 ; Changed 12 years ago by bonefish

Replying to kaoutsis:

Replying to bonefish: Very elegant. Now something to finish that work for linking: in my hrev5 & gcc 2.95.3-beos-060710 need to link with /boot/develop/lib/x86/i386-mcount.o to resolve the profiling symbols.

Just use the LinkAgainst rule to add it:

  LinkAgainst app_server : /boot/develop/lib/x86/i386-mcount.o ;

Note, that the dependency is old, so the first time you might need to remove the generated executable or otherwise trigger re-linking.

comment:13 in reply to:  12 Changed 12 years ago by kaoutsis

Replying to bonefish: Thank you.

comment:14 Changed 12 years ago by kaoutsis

gprof is working perfectly on linux, as described in the Waldemar's short info (but unfortunately not on hrev5). Also gprof is found in binutils package.

comment:15 Changed 12 years ago by axeld

Sorry, my bad. Under BeOS, you can just use "bprof" instead of "gprof"; usage is the same.

comment:16 Changed 12 years ago by stippi

Hi,

if someone manages to get gprof/bprof working properly on R5 (for use in the test environment), that would be great. Don't forget to tell me how... :-)

I did manage to get some useful info from ezprof, but I fixed all issues to the point that the info retrieved from ezprof is now pretty much inconclusive.

I have thought very much about the performance issues, and have come to the conclusion, that the problem could very well be in the way "update sessions" work. Currently, the app_server maintains repainting on the *window* level of things. There are two update sessions (per window): a current and a pending update session. The client window paints views in the current update session which touch the dirty region. The problem here is, that the mechanism is probably too "coarse": The update session cannot grow after the window has been informed that repainting needs to be done. So additional dirty regions are placed in the next (pending) update session. When the window finally gets to repainting a particular view after having painted a few views, large portions of it might already be dirty in the next update session. So it has to be painted again. If the update sessions worked on the *view* level, instead of the window level (as seems to be the case in R5, since there is one update message per view), the dirty region (for each view) might grow for a much longer time on the server side (while the client window is busy painting other views), and the window might have to paint the view only once.

I have yet to think of a way, how the update stuff can be moved to the view level, without wasting too much resources for all those regions. But I think this is defenitely one very important reason for the current "slowness" of the app_server. The drawing speed itself is likely not the problem for the majority of the views you see during typical use.

HTH, Stephan

comment:17 Changed 12 years ago by kaoutsis

I also made some research about ezprof, since i am interesting to make it working in haiku also. The main problem about ezprof is well... speed. From my point of view, if ezprof can speed up, is very useful. To give an example of slowness: i take the collatz.c from http://www.network-theory.co.uk/docs/gccintro/gccintro_80.html for examing with ezprof and the results were the same as gprof (in linux), with a difference: ezprof take 3 hours! to complete. The "profile" the most speedy profiling tool (is a tool that comes with bprof) give the same results to the same machine in 20 seconds. ps: i guess collatz.c is the test for profilers:-)

bye,

Vasilis

comment:18 Changed 12 years ago by axeld

"profile" is using a very different concept, though: it just periodically checks where the executable is and counts the functions. That's very cheap, but also very inexact. Also, profilers that work via instrumenting like ezprof make function calls much more expensive, and thus, may distort the results.

comment:19 Changed 9 years ago by axeld

Cc: stippi bonefish removed
Owner: changed from axeld to nobody
Version: R1/pre-alpha1

comment:20 Changed 2 years ago by pulkomandy

Owner: changed from nobody to jua
Status: newassigned

comment:21 Changed 5 months ago by waddlesplash

Is it worth leaving this open? We can open new tickets for specific still-existing issues.

Note: See TracTickets for help on using tickets.