Opened 10 years ago

Last modified 4 years ago

#3011 new enhancement

Filling Tracker windows with many files takes too long

Reported by: stippi Owned by: aldeck
Priority: normal Milestone: Unscheduled
Component: Applications/Tracker Version: R1/pre-alpha1
Keywords: Cc:
Blocked By: Blocking: #3974
Has a Patch: no Platform: All

Description (last modified by aldeck)

If you don't have a folder with many files (like 30000+), you can reproduce it by running a query for all files (Alt-F + Enter). I have at least over 58000 files on both partitions and I am still watching the files dripple into the list as of now. The first couple thousands will go in quickly enough, although it could be argued it should be faster. But after 20000 files, it becomes apparent that something somewhere doesn't scale very well, and I am strongly guessing that it's the list insertion in Tracker. The sorting to find the right index is probably binary search based and fast enough, but then I suppose (without having looked at the code), that it updates the vertical position of all the following items. If that's indeed what's happening, one could also mark the position as invalid until encountering the first item with an invalid position (which would result in items with invalid positions starting at some index until the last item). The correct position could then be retrieved once it's needed (when drawing). Maybe there is even more room for optimization in the code. (Now my list contains 66500 entries and the harddrive LED is barely flashing with lots of CPU usage inbetween.)

Change History (25)

comment:1 Changed 10 years ago by stippi

Description: modified (diff)

comment:2 Changed 10 years ago by bonefish

I recall having looked into the poor scaling some years ago. Graphical updates (respectively updates of GUI related data) might be a problem as well, but one is definitely that the underlying model is a BList (or BObjectList) which is O(n) for random inserts, which is exactly what happens due to the sorting.

Rumors have it that there's a profiling tool for Haiku to examine such stuff. ;-)

comment:3 in reply to:  2 Changed 10 years ago by aldeck

Replying to bonefish:

[...] one is definitely that the underlying model is a BList (or BObjectList) which is O(n) for random inserts, which is exactly what happens due to the sorting.

Definitely, i've seen lots of O(n) (or worse) stuff in PoseView.cpp.

Rumors have it that there's a profiling tool for Haiku to examine such stuff. ;-)

I'd really like to use that :) Do we have some docs/example/notes on how to use it yet or should i dig the sources?

comment:4 Changed 10 years ago by stippi

Just try profile --help. :-)

comment:5 in reply to:  4 ; Changed 10 years ago by aldeck

Replying to stippi:

Just try profile --help. :-)

Hmm :) nice and simple. However i get lots of "no functions were hit" on the threads i'm interested in. Tried different options without luck...

comment:6 in reply to:  5 ; Changed 10 years ago by bonefish

Replying to aldeck:

Replying to stippi:

Just try profile --help. :-)

Hmm :) nice and simple. However i get lots of "no functions were hit" on the threads i'm interested in. Tried different options without luck...

That doesn't sound right. Will have a look...

comment:7 in reply to:  6 Changed 10 years ago by bonefish

Replying to bonefish:

Replying to aldeck:

Replying to stippi:

Just try profile --help. :-)

Hmm :) nice and simple. However i get lots of "no functions were hit" on the threads i'm interested in. Tried different options without luck...

That doesn't sound right. Will have a look...

Fixed in hrev28441.

If you have Linux/*BSD and can run KCachegrind, I can recommend the valgrind (callgrind) output. It allows to see both inclusive and exclusive function costs as well as call relationships. It's a bit annoying to find the right threads (hint: switch to "Time" (is in ms) and absolute costs).

The results are definitely interesting. :-)

comment:8 Changed 10 years ago by aldeck

Thanks for the quick fix! Now i'm not too sure if i want to see those results :D

comment:9 Changed 10 years ago by aldeck

Ok, found several perpetrators, and some info on a bug! (see #3054)

On pose creation three ops are to blame:

  • Model::CloseNode
  • PoseView::AddMimeTypes
  • FSClipboardFindNode (in BPose constructor)
  • and PoseView::FindPose to a lesser extent.

I'll investigate further in the next days, not accepting the ticket though, lacking time... Please continue the work on this if you'd like, i'm temporarily off.

comment:10 Changed 10 years ago by anevilyak

Is that profiling data for AddPosesTask? Or does it take into account the window thread as well? Most of the work for populating is actually done by ContainerWindow, the addposestask thread just grabs entry refs, does some minor ops on them and packages them up into a message that's sent to the window thread to process them and add to the visible list, etc.

comment:11 Changed 10 years ago by aldeck

Sorry forgot to precise. This is only the window thread, AddPosesTask works well (performance wise).

The problematic calls are coming from the CreatePoses code path in the window thread. This was the profiling of opening a directory for browsing with ~20000 items in it.

The most expensive call is FSClipoardFindNodeMode done on each pose creation. We briefly discussed about this with Rene on irc, it is only used to determine if a pose should be drawn transparent (ex: cut and paste). The costy op here is the locking of the clipboard. So a solution has to be found to get this info without locking the clipboard for each pose. Ideas welcome :)

As for the other calls (CloseNode, AddMimeTypes), they started to show their slowness when profiling with the FSClipboardFindNodeMode disabled, but that might be due to another side effect (#3054 will tell us maybe).

comment:12 Changed 10 years ago by aldeck

Owner: changed from axeld to aldeck
Status: newassigned

Ok, i made some good progress on this one, i've reimplemented AddMimeTypes wich had exponential complexity, still FSClipoardFindNodeMode to do (and FindPose maybe). While investigating, i found that the potential speedup will be almost x10 for a 20000 files folder, going from 120s too 17s. Still some work to do, accepting the ticket.

comment:13 Changed 10 years ago by aldeck

For reference, the AddMimeType issue has been fixed in hrev28891 .

Now i'd like to address the 'FindPose' issue, see this check before each Pose creation:

line 1610 in PoseView.cpp :

if (FindPose(model) || FindZombie(model->NodeRef())) {
			// we already have this pose, don't add it
			watch_node(model->NodeRef(), B_STOP_WATCHING, this);
			delete model;
			if (resultingPoses)
				resultingPoses[modelIndex] = NULL;
			continue;
		}

The FindPose search being o(n) this hurts performance a lot. Searching the ZombieList should be ok as it should be quite small if not empty most of the time.

Now i've reviewed a lot the code and i can't imagine a case where a Model could be added two times... I added debug output here and it never happens with my tests. Was this a mistake or a paranoid check? Ideas welcomed :)

comment:14 Changed 10 years ago by aldeck

Note that the search use a comparison of the entry_ref of the models so we're looking for a case where a Pose with an equal entry_ref already exists.

comment:15 in reply to:  13 Changed 10 years ago by stippi

Description: modified (diff)

Replying to aldeck:

For reference, the AddMimeType issue has been fixed in hrev28891 .

Now i'd like to address the 'FindPose' issue, see this check before each Pose creation:

line 1610 in PoseView.cpp :

if (FindPose(model) || FindZombie(model->NodeRef())) {
			// we already have this pose, don't add it
			watch_node(model->NodeRef(), B_STOP_WATCHING, this);
			delete model;
			if (resultingPoses)
				resultingPoses[modelIndex] = NULL;
			continue;
		}

The FindPose search being o(n) this hurts performance a lot. Searching the ZombieList should be ok as it should be quite small if not empty most of the time.

Now i've reviewed a lot the code and i can't imagine a case where a Model could be added two times... I added debug output here and it never happens with my tests. Was this a mistake or a paranoid check? Ideas welcomed :)

comment:16 Changed 10 years ago by aldeck

Description: modified (diff)

It looks like you edited the decription by mistake :) Let me quote your answer here:

stippi:

It looks to me like a paranoid check, but maybe it prevents a corner case because of some stuff

happening asynchronously. I can't really tell. Maybe turn it into an assert which is not compiled

in in release mode?

Yes, i could do that. Otherwise if we really need that, it's always possible to do a check for dupes on the whole poseList, only one time, after adding all the poses. Using a temporary hash_set of entry_refs and checking if insertions succeeds. This should be much faster.

comment:17 Changed 10 years ago by anevilyak

Some more improvements made in hrev29198. Please check again.

comment:18 Changed 10 years ago by anevilyak

How is it with hrev29395 or later?

comment:19 Changed 10 years ago by stippi

It's much faster, but it's still slowing down noticeably after a while. I would leave the ticket open for the time being.

comment:20 Changed 10 years ago by stippi

Actually, once the files are in the cache, it's not too bad at all. It's night and day compared to ZETA.

comment:21 in reply to:  19 Changed 10 years ago by anevilyak

Replying to stippi:

It's much faster, but it's still slowing down noticeably after a while. I would leave the ticket open for the time being.

No worries, there are plenty more opportunities for optimization, will need to run the latest code in profile and see what the culprits are now, though I know for sure that Tracker's drawing code could use a cleanup/rewrite. You'd probably be the better candidate than me for that part though :)

In any case, could you try something for me?

http://dev.haiku-os.org/browser/haiku/trunk/src/kits/tracker/PoseView.cpp#L102 - change this to 50 and see if that makes any perceptible difference for you. We'll have to be careful changing that parameter too heavily though since it's somewhat of a balance between performance and responsiveness while populating (this controls how many entries the disk reader thread sends to the window thread per batch to populate, though it's also enforced by a time constraint (see http://dev.haiku-os.org/browser/haiku/trunk/src/kits/tracker/PoseView.cpp#L1365 ). Let me know please, that parameter could probably be tweaked a bit for more modern systems with faster disk/mem subsystems.

comment:22 Changed 10 years ago by aldeck

Milestone: UnscheduledR1
Owner: changed from aldeck to anevilyak
Status: assignednew

Reassigning to you Rene as you seem more motivated than me on this one atm :) Don't forget to bench your progresses, it's always interesting.

comment:23 Changed 10 years ago by anevilyak

Owner: changed from anevilyak to aldeck

Back to you since your spatial cache rewrites will probably impact this issue :)

comment:24 Changed 4 years ago by waddlesplash

Milestone: R1Unscheduled

Moving Tracker enhancement tickets out of R1 milestone -- Tracker's source code comes from BeOS R5, so it already has all the features it did on R5.

comment:25 Changed 4 years ago by diver

Blocking: 3974 added

(In #3974) Most likely a dupe of #3011.

Note: See TracTickets for help on using tickets.