Opened 14 months ago
Last modified 14 months ago
#18650 new bug
[query-cli] it takes a lot of time with the initial search
Reported by: | tzu_mi | Owned by: | axeld |
---|---|---|---|
Priority: | normal | Milestone: | Unscheduled |
Component: | File Systems/BFS | Version: | R1/Development |
Keywords: | Cc: | ||
Blocked By: | Blocking: | ||
Platform: | All |
Description
Hi, I and some other user noticed that the query-cli takes a lot of time when the command is ran for the first time, but it become a lot quicker is the same search is repeated later, as if if it builds up a cache, I'm experiencing this on Walter hrev 57315 x86_64 and previous revs (I cannot remember for how long I'm experiencing this but it's quite a long time, never reported before, my fault)
~/Desktop> time query -af '((name=="**")&&(META:group=="**"))' [omitted] real 0m20.704s user 0m0.054s sys 0m2.914s ~/Desktop> time query -af '((name=="**")&&(META:group=="**"))' [omitted] real 0m0.761s user 0m0.051s sys 0m0.708s time query -af '((name=="**")&&(BEOS:TYPE=="text/memo"))' [omitted] real 0m19.852s user 0m0.051s sys 0m10.166s time query -af '((name=="**")&&(BEOS:TYPE=="text/memo"))' [omitted] real 0m1.251s user 0m0.048s sys 0m1.202s
Change History (5)
follow-up: 3 comment:1 by , 14 months ago
Component: | - General → File Systems/BFS |
---|---|
Owner: | changed from | to
Platform: | x86-64 → All |
comment:2 by , 14 months ago
What's considered a large filesystem? The first query -a name=*whatever*
is always pretty slow on a 5 GB partition for me. (I should probably script a dummy query on boot, just so it get ready for when I want to actually use it :-P).
Also... doesn't this has to do more with PackageFS than BFS? I mean... on BFS the indexes are "there" in the disk, but for activated packages those need to be computed/gathered on each boot, or I'm just mistaken on how that works ?
Edit: did some tests, I see that /boot/system only indexes: last_modified, size, name, and BEOS:APP_SIG.
So my slow queries are down to being either for non-indexed attributes, or the difference between using exact match vs "glob" queries (eg. name=test
vs name=*test*
), plus my slow hardware and number of files.
Edit 2: Some timing results on my system (sounds on par with what I get on my 32 bits beta4, 5 GB partition, with similar amount of small files).
comment:3 by , 14 months ago
Replying to waddlesplash:
IF you are running queries on unindexed attributes, or on an especially large filesystem, I think this is expected and there's not much to be done about it.
Not so big, 57.25 GiB, 4096 bytes/block, on a USB3 pendrive, it's obviously indexed
~/Desktop> lsindex /boot Audio:Album Audio:Artist Audio:Track BEOS:APP_SIG BEOS:LOCALE_LANGUAGE BEOS:LOCALE_SIGNATURE Calendar:ID Category:Name Event:Category Event:Description Event:End Event:Name Event:Place Event:Start Event:Status Event:Updated MAIL:account MAIL:account_id MAIL:beam/identity MAIL:beam/imap-uid MAIL:cc MAIL:chain MAIL:draft MAIL:flags MAIL:from MAIL:name MAIL:pending_chain MAIL:priority MAIL:read MAIL:reply MAIL:status MAIL:subject MAIL:thread MAIL:to MAIL:when MEMO:keyw MEMO:title META:address META:city META:company META:country META:county META:email META:fax META:group META:hphone META:keyw META:mphone META:name META:nickname META:state META:status META:url META:wphone META:zip Media:Genre Media:Rating Media:Title Media:Year _signature _status _trk/qrylastchange _trk/recentQuery be:deskbar_item_status last_modified name size
~/Desktop> df Type Total Free Flags Device Mounted on --------- --------- --------- ------- ------------------------ ----------------- bfs 57.3 GiB 40.2 GiB QAM-P-W /dev/disk/usb/0/0/0 /boot
comment:4 by , 14 months ago
even specifying only the /boot volume and with no result (I've deleted all the person files), the first search hangs for ~20 secs,the second search with the same command takes half a second.
~/Desktop> time query -v /boot -f '((name=="**")&&(META:group=="**"))' real 0m21.497s user 0m0.051s sys 0m2.513s ~/Desktop> time query -v /boot -f '((name=="**")&&(META:group=="**"))' real 0m0.428s user 0m0.049s sys 0m0.378s
comment:5 by , 14 months ago
I guess there's no query optimization a la RDBMS? Every file has name, don't know how big and spread that index may get and whether those 20 seconds can't be improved. If you query for ((META:group=="**")&&(name=="**"))
or leave name out, the time taken is quite different.
IF you are running queries on unindexed attributes, or on an especially large filesystem, I think this is expected and there's not much to be done about it.