#19080 closed bug (fixed)
Query term order shouldn't matter, but does.
Reported by: | humdinger | Owned by: | axeld |
---|---|---|---|
Priority: | normal | Milestone: | R1/beta6 |
Component: | File Systems/BFS | Version: | R1/beta5 |
Keywords: | Cc: | ||
Blocked By: | Blocking: | ||
Platform: | All |
Description
This is hrev57849, 64bit.
The order in which you enter the terms for a query should not matter, but apparently it does. Consider these two queries that should have the same result:
- query -a "(((MAIL:when>%-3 days%)&&(MAIL:subject=="*[cC][oO][mM][mM][iI][tT]*"))&&(BEOS:TYPE=="text/x-email"))"
- query -a "(((MAIL:subject=="*[cC][oO][mM][mM][iI][tT]*")&&(MAIL:when>%-3 days%))&&(BEOS:TYPE=="text/x-email"))"
While query 1 results in the correct few dozens of mails from the last 3 days, query 2 returns over 3,000 mails from since forever (2019 for me, as I don't have older mails on this computer).
Not good...
Attachments (2)
Change History (16)
by , 3 months ago
comment:1 by , 3 months ago
Component: | Kits/libtracker.so → File Systems/BFS |
---|---|
Owner: | changed from | to
comment:2 by , 3 months ago
Does this reproduce on something besides BFS with emails; perhaps on packagefs?
comment:3 by , 3 months ago
With "on packagefs" you mean like the "system" volume?
There isn't much with atributes around there...
I just searched for applications with a "sk" AND "b" in their name, then tried the other way around. That did work. As did searching for audio files on a BFS volume, querying for Artist && Album combinations.
Maybe it's somethingto do with the MAIL:when attribute not being of type "string" as the MAIL:subject?
Also curious, why does query -a "(last_modified>%-1 days%)"
spit out every file, not just the ones modified since yesterday? Maybe worth another ticket...
comment:4 by , 3 months ago
It's possible that relative date-based queries are somehow broken and that's the problem here.
comment:5 by , 5 weeks ago
So, the handling of relative dates in queries happens in userland, not the kernel:
$ strace -e "open_query" query -a "(((MAIL:subject=="*[cC][oO][mM][mM][iI][tT]*")&&(MAIL:when>%-3 days%))&&(BEOS:TYPE=="text/x-email"))" [ 412] open_query(0x3, "(((MAIL:subject==*[cC][oO][mM][mM][iI][tT]*)&&(MAIL:when>1731906000))&&(BEOS:TYPE==text/x-email))", 0x61, 0x0, 0xffffffff, 0x0) = 0x3 (28 us)
The question just then remains as to why the comparison isn't working here, for whatever reason.
comment:6 by , 5 weeks ago
The last_modified problems were indeed a regression; I fixed those in hrev58355.
I don't really have emails here to test with; any chance you could come up with a few that trigger this problem and zip them up? (If you can get the problem to reproduce on RAMFS, so much the better. Just copy a few emails over to a RAMFS volume and then run the query on that volume only.)
I guess somewhere around here there was that "data demo package" which had a number of emails in it; perhaps I can see if there's a few in there which are suitable to test this with...
comment:7 by , 5 weeks ago
(Well, you'll have to create the relevant indexes on the RAMFS volume, too.)
comment:8 by , 5 weeks ago
Actually, I found another issue and fixed that in hrev58356. So please see if that outright fixes this bug.
I would also be interested to see the size of these indexes (lsindex can report that). I think the reason the queries behave differently when ordered differently is because the indexes are large, and so it winds up thinking that querying either index is basically the same, but that's not the case at all. We should alter the scoring method to be more precise, probably; but let's see if we can just get the bug fixed first.
comment:9 by , 5 weeks ago
Well done! Testing on hrev58356, the queries from above have the same results! \o/
This is the lsindex -l of the partition where I keep my mails:
Text 11/18/2023 11:18 AM 6144 Audio:Album Text 11/18/2023 11:18 AM 5120 Audio:Artist Text 11/18/2023 08:53 AM 6144 BEOS:APP_SIG Text 11/18/2023 11:18 AM 4096 BEOS:LOCALE_LANGUAGE Text 11/18/2023 11:18 AM 4096 BEOS:LOCALE_SIGNATURE Text 05/27/2024 07:41 PM 2048 Calendar:ID Text 05/27/2024 07:41 PM 2048 Category:Name Text 05/27/2024 07:41 PM 2048 Event:Category Text 05/27/2024 07:41 PM 2048 Event:Description Int-32 05/27/2024 07:41 PM 2048 Event:End Text 05/27/2024 07:41 PM 2048 Event:Name Text 05/27/2024 07:41 PM 2048 Event:Place Int-32 05/27/2024 07:41 PM 2048 Event:Reminder Int-32 05/27/2024 07:41 PM 2048 Event:Start Text 05/27/2024 07:41 PM 2048 Event:Status Int-32 05/27/2024 07:41 PM 2048 Event:Updated Text 10/26/2024 05:25 PM 2048 Feed:name Text 10/26/2024 05:25 PM 2048 Feed:source Text 10/26/2024 05:25 PM 2048 Feed:status Text 11/18/2023 11:18 AM 1105920 MAIL:account Int-32 11/18/2023 11:18 AM 1103872 MAIL:account_id Text 11/18/2023 11:18 AM 1080320 MAIL:cc Text 11/18/2023 11:18 AM 2048 MAIL:chain Int-32 11/18/2023 11:18 AM 2048 MAIL:draft Text 11/18/2023 11:18 AM 18432 MAIL:flags Text 11/18/2023 11:18 AM 1866752 MAIL:from Text 11/18/2023 11:18 AM 1627136 MAIL:name Text 11/18/2023 11:18 AM 2048 MAIL:pending_chain Text 11/18/2023 11:18 AM 20480 MAIL:priority Int-32 11/18/2023 11:18 AM 453632 MAIL:read Text 11/18/2023 11:18 AM 5228544 MAIL:reply Text 11/18/2023 11:18 AM 1105920 MAIL:status Text 11/18/2023 11:18 AM 13512704 MAIL:subject Text 11/18/2023 11:18 AM 11706368 MAIL:thread Text 11/18/2023 11:18 AM 1644544 MAIL:to Int-32 11/18/2023 11:18 AM 6884352 MAIL:when Text 11/18/2023 11:18 AM 4096 META:address Text 11/18/2023 11:18 AM 3072 META:address2 Text 11/18/2023 11:18 AM 3072 META:aim Text 11/18/2023 11:18 AM 3072 META:anniversary Text 11/18/2023 11:18 AM 3072 META:birthday Text 11/18/2023 11:18 AM 3072 META:cell Text 11/18/2023 11:18 AM 3072 META:children Text 11/18/2023 11:18 AM 4096 META:city Text 11/18/2023 11:18 AM 3072 META:company Text 11/18/2023 11:18 AM 3072 META:country Text 11/18/2023 11:18 AM 2048 META:county Text 11/18/2023 11:18 AM 4096 META:email Text 11/18/2023 11:18 AM 3072 META:email2 Text 11/18/2023 11:18 AM 3072 META:email3 Text 11/18/2023 11:18 AM 3072 META:email4 Text 11/18/2023 11:18 AM 3072 META:fax Text 11/18/2023 11:18 AM 2048 META:firstname Text 11/18/2023 11:18 AM 4096 META:group Text 11/18/2023 11:18 AM 3072 META:hphone Text 11/18/2023 11:18 AM 3072 META:icq Text 11/18/2023 11:18 AM 3072 META:jabber Text 11/18/2023 11:18 AM 2048 META:lastname Text 11/18/2023 11:18 AM 3072 META:mphone Text 11/18/2023 11:18 AM 2048 META:name Text 11/18/2023 11:18 AM 3072 META:nickname Text 11/18/2023 11:18 AM 3072 META:pager Text 11/18/2023 11:18 AM 3072 META:spouse Text 11/18/2023 11:18 AM 3072 META:state Text 11/18/2023 11:18 AM 46080 META:title Text 11/18/2023 11:18 AM 121856 META:url Text 11/18/2023 11:18 AM 3072 META:url2 Text 11/18/2023 11:18 AM 3072 META:url3 Text 11/18/2023 11:18 AM 3072 META:waddress Text 11/18/2023 11:18 AM 3072 META:waddress2 Text 11/18/2023 11:18 AM 3072 META:wcity Text 11/18/2023 11:18 AM 3072 META:wcountry Text 11/18/2023 11:18 AM 3072 META:wcphone Text 11/18/2023 11:18 AM 3072 META:wfax Text 11/18/2023 11:18 AM 3072 META:wphone Text 11/18/2023 11:18 AM 3072 META:wstate Text 11/18/2023 11:18 AM 3072 META:wzip Text 11/18/2023 11:18 AM 3072 META:yahoo Text 11/18/2023 11:18 AM 4096 META:zip Text 11/18/2023 11:18 AM 6144 Media:Genre Int-32 11/18/2023 11:18 AM 3072 Media:Rating Text 11/18/2023 11:18 AM 6144 Media:Title Int-32 11/18/2023 11:18 AM 7168 Media:Year Text 11/18/2023 11:18 AM 2048 _signature Text 11/18/2023 11:18 AM 2048 _status Int-32 11/18/2023 09:10 AM 2048 _trk/qrylastchange Int-32 11/18/2023 09:10 AM 3072 _trk/recentQuery Text 11/18/2023 11:18 AM 2048 be:deskbar_item_status Int-64 11/18/2023 08:53 AM 6213632 last_modified Text 11/18/2023 08:53 AM 33856512 name Int-64 11/18/2023 08:53 AM 8140800 size
comment:10 by , 4 weeks ago
I've posted https://review.haiku-os.org/c/haiku/+/8593 to address this.
Please test both with and without that patch, checking to see which query runs faster: the one with MAIL:subject first, or the one with MAIL:when first. Before the patch, the one with MAIL:when first should be faster (maybe even considerably faster); after the patch, they should be equivalent. (Note that the second run of either query will probably be faster than the first.)
comment:11 by , 4 weeks ago
Thanks for working on this!
Here are my findings. I tried with these 2 queries:
query -a "(((MAIL:when>%-100 days%)&&(MAIL:subject=="*[cC][oO][mM][mM][iI][tT]*"))&&(BEOS:TYPE=="text/x-email"))"
query -a "(((MAIL:subject=="*[cC][oO][mM][mM][iI][tT]*")&&(MAIL:when>%-100 days%))&&(BEOS:TYPE=="text/x-email"))"
Both return 863 mails. I first ran on a current nightly, 3x the query-1, 3x query-2 with a reboot between each run to avoid caching.
The results:
hrev58356, 64bit, not patched:
1: 1: 1: real 0m0,728s real 0m0,769s real 0m0,720s user 0m0,052s user 0m0,059s user 0m0,056s sys 0m0,221s sys 0m0,248s sys 0m0,217s 2: 2: 2: real 0m2,095s real 0m2,031s real 0m1,980s user 0m0,059s user 0m0,056s user 0m0,050s sys 0m0,778s sys 0m0,752s sys 0m0,713s
hrev58363+1, 64bit, patched:
1: 1: 1: real 0m0,689s real 0m0,687s real 0m0,733s user 0m0,049s user 0m0,055s user 0m0,062s sys 0m0,188s sys 0m0,197s sys 0m0,216s 2: 2: 2: real 0m0,719s real 0m0,668s real 0m0,681s user 0m0,054s user 0m0,045s user 0m0,053s sys 0m0,217s sys 0m0,184s sys 0m0,193s
comment:12 by , 4 weeks ago
An even more significant result than I was expecting. Thanks for testing!
comment:13 by , 4 weeks ago
Milestone: | Unscheduled → R1/beta6 |
---|---|
Resolution: | → fixed |
Status: | new → closed |
Fix merged in hrev58365.
B_OK