Opened 14 years ago

Closed 8 years ago

Last modified 8 years ago

#6734 closed bug (fixed)

query not working when specifying an interval

Reported by: zuMi Owned by: axeld
Priority: normal Milestone: R1
Component: Kits/Storage Kit Version: R1/Development
Keywords: query, interval, range Cc:
Blocked By: Blocking:
Platform: All

Description

I tried to get e-mails with a similar query formula

(MAIL:when=>%date1%)&&(MAIL:when<=%date2%)

but from Tracker and query cli command there aren't any results, same formula on BeOS R5 works.

I haven't any other indexes of TIME type in my volumes so I didn't made other tests, I have to try with INT yet.

Change History (14)

comment:1 by anevilyak, 14 years ago

Component: File SystemsKits/Storage Kit
Owner: changed from nobody to axeld
Version: R1/alpha2R1/Development

comment:2 by anevilyak, 14 years ago

Some simple tests like (last_modified >= %%-2 days%%) work for me, but it breaks when I try compound conditions like the above. Not yet sure if the problem lies in the query parser in the storage kit or if it's further down.

in reply to:  2 comment:3 by zuMi, 14 years ago

Replying to anevilyak:

Some simple tests like (last_modified >= %%-2 days%%) work for me, but it breaks when I try compound conditions like the above. Not yet sure if the problem lies in the query parser in the storage kit or if it's further down.

yes, as I wrote above it happens only when you query for a TIME frame, I've just tried with INTEGER indexes and there is not any issue.

comment:4 by zuMi, 14 years ago

I'm experiencing something weird again, when I do a query with MAIL:when && something else I got only few or any elements, like if I do:

((MAIL:when>%last 1 year%)) I get a lot of stuff

((MAIL:status=="Read")) a lot again

((MAIL:when>%last 1 year%)&&(MAIL:status=="Read")) I get nothing.

Version 0, edited 14 years ago by zuMi (next)

comment:5 by AGMS, 10 years ago

Looks like the query system is still broken in hrev48702 (the one with fixed BFS index storage of long attributes, like MAIL:cc). I copied over 90000 e-mails to a freshly formatted volume, and only 109 show up in the query that uses the huge MAIL:cc index (Text 01/20/2015 06:55 PM 1996800 MAIL:cc):

/CommonHaiku> query -v /CommonHaiku/agmsmith '(MAIL:cc="*")' | wc

109 1241 15367

/CommonHaiku> query -v /CommonHaiku/agmsmith '(name="*")&&(MAIL:cc="*")' | wc

92046 695183 10674488

/CommonHaiku> query -v /CommonHaiku/agmsmith '(name="*")' | wc

92046 695183 10674488

So, is the parsing broken or is the attribute value turned to junk data in the index or something else?

comment:6 by axeld, 8 years ago

Resolution: fixed
Status: newclosed

Fixed in hrev50906. It was a problem specific to the handling of type TIME. The B+tree correctly mapped it as int32, so that simple queries that can use the index worked fine. As soon as your query gets more complex, one index is chosen to "run" the query. The nodes that this index lookup returns will then be matched against the query.

That's why MAIL:status==Read did not work, while the variant MAIL:status==*Read* did; in the first case, the MAIL:status index was chosen, as this was a direct lookup (= cheap). In the second case, the MAIL:when index has been chosen, as looking up a pattern is expensive. As soon as the query code came across MAIL:when, it stumbled upon a TIME value, and didn't know what to do with it.

comment:7 by jessicah, 8 years ago

Out of curiosity, why is time using an int32 value? Shouldn't this be int64?

From the BeBook: B_TIME_TYPE: 64-bit microsecond time_t data.

comment:8 by AGMS, 8 years ago

last_modified is indeed an Int-64 index in BeOS. However, the query parser converts it to seconds, not microseconds, when using the value in comparisons. So there should be a divide by 1000000 somewhere in there.

comment:9 by AGMS, 8 years ago

Hmmm, no, seems to be doing bit shifting instead of a divide by 1000000 for last_modified in Query.cpp. I wonder if that breaks things, like comparing dates to %today% and that sort of thing.

comment:10 by axeld, 8 years ago

To jessicah: time_t is an int32, as defined in time.h. That's a value in seconds, not microseconds. So whereever this information comes from, it's wrong.

To AGMS: the strange handling of last_modified is inherited from BeOS. They used a time_t, shifted it around, and or'd some random value in the lower bits to keep the number of duplicates low, as duplicates are handled rather inefficiently in BFS. Haiku does a little bit better as it doesn't use a random value, but actually a bigtime_t, and the lower bits from that value -- you lose some, but not all microsecond information this way.

comment:11 by AGMS, 8 years ago

Looks like Haiku is consistent in using shifted units, "last_modified>=%today%" works.

Though that means you can't do a query with a time_t value (seconds since 1970) as a constant, as you can in BeOS (no more using the output from "date +%s"). Not a big loss.

comment:12 by axeld, 8 years ago

%today% is parsed via parsedate(), and returns a time_t. IIRC the query is run against the time_t value, not the crippled bigtime_t one, so using date +%s should work just fine.

comment:13 by AGMS, 8 years ago

Testing... Yes, it does work, last_modified can use number of seconds since 1970.

comment:14 by jessicah, 8 years ago

https://www.haiku-os.org/legacy-docs/bebook/TheSupportKit_Constants.html#Constants_Type_Codes

I'm surprised we use a 32-bit int here though; given how long releases take and that many other OSes use a 64-bit int already, we may run into the year 2038 problem rather quickly ;) haha

Note: See TracTickets for help on using tickets.