#9529 closed enhancement (fixed)
Use native text searching instead of grep
Reported by: | X512 | Owned by: | phoudoin |
---|---|---|---|
Priority: | normal | Milestone: | Unscheduled |
Component: | Applications/TextSearch | Version: | R1/Development |
Keywords: | Cc: | phoudoin | |
Blocked By: | Blocking: | ||
Platform: | All |
Description
Currently TextSearch work very slow because starting and quitting grep team is a big overhead. It will be good to have native search engine. For example Pe multi-file search is mush faster than build-in Haiku app TextSearch. Pe can find text in Haiku sources in a reasonable time.
Change History (17)
comment:1 by , 12 years ago
comment:2 by , 12 years ago
Given the reasoning for a native solution, it doesn't make sense to use ack at all.
comment:3 by , 12 years ago
It depends on why grep is constantly stopped and started.
Taking a very quick look at the code for TextSearch it calls grep individually for every file, in which case I can see why it is so slow. Since barely any of grep's features are used it does seem stupid to use it this way, when a native solution with PCRE for regular expressions would likely be much faster.
But if the usage of grep or ack could be made more intelligent (such as taking advantage of the recursion options) and if popen was used instead of redirecting to a file to get the results(!!!), I bet it could be made much faster without having to recreate grep natively.
Of course if Pe already includes a multi-file, regular expression supporting search, then maybe that could be extracted into a small system library which could be used by various things, including TextSearch. But you know how I hate recreating code and repeating things.
comment:4 by , 11 years ago
Cc: | added |
---|
comment:5 by , 10 years ago
Milestone: | R1 → Unscheduled |
---|---|
Owner: | changed from | to
Status: | new → assigned |
comment:7 by , 10 years ago
Resolution: | fixed |
---|---|
Status: | closed → reopened |
Commit reverted in hrev48971.
follow-up: 9 comment:8 by , 10 years ago
This was reverted " As per discussion on the ML."
What mailing list would that be? I haven't seen any discussion on any mailing list.
comment:9 by , 10 years ago
comment:10 by , 10 years ago
For the record, the problems with the reverted code:
- It was reading the whole file to memory, which could fail for big files
- It did not support using regular expressions for searching, only plain text search
The better way to fix this is to run grep once for all the files, instead of once for each file. Or, if we really want the tool to run without grep, it needs to work in a similar way to it: allow regexp, not read the whole file to memory but one line at a time. This is, I think, more work than needed, unless an existing library can do the work?
comment:11 by , 10 years ago
What solution is used in Pe? Can it be moved in separate library and reused by both Pe and TextSearch?
comment:12 by , 9 years ago
Well, according to:
- Pe FindInFiles read the whole file to memory, too :-\
- but it does support regular expressions.
Back to square one.
comment:13 by , 9 years ago
Owner: | changed from | to
---|---|
Status: | reopened → assigned |
deassigning various things from me
comment:14 by , 7 years ago
Maybe an intermediate solution could be used here: TextSearch could be piping the list of null-terminated file names strings to search to a xargs --null grep SEARCH_PATTERN
command?
That would leverage grep power without having to rewrite it in TextSearch... We could even parallelize the search that way:
xargs --null --max-procs=NB_CPU grep -n SEARCH_PATTERN
comment:15 by , 7 years ago
Owner: | changed from | to
---|---|
Status: | assigned → in-progress |
comment:16 by , 7 years ago
Implemented piping all filenames at once to xargs + grep in hrev51525
Searching "houdoin" case insensitive, plain text on Haiku's src root folder:
Implementation | Duration |
---|---|
Previous: | 708s (11'48s) |
Newer: | 14s |
Should works for everything, not my name only :-)
comment:17 by , 7 years ago
Resolution: | → fixed |
---|---|
Status: | in-progress → closed |
While this is probably a pretty reasonable enhancement request, I would like to recommend the tool "ack" for searching through source code:
http://betterthangrep.com/
The standalone version is very easy to install into Haiku by just downloading and copying it to ~/config/bin, then making it executable:
http://betterthangrep.com/ack-standalone
In fact another way to improve TextSearch would just be to use ack, or have it as an option.