Re: Use beagle to read (eh, grep) source code



Hi,

Nice work!

If you're comfortable hacking around in the Beagle codebase, you could
probably make the tokenizer (called an analyzer in Lucene parlance)
act more appropriately for code so that things like underscores aren't
stripped out.  Take a look at beagled/LuceneCommon.cs in the
BeagleAnalyzer class for more info.

Joe

On Wed, May 19, 2010 at 11:12 AM, Haojun Bao <baohaojun gmail com> wrote:
> Hi, all
>
> Beagle put grep on steroid for me:-) Thanks a lot y'all beagle hackers!
>
> The idea is simple and practical, beagle-static-qeury first, then use
> grep on the results.
>
> For e.g., to grep "ENGLISH_STOP_WORDS" in the beagle source code, I will
> use beagle-static-query:
>
>    beagle-static-query\
>     --add-static-backend /src/beagle/.beagle\
>     --backend none\
>     --max-hits 100000\
>     'ENGLISH STOP WORDS'
>
> (note how I figured out the `_' character should be removed when
> beagling:-)
>
> Then I will only grep the original regexp target in the following files,
> because beagle already decided only these files contain all the 3 words
> of 'ENGLISH STOP WORDS':
>
>    /src/beagle/beagled/ExtractContent.cs
>    /src/beagle/beagled/LuceneCommon.cs
>    /src/beagle/beagled/Lucene.Net/Analysis/Standard/StandardAnalyzer.cs
>    /src/beagle/beagled/Lucene.Net/Analysis/StopAnalyzer.cs
>    /src/beagle/beagled/Snowball.Net/Lucene.Net/Analysis/Snowball/SnowballAnalyzer.cs
>    /src/beagle/NEWS
>
> This way, even with the ~2 gigabytes Andoid source code, you can usually grep
> and get the results in a few seconds. Best of all, it works not only
> with source code, but with any text files.
>
> If you are intested, the source code is at
>
>   git://github.com/baohaojun/windows-config.git
>
> And there's a detailed README at http://github.com/baohaojun/windows-config/raw/master/gcode/beagle/beagle-grep-readme.org
>
>
>
>
>
>
>
> _______________________________________________
> dashboard-hackers mailing list
> dashboard-hackers gnome org
> http://mail.gnome.org/mailman/listinfo/dashboard-hackers
>


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]