GUADEC redux and roadmap



Hey guys,

As you can see from my flood of emails earlier today, I am digging
myself out from underneath a mountain of backlogged emails, blog posts,
and free-agent hockey news. :)

My talk and the BOF on Beagle at GUADEC both went very well, I think.
Lots of people asking good questions about it, and a few people coming
up to me afterward talking about how they want to integrate Beagle with
their apps.  Very exciting.

I also got the chance to meet Daniel Drake, Chris Orr, and Max Wiehle
and see Fredrik Hedberg again.  It was nice meeting/seeing all of you.

There are supposed to be archives of all the talks, but the site
referenced off the GUADEC page doesn't seem to exist.  I don't know if
the talks will ever be up or not.  Generally speaking I don't like to
put slides up because they're lacking in context; they're often useless
without the context of the speaker.  Also, my laptop appears to be
broken so I can't get the talk off of there.  I'll post them once I can
get into my machine.

A couple of weeks ago Kevin put up a good roadmap document for 0.2.8,
here: http://beagle-project.org/RoadMap.  I wanted to give a heads up on
what my larger-scale plans are.  None of these will probably be for
0.2.8, unfortunately, but hopefully won't be too far behind:

        * Beagle on Battery - Right now Beagle will throttle itself and
        index slowly if you are on battery power.  Unfortunately, most
        people want Beagle to not index at all in such a case.  I need
        to investigate what will be involved in that and implement it.
        Not a ton of work, but I need to figure out what the right
        behavior is.
        
        * Metadata store - Jon and I talked about this at GUADEC and
        after thinking about it for a few days, I think we have an
        approach that will work.  Whether or not it scales is yet to be
        seen, however.  Right now we are storing all of our metadata in
        Lucene.  This works fine for largely immutable text-based
        values, but doesn't work well when you want to deal with
        external metadata (think Nautilus emblems, Leaftag or F-Spot
        tags, etc.).  The plan currently is to move storing of all
        metadata into a database and use Lucene only for its text
        indexing capabilities.  This will allow us to (a) adapt better
        to changes in external metadata stores, (b) store metadata
        ourselves on behalf of applications, and (c) make keyword
        searches simpler.  I started prototyping something on my laptop
        on the plane; I'll need to play with it a little bit more if I
        can get my laptop going again.
        
        * Using a single pool for indexes - Right now there is one index
        per backend.  This was fine initially, but when you have a mail
        index with over 400,000 emails and a tomboy index with only 9
        notes, you can see how distributing them more evenly would be
        much more efficient.  I want to change the way we store our
        indexes so that all indexes simply write to an index, and that
        layer takes care of distributing them evenly.  I also think this
        will help us a *lot* on memory usage as the number of backends
        continues to grow.
        
There are also various bugs to look into.  There are reports that Beagle
isn't indexing all files in an extremely large directory with thousands
of files; there is the fact that we're optimizing indexes (in many cases
very large indexes) more often than we need to; changing the gaim log
backend to be an indexable generator.  Those would also be good to
tackle if someone has time.

I'm also planning on writing some docs: some high-level block diagrams
on the Beagle architecture, reviewing and cleaning up the filter docs,
and writing a simple tutorial on how to write a backend.

Joe




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]