Beagle 0.0.7



I'm pleased to announce the release of Beagle 0.0.7.

This is the first version of Beagle that will work without an inotify-enabled
kernel.  If inotify is not available, Beagle will watch a few key
directories with FAM and will crawl the rest.


OUR MANY URLS
-------------

To download the 0.0.7 tarball, visit the Beagle web page at
http://www.gnome.org/projects/beagle

There is lots of useful information about compiling and using Beagle on the
wiki:
http://www.beaglewiki.org

If you are running SuSE or the Novell Linux Desktop, we have an open carpet
server with snapshots and packages for all of the dependencies:
http://segfault.cam.novell.com

Joe Gasiorek writes a regular Beagle newsletter.  You can read it at:
http://www.beaglewiki.org/index.php/Newsletters

Nat Friedman made some cool movies that demonstrate Beagle in action:
http://nat.org/demos

The latest gossip is available at:
http://www.planetbeagle.org

We still talk about Beagle on the dashboard-hackers mailing list:
http://mail.gnome.org/mailman/listinfo/dashboard-hackers

This year marks the centenary of Jules Verne's death:
http://en.wikipedia.org/wiki/Jules_Verne


WHAT IS BEAGLE?
---------------
 
Beagle is a tool for indexing and searching your data.  It is in an early
stage of development and should be considered experimental.  Beagle is
improving rapidly on many fronts, but it is not yet stable enough for
full-time, everyday use.
 
The Beagle daemon transparently monitors your data and updates the index
to reflect any changes.  On an inotify-enabled system, these updates happen
more-or-less in real time.  So for example,
 
* Files are immediately indexed when they are created, are re-indexed
  when they are modified, and are dropped from the index upon
  deletion.
* E-mails are indexed upon arrival.
* IM conversations are indexed as you chat, a line at a time.
 
Beagle uses the Lucene indexing system from the prodigious Doug
Cutting.

Best is a graphical tool for searching the index that the daemon creates.
Best doesn't query the index directly; it passes the search terms to the
daemon and the daemon sends any matches back to Best.  Best then renders the
results and allows you to perform useful actions on the matching objects.

Indexing your data requires a fair amount of computing power, but the Beagle
daemon tries to be as unobtrusive as possible.  It contains a scheduler that
works to prioritize tasks and control CPU usage, based on whether or not
you are actively using your workstation.


DEPENDENCY HECK
---------------

Beagle has many dependencies, and thus can be difficult to compile.
It requires:
* The full Mono stack, including Gtk#. (We all use 1.1.4, and you probably
  should too, but 1.0.x will also work.)
* D-BUS 0.23.2
* Evolution-sharp 0.6
* Gecko-sharp
* Gsf-sharp
* Gmime

For the best possible Beagle experience, you should also have:
* An inotify 0.18-enabled kernel


CHANGES SINCE 0.0.6.1
---------------------

Backends:
* All backends now work without inotify (Jon Trowbridge, Fredrik Hedberg)
* Lots of file system backend improvements (Jon)
* Deal with missing Evo mail directories (Joe Shaw, Lukas Lipka,
  Daniel Drake)
* Detect Evo summary file versions and skip ones we won't support (Joe)
* Launcher backend clean-up (Daniel)
* In the mail backend, don't index mail headers or non-text parts as
  text (Jon)

Daemon/Infrastructure:
* Don't check access before setting EAs (Lukas)
* De-inotification of directory creation (Daniel, Christopher Orr)
* Before we serialize any XML, check for invalid characters (Joe)
* More fixes for filenames containing @ (Jon)
* Fixed logging for helper process (Jon)
* Don't complain so loudly if we can't set EAs on files (Daniel)
* Properly handle --fg and --bg in the beagled script (Jon)
* Look at VmRSS to decide when to restart the helper, not VmSize (Jon)
* Set the helper max memory size relative to the initial footprint (Jon)
* Added environment variables to override where beagle looks for your
  files and when it writes its indexes (Jon)
* TextCache fixes (Jon)
* Reset the stored path if a file already has attributes but appears to have
  been moved, copied or renamed (Jon)
* Properly shutdown the helper if the beagled terminates after an add
  but before flushing the indexer (Jon)
* Properly deal with multiple entries for one path in the fallback
  file attributes sqlite db (Jon)
* While indexing, filter out stuff that is obviously not text (Jon)

Filters:
* Greatly improved handling of .ppt files (Veerapuram Varadhan)
* Strip junk characters from .doc files (Varadhan)
* Support odt format in OpenOffice filter (Varadhan)
* Index contents of hyperlink fields in OpenOffice documents (Varadhan)
* Handle exif dates gracefully (Daniel)
* Support both libexif 0.5 and 0.6 (Daniel)
* Fail gracefully if an exception is thrown while a filter is pulling
  text (Jon)
* Don't index text/plain files that are suspiciously large (Jon)
* Lots of misc. clean-up, fixes for compile-time warnings, etc. (Varadhan)

UI:
* Print out nice error messages on dbus exceptions in beagle-query
  (Daniel)
* If we're unable to launch a process from Best, don't crash (Joe)
* Show snippets in presentation tiles (Jon)

Everything Else:
* D-BUSology (Joe)
* Wiki (Joe Gasiorek.)
* Newsletter (Joe G.)
* Fixed spelling of Tom von Schwerdtner's name... sorry! (Jon)
* All the stuff I forgot (All the people I forgot)


KNOWN ISSUES
------------
 
It doesn't take that much ingenuity to confuse the file system backend.
Certain operations are yet not fully implemented -- in particular, the right
thing doesn't happen when you move a file.

The beagle daemon grows over time, using more and more memory...  but we now
grow *much* more slowly than previously-released versions.  It still needs to
be periodically killed and restarted, but the time for it to get too big is
now best measured in days rather than hours.  Our Mono GC issues are mostly
resolved; our problems now seem to be leaks in the C# D-BUS bindings.

D-BUS is not fully thread-safe.  There are race conditions that can cause
beagled to crash.

Sometimes the daemon or its associated helper process fail to shut down
cleanly.  Occasionally you will need to kill a beagle-related process by hand.

At this point in development, we cannot commit to stable APIs or file formats.
You will almost certainly need to delete your indexes and start again at
some point in the future.  In fact, you should probably delete your indexes and
start again before upgrading to 0.0.7... you need to re-index to get the benefit
of some new optimizations.





[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]