ANNOUNCE: Beagle 0.2.0

From: Joe Shaw <joeshaw novell com>
To: dashboard-hackers gnome org
Cc: gnome-announce-list gnome org
Subject: ANNOUNCE: Beagle 0.2.0
Date: Fri, 20 Jan 2006 21:29:50 -0500
Hi,

I'm pleased to announce the release of Beagle 0.2.0.

This version of Beagle features a brand new user interface, formerly
codenamed Holmes, which is a vast improvement to the user experience
over the older UI, Best.  This new UI features native widgets, a more
effective use of screen space, grouping by document type, and much,
much more.

You can see a screenshot of the new beagle-search in action here:
http://beagle-project.org/images/5/54/Beagle-search.png

As a side effect of this new user interface, the dependencies on
gecko-sharp 2.0 and a Mozilla runtime (either Mozilla, Firefox, or
XULrunner) have been dropped.

This version also features a plethora of bug fixes and myriad new
features.  Among them, the dreaded tmpfile bug has been fixed, and
memory usage should be much improved in this release.


OUR MANY URLS
-------------

To download the 0.1.4 tarball or learn more, visit the Beagle wiki at:
http://www.beagle-project.org

The latest gossip is available at:
http://www.planetbeagle.org

Nat Friedman made some cool movies that demonstrate Beagle in action:
http://nat.org/demos

We still talk about Beagle on the dashboard-hackers mailing list:
http://mail.gnome.org/mailman/listinfo/dashboard-hackers

In 1835 and 1836, Ohio and Michigan fought a war over a strip of land
which included the city of Toledo:
http://en.wikipedia.org/wiki/Toledo_War


WHAT IS BEAGLE?
---------------
 
Beagle is a tool for indexing and searching your data.  Beagle is improving
rapidly on many fronts, and should work well enough for everyday use.
 
The Beagle daemon transparently monitors your data and updates the index
to reflect any changes.  On an inotify-enabled system, these updates happen
more-or-less in real time.  So for example,
 
* Files are immediately indexed when they are created, are re-indexed
  when they are modified, and are dropped from the index upon deletion.
* E-mails are indexed upon arrival.
* IM conversations are indexed as you chat, a line at a time.

Beagle also provides Firefox and Epiphany extensions that allow web pages
to be indexed as the user visits them.

Beagle uses the Lucene indexing system from the prodigious Doug
Cutting.

Beagle includes a GTK-based graphical tool for searching the index
that the daemon creates.  This application doesn't query the index
directly; it passes the search terms to the daemon and the daemon
sends any matches back.  The user interface then renders the results
and allows you to perform useful actions on the matching objects.

Indexing your data requires a fair amount of computing power, but the Beagle
daemon tries to be as unobtrusive as possible.  It contains a scheduler that
works to prioritize tasks and control CPU usage, based on whether or not
you are actively using your workstation.


DEPENDENCY HECK
---------------

Beagle has many dependencies, and thus can be difficult to compile.
It requires:
* Mono 1.1.10 or better, along with the full Mono stack
* gtk-sharp 2.3.90 or better
* GMime 2.1.19
* Libexif 0.5.7 or better

For the best possible Beagle experience, you should also have:
* Mono 1.1.13
* Evolution-sharp 0.10.2
* libgsf 1.12.1 and gsf-sharp 0.6
* Either wv 1.2.0, or a *patched* wv 1.0.3 --- the patch is available from
  http://users.avafan.com/~fredrik/beagle/wv-libole2-readonly.patch
* An inotify 0.24-enabled kernel.  Inotify is in the mainline Linux
  kernel as of 2.6.13.


CHANGES SINCE 0.1.4
-------------------

UI/Tools:
* New user interface, named beagle-search.  (Lukas Lipka, Dan Winship,
  Fredrik Hedberg, Joe Shaw)
* Old user interface (Best) removed.  (Dan)
* Explicitly blacklist the documentation index when searching using
  beagle-search.  (Joe)
* Fix up the beagle-extract-content testing tool to clean up after
  tmpfiles.  (Joe)
* Many fixes to the bludgeon testing tool.  (Joe)

Daemon/Infrastructure:
* Add a one minute delay in starting the indexing process when the
  daemon is started.  (Joe)
* FINALLY fix the dreaded stale tmpfile bug.  (Joe)
* Cache Lucene IndexReaders whenever we can, drastically reducing the
  number of allocations made and saving memory.  (Joe)
* Move to XdgMime for mime type detection.  (Bera)
* Fix a compatibility problem with Mono 1.1.11 and newer that caused
  settings to not be loaded correctly.  (Joe)
* Fix a pegged CPU and memory problem when loading the text cache for
  some files with non-ASCII characters.  (Joe)
* Always use lowercase for file extension queries.  (Bera)
* Allow clients to set the QueryDomain they wish to search in.  (Joe)
* When generating snippets, get six prior and following words rather
  than two for better context.  (Joe)
* Store search-only properties as unstored Lucene fields.  (Bera)
* Use a special namespace to hint filters instead of a stored
  property.  (Bera)
* Fix an exception in the inotify code when a file is deleted before a
  change can be handled.  (Joe)
* Fix a bug in which a stream was being saved to disk twice in the
  middle of filtering.  (Joe)

Backends:
* Fix a nasty memory inefficiency when crawling over read-only files
  in the file system backend.  (Joe)
* Use GMime's new StreamWrapper in the Evolution and KMail backends,
  which should dramatically reduce allocations and save memory.  (Joe)
* Remove gmime dependency from Akregator backend.  (Bera)
* Update Blam backend to match the current file format.  (Bera)
* Read KMail folder locations from the kmailrc file.  (Bera, Vaclav
  Slavik)
* Changed KMail folder detection to allow stale index files.  (Bera)

Filters:
* Also use GMime's StreamWrapper in the mail filter.  (Joe)
* Store email addresses as non-keywords, so that searching for just
  email addresses works again.  (Joe, Bera)
* Break up email address into fragments, so that searching for "foo"
  or "bar" will match "foo bar com".  (Joe)
* Update PNG filter to be entirely managed and extract more metadata.
  (Larry Ewing)
* Update JPEG and PNG filters to extract embedded XMP data.  (Larry)
* Added Ruby Filter.  (Uwe Hermann)
* Store message ID and references from email for tracking
  conversations.  (Bera).
* Truncate indexing of shell scripts to 20k.  (Joe)
* Touchups to the Source and OpenOffice filters.  (Lukas)
* Index "Type" in .desktop files.  (Bera)
* Use the "meta" namespace for meta tags in HTML files.  (Bera)
* Fix some memory management issues in mail filter.  (Joe)

Bindings:
* Many libbeagle updates.  (Joe, Bera)
* Many Python binding updates.  (Bera)
* Added Python example script.  (Raphael Slinckx)

Translations:
* Updated Canadian English translation.  (Adam Weinberger)
* Updated Danish translation.  (Lasse Bang Mikkelsen)
* Updated Dutch translation.  (Tino Meinen)
* Updated Hungarian translation.  (Gabor Kelemen)
* Updated Finnish translation.  (Ilkka Tuohela)
* Updated Norwegian Bokm�translation.  (�vind Hoel)
* Updated Vietnamese translation.  (Clytie Siddall)

Everything Else:
* Remove gecko-sharp2 and Mozilla dependencies.  (Dan)
* Detect Mono version to work around SharpZipLib compatibility.  (Joe)

KNOWN ISSUES
------------

Yes, we know we use too much memory.  We are working on it.

Extreme spikes in memory usage have been observed in some cases.  Certain
extremely large documents (particularly large HTML files) can temporarily
degrade your system's performance while they are being indexed.  In most
of these cases, the memory is reclaimed by the system relatively quickly after
the document is indexed.  There are other still-unexplained cases of excessive
memory use, particularly on SMP systems.

The file system is now much more robust than ever before.  However, there
are still race conditions that can occur with certain combinations of
file system operations.  In some cases it might be necessary to stop and
restart the daemon.

The CHM filter has been disabled for this release because the HTML
filter it is based upon has changed, and it has not been updated.

At this point in development, we cannot commit to stable APIs or file formats.
You will almost certainly need to delete your indexes and start again at some
point in the future.
[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]