{ANNOUNCE] Tracker 0.0.4 "Now indexing at Warp speed"



The next vastly improved version of the Tracker Search engine, Indexer framework and first class object/metadata/tag database is now available at:

http://www.gnome.org/~jamiemcc/tracker/

Tarball : http://www.gnome.org/~jamiemcc/tracker/tracker-0.0.4.tar.gz
Deb : http://www.gnome.org/~jamiemcc/tracker/DEB/tracker_0.0.4_i386.deb
RPM (FC5): http://www.gnome.org/~jamiemcc/tracker/RPM/

All discussion related to tracker happens on the tracker mailing list (http://mail.gnome.org/mailman/listinfo/tracker-list)


INTRODUCTION

Tracker is a powerful desktop-neutral first class object database, tag/metadata database, search tool and indexer.

Tracker is also extremely fast and super efficient with your systems memory when compared with some other competing frameworks and is by far the fastest and most memory efficient Nautilus search and Deskbar backends currently availble.

It consists of a common object database that allows entities to have an almost infinte number of properties, metadata (both embedded/harvested as well as user definable), a comprehensive database of keywords/tags and links to other entities.

It provides additional features for file based objects including context linking and audit trails for a file object.

It has the ability to index, store, harvest metadata. retrieve and search all types of files and other first class objects.

First class object support includes : Files, Documents, Music, Images, Videos, Playlists*, Notes*, Applications*, Contacts*, Emails*, Conversations*, Appointments*, Tasks*, Bookmarks*, History* and Projects*

(* these services are not currently indexed at the moment but will be in later versions)


More infomation on Tracker can also be found at http://freedesktop.org/wiki/Software/Tracker


USE CASES

Tracker is the most powerful open source metadata database and indexer framework currently available and because it is built around a combination indexer and sql database and not a dedicated indexer it has much more powerful use cases:

* Provide search and indexing facilities similiar to those on other systems

* Common database storage for all first class objects (EG a common music/photo/contacts/email/bookmarks/history database) complete with additional metadata and tags/keywords

* Comprehensive one stop solution for all applications needing an object database, powerful search (via RDF Query), first class methods, related metadata and user definable metadata/tags

* Can provide a full semantic desktop with metadata everywhere

* Can provide powerful criteria based searching suitable for creating smart file dialogs and vfolder systems

* Can provide a more intelligent desktop using statistical metadata



NEW CHANGES

* Hugely optimised indexing when many files are waiting to be indexed (especailly when you first run trackerd)

* Mass queueing of files no longer blocks the main thread meaning super fast searches can still be performed during heavy indexing

* Eliminated cpu bottlenecks and improved thread synchronisations so Tracker now hits the ground running when indexing (approx 500+ files indexed per minute on inotify enabled systems)

* Extended metadata support for more Exif fields

* Added more service types

* Imporved build and support for FC5 (includes RPMs)

* Redesigned Database around version 5 of the *embedded* in-process mysql database library

* Now uses the auto repair facility provided by mysql to automatically repair damaged database files so you need never worry about losing your precious data.

* Moved virtually all DB logic into stored procedures which provides a clean seperation of DB logic and application logic

* Added support for parsing dates in various formats including conversion to/from ISO 8601 format

* Fixed MsWord filter to prevent looping (WvText causes inotify to report file write change causing endless looping)

* Redesigned DB structure to be more generic and service orientated

* Added support for service types (first class objects) to DB

* Cleaned up code warnings and fixed potential crasher (thanks to patch from Nate Nielsen)

* Fixed issues with argv handling (thanks to patch from  Dan Nicolaescu)

* Fixed build issue for Fedora Core 5 (thanks to patch from  Dan Nicolaescu)



FEATURES

* Desktop-neutral design (its a freedesktop product built around other freedesktop technologies like DBus and XDGMime but contains no GNOME specific dependencies)

* Very memory efficient and non-leaking (typical RAM usage 4 - 6 MB). Unlike some other indexers, tracker is designed and built to run well on lower memory systems with typically 128MB or 256MB memory. It should even be efficient enough to use on some mobile devices.

* Non-bloated and written in C for maximum efficiency.

* Small size and minimal dependencies makes it easy to bundle into various distro's including live cds.

* Fast indexing and unobtrusive - no need to index stuff overnight. Tracker runs at nice+10 so it should have a minimal impact on your system.

* Implements the freedesktop specification for metadata http://freedesktop.org/wiki/Standards/shared-filemetadata-spec

* Extracts embedded File, Image, Document and Audio type metadata from files.

* Extracts embedded metadata from HTML, PDF, PS, OLE2 (DOC, XLS, PPT), OpenOffice (sxw), StarOffice (sdw), DVI, MAN, MP3 (ID3v1 and ID3v2), OGG, WAV, EXIV2, JPEG, GIF, PNG, TIFF, DEB, RPM, TAR(.GZ), ZIP, ELF, REAL, RIFF (AVI), MPEG, QT and ASF files

* Supports the WC3's RDF Query syntax for querying metadata

* Provides support for both free text search (like Beagle/Google) as well as structured searches using RDF Query

* Respond in real time to file system changes to keep its metadata database up to date and in synch

* Fully extendable with custom metadata - you can store, retrieve, register and search via RDF Query all your own custom metadata

* Can extract a file's contents as plain text and index them

* Provides text filters for PDF, MS Office, OpenOffice (all versions), HTML and PS files.

* Can provide thumbnailing on the fly



INSTALLATION (from source):


The following build dependency is needed:

The embedded mysql library ibmysqld.a version 5.0.19 or higher is required complete with corresponding header files and the mysql_config program. The libmysqld.a library is always statically linked so it is not a runtime dependency. It can be found in Debian/Ubuntu package libmysqlclient15-dev. Other distros should check their mysql client libs to see if it is present.

If not present you can compile libmysqld.a by downloading source tarball for an appropriate version (5.0.19+) from the bottom of page http://dev.mysql.com/downloads/mysql/5.0.html

The following configure flags are reccommended if building mysqld.a from source: --without-server --with-embedded-server --enable-assembler --with-mysqld-ldflags --with-client-ldflags


Run time dependencies (also needed for build) :

libdbus (0.50 +)

dbus-glib bndings (0.50 +)

glib (2.6+)

zlib

libvorbisfile3 (1.1+)


Additional recommended packages:

libextractor (0.5.9+) (tracker has a streamlined version inlined)
wv (1.0.2+)
poppler (pdftotext)



COMPILATION

To compile Tracker uses the following commands :

./configure --prefix=/usr --enable-static --with-pic CFLAGS=-I/usr/include/exiv2
make
sudo make install

If you install using any other prefix then you might have problems with files not being installed correctly.

(You may need to copy and amend the dbus service file to the correct directory and/or might need to update ld_conf if you install into non standard directories.)




Compile Options

Tracker has several compiler options to enable/disable certain features. The following is available (all options should be passed as command line parameters to autogen.sh EG ./autogen.sh --disable-fam)

--disable-fam : this option omits support of FAM/GAMIN with tracker

--disable-inotify : this option omits support for inotify with tracker

--enable-libextractor : this option forces the use of the prepackaged streamlined version of libextractor in tracker



RUNNING TRACKER


To run tracker, you need to manually start the tracker daemon trackerd. By default trackerd will index your entire home directory.

You can also pass a directory root to be indexed as a command line parameter if you dont want your entire home directory indexed. EG

"trackerd /home/jamie/Documents"

You can make sure that tracker only indexes a subset of your home directory and also specify folders not in your home directory by editing the tracker.cfg file in ~/.Tracker (which is created when you first run trackerd) and setting WatchDirectoryRoots to a semicolon delimited list of directories (full path required!)

EG:

WatchDirectoryRoots=directory1;directory2;directory3

On the first run, Tracker will automatically create a new database and start populating it with metadata by browsing through the user's home directory (or the root folder(s) specified).

On subsequent runs, Tracker will start up much much faster and will only ever incrementally index files (IE files that have changed since last index).

If installed correctly, the tracker daemon (trackerd) can also be started automatically via Dbus activation.


Tracker And Nautilus Search

Once you have installed Tracker and have some indexed contents, you should now compile Nautilus (ver 2.13.4 or higher) which should auto detect that tracker is installed and automatically compile in tracker support. You are now ready to appreciate a powerful and super efficient c based indexer in all its glory... happy hunting!

To make sure trackerd always start when you login to Gnome, you will need to add it to Gnome-session (select sessions from preferences menu, select startup program tab and then add /usr/bin/trackerd). For non-gnome installations, see the desktop docs for how to auto start an application for your particular desktop.

Tracker and Deskbar applet

Tracker is also integrated in GNOME's deskbar applet. Please see that applet for more info.



COMMAND LINE TOOLS

Tracker comes with a number of command line apps that you can use:

"tracker-extract FILE" - this extracts embedded metadata from FILE and prints to stdout

"tracker-search SEARCHTERM" - this perfoms a google like search using SEARCHTERM to retrieve all matching files where SEARCHTERM appears in any searchable metadata

"tracker-query" - this reads from STDIN an RDF Query that specifies the search criteria for various fields. It prints to STDOUT all matching files. You can see some example queries in the RDF-Query-examples folder. You can run the examples as "tracker-query < RDFFILE"


--
Mr Jamie McCracken
http://jamiemcc.livejournal.com/




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]