Re: [Tracker] New CVS version
- From: Jamie McCracken <jamiemcc blueyonder co uk>
- To: Laurent Aguerreche <laurent aguerreche free fr>
- Cc: Tracker List <tracker-list gnome org>
- Subject: Re: [Tracker] New CVS version
- Date: Tue, 22 Aug 2006 18:44:45 +0100
Laurent Aguerreche wrote:
Le mardi 22 aoÃt 2006 Ã 19:00 +0200, Marcus Fritzsch a Ãcrit :
Veeery nice work!!
here's a small patch introducing some debian/ubuntu build depends - i
am not sure about the versions though, so I took them from ubuntu
dapper (w/o ubuntu specific revision things)
Looking forward to see tracker/sqlite in action :)
I removed libmysqlclient15-dev dependency (and so libssl-dev one) due to
coming use of sqlite (or other things). :-)
yeah these dependencies will be optional (you will be able to choose
which backend to use - sqlite or mysql)
I cant say which will be better at the end of the day but even so I will
support both (sqlite's sql syntax is 99% compatible with mysql although
it does not support stored procedures yet so it wont be too much work to
support both)
But now, there are new dependencies: one with -lmagic, and one on Pango.
I think -lmagic is for libmagic-dev?
( http://packages.debian.org/unstable/libdevel/libmagic-dev )
Configure script is missing tests on it.
these is no pc for libmagic so instead we check for visibility of
magic.h in configure.in
Is Pango really required? On Debian it depends on:
- libcairo2;
- libfontconfig1;
- libfreetype6;
- libglib2.0-0
- libx11-6;
- libxft2;
- zlib1g.
And libx11-6 has dependencies on some Xorg elements: libxau6, libxdmcp6,
libx11-data and x11-common.
etc.
So it makes Tracker depends on many things which will install a big part
of X, perhaps all X on some distributions...
pango is only required for its word breaking ability with languages that
do not contain word break characters.
in the new indexer, we break words as follows:
1) we use libmagic to determine if its an ASCII file and or English and
therefore use non-utf8 techinques to break and parse words very quickly
2) if utf-8 we use utf-8 techniques
3) if text has no spaces and is not ASCII/English we assume it might be
CJK or some other lnaguage that does not contain word break characters.
In this case we use the pango word break functionality to break up words
correctly. This is incredibly slow compared to 1 and 2 above (it takes
15minutes+ to break 1MB of text compared to less than a second with 1)
We use no other functionality of pango in tracker so if any knows of a
free C lib that can do word breaks in a language indepedent manner then
let me know.
--
Mr Jamie McCracken
http://jamiemcc.livejournal.com/
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]