Re: [Tracker] Running tracker on an Ubuntu server box?
- From: Martyn Russell <martyn imendio com>
- To: tigerf web de
- Cc: tracker-list gnome org
- Subject: Re: [Tracker] Running tracker on an Ubuntu server box?
- Date: Tue, 21 Apr 2009 09:53:15 +0100
tigerf wrote:
Hello from Germany,
Hi there :)
I'm looking for a way how to use tracker on an Ubuntu 8.10 server
edition (!) to index huge numbers (60.000+) of .doc and .pdf-files.
I should mention, that for Tracker 0.6.x, there is a hard limitation on
the amount of data you can store for full text searching. We currently
use QDBM for the index and it has a 2Gb file size limit.
This means that once you get to a sizeable index, it will not index any
further. This has recently lead to this "Can not index word" error we
have been seeing in bug reports.
In the 0.7 branch which we are working on in parallel, we are using
SQLite instead of QDBM. This should extend the possibilities here not to
mention add partial match searching (i.e. "foo*" finds "foobar") which
is another feature missing.
Unfortunately, I couldn't say if 60k files would actually be reaching
the limit or not because it really does depend on how many words are in
those files. My estimate is that you wouldn't be far off the limit with
that many files though. Perhaps others which have had this QDBM error
can comment on how many files they have to give some rough estimation here.
Is there somewhere a how-to or is the whole idea simply unrealistic?
I don't think so, Tracker just might not be able to cope with the
volumes for now. Of course, the said limit I mentioned above is per
user, if you are doing this on a multiple user level, things get
trickier but the QDBM limit is less of a problem.
Background:
I'm currently setting up a LAMP + Samba server to replace a windows box,
which offers thousands of documents to its windows clients via network
shares. I'm using Apache & PHP as a frontend, which translates the
user's web-page input into a commandline using "find". The results are
parsed and returned to the users via HTML pages containing links to the
matching files found.
I expect the combination of tracker as a backend and a commandline
querying tool could be much faster and flexible than the current approach.
Tracker has a daemon monitoring changes in the filesystem and updating
its index accordingly, I guess. Is there an easier way to access
tracker's index via PHP? Some kind of SQL interface may be?
My requirements are basically the same an ordinary NAS has, one can
imagine my Ubuntu box as a NAS box, which offers only an HTML user
interface.
Sorry for my may be nooby questions, but I my linux knowledge is rather
limited at the moment and I didn't find a better place to ask.
No need to apologise, thanks for asking!
--
Regards,
Martyn
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]