Re: [Tracker] Tracker performance: MaxTextToIndex



On Fri, 2007-06-29 at 18:18 +1000, Simon Cullen wrote:
I have been looking for a desktop search that will index *all* of my
PDFs---some of which are thousands of pages long.  Google only turns
up one hit when I search for MaxTextToIndex -- the variable in the
tracker.cfg file which sets the "maximum size of text in bytes to
index from a file's text contents". I have tried setting this to
various things---but I cannot seem to get tracker to go further than
(say) 20 pages into a document... 

Is there a way I can just disable the limit---so as that it will go on
until the end of any file it finds?  I only use tracker to index
pdfs---so it doesn't have to deal with any of the other junk on my
system. I don't mind if the index is large---but if there is a better
system I should consider for this, I'd love to hear about it.  

I've recently been forced into trying Google Desktop, which is fine,
but suffers from an even stricter limit in this regard (about 10
pages?). 

Any handy hits will be VERY much appreciated,


we only index the first 10,000 unique words in any one doc and/or only
the first 1mb of text

I believe the maximum text size limit can be adjusted in the config file
but the word limit is hardcoded (which needs to be changed to use a
config var). Note these settings greatly affect memory usage of trackerd
when indexing.

I will look at adding support for these in the tracker-preferences UI in
the near future

jamie





[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]