Re: [Tracker] status of 0.6.90?
- From: Martyn Russell <martyn imendio com>
- To: jamie mccrack gmail com
- Cc: Tracker-List <tracker-list gnome org>
- Subject: Re: [Tracker] status of 0.6.90?
- Date: Mon, 06 Oct 2008 16:58:04 +0100
Jamie McCracken wrote:
There are also a load of other issues that need correcting:
1) enumerating and crawling directories needs to be done in the indexer
(and pass directories to watch back to the daemon). Daemon can then run
as nice 0 and normal ionice instead of nice 19 as only cpu/io heavy ops
will be searches and queries which need to be fast as possible
I really want to do this ASAP. This will reduce DBus traffic
significantly not to mention it should be faster and reduce the amount
of memory duplication we have with strings existing in the indexer and
daemon. The daemon will be REALLY lightweight then and not need to be
nice()d as Jamie says, so I can't to do this.
2) indexing needs to do all files in a directory before indexing the
directory itself to prevent files not getting indexed if daemon is
stopped in mid-index
This is REALLY important. I plan to do this after #1.
3) Needs to be fully backward compatible with API and config options.
Its likely that we will have to force a reindex for 0.6.6 as Im not sure
db will work with old versions. I would rather have sqlite fts and xesam
db support as well as flattened tables in to prevent the next few
versions forcing a reindex after each upgrade if that is the case (IE I
would rather do the reindex once if possible)
I think we already have this. Right now we have a db-version.txt that is
in ~/.cache/tracker/. If it doesn't exist or the file version is too
old, we force a reindex anyway right now. I did this pre-merge.
The API should be compatible. Need to check with Michael about that
issue I replied to in my last email. That seems to be the only thing left.
4) Db connections should use sqlite3_soft_heap_limit call to limit heap
usage of sqlite (sqlite will runaway and eat memory while indexing if
you dont - these are not leaks and will not show up in valgrind!)
see http://sqlite.org/c3ref/soft_heap_limit.html and note following:
A negative or zero value for N in a call to sqlite3_soft_heap_limit(N)
means that there is no soft heap limit and sqlite3_release_memory() will
only be called when memory is completely exhausted. The default value
for the soft heap limit is zero.
Ergo sqlite will happily eat all memory until you run out before
attempting to free a single byte unless we set a value for above
Eeek. We should definitely do this.
5) probably some fine tuning and default settings for throttle might
need to be adjusted
Right now I find the throttle just right. Maybe it throttles to much? It
can certainly be faster given we ALWAYS throttle at the moment.
6) indexing email attachments? Might be a regression if we dont as
0.6.6 did. I need to think how this fits in with xesam though as we dont
want to add them to email or files index as they are of different source
(probably we will have a separate index/db for each source -
archive,attachment et al as they are not files or emails as such)
I would have to investigate this.
7) email optimisations - really slow for large mboxes. mbox could be
optimised to store last known offset and record details to prevent full
scan when new emails are appended. needs to be done smartly (IE verify
last record structure and UID at known offset) so we dont screw up if
mbox was compacted or changed beyond recognition. i will likely restore
the junkemail table to speed up junk checking for mbox too.
I will try and do 4, 5 and 7 above over the weekend. I think martyn said
he will do 1,2,3 soon.
Great! :)
--
Regards,
Martyn
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]