Re: [Tracker] work items for 0.6.5




On Fri, 2008-02-01 at 22:32 +0800, Halton Huo wrote:
ïI will take Chinese New Year leave until Feb. 17. If time is allowed, 
I can do 1) and 3).


thanks - for ubuntu, feature freeze is 14th feb so let me know if you
can't do it by then


On Wed, 2008-01-30 at 13:05 -0500, Jamie McCracken wrote:
There are a few things that need doing before for 0.6.5 so if any of you
has time to spare (i have little) so feel free to volunteer:

1) add an optional auto-pause feature to tracker-applet which pauses
trackerd for a few secs whenever a keyboard or mouse event is detected
(use gdk to listen to X event queue for this just like screensavers
do). 

this feature would prevent trackerd from slowing down the computer
whenever the user is actively using the machine
Seems good feature. One question is, if user want to do something and
trackerd index at the same time, which means he ïdoes not take the
slowing down, can we offer a choice?

How about add a key to enable/disable this keyboard/mouse detection?
Default value is true.

yes but it would be in the applet (a check menu item called auto-pause
underneath the pause item) not tracker-prefs obviously as its applet
specific and trackerd cannot depend on X



2) Ignore maildirs - currently these are treated as text files but its
best to ignore them for now as they can overload the files db with
millions of mail msgs

maildirs always has a directory with 3 subdirs - "cur",  "new" and "tmp"
- if we detect a directory with just those subdirs then we should ignore
them and not process anything in them (in the future if we have good
email detection we can process them as emails but its too tricky to do
so atm)
What's this for, I'm not clear.


maildirs are usualy non-hidden folders in your home directory rather
than hidden folders like .evolution so tracker currently indexes each
mail message in them as a file - we need to stop that. Basically all it
needs to do is check current directory being indexed and if it *only*
contains three sub dirs called "cur", "new" and "tmp" then we know its a
maildir and should skip it.



3) Constantly changing files - we should ignore these especially
bittorrent ones. Perhaps keep a small size-limited stack of recently
indexed files and if one of those files has been changed more than 10
times in a few minutes we should ignore them until trackerd next
restarts

It is like prison rule. 

Suppose these stack is like:

file_name      first_change_time     change_accounts
/a             11111                 3
/b             22222                 8
ï/c             33333                 1

When a file is changed, 
  if ((current_time - ïfirst_change_time) > MAX_DURATION) {
    ïïfirst_change_time = current_time
  } else {
    ïchange_accounts ++;
    if (ïchange_accounts < MAX_CHANGE_TIMES) {
      //reflect this change
    } else {
      // ignore this change
    }
  }

thats pretty much it but make sure it applies to files only (and not
emails or conversations)

when we ignore the file - add it to an ignore list in memory. When
trackerd exits, save that list to file. When trackerd is next restarted
load the list and index the entries and then reset it.


Stop here if my thinking is totally wrong.

If my proto is not wrong, the question is where to have this stack?
(1) In memory
   There will have a long list for all changed files, even it is changed
only once.
   
(2) In database
   Add a property for each file in Service table. Could be slower.


it would not be slower as such cause we are updating the mtime in that
table anyhow whenever it changes. However we dont want to do a db change
at this point so (1) would be better for now


Any idea?

Use a fixed size LIFO stack of 50 items max in memory 

you can use a static array if you like or if you prefer the glib double
queue 

http://library.gnome.org/devel/glib/unstable/glib-Double-ended-Queues.html)


but make sure you pop tail to keep its size limited

thanks

jamie





[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]