Re: [Tracker] tracker-extract not being restarted



On 12/05/14 15:06, Ralph Böhme wrote:
Hi Martyn

thanks for taking time and explaining things!

No problem :)

Am 12.05.2014 um 15:11 schrieb Martyn Russell <martyn lanedo com>:

On 12/05/14 14:07, Ralph Böhme wrote:
Upon login. Otherwise, never IIRC.

I see. From the .desktop file, right? But then, how are
extraction requests passed to tracker-extract? Not via DBUS?
Because if they were passed via DBUS, DBUS could/should start it
as necessary. So I'm missing some architectural point somewhere,
but where? :)

We use nie:data-source <foo> and if it's unset, we know we've only
done the first pass index on the file system (i.e. basic file
data).

Hm. But then, how made tracker-extract made aware of filesystem
event, ie new files? I though tracker-miner-fs sets up the watches

So:

1. tracker-miner-fs uses inotify (and other backends), it is signalled on a new file.

2. tracker-miner-fs processes that file adding basic information (file size, etc).

3. tracker-extract is started on login and listens for GraphUpdated, a signal from tracker-store. The tracker-store process is the only one that can insert data. tracker-extract also queries for a list of IDs of resources on start up of the ones that it won't be notified of and which have no nie:data-source set. These are the files it knows it needs to process.

4. tracker-extract then runs with that list of files and also queues any new files from the GraphUpdated signal.

and notifies other processes (tracker-extract) via DBUS IPC ?

We used to use DBus and that would instantiate processes for us, but now we have this "passive" approach. It has advantages and disadvantages. The advantage is, tracker-miner-fs is simpler and doesn't have to deal with the mess of broken files crashing tracker-extract and timeouts, etc. Also it means the file basic information is added to Tracker quicker than it was previously because we have no delay on tracker-extract. Another advantage half realised (bits can be improved still), is that we can group process files of the same time and prioritise files of particular types. We have mime type priorities currently I think. Disadvantages include having to wait for the second pass indexing to complete :)

tracker-extract has _ALWAYS_ crashed because of the nature of the files and libraries we're using being the weakest point. With this in mind, it's quite bad that we don't have some sort of watchdog or way to keep it up and running. We could easily do this in tracker-miner-fs. Part of me thinks we should actually have a tracker-control --daemon which watches all process and keeps some sort of journal / log going and also maintains processes like tracker-extract keep running. I should add, this is something that nearly EVERY embedded solution does itself with some script which keeps miner-fs running (not even tracker-extract), because there is no other way to guarantee Tracker is there to catch all events.

The only downsides I see are repeated attempts causing a circular problem with tracker-extract. But that's easily remedied, we had to do something like this in tracker-miner-fs before already.

Happy to help :)

--
Regards,
Martyn

Founder & Director @ Lanedo GmbH.
http://www.linkedin.com/in/martynrussell


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]