[Tracker] Some thoughts on tracker-extract interaction and error handling



Hey,

As of now, embedded metadata extraction works in a pretty simple manner,
tracker-extract processes files one by one in the main thread, and
tracker-miner-fs both controls the request rate and handles errors in
extraction.

When handling errors, tracker-miner-fs does so relying a lot on current
tracker-extract behavior, immediately discarding a file if there's
suspicion that it's causing some trouble in the extractor (mostly
timeout/no_reply dbus conditions), the heuristic being that if it's the
oldest file in the current request, it must be the one that
tracker-extract was working on.

So, let's figure tracker-extract goes threaded for extraction, and makes
a difference between extract modules that are safe to spawn in different
threads, in a single thread only or even the main thread only (drifting
out of subject here, but it's also something I'd like to see eventually
done). 

If that's the case, error handling could be rather complicated, no_reply
dbus errors very often means a crash, and would happen for every ongoing
petition. Timeouts often mean that a extractor module has gone wild, be
it by a bug there, corrupt data... timeouts could apply to just a single
file, or cascade to a whole mimetype family, depending on the extractor
module used behind.

So I think the lowest common denominator would be to have
tracker-miner-fs enter some "failsafe extraction" mode, where on
extractor error:

1) file is added to a "failed extraction" list
1) the miner is paused
2) waits for all ongoing extractor replies
  2a) if extraction failed, add files to the list too
3) goes through the list, request metadata again, one by one
  3a) if request failed again, give up

This does make sense for tracker-extract crashes, but could still
potentially take a long time in timeout conditions for the "extractor
module in single thread" case, so, thoughts?

Cheers,
  Carlos




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]