Re: [Tracker] [systemd-devel] How to use cgroups for Tracker?



On Thu, Oct 23, 2014 at 1:24 AM, Philip Van Hoof <philip codeminded be> wrote:
Don't try to work around limitations of kernel APIs by
implementing inherently not scalabale algorithms in userspace.
I mean, you implemented something that scales O(n) with n the
numbers of dirs. That's what you need to fix, there's no way
around that. Just looking for magic wands in cgroups and
scheduling facilities to make an algorithm that fundamentally
scales badly acceptable is not going to work.

The problem with allowing unprivileged processes is that a
misbehaving process will cause the kernel to buffer events for
it, forcing the kernel to dynamically allocate memory. Not
something kernel or inotify developers will like a lot.

This is fixable, by enforcing a size limit on the queue. As the
limit is hit the algorithm should coalesce queued events based on
subtrees. For example, if one event for /foo/bar/buzz and one for
/foo/bar/waldo is queued, and the queue hits its limit, both are
replaced by an entry for /foo/bar which is marked with a new flag
that some event was lost below this subtree. For clients this would
then mean that when they receive this they must rescan that
specific subtree again, but not the whole tree.

It's a simple algorithm, the max number of entries stays fixed,
but perfomance doesn't go completely horrible when the limit is
reached.

Such a scheme should be implemented in fanotify on the kernel
side.

And clients must be prepared to scan coalesced queued events, which
can be large (when /foo/bar/waldo and /foo/bar/buzz are directories
with large amounts of files, coalescing to /foo/bar means
tracker-miner-fs suddenly has a lot of work to do) ..

But yes. The coalesce solution in fanotify would be a good idea to
allow unprivileged processes. Probably better than what FSEvents does.

That's a good one indeed; coalescing events in that way in the kernel
looks quite a sane approach. Still, one single process in userspace
doing all the control of what changed when (like FSEvents does) may
actually behave much nicer, as other processes could ask for all
changes coalesced since a specific timestamp (e.g. since that process
was run). Not thinking in Tracker here, think of a program which runs
sporadically, but when it runs it wants to know what changed since the
last time it was run (e.g. a backup app).

Anyway, please remember that being privileged isn't the only reason
why Tracker can't use fanotify. It's API being fd-based, it works on
existing open files only; e.g. it won't notify file deletes or move
events, among other things. If we want some recursirve monitoring
approach with all CREATE/UPDATE/DELETE/MOVE events, something new
needs to be implemented, or inotify somehow improved to handle that.

-- 
Aleksander
https://aleksander.es


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]