Re: [Tracker] [systemd-devel] How to use cgroups for Tracker?



On 20/10/14 21:12, Lennart Poettering wrote:
On Tue, 14.10.14 15:35, Martyn Russell (martyn lanedo com) wrote:

Hej Lennart,

I am not entirely sure what cgroups would really give you that
sched_setscheduler(), ioprio_set(), setrlimit() wouldn't give you

It's another approach, more of a sandbox in my mind than a priority system (which the above APIs are really - possibly with the exception of setrlimit()). I will get on to APIs in a bit...

Tracker has been used on embedded platforms in the past with its own cgroups because it was more effective than the other APIs. It was recently suggested again in the bug above.

anyway, after all tracker is only a single process, no?

No, there are several:

1. tracker-store (handles DB writes mainly and ontology / DB schema maintenance).

2. tracker-miner-fs (handles finding and sending the primary data we found to the store to be added to the DB. Primary data is purely information about files, like size, name, etc. no file type specific metadata, like song duration).

3. tracker-extract (handles file type metadata extraction - e.g. the song duration).

4. Other data-miners (tracker-miner-apps, etc)

Typically, the data miners use the ioprio_set(), sched_setscheduler() and tracker-extract (ONLY) used to use setrlimit() until about 2 weeks ago where we removed it - it's produces more problems than it solves.

The store only uses ioprio_set() - due to all the disk writes we perform there. We don't set the scheduling here because apps on the desktop are using the store too and it should be responsive, so instead we try to make the processes feeding the store more idle or lower priority to drip feed instead.

Moreover tracker is unpriviliged and runs in user context (not system
context), right? In that case access to cgroupfs is not available, you
have to go through systemd's per-user APIs. However, currently moving
user sessions to system is not complete, hence I fear its to early to
make this change. Also, at least initially, even if we'd establish
systemd for users widely I have doubts we'll support much more than
CPUShares= for unpriviliged users.

All completely true.
Though my suggestion was to try to install a cgroup by the packager that could be generally used by Tracker. I don't see Tracker needing to change the details of the group.

libcgroup is obsolete btw. And while this is currently not strictly
enforced there is supposed to be only one writer to the cgroup tree,
and on most systems that is systemd.

Interesting, I didn't know that. I can scrap that branch I started then :)

Anyway, what precisely are you trying to do?

Even using the kernel APIs, we still get bug reports about crazy CPU use and/or disk use. Since employing sched_setscheduler(), I think the situation is much better, but some people really want Tracker to
  a) only perform when they're away from their computer OR
  b) be completely unnoticeable.

Now, we can do a) without cgroups, but I believe b) would be better done using cgroups.

Why don't the kernel API calls pointed out above not suffice?

I think it depends on the API and who you talk to.

For me, the APIs work quite well. However, we still get bug reports. I find this quite hard to quantify personally because the filesystems, hardware and version of Tracker all come into play and can make quite some difference.

The one API that doesn't really work for us is setrlimit(), mainly because we have to guess the memory threshold (which we never get right) and we get a lot of SIGABRTs that get reported as bugs. I suppose we could catch SIGABRT and exit gracefully, but lately, we've agreed (as a team) that if an extractor or library we depend on uses 2Gb of memory and brings a smaller system to its knees, it's a bug and we should just fix the extractor/library, not try to compensate for it. Sadly, there are always these sort of bugs and it's precisely why tracker-extract is a separate process.

I suppose what we're left with is cgroups that would really bottleneck the process (tracker-extract is the main one I am thinking of here) in terms of memory and disk use.

Thanks Lennart, any suggestions you have would be appreciated :)

--
Regards,
Martyn


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]