Re: [Tracker] Reviving the project, a first attempt
- From: Philip Van Hoof <philip codeminded be>
- To: Martyn Russell <martyn lanedo com>
- Cc: Jos van den Oever <jos vandenoever info>, Tracker mailing list <tracker-list gnome org>
- Subject: Re: [Tracker] Reviving the project, a first attempt
- Date: Thu, 22 Nov 2012 12:21:10 +0100
On Thu, 2012-11-22 at 10:52 +0000, Martyn Russell wrote:
On 11/22/2012 09:01 AM, Philip Van Hoof wrote:
Hello Philip,
Hey Martyn!
Buried under its own weight of complexity the project is stifled. Why do
I think this?
The project isn't dead. I should point this out. It's slowed up due to
the change in funding clearly.
Right, it's not dead it's slowed down. I agree. You made several fine
releases and patches are being reviewed. Great job on that.
[CUT]
I always wondered, JÃrg, where the name vstore came from? But it was a
fantastic branch and piece of work that you did. It clearly steered the
project in the direction of SPARQL and Nepomuk as ontology. Thanks.
I don't recall what the branch was about actually. At the hight of our
development, I was merging ca. 6 branches a week into master. Hard to
keep up with all of them ;)
That branch was the SPARQL and Nepomuk stuff, the big redesign point
where the indexer got separated from the SPARQL endpoint. The branch got
removed apparently but I recall that vstore was its name.
[CUT]
This is not the case anymore. And I heard from developers of a new phone
OS being developed that Tracker is again used, that it was again a hard
Which one?
They have not officially announced this and I don't want the developers
who discussed it with me in private to feel sorry that they did. You'll
figure it out, Martyn ;-)
[CUT]
I would also like to thank our top contributors and the people who
worked on Qt based libraries built on top of libtracker-sparql for
spreading the truth about our team and Tracker. You guys know who you
are, I don't have to name you ;-)
Don't forget your input. You made quite a sizeable contribution and made
quite some difference. ;)
And oh my God I'm writing so much text just to make a simple point ..
It's definitely not an imposter writing this then :P
Thanks for the nice compliments!
The API libtracker-extract's tracker_extract_client_get_metadata is not
public enough because the Tracker is relying too heavy on the file
system miner. Today it is time to change this.
I agree that it's too heavily relying on the miner.
There is a good reason for this. The filesystem information, name, size,
mtime, etc is all handled by the miner-fs. You could likely solve this
issue by "chaining" extractors and have a basic file extractor which
gets this information so the miner isn't doing it.
Yes, I agree.
This is the reason why miner-fs is injecting SPARQL, because it
concatenates extractor specific SPARQL with file system general SPARQL.
We always wanted to redesign this, as it was an extra IPC callback that
we could avoid. So improvement here would also be beneficial for
Tracker's FS miner in the form on less IPC overhead. It's win-win.
Phone builders want to rid themselves of file system mining. Instead
they want to let MTP daemons, who deal with incoming files, do the
processing and extraction of file meta data. They don't want to
configure with DConf or a GKeyFile to point to a directory where the MTP
daemon will write files, at all.
Right, there are different ways which data can come in and we shouldn't
restrict ourselves to the filesystem. That makes sense. But we don't. It
just so happens the miner-fs is the main way people get data into
tracker-store.
Right now. We can change this, but I think the right developers need to
be reactivated or at least provide a supportive role to a contributor
when or if somebody new starts working on this.
Instead they want their MTP daemon to use a simple API that will trigger
tracker-extract into extracting the file and then writing the SPARQL
INSERT to the SPARQL endpoint.
One of the things I would love to see happen (or add) is a command line
option or way to inject SPARQL from tracker-extract into the store. We
have a hack for this right now with tracker-control -f $FILE and a dbus
API. The main problem with this is that the filesystem data is not there
for files (which are the main use case in tracker right now).
Yes, this could certainly be part of such a redesign or even an
intermediate step towards it.
[CUT]
Tomorrow's phone builders might not even use a file system. Why would
inter app data sharing then necessarily depend on file system indexing?!
You know that the miner-fs doesn't have to be a daemon and can index on
demand (instead of by inotify) right now in stable releases right? The
miner-fs is also configurable to not be built --disable-miner-fs (I think).
I know, but I don't think this is sufficient. A MTP daemon doesn't want
to call system(). They'll need deeper and better defined integration.
File system indexing is of course important, but only for users who need
it. Like a desktop. A desktop needs it. A phone might not need it. And
if it does, they understandably want to limit its use.
I would like to propose to start with adapting libtracker-extract to be
fully documented, to change tracker_extract_client_get_metadata's API in
such a way that it is truly obvious for a platform builder, integrator
or app developer of for example a MTP daemon to call it in order to get
the file's meta data to be inserted into tracker-store before the MTP
daemon had to write the file itself.
I was under the impression that it was already. If someone is paying for
this or wants patch review, I am happy to step up.
Awesome
To make it possible to call this on a .tmp-XYZ file for a file that will
later be renamed to Girlfriend.JPEG in the DCIM folder of the phone.
Well, this isn't actually easy to solve even if you move away from
miner-fs. If you're returning the full SPARQL including things like the
file name, size, mtime, etc. then these details change. You either
change the SPARQL and wait before injecting it to the store, or post
process by updating the store details when it changes.
As a team we did a lot of things that were not easy to solve ;-)
You can't have it both ways. You either want the data early and have to
cope with changes like the name changing OR you wait and have the data
in it's final (albeit maybe for a small time) state.
Yep
Right now this ain't possible, because libtracker-extract is too focused
on being "just a tool library for the filesystem miner".
Well, I would say it's more that the miner-fs is _THE_ only one using
it, so it's not so bad given that.
Agree
If you mean to suggest we separate this into a new project, I think that
might be a good idea. Same for the miner-fs. Possible for
libtracker-sparql too? Some investigation would be needed, there are
core libraries that we depend on in all cases and might cause problems...
I don't think that separating or splitting the subprojects of Tracker is
right now needed and/or a good idea. Long term it probably is.
One of the recent issues I've had with Tracker is, I can't find it on
Google - I think Rob mentioned this way back at some GUADEC. The name is
quite generic. I have been asked several times why we have so many
things in the tree and if we can disable or split out things. I think
RedHat recently asked if we could do this, I am sure Debian maintainers
have too.
Yes. Back then my opinion was that a rename was not needed and would at
that point in time hurt the project's team adhesion.
Today the situation is different and if all former team members and / or
a new group of contributors taking a lead role in the project agree,
then I think a rename (long term goal) would not be a bad idea.
Sadly has the name "Tracker" been given a bad reputation for false
reasons. I think the Intel MeeGo attempt for example wrongly accused
Tracker of being a reason why Harmattan MeeGo didn't succeed.
Start of thread here:
http://lists.meego.com/pipermail/meego-architecture/2011-March/000081.html
This was my response:
http://lists.meego.com/pipermail/meego-architecture/2011-March/000113.html
https://mail.gnome.org/archives/tracker-list/2011-March/msg00033.html
A rename might undo that. I'm still not much of a fan for yielding to
reputation pressure done by clueless people who without doing much
investigation (like we did do) make faux statements.
It's not really the Linux way IMO to have everything in one monolithic
module. So I wouldn't mind splitting things out.
I agree.
To make language bindings for it like for JS, Dalvik, MonoTouch, Qt.
That would be good. The API is quite small too, shouldn't take much effort.
Right
It ought to be a library for all application developers, just like how
libtracker-sparql is such a library: obvious in API, well documented,
suitable for wrapping it with for example a Qt layer and all that stuff.
:) interesting. There is a reason why it's not a library. We often have
crashes for whatever reason.
Yes, I don't think tracker-extract should cease being a process. A
library that does IPC to tracker-extract is probably the right solution.
That or a strong warning that a no-extract-process libtracker-extract
can crash as it relies on a wide variety of libraries having to cope
with a wide variety of file formats.
A libtracker-extract could also be done like how libstreamanalyze was
done, but I consider libstreamanalyzer's integration, adaptation and /
or merge with what is now the Tracker project a long term goal.
I'm adding Jos in CC. hey Jos, start of thread here:
https://mail.gnome.org/archives/tracker-list/2012-November/msg00009.html
Sometimes, it's just that the system library was updated and now our
extractor crashes. Sometimes, it's problematic files which cause crashes.
That's why we use a daemon/program to do extraction, because the people
using the extractor don't die. I think making this into a library
presents some interesting situations we would need to consider like that.
I fully agree.
I think whoever starts with improving libtracker-extract in this
direction, perhaps by renaming, copying or refactoring to a new library
the API tracker_extract_client_get_metadata, will revive the project to
its original glory.
I don't really view the project as "loosing" it's glory. It's just
slowed down, matured even you could say.
Yes ok. Still, it had more glory a few years ago. I think :-)
Kind regards,
Philip
--
Philip Van Hoof
Software developer
Codeminded BVBA - http://codeminded.be
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]