Re: [Tracker] A simplified tracker idea

On 07/01/10 03:55, Yuan Yijun wrote:
2010/1/6 Martyn Russell<martyn lanedo com>:


You forgot to reply to the mailing list too ;)

I hope you don't mind, but I have posted this to the Tracker mailing list
for others to see and comment on.

It's my honor. And thank you for your reply to my stupid idea.. really
thank you. I got the idea after an exciting IRC meeting (#fedora-zh)
and wrote it down in a few minutes, then I sent it to you for review
such an immature item -- I cannot say more thank you now..

No problem. Of course, we have a lot of experience with this stuff by now and a lot of it is not obvious or easy.

OK, so I have a few questions:

- Why don't you like tracker?

I don't mean tracker project, but such systems range from "locate" to
"aurora".. my reasoning:
* applications may have better understanding to data.

This is usually true yes. Which is why we have a library and examples on how to insert that data into Tracker (libtracker-miner)..

* if searching is mandatory for an application, it can be made faster
and easier to use.

Can it? Let's say for example, you write an office type client (like oowriter) and you want to collaborate with a contact. How do you get a list of contacts to do that with?

Or for a more simple example, say your application is an IM client and wants to store information and be able to find it easily. What backend and framework do you use to store and describe the relationships between the data so you can find it quickly and efficiently? Tracker does precisely this. The ontology (largely based on Nepomuk) provides a framework to relate the data. In the end this makes querying much easier and more powerful.

Using Tracker also means the data is stored efficiently in a database and you don't have to do all that yourself in your application x, y and z (it is done once). The data inside Tracker is also available between all applications instead of having to hack your way into someone else's database design (like we did with Evolution) just to find out email information or contact information.

* there is no technical limit to reuse related components. Though it
is hard to do so in the beginning, applications will finally reach an
agreement to a specific type of contents, then share the same
implement. Before that happens, it should be made easier for
applications to implement their own cache.

Applications rarely share their interests for reusing data. What tends to happen is, people use whatever they can to get the job done the best way they know.

- I don't know how Windows search 4.0 works, how does it work?

I cannot see its code either. It is not important here, since it only
works on windows.

I mean from a topological point of view. Is is the same as it always was (i.e. each search goes through the entire file system) ?

- Your idea is to have a collaborative tracker library used by applications

Yes, not only indexing service but related UI components. Maybe the
"file chooser dialog" is comparable. Every application deal with files
will need that dialog, and it is so complex and get many love.

Actually, I didn't realise this until recently, but we have support for that already in Tracker. The GtkFileChooser has a method to allow searching for the file you want and we recently updated the code to work with 0.7 too. We also integrate with Totem for finding videos/audio. We have integration with Nautilus for tagging files too.

I think there are other places using Tracker too, Evolution for example has a plugin. We are likely to end up adding support for some of the music apps because we are pushing a totem-playlist-parser patch that removes the GtkTreeView requirement in the API (so we will end up fixing the apps that break).

The integration is there and more can be added, but it doesn't happen over night unfortunately :)

[it makes]
more sense to have applications push data to us in its raw sense than having
us try to keep up with the latest schema changes by application X (we have
done this with Evolution and it is nasty).

It seems my idea is to make every application to work in this way, let
the service only manages to save something passively.

That's generally how it works. The miner-fs application we ship with does the leg work for your computer (i.e. it crawls existing data and inserts it into the store for you). This gets basic "file" information (size, dates, etc), the extraction of embedded data comes after that.

Regarding indexing music's metadata (and other types of data), if
applications control their own "domain", instead of depend on
tracker's ability, the system may evolve faster. I don't think tracker
can recognize all types of data, if so the system would be complex and
too slow. So let applications to do that.

No it can't, but it recognises all the important ones. It is complex but it isn't slow it is damn fast. Also, applications do control data in their "domain", IM clients generally don't write data about music.

<-- No standalone process. It is started with application and quit with it
too. So index is updated when user actually need the data, and user can
wait. Nowadays computers are fast enough if indexing criteria were clear
(and an initial indexing have been done.)

-->  In practice this is completely flawed. What happens if application A
writes data in a complete fashion and application B is then fired up to do
something with the same data and doesn't use the same *complete* methods to
update the data? It then causes all sorts of problems. Another example is,
what if you use a terminal or some application which doesn't support
Tracker's collaborative library - then you know nothing about that content.
This is unacceptable.

Extending an application depends on author's will and user's itches.
If an application knows nothing about tracker and cannot read its
content, that is quite normal nowadays and nothing breaks. If you are
not running tracker on your system, which application would you most
want to have tracker turned on and usable? I just installed tracker,
could you tell me how to maximize its use?

I would say the most visible ways of using it right now are:

  (the applet to quickly find your information)

  (this is being improved right now but is a bit lacking in features)

  (for tagging files)

  (for finding your files quickly when you need to choose one)

  (for finding all videos/audios)

If you use the command line a lot, then I recommend you try some of the command line apps. For more information see:


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]