Re: [Tracker] Semantic Desktop for Gnome



On 07/05/13 09:08, אנטולי קרסנר wrote:
Great news! And thanks for the feedback :)

I went through the various Tracker APIs visible in Devhelp, and I have a
few design questions, which you may be able to help me with.

Hello Tom.

1. Data storage:
Some file formats, such as binary data of all kinds (music, executables,
video, etc.), are stored as regular files, and only the metadata is
stored in Tracker's database. But many file formats are just text. For
example, my task management app represents the data as a graph of tasks,
and 100% of the data has semantic value. Same thing with notes and
probably many other file formats which describe data in simple XML.

So the question is: Is all the data supposed to be stored in the Tracker
database and fecthed from it every time, or I still should store the XML
data as local files, and "report" semantic data to the Tracker database?

Clearly, my data *does* have some plain text, such as the task
descriptions. I guess it's not valuable to Semantic Desktop, but still,
I'm wondering how to store the data. Another idea may be like this:
Store the task descriptions and all other plain text in regular text
files, but all the semantic-desktop-useful data, like the tags and the
task graphs, will go to the Tracker database.

This is really a question which divides the team. Email is the classic example, do we/could we store emails in the database?

Sure we could. But some on the team feel Tracker is more about storing the metadata and location of the information than the information itself. This saves the database from being so bloated too.

Some data types are exempt as Ivan says. Contacts and SMS/MMS content on the Nokia devices were saved in the database (for example).

2. Data Mining

I see two basic approaches to using the database: Writing a miner which
reads my app's XML files and write semantic data to Tracker, or have my
app write the data directly by itself. One clear difference is that if
the apps writes directly, it's then able to avoid storing in regular
files what is already in the Tracker database, and thus avoid having
data stored twice.
Are there other considerations?

This divides the community too. There are 4 approaches here:

1. Many of the application developers prefer to NOT have an indexer running in the background but to call an API which indexes their files on demand. This depends on your content of course. Maybe you don't want the file holding the data indexed too like with your tasks?

2. The miner-fs runs for all other cases, e.g. indexing PDFs, music, etc. left in your common XDG directories.

3. You can pick up your data with a specific miner as you say. This really means another process and another daemon. It's better to avoid this if you are writing an app that could be using approach #1. What's important to realise here, is that the miner-fs also indexes *file* based information too - if that's important to you, e.g. name, size, dates?

4. Your application inserts the data manually, using SPARQL insert statements. This gives you much more control over the data you put in but you are responsible for it! :)

--
Regards,
Martyn

Founder and CEO of Lanedo GmbH.


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]