Re: [Tracker] Semantic Desktop for Gnome



Hi Anatoly,

 Good questions!


On Tue, May 7, 2013 at 1:08 AM, אנטולי קרסנר <tombackton gmail com> wrote:
Great news! And thanks for the feedback :)

I went through the various Tracker APIs visible in Devhelp, and I have a
few design questions, which you may be able to help me with.


1. Data storage:
Some file formats, such as binary data of all kinds (music, executables,
video, etc.), are stored as regular files, and only the metadata is
stored in Tracker's database. But many file formats are just text. For
example, my task management app represents the data as a graph of tasks,
and 100% of the data has semantic value. Same thing with notes and
probably many other file formats which describe data in simple XML.

So the question is: Is all the data supposed to be stored in the Tracker
database and fecthed from it every time, or I still should store the XML
data as local files, and "report" semantic data to the Tracker database?

Information that is only metadata (e.g. Contacts, a simple note) can be stored directly in Tracker. No need to use a file. A file representation could be useful to share the information (like a vcard) but you can even generate it on demand.
 

Clearly, my data *does* have some plain text, such as the task
descriptions. I guess it's not valuable to Semantic Desktop, but still,
I'm wondering how to store the data. Another idea may be like this:
Store the task descriptions and all other plain text in regular text
files, but all the semantic-desktop-useful data, like the tags and the
task graphs, will go to the Tracker database.

Every resource in Tracker is an InformationElement and has a nie:plainTextContent (IIRC) that you can use to store those small texts.

Those texts are very useful for plain text search. Then you can find your objects in tracker-needle and you can use it as clause in your structured queries: "All notes with its title sorted by modification date with "todo" in the content".
 
2. Data Mining

I see two basic approaches to using the database: Writing a miner which
reads my app's XML files and write semantic data to Tracker, or have my
app write the data directly by itself. One clear difference is that if
the apps writes directly, it's then able to avoid storing in regular
files what is already in the Tracker database, and thus avoid having
data stored twice.
Are there other considerations?


We prefer the applications writing directly into tracker. Some reasons:
  * It saves the indexing roundtrip: Using files as intermediate format means that the app writes the file, miner-fs discover it, send it to a extractor which reads the file, parses the content, creates sparql and send it to the store. All this is fairly quick, but is an unneccesary delay that can be avoided writing directly into the store.
  * It saves one "translation" of content: The app knows already the information and its structure. Using a file means to translate that information to a file format, that then is interpreted again by tracker's extractor. These translations are a risk to lose/misinterpret the data.
  * Simplifies the program. It is easier to use directly tracker that writing things in a file and wait for signals saying that a new file has been found.


 Regards,

Ivan



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]