Re: Finding and Reminding, tech issues, 3.0 and beyond

From: Jamie McCracken <jamie mccrack googlemail com>
To: Martyn Russell <martyn lanedo com>
Cc: Owen Taylor <otaylor redhat com>, gnome-shell-list gnome org, Alexander Larsson <alexl redhat com>, desktop-devel-list gnome org
Subject: Re: Finding and Reminding, tech issues, 3.0 and beyond
Date: Wed, 14 Apr 2010 13:28:01 -0400

One way around all this is to provide interfaces which could use tracker
or something else for base storage and notification of changes. 

In the future tracker may use CouchDb for storage of tags/notes and
other user input metadata as being able to import/export to/from the
cloud and across machines is likely to be very important. (CouchDB is
schema-less and so is not suitable for direct use IMO without tracker's
standardised  ontologies)

I would add that private metadata (which only has meaning for a specific
app) should never be put in tracker (indeed are ontologies/schema are
for shareable metadata only).

jamie


On Wed, 2010-04-14 at 18:04 +0100, Martyn Russell wrote:
> On 14/04/10 16:10, Alexander Larsson wrote:
> > On Fri, 2010-04-09 at 18:09 -0400, Owen Taylor wrote:
> >
> >> "User defined tags"
> >>
> >>    A completely flat view of all documents doesn't handle all users
> >>    or use cases. "Frequent filers" will want to be able to identify
> >>    projects and other subsets of files.
> >>
> >>    There's not a detailed plan for the user interface right now, but
> >>    technically this could be done a couple of ways.
> >>
> >>    We could use the traditional method of grouping by using
> >>    folders; and just make that look somewhat tag-like in the
> >>    UI. (Make selecting a folder show all the files in that folder
> >>    and all sub-folders. Allow creating a folder of files without
> >>    worrying where it was and automatically creating it in
> >>    ~/Documents.)
> >>
> >>    Or we could use a real tag-based approach with tags stored in
> >>    metadata. (multiple tags per file, tags orthogonal to folders.)
> >
> > Does tracker currently index gvfs/gio metadata?
> 
> No.
> 
> > Thats a highly efficient
> > way to set small "non-extracted" metadata on files that will
> > automatically be copied/moved/etc when files are managed with nautilus
> > or other gio apis.
> 
> That sounds quite similar to what Tracker does, perhaps not as efficient 
> of course (due to infrastructure/IPC/etc).
> 
> What about files which are not copied/moved/etc with GIO APIs/commands?
> 
> >> Tracker
> >> =======
> >
> >>   * Using Tracker to extract and index metadata from files is
> >>     pretty uncontroversial. Using Tracker as the primary store
> >>     of information (such as tags) is more controversial - suddenly
> >>     the user's data is dependent on the use of Tracker.
> >
> > I'm personally of the opinion that we should use a separate store for
> > such metadata, and then index this with tracker. Which is why i created
> > the gvfs metadata storage:
> 
> It has certainly been suggested that reproducible metadata (from files) 
> should be considered in another database to the one keeping tags and 
> more user-specific data. I think we agree there that it would be 
> preferable, however, it isn't done that way right now.
> 
> > http://blogs.gnome.org/alexl/2009/06/24/data-about-data/
> 
> Just as an update on the current state of Tracker regarding that post:
> 
> - "It uses a database, so each read operation is an IPC call that gets 
> resolved to a database query, causing performance issues", this is true 
> to some extent. We have been looking at improving this. In addition 
> there is a bug about a direct access library which Bastien filed. We are 
> considering it¹.
> 
> - "I don’t like the mixing of user specified data like custom icons with 
> auto-extracted or generated data...", we agree generally. Tags are a bit 
> of a unique situation at the moment (there might be other cases too).
> 
> - "The tracker database is a huge (gigabytes) complex database with 
> information from lots of sources, mostly autogenerated." This might be 
> true for 0.6, but 0.8 is much more efficient in terms of space. For my 
> collection (consisting of 174223 resources, 11450 images, 17750 audio 
> files, 151052 files, 20536 folders) the meta database is 344542208 
> bytes. That's quite a bit smaller than it used to be.
> 
> - "This risks the data not being backed up", we now have a journal which 
> backs up the data quite efficiently too. This replays all transactions 
> and even ontology changes.
> 
> - "Also, people having problems with tracker are prone to remove the 
> databases and reindexing just to see if that “fixes it...", the only 
> people that come to us with those problems are extreme use and corner 
> cases or where testers have done something which warrants it. Generally 
> we don't get many people needing that with 0.8+.
> 
> - "...or due to database format changes on upgrades.", we have limited 
> support for this in 0.8 (depends on what changes in the ontology). For 
> 0.9 we have pretty much completed that now.
> 
> - "Also, the generic database model seems like overkill for the simple 
> stuff we want to store, like icon positions and spatial window 
> geometry.", it is no longer generic, which is a major contributing 
> factor to why 0.8 is so fast.
> 
> - "For instance, many people report that system performance when using 
> Tracker suffer. I’m sure this is fixable...", It was always fixable with 
> slower indexing, but that's usually not good enough. It is always a 
> trade off. This is not the case so much these days. At least on my 
> desktop I don't even notice when it is running. Others have said the 
> same thing.
> 
> ¹ https://bugzilla.gnome.org/show_bug.cgi?id=613255
>

References:
- Finding and Reminding, tech issues, 3.0 and beyond
  - From: Owen Taylor
- Re: Finding and Reminding, tech issues, 3.0 and beyond
  - From: Alexander Larsson
- Re: Finding and Reminding, tech issues, 3.0 and beyond
  - From: Martyn Russell

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]