Finding and Reminding, tech issues, 3.0 and beyond



I've attempted below to extract out some of the technical bits from
http://live.gnome.org/GnomeShell/Design/Whiteboards/FindingAndReminding
and see how they line up with our current technology. This is just
notes, not yet a concrete plan.

- Owen

File management ideas and technology
====================================

"Things can safely fall off the desktop"
 
  The desktop is reconceptualized as "what you are working on", 
  "the most relevant items". Getting something off the desktop then
  shouldn't require an explicit filing decision by the user. The user
  should be able to let items "expire" with attention, or they should be
  able "archive" an item to remove it from the desktop.

  There are two basic approaches here - one is to avoid storing
  things on the Desktop. Instead of seeing the Desktop as a separate
  location in the file selector, you'd have a checkbox:
  
   [ ] Pin to Desktop
   
  (or whatever the designers come up with), and that would create
  a symlink to the desktop.
  
  The other approach is when expiring or archiving to move files
  from ~/Desktop to an archival location like ~/Documents.
  
"Be able to treat non-local information the same as Places"

  Right now, the user has a couple of organization of files based on
  directories on the file system "Music" "Documents" "Downloads".
  We want to be able to present other options for narrowing your items
  that might not be correspond to directory structure. This could
  include "Frequent", "From Email", "Spreadsheets", and so forth.
  
  In general, this type of thing requires searching over all files
  to find the subset of files that share some meta-data property.
  This is a core operation for Tracker (and for other search engines
  like Beagle, though Tracker seems to have the most interest at
  the moment.)
    
"User defined tags"

  A completely flat view of all documents doesn't handle all users
  or use cases. "Frequent filers" will want to be able to identify
  projects and other subsets of files.
  
  There's not a detailed plan for the user interface right now, but
  technically this could be done a couple of ways.
  
  We could use the traditional method of grouping by using
  folders; and just make that look somewhat tag-like in the
  UI. (Make selecting a folder show all the files in that folder
  and all sub-folders. Allow creating a folder of files without
  worrying where it was and automatically creating it in
  ~/Documents.)
   
  Or we could use a real tag-based approach with tags stored in
  metadata. (multiple tags per file, tags orthogonal to folders.)
  
"Timeline view of files"

  For items that aren't on the desktop (the "slip") the default view
  is a chronological one with "yesterday", "last week", and so
  forth. So we need to be able to organize user's files this way.
  
  One approach is to keep track of user accesses and edits via
  Zeitgeist (or in simplifed form by ~/.recently-used.xbel)
  
  The other approach would be to treataccess/edit time a
  metadata property, and to use tracker to search over these
  properties.
  
  (Note that the timeline here only includes each item once,
  not once for each usage - I use "timeline" somewhat differently
  below)
  
"Search"

  We want to be able to search - over the names of all
  documents, but also over extracted metadata such as
  document titles, and maybe over full text. This is definitely
  best supported by something like Tracker.

"Adding non-files to Desktop"

  Files won't be the primary interesting thing for all people;
  we probably want to provision for at least putting web
  bookmarks into the desktop area. (This is also interesting
  for people who want to have a GNOME desktop for their users
  configured in some particular way.)

  Probably the existing way we do web bookmarks for ~/Desktop will work.

Tracker
=======

In some testing, Tracker 0.8 seems enormously better behaved
than Tracker 0.6. It has very significant optimizations in how
it stores the tracker database on disk, and also, by default,
only indexes defined subdirs of $HOME. So, as of right now,
system-impact of Tracker isn't a big concern of mine, as it
would be for 0.6.

Possible concerns and considerations with Tracker:

 * RDF + SPARQL + a large collection of ontologies does present
   a significant new barrier to someone coming to the GNOME
   platform. While the basic concepts of RDF are quite simple,
   RDF serialization formats and SPARQL are new learning people
   will have to do, and there are some intimidating terms
   like "ontology"
   
   RDF is also popularly (and perhaps unfairly) seen as
   yesterday's fad.
   
 * There is a large abstraction barrier between the application
   and the underlying data storage. It's very hard to decipher
   or influence how storing data in RDF and running SPARQL queries
   maps into low-level database operations.
   
 * Indexing only a subset of the filesystem, while it does
   avoid performance traps like indexing into large GIT
   repositories, could result in odd behavior from a user's
   point of view. If you edit a file in an unindexed part
   of your home directory, is it invisible when looking at
   your history?
   
   This may be partly satisfied by feeding accessed files
   into the Tracker indexed set file-by-file, either directly
   or via Zeitgeist.
   
 * Even when limiting Tracker to a subset of the home directory,
   it's likely still possible to run the system out of inotify
   handles. 
 
 * Using Tracker to extract and index metadata from files is
   pretty uncontroversial. Using Tracker as the primary store
   of information (such as tags) is more controversial - suddenly
   the user's data is dependent on the use of Tracker.
   
Zeitgeist
=========

The "properties of files" approach of Tracker works for a lot
of things. However, it is pretty much unsuitable for storing
time-based histories of actions. We can store the last time
a file was edited as a Tracker property. It's slightly harder
to store all the times the file was edited. It's considerably
harder to store all the times the file was edited including
the editing application for each access.

(Of course, anything can be stored in RDF; it's a perfectly
general format; however, the more that we have to create
anonymous nodes, the more different structures that we are
storing in the tracker triple store, the harder it is going
to be to optimize, and the less suitable a straightforward
implemention of the triple-store backed by a sqlite database
is.)

My understanding is that the Tracker people have disclaimed
the log storage problem. The role of log-storage for projects
like "GNOME Activity Journal" is taken over by the Zeitgeist
daemon.

Concerns and thoughts concerning Zeitgeist:

 * There are two things where the event-logging approach
   of Zeitgeist really shines - first showing timelines
   (what was I doing two weeks ago on Thursday) and second 
   doing sophisticated computations over the past actions 
   of the user (what documents were typically edited at the 
   same time as this document.) All though these are
   interesting areas to explore, neither is central to
   current file management ideas in GNOME Shell for 3.0.
   
   The only think I can think of in the current mockups
   that requires a Zeitgeist-like approach is the
   "Frequent" selector. Without a longitudinal view
   of usage, it's hard to answer "what are the most frequently 
   used documents in the last 30 days".
   
 * To a much greater extent than tracker, Zeitgeist is
   is designed to require applications to be modified to
   push events to it.
   
 * Zeitgeist is designed to be standalone and independent
   from Tracker, but also used in conjunction. This, at
   times, makes things not as good as they could be. For
   example, Tracker has a pretty sophisticated system to
   assign a UID to each file and track files as they
   move around the file system, but Zeitgeist, which
   identifies file by file paths will lose a file as
   soon as it is moved - it doesn't piggyback off the
   work that Tracker is doing.
 
Nautilus
========

The more we hide the heirarchical filesystem as a primary
way of looking at your files, the harder job Nautilus has
to explain what is going on. If in gnome-shell we transparently
merge together things in ~/Documents and things in ~/Downloads,
then the user doesn't have a mental model that there are
two separate places and some types of things are found in
one place and some types of things in the other place.

But we can't just consider Nautilus to be the backdoor to
the filesystem - the things you use when you need to do something
low-level. Because the overview is a place you go to find
things, to switch them and get out. It's not meant to be
a place for spending lots of time manipulating things.

So for explicit file manipulation of files (cleaning up,
filing, etc) the user would probably still be using Nautilus.

Major modifications here are not going to happen
for 3.0, but as much as possible there needs to be alignment
so that things feel familiar between the two places.

Conclusions?
============

Not much yet - I think it will definitely be hard to implement
our ideas without something that looks a lot like Tracker, and 
since we have Tracker something that looks a lot like Tracker 
is most likely Tracker :-) Zeitgeist seems less centrally crucial, 
but there is a role for event logging here. 

Further UI design is definitely needed to figure out what we
can do short-term for Nautilus/GtkFileChooser, etc.




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]