Re: Announcing Folks2



Hey,

Sorry for the late reply (I realise I said I would look at this a few
weeks ago, and completely failed to find the time). Here’s a bit of a
mash-up of a reply, since Travis has made most of the points I wanted to
make already.

El vie, 12-04-2013 a las 18:26 +0200, Xavier Claessens escribió:
Hello folks,

It has been 3 weeks that I've worked on my weekends and nights to
rewrite Folks from scratch. Yes I'm expecting a lot of hate and
critics,
but with my experience on N900 and N9 addressbooks I wanted to give a
try to my own design ideas instead of starting from someone else's
design and then spend months to make their wrong design decisions fast
enough to be usable. So even if this turns out to be useless work, I
had
a lot of fun writing Folks2 and that's already enough for me :-)

For the record, I think the design decisions we made for folks1 are
pretty much all correct. The code which implements them is not great,
but it was originally written for correctness, with the intention of
coming back and optimising the slow bits later (which unfortunately
hasn’t happened until smcv came along).

The key difference between folks1 and folks2 is the use of a daemon to
store linking data. I’m very interested to see how that compares to
folks1 once smcv’s finished his performance work.

El lun, 22-04-2013 a las 13:03 -0700, Travis Reitter escribió:
On Fri, 2013-04-12 at 18:26 +0200, Xavier Claessens wrote:
Hello folks,

It has been 3 weeks that I've worked on my weekends and nights to
rewrite Folks from scratch. Yes I'm expecting a lot of hate and critics,
but with my experience on N900 and N9 addressbooks I wanted to give a
try to my own design ideas instead of starting from someone else's
design and then spend months to make their wrong design decisions fast
enough to be usable. So even if this turns out to be useless work, I had
a lot of fun writing Folks2 and that's already enough for me :-)

http://cgit.collabora.com/git/user/xclaesse/folks2.git/

Folks2 is entirely written in C/GObject. It has its own soname and
should be parallel installable with Folks. However it reuse the same
namespace and has an API close to the original Folks, so it probably
cannot be used in parallel in the same application (why would you do
that anyway?). It already has telepathy and EDS backends and is able to
load my roster in about 200ms. I've hacked a roster reusing
EmpathyRosterContact and EggListBox widgets on top of Folks2 and GTK is
now clearly the bottleneck. Using an old-school GtkTreeView, Folks2
display my full roster almost instantaneously. Compared to Empathy who
takes 100% CPU for about 15s and its window becomes grey because WM
thinks it crashed.

Good to hear about that huge speed-up. Have you tried it with a Google
account in EDS with ~6,000 contacts? That's the slowest use case for
Folks currently. See https://bugzilla.gnome.org/show_bug.cgi?id=689549

Indeed. It would also be good to see how this compares to folks1 after
smcv’s optimisation work is finished.

Just glancing over the API very briefly, a few things stand out:

* I'm assuming your "merging" is non-destructive as it is in Folks. If
  so, I'd be careful to always refer to it as "linking" which apparently
  is much clearer to users (that's why we've used that language
  ourselves) and obviously it's preferable to actually-destructive
  merging, as we know from Maemo. Since it's really a single concept in
  both the implementation and UI, we might as well be consistent with
  the language

Totally agreed.

* And I'd recommend search functionality early for similar reasons. I've
  got a bit-rotted branch of it for Folks that I'm meaning to clean up
  and finally merge at some point, but it's something we should have
  considered early on, since it's very important. All that really
  matters is fuzzy string matching for names and phone numbers (which
  are their own special case). Any fancier types of searching (including
  booleans, etc.) really aren't worth implementing because regular
  people (and most irregular people) don't use them in real life.

More discussion about it (from the folks hackfest) here:
https://bugzilla.gnome.org/show_bug.cgi?id=646808

* using the vCard attribute structuring is tempting, and I considered it
  as well. My main question is, how does the client look when you're
  displaying the Individual's avatar, nickname, and presence type (as an
  icon)? If it's not too bad, it may be worthwhile. We created all the
  interfaces for convenience and type safety. I'm ambivalent about which
  approach is better.

I definitely prefer the approach folks1 took where we used type-safe
interfaces for properties, rather than vCards. Apart from all the nice
theoretical advantages of type safety (and not having to marshal
everything through strings, GValues or GVariants), clients only ever use
a well-defined set of properties anyway, so having support for arbitrary
key-value pairs is pointless.

Pasting Folks2's README for more details. Comments are of course
welcome.

Concepts
========

Backends (e.g. telepathy, eds) provides a set of personas which provides a
common API on top of a backend specific contact (e.g. TpContact, EContact).
Personas have an unique identifier string which is opac to user, but can be
parsed by backends to find its underlying contact. For example a telepathy
persona's ID would looks like "telepathy:<account path>:<contact id>".

Individuals aggregates one or more personas and exposes information gathered
from its personas. For example its presence will be the presence of the "most
available" of its personas, its emails will be the union of the emails of
all its personas, etc. An individual also has an unique identifier string. If
it contains only one persona then it is its persona's identifier. Otherwise it
looks like "individual:<uuid>". An individual id represents an unique set of
personas, adding/removing personas will result in a new individual id.

I think this special-casing could end up making the client code a little
awkward. I can see why it'd be useful but I'd encourage you to keep a
flat namespace for individual IDs and just let the client code check
whether there's a single Persona in case it cares about that. Otherwise,
I could see this causing trouble.

Agreed. We’ve had far too much trouble with different ID namespaces in
folks1. Deterministically producing the individual’s ID from its
personas’ IDs is good. If you really do want folks2 to replace folks1,
you might want to change it so that IDs are compatible between the two
(which, IIRC, should just mean using the same UID format for personas,
and the same hashing algorithm for generating individuals’ IDs). Then
IDs which are saved externally to folks won’t be broken in a transition
from folks1 to folks2.

Folksd is a DBus service centralizing merge information. It maps invididuals
IDs to a set of personas IDs merged together. It does not know any other
information than IDs, so clients must tell explicitly what personas to merge
together. Any implicit merging logic must be done client-side, eventually by
asking the user.

‘linked’, rather than ‘merged’?

Expected UI
===========

One very important goal is to limit as much as possible information duplication
between processes. Processes caring about only few individuals (call ui,
chat window, file transfer hanlder, gnome-shell notifications, etc) should not
fetch and get change notifications about any other individuals. Only 1 process
should fetch all individuals: the Contact List UI.

The contact list is expected to know all the details about all individuals, so
it is expected to be the process who will do the merging heuristics and tell
Folksd to merge some personas. It is its responsability to ask user's
confirmation first or not.

The contact list could also implement a DBus searching interface, so other
applications could ask "which individual has email address foo example com" and
it would return the individual ID. From that ID the application can ask Folksd
for the list of personas IDs and then fetch the needed information.

Out of scope
============

Compairing with the original Folks project, a few things are out of Folks2'
scope:

 - Offline telepathy contacts caching: It is important to still have all
   information about individuals regardless of the internet connectivity. It is
   thus needed to cache telepathy contacts on disk for offline usage. It is
   expected that the Connection Manager will write those information on disk.
   When account goes online/offline only the presence of the persona would
   change, and internally the persona wrapper would switch from a TpContact to a
   TpOfflineContact and vice-versa.
   See https://bugs.freedesktop.org/show_bug.cgi?id=62378

Agreed. This should be moved out of folks1 and into Telepathy. It was
only implemented in folks1 because that was easiest at the time.

 - Merging heuristics: It is the UI who decides which personas to merge
   together. UI could as well decide to add information into e.g. Google EDS
   book that would help its heuristic logics to detect merged contacts on other
   devices. It would be his responsability to trust those information and do
   implicit merging without asking the user.

I disagree. I think it’s important that a consistent view of links
between contacts is seen by all processes which use folks, and hence the
aggregation heuristics must be in the core of folks. Unless you’re
talking about heuristics for finding potential matches (folks1’s
PotentialMatch API)?

In general, I think this is an interesting project and I'm curious to
see how it could handle the points I raised above.

Me too. I think you’ve written an impressive amount of code (especially
given it was a spare time project — although the copyright headers say
Collabora??), and it would be nice to see it feed back into folks1. Or
perhaps for folks2 to replace folks1, though I think that would be more
work (both in adding features to folks2, testing them, and porting
clients without causing pain for everyone).

Philip



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]