On Fri, Jul 10, 2009 at 4:04 PM, Jamie McCracken
<jamie mccrack googlemail com> wrote:
On Fri, 2009-07-10 at 14:47 +0200, Philip Van Hoof wrote:
> I'm sure that if there are performance issues with Zeitgeist that the
> Zeitgeist team will eventually optimize them out.
>
> For example, indeed, by reimplementing them in a more performing
> programming language, or maybe even by implementing them as miner
> plugins, SPARQL functions or code running in Tracker's processes.
>
> I don't believe that Zeitgeist performing badly will affect Tracker's
> reputation.
putting it in a more performant language wont help as its the
architecture thats suspect
Here is what is zeitgeist is intending to do so far as i see it:
1) Zeitgeist front end <-> dbus <-> Zeitgeist Daemon <-> dbus <->
Tracker
2) Gnome apps <-> dbus <-> Zeitgeist Daemon <-> dbus <-> Tracker
As you can see in that arch, dbus is used twice for everything so you
double the latency and cpu needed to copy and route data. This is
independent of language used although python being the slowest language
around may well exacerbate it further
Here is what im proposing by removing the unnecessary middleware :
1) Zeitgeist front end <-> dbus <-> Tracker
2) Gnome apps <-> dbus <-> Tracker
bear in mind zeitgeist wants to use tracker for full text searches,
searches for tags as well as updates and its these searches which will
be delayed by their proposed architecture - that is my concern. If there
is stuff that tracker does not or will not do then they can use libs
instead of a daemon to avoid the suspect arch as well
I also feel the real fun in zeitgeist is surely in the front end
zeitgeist timeline based UI they will produce and not back end work
which is best left to us IMO
As a Zeitgeist developer, I'd like to clarify something: Our goal is to deliver is to deliver a new user interface for managing/browsing documents in time for GNOME 3.0. What's extremely important to us is that we release early and often enough to receive feedback from the community and polish our work.
I don't know what Seif has discussed with the Tracker team at GCDS- and I don't speak for all of the Zeitgeist developers- but it really doesn't make any difference to me whether Zeitgeist's GUI accesses the database through Zeitgeist-Daemon (which itself pulls the information from Tracker) or through Tracker itself. What matters to me a lot more than the speed benefit are a few things:
1. A high level API that wraps around SPARQL: I do like the option of using SPARQL for advanced queries, but right now we don't need that much power and it raises the entry bar for new developers.
2. A stable D-BUS API that we can use today. (I'm planning on releasing some code that I'm working on before Tracker 0.7 is even released.) We can't wait for tracker to add support for tracking timestamps and stall all development until then.
3. The ability to quickly write indexers in any language and insert new documents into the database over D-BUS: As said above, speed isn't that much of a concern at the moment. It definitely makes sense to eventually rewrite Python indexers in C to improve the performance, but now is not the time for that. Time is a lot more valuable to us then speed.
I'd say that at the moment Zeitgeist front end <-> dbus <-> Zeitgeist Daemon <-> dbus <->
Tracker makes sense because we already offer high level APIs that applications can use today. I wouldn't mind migrating some of Zeitgeist Daemon's functionality to Tracker or a suite of Tracker indexers over time, and eventually it may be possible to integrate it into existing software and get rid of it entirely.
There's one other major advantage that using Zeitgeist as middleware (for now) gives us and that's flexibility. If tomorrow we decide to add on to our ontologies (and Events is a good example of this) then we want the ability to do it immediately. As GNOME 3 comes closer and our prototypes turn into deployable apps, we'll focus on speed and cutting out unecessary steps wherever possible.
Regards,
Natan