Re: [Tracker] Review request : Bridge manager subsystem



On 16/08/09 09:05, Ivan Frade wrote:
Hi Adrien!

Hi all,

as part of my gsoc project, I implemented a system to allow Tracker to
index online resources. It's basically split into two parts :
1. The bridges are small programs which connect to a remote webservice
and import data into Tracker via its SPARQL interface. They are
standalone processes and all expose a common DBus interface. They are
started by DBus activation.

Great! BTW, we call those processes "miners".

Yep we do :)

2. A bridge manager, in charge of calling the bridges to ask them to
pull the data. Any program can talk to the bridges, here I implemented
a bridge manager as a Tracker subsytem.

Uhm, the original idea was that each bridge/miner had its own
configuration and it is an independient entity; No control at all from
tracker itself.

Martyn was working in a generic configuration super-class, so you dont
need to write from scratch the configuration management for each
miner.

Yea, we already have a preliminary DBus API for this. Perhaps we could discuss this further with you Adrien to make sure the API is sufficient. Right now the file system miner is the only one we have to worry about so our API might be a bit vague.

Basically, what it does is
list the available bridges, and call the synchronization method
(Pull()) at a given interval. The default interval is 300 seconds, and
is currently shared by all the bridges. It also exposes a DBus
interface which allows to set the Pull () interval and to force a call
to Pull() on one or all of the bridges.

I prefer completely autonomous bridges/miners than a "poll" solution;
besides, the poll code should be in tracker-store and it doesn't fit
there. Ttracker-store is just a SparlQL DB with a dbus interface; no
extra logic (DB and sparQL handling and backup stuff, that's all in
the store).

I agree with Ivan here. Miners should know how and when to pull data, not be polled from the store. Also the store shouldn't have to manage miners at all. This "sync" method is currently done by starting all miners on startup using a desktop file and letting them run the whole time (instead of exiting after inactivity like we used to).

Details :
The object basically lists the available bridges (they all have a
.desktop file in /usr/share/tracker/bridges), and keeps the list in
memory.

This part is really interesting, because we need a tracker-applet that
receive information from all those bridges. The idea i talk with
martyn was something like the network manager applet, but instead of
listing networks, listing miners and their status.

Yea, the way we planned to do this was by using the DBus API to get a list of names with the Tracker prefix for miners and expect them all to have the same base class API (pause/continue/get_status/etc).

When you click on the applet you would see a list like:

Filesystem OK
RSS           Updating
Flickr          Paused

Maybe a pause/play button to pause/run the miners. To do this we need
all miners present in DBus, so maybe we can use your code to implement
this.

Yea, as Ivan says, this is what we want to do. This is quite important for users to see how things are progressing and to be able to control those processes.

Where to get the code :
git://git.mymadcat.com/tracker , branch tracker-bridges
Then, if you want to get some bridges to play with, first install
git://git.mymadcat.com/vapi
git://git.mymadcat.com/libtrackerbridge
git://git.mymadcat.com/bridge-manager
git://git.mymadcat.com/bridge-facebook
git://git.mymadcat.com/bridge-flickr
git://git.mymadcat.com/bridge-twitter
git://git.mymadcat.com/bridge-gdata

  WOW, i want to try this!

Nice, I will take a look this week some time.

Unknown issues :
I'm sure there are many... Please report them to me !

  A thing that is common for almost all miners is the
"connection_status" functions to know whether we are online, or
whether the connection is good to retrieve massive data. It would be
great to do it with an interface and multiple implementations and in a
library. How about create a libtracker-miner library? (Something
similar to the old tracker-module library that Carlos created in 0.6?)

Ivan, we currently have libtracker-miner in a separate branch which we are working on. It will have that DBus API shared amongst all miners and also the file system crawling basics for those miners which need to crawl some directories for their data.

We could include some more stuff in there which makes sense for > 1 miner (i.e. shared APIs).

Cheers

  Cheers, and congratulations for the good work!

Yes, I second that. Really good to see work done here.

--
Regards,
Martyn



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]