Search Bookmarks Driver (New Contributor Questions)
- From: John Stowers <john_stowers runbox com>
- To: dashboard-hackers gnome org
- Subject: Search Bookmarks Driver (New Contributor Questions)
- Date: Wed, 02 Feb 2005 13:53:19 +1300
Hey Everyone,
Ive been reading the beagle hackers guide and am realy interested in
hacking around. I currently develop in C# on windows, but the maturity
of the mono project, the cool aplications being developed with it
(beagle, F-stop, etc) and the encouragement to contribute to projects
like beagle cause me to begin my switch.
I realy want to have a go at developing something useful for beagle. Im
thinking a bookmark (read: bookmark content) indexer for firefox/other
browsers.
Im thinking that upon noticing the bookmark file change it will go off
and check every website in a users bookmark list, downloading them all
(text only), and index each bookmarks content (if the website has been
updated). I know i have about a million bookmarks with thoroughly non
descriptive titles (hence jsut indexing the bookmark file alone is
useless), i hope that this will alow me to find them. Does this sound
like it would be useful to anyone else but me?
So with that out of the way I have a few questions on the general
operation of beagle - and how my thing will fit in (im a noob to beagle
and Open source colaborative projects in general)
Sorry about all the questions..... I'l add their answers to the wiki
when I know them.....
1) I presume that what I want to do comes under the general heading of
an External Search Driver
2) As per the hacking guide I set up Inotify events etc. When the
bookmark file changes foreach bookmark in bookmarks,
- Download (text only)
- MD5
- Cmpare MD5 with old MD5 of bookmark to see if the site has been
updated
- If so add bookmarked site to index
- else
3) See below
Now for the questions
1) How does an external query driver add things to the Lucene index
system. Looking at code for other external drivers (Tomboy in this case)
Indexable indexable = NoteToIndexable (file, note);
Scheduler.Task task = NewAddTask (indexable);
task.Priority = priority;
task.SubPriority = 0;
ThisScheduler.Add (task); //Is this the line where Beagle becomes
aware of the info to index (and hence indexes it at some time in future)?
2) How about persistance between instances, the flow of the operation,...
(Assuming when parsing the bookmark file, i download each bookmark to a
tempfile. say ~/beagle/bookmarkstemp/http://www.google.com.temp)
like if a bookmark changes, when I call This.Schedular for the bookmark
at some time in future is its uniqueness determined by
indexable.ContentUri and no harm is done by adding ThisScheduler.Add for
each bookmark without deleting the old one in the schedular/index?
(aside: instead of using flat files to store the bookmark sites, can I
use a SQLite database??. Im just wondering what you guys think is the
tidiest solution
to prevent me having to reindex every website in a users bookmarks when
the website may not have been updated - and how this plugs in with each
task in the schedualr
and its assoxiated indexable.SetTextReader (which i presume, being a
text reader, needs a flat file to read text from.))
Sorry about the barrage of questions, ;-)
John
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]