running beagle on a server over nfs



Hello beagle-pals,
	Those of you who (want to) run beagle on a central fileserver and use it over 
nfs in clients, here is an exercise which might lead you (us) to another 
solution.

Traditional way to index nfs-exported directories involve periodic scan of 
filesystem and some clever relocation of beagle index to speed up things. 
This method is slow. Usual practice is to create a static index on the 
fileserver and nfs export it to the clients while running beagle locally on 
the clients on that static index. This requires frequence rebuilding of 
static index if its not-so-static. A nice solution would be to implement 
(fix) beagle over network or beagle webservices. Here are some hints for a 
middle approach using sharing of index between server and client.

This requires
1) proper nfs setup - from http://nfs.sourceforge.net/nfs-howto/client.html 
"... To begin using machine as an NFS client, you will need the portmapper 
running on that machine, and to use NFS file locking, you will also need 
rpc.statd and rpc.lockd  running on both the client and the server...". 
Beagle uses two databases - lucene and sqlite. Lucene is able to handle 
multi-process/multi-machine database access. Sqlite might have some problems 
wrt database locking, hence make sure locking is working.
2) Same path exported in nfs i.e. if the server exports the path to a beagle 
indexed file as /foo/bar/home/file then export it so that client sees it 
as /foo/bar/home/file. Otherwise, beagle in client will complain that file 
doesnt exist.

Here is how to do it (as I am writing it, I realise its fairly messy... 
hopefully it will get simplified if enough people use it and come up with 
nice tricks to do the same):
1) Dont export .beagle to client; export .beagle/Indexes and .beagle/TextCache 
(and if you have any Configuration, .beagle/Config). Make 
sure .beagle/TextCache is exported read-write, others dont matter.
2) On server, run beagled as usual
  $ beagled --allow-backend Files ...<other options>
      - beagled will build the index and constantly monitor it for changes to 
filesystem. The usual files backend. All changes picked up instantly.
3) On client, run beagled as
  $ beagled --disable-scheduler --allow-backend Files ...<other options>
     - without scheduler, beagled will run in query-only mode.
4) Run "beagle-query seachterm" in client to search!

If you are looking forward to indexing your home directory, then you might 
have whole /home/username (or /home) exported. You can use BEAGLE_STORAGE 
environment variable to control where beagle puts its stuff. So, you can do 
this
in server,
  $ BEAGLE_STORAGE=/tmp/.beagle beagled ...
export /tmp/.beagle/Indexes and /tmp/.beagle/TextCache to client's 
(say) /tmp/.beagle/Indexes and /tmp/.beagle/TextCache
in client,
  $ BEAGLE_STORAGE=/tmp/.beagle beagled --disable-scheduler ...
for searching,
  $ BEAGLE_STORAGE=/tmp/.beagle beagle-query bla ...

Alex informed me in irc that he managed to do this, so this works :). However, 
searching using beagle-search didnt work - might be some locking issues with 
sqlite (snippets use a sqlite database).

Not sure if there is any advantage this way over the static index. Give it a 
try if you are looking for an adventure ;). It might be useful if this 
proof-of-concept actually works under normal system load and filesystem 
changes. Keep us informed.

- dBera

-- 
--------------------------------------------------------
Debajyoti Bera @ http://dbera.blogspot.com
beagle / KDE fan
Mandriva / Inspiron-1100 user



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]