Re: Stemmed search configuration


> >>how to decide the language of the data/metadata for each document.
>  Proberly not easy. But i gues word/oo documents must have this information

Ahh right. Office documents also have the language specified.

There is actually a large Textcat (automatic language detection) based
patch in the bugzilla. But I did not find the time to look into that.

>  >>Beagle has the means to use a different stemmer for each document
>  Does this mean the stemmer are used while indexing data, and not while
>  searching data?

No, of course it uses the stemmer for searching. But there can be a
default stemmer and the query API can allow the user to override it by
an alternative language while querying. So that is not that big a

>  >>For most documents, only some data/metadata  fields are in a different
>  >>language and the others are generally in English
>  I do not think this is a problem. My geas is most searches in my organition
>  will be keywords from the contents of the documents, not the tecknical meta
>  data. 98% of the search will be after danish words.

Well then you are much better off changing the source to make the
default stemmer danish :)

>  >>If you are using 0.3.x ... change ... DEFAULT_STEMMER = "Danish";
>  Sorry have not been able to compile (configure) 0.3.2 on my fedora FC6 (only
>  0.2.x)
>  I have installed:
>     ndesk-dbus-0.6.1a-2.fc9
>     ndesk-dbus-glib-devel-0.4.1-3.fc9
>     ndesk-dbus-devel-0.6.1a-2.fc9
>  But configuration fails:
>    configure: error: Package requirements (ndesk-dbus-glib-1.0 >= 0.3.0)

Are you sure you installed ndesk-dbus and ndesk-dbus-glib correctly ?
Can you paste the contents of
/usr/lib/pkgconfig/ndesk-dbus*.pc ? There should be two files, one
ndesk-dbus* and other ndesk-dbus-glib*.

- dBera

Debajyoti Bera @
beagle / KDE fan
Mandriva / Inspiron-1100 user

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]