Re: Beagle Scoring System

From: Kevin Kubasik <kevin kubasik net>
To: Debajyoti Bera <dbera web gmail com>, Dashboard mailing list <dashboard-hackers gnome org>
Cc:
Subject: Re: Beagle Scoring System
Date: Sat, 17 Dec 2005 14:36:49 -0500

Unfortunately, that leaves us holding the bag on how to fix it... and
I am at a loss for anything short of some hard coded reduction
ratio/factor for all mail scores.....

Perhaps its just something we have to handle in the front ends, Mail
and Chats are stored in separate indexes, maybe we should just stick
to that for the front ends as well...

-Kevin Kubasik
On 12/17/05, Debajyoti Bera <dbera web gmail com> wrote:
> > I have noticed that mail messages seem to get unusually high scores
> > from the indexer, while holmes makes the problem much less of a issue
> > (since it separates the conversation results) it still seems like
> > something worth fixing. I can't seem to figure out exactly why the
> > scoring is so off, but an initial guess would be the ease with which
> > we can add hotwords for email (subject lines) as opposed to most other
> > backends.
>
> (from http://wiki.apache.org/jakarta-lucene/LuceneFAQ )
> Lucene automatically adds a weight inversely proportional to the length of the
> field i.e. terms in short fields (like sender name, email address, subject)
> will get a higher weight (known as 'boost') that terms in text. Same holds
> for document metadata - they have more weight than document data/text.
>
> (from my understanding)
> Beagle searches several lucene indexes and merges the results based on their
> scores. Somewhere during the process, it recalculates the score based on the
> age of the document. However, absolute value of lucene scores are not
> directly comparable - the ratio (and hence the ranking) between the scores
> are comparable. In that sense, I dont think scores across multiple indexes
> should be directly compared. Ranking in a particular backend is meaningful
> and IMO, that is correct way to do it.
>
> - dBera
> _______________________________________________
> Dashboard-hackers mailing list
> Dashboard-hackers gnome org
> http://mail.gnome.org/mailman/listinfo/dashboard-hackers
>


--
Kevin Kubasik
240-838-6616

References:
- Beagle Scoring System
  - From: Kevin Kubasik
- Re: Beagle Scoring System
  - From: Debajyoti Bera

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]