Re: Interesting Post on Tracker

Hi Richard,

Thanks for your email.

On Thu, 2006-10-12 at 10:20 +0100, richard lemurconsulting com wrote:
> The license of Snowball isn't intended to be old-style BSD, it's new style.
> Could you point out where you got that idea from, so I can fix it?

The line "You also have to alert anyone to whom you give the Snowball
software to the fact that is is covered by the BSD license" is what
threw me off, because it sounded a bit like the old BSD advert clause.
After reading your email and re-reading the page, it occurs to me that
ensuring that license file is included is probably sufficient to meet

> While I'm here, a quick plug for Snowball: we don't just offer an English
> stemmer, we offer stemmers for Danish, Dutch, English, Finnish, French,
> German, Hungarian, Italian, Norwegian, Portuguese, Russian, Spanish,
> Swedish.  Oh, and Romanian is in the pipeline.  If Beagle wants to take
> advantage of these stemmers, I'd be happy to offer help and advice.  In
> particular, if Beagle requires a version of the algorithms in C#, the
> Snowball code generator could be modified to generate a C# version without
> too much difficulty (it already generates Java and C).  Alternatively, a C#
> interface to the C stemmers could be built.  I don't speak C# well
> currently, though, so I'd need help.

The ideal situation for us would be a managed implementation of the
Snowball stemmers.  I took a look at the Java versions and the class
structure is very straightforward, and because C# is so close to Java
that I suspect it'd be trivial to tweak the Java generator to output C#.

> Also, I should note that the Snowball English stemmer is not the same as
> the Porter stemmer - rather, it is an updated version of the Porter
> stemmer, with a few rules modified to produce more useful results in
> specific common cases.  For example, the Porter stemmer will stem "news" to
> "new", whereas the Snowball English stemmer will leave "news" as "news".

Ah, thanks for pointing this out, my mistake.  Any improvement on our
current system is welcome, so I'd love to switch over to the Snowball
stemmer when we can.


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]