Re: [Tracker] Indexers comparison
- From: Michal Pryc <Michal Pryc Sun COM>
- To: Mikkel Kamstrup Erlandsen <mikkel kamstrup gmail com>
- Cc: tracker-list gnome org
- Subject: Re: [Tracker] Indexers comparison
- Date: Tue, 23 Jan 2007 22:04:33 +0100
Mikkel Kamstrup Erlandsen wrote:
Unfortunatelly this is rather hard to do for me, because in the data set
there were some documents that might be for internal use only, so it
would be very time consuming to select the "proper" ones :-(
But maybe it is time to create one good set of documents that people can
freely use for testing the indexers.
Maybe some wikipedia dumps? Do they have an OAI target? Maybe we could
even takes dumps of localized wikipedias?
Cheers,
Mikkel
PS: I ofcourse mean to strip all formatting from the harvested files.
Hello,
I've created small java application and posted it on my rarely updated
blog, which grabs some text from wikipedia (MediaWiki) as you wished,
please test it.
http://blogs.sun.com/migi/entry/wikipedia_for_indexers_testing
hope it will help with testing
--
best
Michal Pryc
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]