Online/Offline and Wikipedia Re: beyond text Re: Attaching Meta-Data
- From: Srikant Jakilinki <sriks dcs gla ac uk>
- To: Julian Satchell <j satchell eris qinetiq com>
- Cc: darryl vandorp ca, dashboard-hackers <dashboard-hackers gnome org>
- Subject: Online/Offline and Wikipedia Re: beyond text Re: Attaching Meta-Data
- Date: Wed, 27 Oct 2004 06:13:41 +0100
And yes one more thing. I think we should try to reduce using online
services as much as possible. I come from India and if I have to depend
on Google for spellchecking or even the Dict server for something else,
it would be a contradiction of interests for most people. This "online"
stuff should be detectable by Beagle and in such cases the search
results could be ranked differently but I guess that we can do all this
offline by installation of appropriate packages/daemons/services...
How hard is it to do such things offline? We are not worried about space
are we? And a question. Is it possible to download the latest complete
Wikipedia snapshot at any point and install it in our computer? It could
install a server of sorts or give complete offline browsing. Any idea of
such a project? And if this could be useful to Dashboard in some way? It
would be cool if we used this knowledge when we see some obscure term
like "existentialism" in our index. The obscurity of course determined
by the Zipf ranking for that word which we should use as well. I am just
rambling now and will clarify this weekend. Any thrashings or comments
very welcome...
On Mon, 2004-10-25 at 13:05, Julian Satchell wrote:
> You are talking about synonym expansion.
>
> This needs more than a standard ispell/aspell type dictionary, as you
> need the semantic relationships of words. The only big, not-paid-for,
> dataset like this that I know of is WordNet, which is english; I think
> there are efforts to make similar sets for some European languages.
>
> It is only rarely useful; many search engines used it in the late 90s,
> but it is not now normally turned on. It mildly increases recall, but
> vastly reduces relevance; most searches return too many items anyway, so
> this is not often wanted.
>
> In many cases, a good thesaurus tool (a front end to WordNet?) will
> allow you to do the expansion youself.
>
> Julian
>
>
> On Mon, 2004-10-25 at 16:16 -0700, Veerapuram Varadhan wrote:
> > On Thu, 2004-10-21 at 10:05 -0500, darryl vandorp wrote:
> > > >
> > > > It would rock if it used aspell or some other dictionary that wasn't
> > > > online. Setting up a query driver for the google spell checker would
> > > > be a good start though.
> > > >
> > > > -- joe g.
> > > >
> > > >
> > > There's also the dict protocol I don't know if there's a c-sharp
> > > library for that somewhere.
> > >
> > > -darryl
> >
> > hmmm.. I think I was not clear in my previous post. Well.. what I meant
> > by "Dictionary search" was:
> >
> > * search the "beagle-backend" for the "synonyms" of the user entered
> > "query word".
> >
> > For example: "There has been a serious debate/disagreement on a
> > particular feature being implemented in a tool and lot of mails, docs,
> > chat logs are available as a record. Now, if user wants to search on it
> > and he doesn't know the exact word but knows to the extent that 'there
> > was a debate/dispute'.".
> >
> > In such scenarios, he can very well say "dispute" and select
> > "Dictionary search" and he gets a hit list that satisfies:
> >
> > * Docs, mails, chat-logs, web pages, etc., that contain the keyword
> > "dispute" or "debate" or "disagreement" or "quarrel" or "argumentation"
> > or "discussion" etc.
> >
> > Maybe that the example that I stated above is not good, but, I just
> > wanted to explain what I meant by "Dictionary" search.
> >
> > Cheers,
> >
> > V. Varadhan.
> >
> > _______________________________________________
> > Dashboard-hackers mailing list
> > Dashboard-hackers gnome org
> > http://mail.gnome.org/mailman/listinfo/dashboard-hackers
> >
>
> _______________________________________________
> Dashboard-hackers mailing list
> Dashboard-hackers gnome org
> http://mail.gnome.org/mailman/listinfo/dashboard-hackers
--
Cheers-Regards-Sincerely,
Srikant
" " - Sriksisms ~powered by~ TagZilla
http://sriks6711.blogspot.com
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]