Re: Two questions about medusa



On Mon, 2003-05-05 at 11:35, David C Sterratt wrote:
> I've had more of a look at medusa, and I've got two questions:
> 
> Firstly, it looks as though to add a content indexer for
> application/pdf would require writing some c-code to make a PDF
> indexing module.  Is that right?  Other indexers (e.g. htdig IIRC)
> allow plugins specified in a configuration file to convert between
> mime types (e.g. pdftotext for application/pdf to text/plain).  Is
> this planned for medusa?

Yes.  My priorities are:
1. Upgrade the progress dialog to gtk2 in nautilus
2. Restore keyword/emblem indexing.
3. Distribute patches and notes to get back advice, criticism, and
flames.
4. OpenOffice indexer
5. MSOffice indexer
5. PS/PDF indxer

I believe 1, 2, 3 will be done in the next 7 days (I got a lot done in
the last week).  The indexers will be added in the subsequent weeks.  It
only takes a few days to do an indexer.  Word on the street has it, that
I'll be laid off from TimeLife.com in the next month so I'll finally
have time to work on something interesting, barring the fact I've got to
look for a job.

I've toyed with the plugin idea as it might get some things done
quickly.  I'd like to bring some intelligence to what is indexed, and
the plain text indexer cannot handle that.  XML content like OpenOffice
is very rich and it would loose it some of it's meaning and relevance if
it were crudely converted to plain text.  PDFs don't have any meaning. 
They would be fine in your solution.  We need to weigh the capability of
adding ad hoc indexers verses their potential dependencies.

> Secondly, it looks as though medusa can't search for phrases or
> words including globbing characters.  Is that right?

Yeah.  That is a weakness, and a difficult one to overcome. I can image
how to add the phrase capabilities by adding some additional index
information.  The globing (* and ?) could be done with some ungraceful
hacks--but I think we would need to get the OR functionality working.

-- 
__C U R T I S  C.  H O V E Y____________________
sinzui cox net
Guilty of stealing everything I am.




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]