Re: Thoughts about FilterChm



Hi,

On Sat, 2006-04-22 at 12:59 -0500, Miguel Cabrera wrote:
> In theory that would allow to index all the Html files inside the Chm
> file (currently it only index index the topics file and the default
> page due processor use overhead). The code as is written inherits the
> behaviour of the Html filter, but rather than being a Html filter, Chm
> filter should use the html filter to index the text. the last time I
> saw the code of beagle this was not possible (I understood  that FSQ
> did not support Child Indexables [1][2]) . 

It's true that the FSQ doesn't support child indexables yet.

I'm not sure they're necessary, though, but I admit that I don't know
much about CHM files.  How are they viewed?  Does it make sense to break
down the CHM files into multiple indexable objects that can be referred
to separately?  Take two contrasting examples:

        * Archive files, when they're fully supported, will allow you to
        search against files contained within the archive.  It will make
        sense to be able to extract and open individual files within
        them.
        
        * OpenOffice documents are actually zip files, which contain
        several files within.  But to the user this is one single
        document; the internal details are not relevant at all.  There
        is no reason why we'd ever want to refer or retrieve a file from
        the archive.
        
Again, without much CHM knowledge, my belief is that CHM files are more
like the latter than the former.

Joe





[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]