Re: Indexing mail attachments & files inside Zip/tar.



Hi,
	I'd thought about indexing the contents of archives a while ago
and discussed it briefly on this list. I have since completely run out of time,
and would love ot see this happen!

	The basic idea was to use existing code (SharpZipLib) to access each
of the documents contained in an archive as a stream for indexing. Then to 
use a URI to reference each one (something like tar://the/tar/file.tar?entry=the_contained_file). 

	The problems I ran across seem trivial but caused me to run out
of time. Basically the problem is this: It is easy to know the MIME type
of the archive (tar or zip files for eaxmaple), but it gets a little tougher
to know the MIME type of the contained documents. I ran across some memory
corruption issues when trying to use the existing MIME type-sniffing functions
ifrom gnome-vfs. The problems were most likely due to my poor understanding of Marshalling in .NET,
so others will probably have an easier go at it. Also, I noticed that the functions
behave poorly when attempting to get the MIME type of a non-existing file (i.e. a 
file which is not actually on disk, but rather is an "entry" in an archive). I can try
to find the C code I wrote to test the functions if someone is interested.

	Indexing archives would be great, even for archives which are not mail attachements.

Keep up the great work!
Mike

P.S. I'd looked into this initially when looking at man/troff filters in case you feel like searching the list archive.

On Mon, Oct 18, 2004 at 06:13:46PM -0700, Veerapuram Varadhan wrote:
> Hi,
> 
> Will it be nice if we "index" files inside an archive (zip, bz2, gz,
> tar) and let the archive get hit on meeting any of the "texts" indexed
> from the "files" inside the archive?
> 
> And also, I got a feedback from one of the HR's that if a mail client
> could "beagle" its mails, which essentially means the contents of the
> attachments, that would be lot more useful to them.
> 
> Any thoughts/suggestions on that front?
> 
> Cheers,
> 
> V. Varadhan.
> 
> 
> _______________________________________________
> Dashboard-hackers mailing list
> Dashboard-hackers gnome org
> http://mail.gnome.org/mailman/listinfo/dashboard-hackers



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]