Re: Parsing an XML file in an archive


On Wed, Mar 18, 2009 at 5:04 PM,  <filterdex runbox com> wrote:
> I want to create a filter for Beagle! (Actually, several filters for different formats that are built
> similar….)  I have red the documentation, but I cannot wrap my head around it. I hope some-
> one where will take the time to get me started!

If you haven't already, definitely the first place to look at is the
filter tutorial on the wiki:

That will hopefully give you an overview on the structure of the
Filter code, how to register it with Beagle, and how to test it.

> The format is simple: zipped files (with extensions other than .zip, but they are still just zips)
> Only one file of interest: meta.xml that contains strings that can be mapped to dc values.

Definitely take a look at the OpenOffice filter.  OpenOffice files
follow this exact model: a zip file (with a different extension)
containing a bunch of XML files inside of it.  The code is not the
easiest to follow, but it's a decent starting point:

Look at the core overridden Filter methods for a start: DoOpen(),
DoPullProperties(), DoPull(), and DoClose().

If I were writing this code today I might use XPath instead of walking
every node in the document, and if I had C# 3.0 support I might even
use Linq-to-XML on it.  We're not officially supporting C# 3.0 yet,
although since it's fully supported in Mono 2.2 and 2.4 is coming out
soon, there's no reason why we couldn't make that jump.


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]