Re: Parsing an XML file in an archive

From: Joe Shaw <joe joeshaw org>
To: filterdex runbox com
Cc: dashboard-hackers gnome org
Subject: Re: Parsing an XML file in an archive
Date: Sat, 21 Mar 2009 11:20:21 -0400

Hi,

On Wed, Mar 18, 2009 at 5:04 PM,  <filterdex runbox com> wrote:
> I want to create a filter for Beagle! (Actually, several filters for different formats that are built
> similar….)  I have red the documentation, but I cannot wrap my head around it. I hope some-
> one where will take the time to get me started!

If you haven't already, definitely the first place to look at is the
filter tutorial on the wiki:

    http://beagle-project.org/Filter_Tutorial

That will hopefully give you an overview on the structure of the
Filter code, how to register it with Beagle, and how to test it.

> The format is simple: zipped files (with extensions other than .zip, but they are still just zips)
> Only one file of interest: meta.xml that contains strings that can be mapped to dc values.

Definitely take a look at the OpenOffice filter.  OpenOffice files
follow this exact model: a zip file (with a different extension)
containing a bunch of XML files inside of it.  The code is not the
easiest to follow, but it's a decent starting point:

http://svn.gnome.org/viewvc/beagle/trunk/beagle/Filters/FilterOpenOffice.cs?view=markup

Look at the core overridden Filter methods for a start: DoOpen(),
DoPullProperties(), DoPull(), and DoClose().

If I were writing this code today I might use XPath instead of walking
every node in the document, and if I had C# 3.0 support I might even
use Linq-to-XML on it.  We're not officially supporting C# 3.0 yet,
although since it's fully supported in Mono 2.2 and 2.4 is coming out
soon, there's no reason why we couldn't make that jump.

Joe

References:
- Parsing an XML file in an archive
  - From: filterdex

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]