Re: XML libs (was Re: gconf backend)

On Mon, Sep 29, 2003 at 11:12:15PM -0400, Jody Goldberg wrote:
> On Sat, Sep 27, 2003 at 07:51:28PM -0400, Daniel Veillard wrote:
> >   No and that is a big mistake you're making. Example: you don't have to 
> > know about entities at the user level for those apps, you just let the
> > parser do the work for you. You won't even know that they have been there.
> > But if you ignore them at the lib level, you loose the data.
> As an example.  In most instances applications do not need libxml to
> generate their XML.  The app is in control on what is being

  Hum, we were not talking about generation. 

> generated, and spewing a DOM tree is total overkill.  Gnumeric's
> default xml exporter gets hammered this way on large files.  Four
> times base representation memory usage is lethal for 100 Meg
> workbooks.  This is why I have the ultra simple printf style
> wrappers in libgsf.

  Sure, building a DOM to save a serialization is overkill, nothing
to argue about this.
> On import similar issues arise.  The overhead of DOM and even
> xmlreader are significant, which is why I had to whip up the trivial
> little namespace supporting SAX wrappers for gsf.  We'll see if
> conglomerate can adopt them, which would force the addition of
> entities.

  One customer reported a 5x speedup with thre reader on the upcoming
2.6.0 compared to 2.5.11 . Considering namespaces and SAX, 2.6.0 will
have SAX2 where namespaces are fully resolved at the callback level.

> Another area where libxml could really assist its users would be a
> set of canned conversion routines.  Applications all end up
> replicating load/store operations on
>     bool
>     int
>     float
>     enum
>     binary blob
> Throwing in a set of those for content and attributes would go along
> way to making things simpler out of the box.

  All those are nice but you need a precise definition when you
start going from a data serialization to a type system:
    1/ libxml2 implement W3X XML Schemas Datatypes
    2/ the current routine are mostly used for validation not
       for scanning but they can be used for input

                xmlSchemaGetPredefinedType      (const xmlChar *name,
                                                 const xmlChar *ns);
                xmlSchemaValidatePredefinedType (xmlSchemaTypePtr type,
                                                 const xmlChar *value,
                                                 xmlSchemaValPtr *val);

  It's not the most trivial API, but it exists.


Daniel Veillard      | Red Hat Network
veillard redhat com  | libxml GNOME XML XSLT toolkit | Rpmfind RPM search engine

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]