[Date Prev][Date Next] [Thread Prev][Thread Next]
[Thread Index]
[Date Index]
[Author Index]
Re: [xml] libxml2 and default namespaces
- From: Kasimier Buchcik <K Buchcik 4commerce de>
- To: Paul Boddie <paul boddie org uk>
- Cc: ML-libxml2 <xml gnome org>
- Subject: Re: [xml] libxml2 and default namespaces
- Date: Tue, 13 Dec 2005 12:09:56 +0100
On Tue, 2005-12-13 at 02:17 +0100, Paul Boddie wrote:
> Hello!
>
> > Don't worry, I don't think this is trivia. I'm happy that this issue
> > reached the surface once again, since it seems to me still
> > underestimated by DOM users. I guess the most people using DOM think
> > that there's no way of how the serialized representation of a DOM tree
> > might break.
>
> It shocks me to see how complicated the standards make this, yet I feel
> somewhat embarrassed that I didn't know that createElementNS didn't guarantee
> the presence of namespace declarations in a serialised document. Seeing the
> thread on comp.lang.python, I suppose I'm not alone in that respect, however.
Warning: the following is wrong; an LSSerializer (DOM Level 3 Load and
Save module) will normalize namespaces by _default_.
> > A plain DOM serializer will just close it's eyes and won't try to
> > change anything what's in the DOM tree. That's fine and wanted
> > for e.g. editing applications.
> >
> > If one wants a samantic-safe serialization, then one needs a
> > namespace-normalization mechanism; although you risk breaking
> > QNames in element/attribute content on the other hand.
> >
> > The options here would then be:
> > 1) Close your eyes and serialize the tree
> > a) if you know exactly that you didn't create mess in the tree then
> > this is OK
> > b) be aware that your serialized tree might be broken
> > 2) Normalize namespaces and then serialize
> > a) the normalization will try to change prefixes,
> > remove/add ns-declarations, in a way that a serialization is
> > possible without altering the semantics of the DOM tree
> > b) if the DOM is not serializable then the normalization should raise
> > an error
> > c) be aware that the normalization might break your QNames
> >
> > If we apply namespace-normalization to your example, then the outcome
> > would look like:
> > <href xmlns:ns1='DAV:'/>
> > i.e. the namespace declaration of 'DAV:' would get a different prefix,
> > in order to not interfere with the <href> element in no namespace.
>
> But the href element was created with a namespace specified, but with no
> prefix in its qualified name. A subsequent discussion touched upon default
> namespace pollution where href is created as follows...
>
> href = document.createElementNS("DAV:", "href")
>
> ...and where a child of href is created as follows...
>
> no_ns = document.createElementNS(None, "no_ns")
>
> ...where None is Python's equivalent of JavaScript's null. For this I proposed
> the following serialisation:
>
> <?xml version="1.0"?>
> <href xmlns="DAV:">
> <no_ns xmlns=""/>
> </href>
Looks OK; I would expect the same result.
Just to be sure we talk about the same: your first example didn't put
the <href> element in the "DAV:" namespace. So you provided here a
different scenario, right?
I.e. ns = libxml2mod.xmlNewNs(element, "DAV:", None) does only
create a ns-declaration attribute on the element, but does not
assign any namespace to the element.
> > On the one hand I use namespace-normalization for small DOM trees,
> > where the overhead of a normalization doesn't matter; on the other hand,
> > I just try to be careful and keep the serialized form in the back of my
> > head when working on a huge DOM tree, where I want to avoid
> > ns-normalization.
>
> My objectives include using libxml2's serialisation wherever possible -
> traversing the tree in Python is typically a very slow operation, and having
> to fix up the tree is also likely to incur substantial performance costs.
Maybe we should implement a namespace-normalization function in Libxml2.
Have a look at xmlDOMWrapReconcileNamespaces (in tree.c); it does
something similar, but not exactly since namespaces are handled
differently in Libxml2 than in DOM; i.e. we cannot simply remove a
ns-declaration, since it could be referenced by node->ns fields. I don't
know anymore if it does the xmlns="" thingy, so you might want to test
this.
> Since you're not the first person to suggest namespace normalisation (and the
> related DOM standards), I had a look at the pxdom module for Python which is
> much more standards-compliant than virtually any other Python DOM
> implementation, and it would appear that pxdom does "automagically" (as
> someone said) emit xmlns declarations at least in its default configuration,
> which I would assume has something to do with the normalisation process or
> some related aspect of DOM Level 3.
I now even see that I was wrong telling you that a plain DOM serializer
won't try to normalize namespaces. It will normalize by default
according to:
http://www.w3.org/TR/2004/REC-DOM-Level-3-Core-20040407/core.html#parameter-namespaces
"namespaces"
true
[required] (default)
Perform the namespace processing as defined in Namespace
Normalization.
false
[optional]
Do not perform the namespace processing.
So that's the reason why pxdom does it automagically.
> Anyway, I'd like to thank you for the kind words and helpful advice. It's a
> longer journey of enlightenment than I thought. ;-)
Yeah, obviously the same for me.
Regards,
Kasimier
[Date Prev][Date Next] [Thread Prev][Thread Next]
[Thread Index]
[Date Index]
[Author Index]