[xml] namespace handling


In the need to use the functions from "tree.c" to privide functionallity
as described in the DOM 2 specification + having learned something about
namespace handling in libxml2 + having read the mail
"http://mail.gnome.org/archives/xml/2002-March/msg00111.html"; from the
libxml2 archive, the following thoughts and questions arose to me:

If a namespace-node is created, libxml2 wants it to reference a
namespace declaration (node->ns --> some-node->nsDef) existent in the
ancestor-or-self axis.

   Now, let's take some DOM 2 methods:
     If I create an element like this I have to ensure that the namespace
     is available in libxml2 - even if I may never add this element to
     a tree.
     Possible solution: create a namespace and add it to the nsDef of
       this element.
       Adding this element to a tree, would mean:
         Leave the namespace as it is (this could create a lot of
         namespace declarations) OR try to substitute it with an
         already existent proper namespace in the ancestor axis.
     This all seems a bit laborious to me, but can be done.
     Are there any other solutions existent?

     Here it gets a bit messy: where to declare the namespace?
     I cannot attach it to any nsDef since there might be no tree
     existent yet (if building a tree from scratch).
     Does a solution exist?

     If I import a tree (deep=true) into an other document and would use
     xmlDocCopyNode, this function would lead to xmlStaticCopyNode and
     xmlCopyProp being processed, both trying to reconciliate namespaces,
     resulting in all namespaces being recreated at the given tree.
     Adding this tree to the desired document would imply:
       a. Leave the created namespaces - this could lead to a lot of
       b. OR try substitute the namespaces with proper namespaces in the
          ancestor axis.

     If I detach a tree from the document, namespace references to
     "out of the tree" may still exist. What to do?
        a. Don't care. This won't work, since we cannot ensure that the
           referenced namespaces will live as long as the tree
           we cut out.
        b. Reconciliate the tree, thus recreating "out of scope"
           namespace declarations at the top of the tree. This might lead
           to a lot of namespace declarations + speed loss, since it
           would be advisable to substitute those namespaces if we attach
           the tree to the document.

Considering these examples, I might doubt if libxml2's philosophy of
handling namespaces is adequate for working with DOM 2 (or DOM 3).
I wonder if the document structure of libxml2 has strictly to follow the
canonical form of an xml document. AFAIC the w3c DOM 2 does not imply
namespaces of elements or attributes being declared, when working with
the DOM 2. Once an namespaced-element or namespaced-attribute is created
its namespace (URI) cannot be changed; no matter in what namespace
declaration context it might be put in. Regarding this it makes sense
that a method like "normalizeDocument" was added to DOM 3; IMHO
namespace normalization should be done before the document is being
serialized - while working with the DOM 2 the canonical form is not needed.

So you may want to consider the following proposals:

1. A list of namespaces on the xmlDoc.
2. The parsed namespace declarations need not to be changed.
3. Copy the found namespace declarations on elements to the list on xmlDoc.
4. Namespace references should reference namespaces in the list on
xmlDoc and not any namespace declarations on elements (nsDefs).
5. Change substitution code for namespaces in functions like
   xmlCopyProp, xmlStaticCopyNode, etc. to handle the namespaces on the
list on xmlDoc.

This would result in:
   1. Namespace declarations on elements if you want them.
   2. No mess with namespace references of orphan elements or attributes.
   3. Easy import of nodes to other documents.
   4. No mess with namespace reconciliation during the work with DOM 2.
   5. Gain of speed during the work with DOM 2.
   5. The need to reconciliate or normalize namespaces of the whole
document before serialization, thus speed loss, if only tiny work on the
document is to be done.


Kasimier Buchcik

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]