'Re: "Re: [xml] xmlReconciliateNs fix proposal II"'

From: Kasimier Buchcik <kbuchcik 4commerce de>
To: <xml gnome org>
Subject: 'Re: "Re: [xml] xmlReconciliateNs fix proposal II"'
Date: Mon, 12 May 2003 19:02:01 +0200

Hi,

Daniel Veillard wrote:

On Mon, May 12, 2003 at 12:06:13PM +0200, Kasimier Buchcik wrote:

Hi,

I attached a new version of the fix proposal for xmlReconciliateNs.

Would be nice, if someone could verify if this one follows the 
philosophy of libxml2's namespace handling, since I'm still new to this.



  I don't have right now the time to really go through it but:

  please provide the changes as contextual patches (diff -c)


Ok. I will do that from now on.

/**
* This is just a memeory allocation function for xmlReconciliateNs.
*/



  Please use the same documentation mechanisms as for other functions
in the library.


I see. There is one :-)

int



  make sure functions used only locally are declared static to not
pollute the symbol space and avoid ABI long term compat issues

Ok.

xmlReconciliateNsAllocMap(xmlNsPtr **oldNs, xmlNsPtr  **newNs, int **createdNs, int sizeCache, int reall) {



  I'm not 100% sure taht separating the allocation functions really makes
the code cleaner, but it's not a big deal.


Well, it was already big enough, I thought ;-)

/**
* This one will return 1 if a namespace is declared in the ancestor-or-self axis of @node and 
* @breakNs does not exist in between; returns 0 otherwise.
*/

int
xmlReconciliateNsIsReachableNs(xmlNodePtr node, xmlNsPtr ns, xmlNsPtr breakNs) {
   xmlNodePtr curNode = node;
   xmlNsPtr curNs;
   if ((node == NULL) || (ns == NULL))
     return(0);
   while (curNode != NULL) {
     curNs = curNode->nsDef;
     while (curNs != NULL) {
         if ((breakNs == curNs) && (breakNs != ns))
             return(0);
         if (curNs == ns) 
             return(1);
         curNs = curNs->next;
     }
     curNode = curNode->parent;
   }
   return(0);
}



  code indentation and style use the following:

gnome:~ -> cat bin/cb
#!/bin/sh
indent -bad -bap -bbb -bli4 -br -ce -brs -cs -i4 -l75 -lc75 -nut -sbi4 -psl -saf -sai -saw -sbi4 -ss -sc 
-cdw -cli4 -npcs -nbc
gnome:~ ->


Ok, will be done.

int
xmlReconciliateNs(xmlDocPtr doc, xmlNodePtr tree) {
   // Namespace map.



  no // comments please, portability is key


Sorry me... I'm a nasty delphi chap...

   xmlNsPtr *oldNs = NULL;
   xmlNsPtr *newNs = NULL;
   int *createdNs = NULL;  
   int mapSizeCache = 0;
   int mapnbCache = 0;
   // Namespace bag.
   xmlNsPtr *nsBag = NULL;
   int bagSizeCache = 0;
   int bagnbCache = 0;

   xmlNsPtr n;   
   xmlNodePtr node = tree, tmpNode;
   xmlAttrPtr attr;
   int ret = 0, i, j;
   xmlNsPtr *ancNsList = NULL, *curAncNs = NULL;
   int found;
   xmlChar prefix[50];
   int counter;
  
   // Put namespaces of ancestors into the namespace bag.
   if (node->parent != NULL)
       ancNsList = xmlGetNsList(doc, node->parent);
   if (ancNsList != NULL) {
     // Initialize the ns bag cache if needed.           
     if (bagSizeCache == 0) {
         bagSizeCache = 10;
         if (! xmlReconciliateNsAllocBag(&nsBag, bagSizeCache, 0))
             return(-1);   
     }
     curAncNs = ancNsList;
     while (*curAncNs != NULL) {
         if (bagSizeCache <= bagnbCache) {
             bagSizeCache *= 2;
             if (! xmlReconciliateNsAllocBag(&nsBag, bagSizeCache, 1))
                 return(-1);
           }
           nsBag[bagnbCache++] = *curAncNs;   
         curAncNs++;
     }           
     xmlFree(ancNsList);
   }        
   // Walk the tree.
   while (node != NULL) {    
     if ((mapSizeCache == 0) && ((node->nsDef != NULL) || (node->ns != NULL))) {
           mapSizeCache = 10;
         if (! xmlReconciliateNsAllocMap(&oldNs, &newNs, &createdNs, mapSizeCache, 0))
             return(-1);
       }     
       // Put declared namespaces of the subtree into the namespace bag.
       if (node->nsDef != NULL) {
         // Initialize the ns bag cache if needed.       
         if (bagSizeCache == 0) {
             bagSizeCache = 10;
             if (! xmlReconciliateNsAllocBag(&nsBag, bagSizeCache, 0))
                 return(-1);   
         }
           n = node->nsDef;
           while (n != NULL) {
               // Check if we need to grow the bag cache buffer.
               if (bagSizeCache <= bagnbCache) {
                   bagSizeCache *= 2;
                   if (! xmlReconciliateNsAllocBag(&nsBag, bagSizeCache, 1))
                     return(-1);
               }
               nsBag[bagnbCache++] = n;                
               n = n->next;
           }
     }
     if (node->ns != NULL) {     
           // Try to remap namespaces.
         for (i = 0;i < mapnbCache;i++) {
             // Additionally we have to check if this one is in scope at all.
             if ( (oldNs[i] == node->ns) && xmlReconciliateNsIsReachableNs(node, newNs[i], node->ns) ) {   
                                      
                 node->ns = newNs[i];                                    
                 break;
             }
         }
         if (i == mapnbCache) {
             // Search in the parental axis for a namespace with an equal URI.
             n = xmlSearchNsByHref(doc, node, node->ns->href);



  By building the map of the in-scope namespaces at the insertion point
you should not have to call xmlSeachNs ... functions in the loop, or I
missed something...


Hmm, I cannot search by pointer in the in-scope namespaces of the given 
tree, if this is what you mean.

Example:
<a>
   <b>
     <c xmlns:x="http://X"/>
       <x:d/>
     </c>
   </b>
</a>

If we detach <x:d>, attach it to <b> and reconciliate on <b> we get this:

<a>
   <b>
     <c xmlns:x="http://X"/>
     <x:d/>
   </b>
</a>

The namespace reference of <x:d> would point to a declaration out of 
scope then (still on <c>).
This function is designed to let you decide where to begin the 
reconciliation. Thus giving you the possibility to decide where to 
create new required namespace declarations. If I were only allowed to 
start reconciliation on the attached node (here: <x:d>) your suggestion 
would be true.

             if (n == NULL) {
                 // Check if an already created namespace can be used.
                 for (i = 0;i < mapnbCache; i++) {   
                     if ( createdNs[i] && xmlStrEqual(newNs[i]->href, node->ns->href) ) { 
                         n = newNs[i];                           
                         break;                                          
                     }
                 }
             }
             if (mapSizeCache <= mapnbCache) {
                 mapSizeCache *= 2;
                 if (! xmlReconciliateNsAllocMap(&oldNs, &newNs, &createdNs, mapSizeCache, 1))
                     return(-1);
             }               
             oldNs[mapnbCache] = node->ns;
             if (n != NULL)              
                 createdNs[mapnbCache] = 0;              
             else {
                 // We need to create a new namespace.
                 // Note that we don't bother yet with dublicate prefixes this will
                 // be done later. That's why we pass NULL as the node argument to xmlNewNs.            
                 n = xmlNewNs(NULL, node->ns->href, node->ns->prefix);                                   
                 createdNs[mapnbCache] = 1;              
             }               
             newNs[mapnbCache++] = n;
             node->ns = n;      
         }
     }
     /*
      * Now check for namespaces hold by the attributes on the node.
      */



  The big problem of that approach is that you try to minimize the number of
namespaces put on the new tree but it may not work ! For example XML Schemas
allow data to be QNames, and it could go in nodes or attributes *value*.
You will not catch them this way, leading to broken QNames in the resulting
tree. See the section of the XSLT-1.0 specification about namespace output,
basically what you want is make 100% sure that the namespace in scope at the
extraction point are still available at the insertion point.


Wellowello, I'm not into xslt... but if QNames of attribute or node 
values have to be handled nowadays, then this asks for namespace 
references to be implemented for those too, since one cannot ensure that 
a prefix of a namespace will stay the same during the work with libxml2. 
It would be trivial to include those namespace references to 
xmlReconciliate then. Otherwise we have no information about the 
namespace of those QNames. If we put all the namespaces on the top of 
the tree this will still not be enough, since the namespace used by the 
QName might not be declared neither in the moved tree, nor its new home 
document (if you think of importing nodes).

I do think it makes a faster algorithm and also garantee that you won't miss a needed
namespace. Your algorithm is certainly better than the existing one but
I'm not sure it is the right way to proceed acually.


  Also the resulting code you suggest is really huge and relatively complex,
I have a hard time commiting to have it maintained in the source. 
  Can you suggest a non-minimizing but correct namespace reconciliation
mechanism ? Basically it should just grab all the namespace in scope at the
extraction point, and insert them at the root of the pasted tree, then
walk the subtree to do the reconciliation.


Well, the minimizing is a feature, it does not break any things. Do I 
still miss something here?

That should be quite simpler possibly as efficient, and if there is a coule of extra namespace unused
at least we know that there is no hole in the process,


IMHO there need not to be any unsued namespaces.


  thanks,

Daniel


Thanks too!
You might as well want to read my mail "namespace handling", since maby 
things could be done easier.


Kasimier Buchcik

Follow-Ups:
- Re: 'Re: "Re: [xml] xmlReconciliateNs fix proposal II"'
  - From: Daniel Veillard

References:
- [xml] xmlReconciliateNs fix proposal II
  - From: Kasimier Buchcik
- Re: [xml] xmlReconciliateNs fix proposal II
  - From: Daniel Veillard

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]