Re: [xml] xmlReconciliateNs fix proposal II



On Mon, May 12, 2003 at 12:06:13PM +0200, Kasimier Buchcik wrote:
Hi,

I attached a new version of the fix proposal for xmlReconciliateNs.

Would be nice, if someone could verify if this one follows the 
philosophy of libxml2's namespace handling, since I'm still new to this.

  I don't have right now the time to really go through it but:

  please provide the changes as contextual patches (diff -c)

/**
 * This is just a memeory allocation function for xmlReconciliateNs.
 */

  Please use the same documentation mechanisms as for other functions
in the library.

int

  make sure functions used only locally are declared static to not
pollute the symbol space and avoid ABI long term compat issues

xmlReconciliateNsAllocMap(xmlNsPtr **oldNs, xmlNsPtr  **newNs, int **createdNs, int sizeCache, int reall) {

  I'm not 100% sure taht separating the allocation functions really makes
the code cleaner, but it's not a big deal.


/**
 * This one will return 1 if a namespace is declared in the ancestor-or-self axis of @node and 
 * @breakNs does not exist in between; returns 0 otherwise.
 */

int
xmlReconciliateNsIsReachableNs(xmlNodePtr node, xmlNsPtr ns, xmlNsPtr breakNs) {
    xmlNodePtr curNode = node;
    xmlNsPtr curNs;
    if ((node == NULL) || (ns == NULL))
      return(0);
    while (curNode != NULL) {
      curNs = curNode->nsDef;
      while (curNs != NULL) {
          if ((breakNs == curNs) && (breakNs != ns))
              return(0);
          if (curNs == ns) 
              return(1);
          curNs = curNs->next;
      }
      curNode = curNode->parent;
    }
    return(0);
}

  code indentation and style use the following:

gnome:~ -> cat bin/cb
#!/bin/sh
indent -bad -bap -bbb -bli4 -br -ce -brs -cs -i4 -l75 -lc75 -nut -sbi4 -psl -saf -sai -saw -sbi4 -ss -sc -cdw 
-cli4 -npcs -nbc
gnome:~ ->


int
xmlReconciliateNs(xmlDocPtr doc, xmlNodePtr tree) {
    // Namespace map.    

  no // comments please, portability is key

    xmlNsPtr *oldNs = NULL;
    xmlNsPtr *newNs = NULL;
    int *createdNs = NULL;  
    int mapSizeCache = 0;
    int mapnbCache = 0;
    // Namespace bag.
    xmlNsPtr *nsBag = NULL;
    int bagSizeCache = 0;
    int bagnbCache = 0;

    xmlNsPtr n;   
    xmlNodePtr node = tree, tmpNode;
    xmlAttrPtr attr;
    int ret = 0, i, j;
    xmlNsPtr *ancNsList = NULL, *curAncNs = NULL;
    int found;
    xmlChar prefix[50];
    int counter;
   
    // Put namespaces of ancestors into the namespace bag.
    if (node->parent != NULL)
        ancNsList = xmlGetNsList(doc, node->parent);
    if (ancNsList != NULL) {
      // Initialize the ns bag cache if needed.           
      if (bagSizeCache == 0) {
          bagSizeCache = 10;
          if (! xmlReconciliateNsAllocBag(&nsBag, bagSizeCache, 0))
              return(-1);   
      }
      curAncNs = ancNsList;
      while (*curAncNs != NULL) {
          if (bagSizeCache <= bagnbCache) {
              bagSizeCache *= 2;
              if (! xmlReconciliateNsAllocBag(&nsBag, bagSizeCache, 1))
                  return(-1);
            }
            nsBag[bagnbCache++] = *curAncNs;   
          curAncNs++;
      }           
      xmlFree(ancNsList);
    }        
    // Walk the tree.
    while (node != NULL) {    
      if ((mapSizeCache == 0) && ((node->nsDef != NULL) || (node->ns != NULL))) {
            mapSizeCache = 10;
          if (! xmlReconciliateNsAllocMap(&oldNs, &newNs, &createdNs, mapSizeCache, 0))
              return(-1);
        }     
        // Put declared namespaces of the subtree into the namespace bag.
        if (node->nsDef != NULL) {
          // Initialize the ns bag cache if needed.       
          if (bagSizeCache == 0) {
              bagSizeCache = 10;
              if (! xmlReconciliateNsAllocBag(&nsBag, bagSizeCache, 0))
                  return(-1);   
          }
            n = node->nsDef;
            while (n != NULL) {
                // Check if we need to grow the bag cache buffer.
                if (bagSizeCache <= bagnbCache) {
                    bagSizeCache *= 2;
                    if (! xmlReconciliateNsAllocBag(&nsBag, bagSizeCache, 1))
                      return(-1);
                }
                nsBag[bagnbCache++] = n;                
                n = n->next;
            }
      }
      if (node->ns != NULL) {     
            // Try to remap namespaces.
          for (i = 0;i < mapnbCache;i++) {
              // Additionally we have to check if this one is in scope at all.
              if ( (oldNs[i] == node->ns) && xmlReconciliateNsIsReachableNs(node, newNs[i], node->ns) ) {   
                                      
                  node->ns = newNs[i];                                    
                  break;
              }
          }
          if (i == mapnbCache) {
              // Search in the parental axis for a namespace with an equal URI.
              n = xmlSearchNsByHref(doc, node, node->ns->href);               

  By building the map of the in-scope namespaces at the insertion point
you should not have to call xmlSeachNs ... functions in the loop, or I
missed something...

              if (n == NULL) {
                  // Check if an already created namespace can be used.
                  for (i = 0;i < mapnbCache; i++) {   
                      if ( createdNs[i] && xmlStrEqual(newNs[i]->href, node->ns->href) ) { 
                          n = newNs[i];                           
                          break;                                          
                      }
                  }
              }
              if (mapSizeCache <= mapnbCache) {
                  mapSizeCache *= 2;
                  if (! xmlReconciliateNsAllocMap(&oldNs, &newNs, &createdNs, mapSizeCache, 1))
                      return(-1);
              }               
              oldNs[mapnbCache] = node->ns;
              if (n != NULL)              
                  createdNs[mapnbCache] = 0;              
              else {
                  // We need to create a new namespace.
                  // Note that we don't bother yet with dublicate prefixes this will
                  // be done later. That's why we pass NULL as the node argument to xmlNewNs.            
                  n = xmlNewNs(NULL, node->ns->href, node->ns->prefix);                                   
                  createdNs[mapnbCache] = 1;              
              }               
              newNs[mapnbCache++] = n;
              node->ns = n;      
          }
      }
      /*
       * Now check for namespaces hold by the attributes on the node.
       */

  The big problem of that approach is that you try to minimize the number of
namespaces put on the new tree but it may not work ! For example XML Schemas
allow data to be QNames, and it could go in nodes or attributes *value*.
You will not catch them this way, leading to broken QNames in the resulting
tree. See the section of the XSLT-1.0 specification about namespace output,
basically what you want is make 100% sure that the namespace in scope at the
extraction point are still available at the insertion point. I do think it
makes a faster algorithm and also garantee that you won't miss a needed
namespace. Your algorithm is certainly better than the existing one but
I'm not sure it is the right way to proceed acually.


  Also the resulting code you suggest is really huge and relatively complex,
I have a hard time commiting to have it maintained in the source. 
  Can you suggest a non-minimizing but correct namespace reconciliation
mechanism ? Basically it should just grab all the namespace in scope at the
extraction point, and insert them at the root of the pasted tree, then
walk the subtree to do the reconciliation. That should be quite simpler
possibly as efficient, and if there is a coule of extra namespace unused
at least we know that there is no hole in the process,

  thanks,

Daniel

-- 
Daniel Veillard      | Red Hat Network https://rhn.redhat.com/
veillard redhat com  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]