Re: [xml] Useless function calls in xmlSetProp()?



Daniel Veillard wrote:
On Fri, Feb 08, 2008 at 05:17:31PM +0100, Julien Charbon wrote:
Seems fine and clear. Attached to this email the "final" patch against current trunk.

  okidoc, rereviewed it, it looks fine, applied, tested still fine, so I
commited it, thanks a lot !

 Thanks to you, for reviewing and applying patch.

  In completely unscientific testings, runtest number of allocs are reduced
from 3,058,476 to 3,053,663 which is around 0.15% , it really depends on
the kind of documents used and what processing.

Right. I made a tiny prog [again], that show time made by a call to xmlSetProp() with an attribute value that double in size every iteration:

- With old xmlSetProp():

$ ./test-setprop-big
Size:   8       Time:   000:000014397
Size:   16      Time:   000:000003429
Size:   32      Time:   000:000003164
Size:   64      Time:   000:000004435
Size:   128     Time:   000:000033918
Size:   256     Time:   000:000017600
Size:   512     Time:   000:000042089
Size:   1024    Time:   000:000120426
Size:   2048    Time:   000:000400574
Size:   4096    Time:   000:001471683
Size:   8192    Time:   000:005275916
Size:   16384   Time:   000:020225833
Size:   32768   Time:   000:078549300
Size:   65536   Time:   000:304803839
Size:   131072  Time:   001:240064893
Size:   262144  Time:   004:963828528
Size:   524288  Time:   019:846718130
Size:   1048576 Time:   078:606054215

- With "new" [now current] xmlSetProp():

$ ./test-setprop-big
Size:   8       Time:   000:000004981
Size:   16      Time:   000:000001847
Size:   32      Time:   000:000000906
Size:   64      Time:   000:000000926
Size:   128     Time:   000:000001011
Size:   256     Time:   000:000001327
Size:   512     Time:   000:000001842
Size:   1024    Time:   000:000002458
Size:   2048    Time:   000:000004280
Size:   4096    Time:   000:000007559
Size:   8192    Time:   000:000022546
Size:   16384   Time:   000:000028533
Size:   32768   Time:   000:000054189
Size:   65536   Time:   000:000107761
Size:   131072  Time:   000:000218971
Size:   262144  Time:   000:000584088
Size:   524288  Time:   000:001057238
Size:   1048576 Time:   000:002148980

[Yes, attributes with value size of 1 MB are unrealistic, it is just to show how xmlSetProp() scaled before setprop.patch]

Knowing these results, maybe there is a new possible performance improvement:

According on a quick run of valgring/callgrind on test-setprop-big, old bad scaling is due to xmlStringGetNodeList() that call too much xmlStrLen() through call sequence:
xmlNodeAddContent{Len}() -> xmlStrnCat() -> xmlStrLen().

 Thus, I will check if some improvements can be done
on xmlStringGetNodeList() to make it more scalable...

P.S: Joined test-setprop-big.c was compiled with:
$ gcc -Wall test-setprop-big.c -o test-setprop-big $(xml2-config --cflags --libs) -lrt -g -O2

--
Julien
Index: include/libxml/xmlerror.h
===================================================================
--- include/libxml/xmlerror.h   (revision 3690)
+++ include/libxml/xmlerror.h   (working copy)
@@ -398,6 +398,7 @@
     XML_TREE_INVALID_HEX = 1300,
     XML_TREE_INVALID_DEC, /* 1301 */
     XML_TREE_UNTERMINATED_ENTITY, /* 1302 */
+    XML_TREE_NOT_UTF8, /* 1303 */
     XML_SAVE_NOT_UTF8 = 1400,
     XML_SAVE_CHAR_INVALID, /* 1401 */
     XML_SAVE_NO_DOCTYPE, /* 1402 */
Index: tree.c
===================================================================
--- tree.c      (revision 3690)
+++ tree.c      (working copy)
@@ -92,6 +92,9 @@
        case XML_TREE_UNTERMINATED_ENTITY:
            msg = "unterminated entity reference %15s\n";
            break;
+       case XML_TREE_NOT_UTF8:
+           msg = "string is not in UTF-8\n";
+           break;
        default:
            msg = "unexpected error number\n";
     }
@@ -1814,11 +1817,15 @@
         cur->name = name;
 
     if (value != NULL) {
-        xmlChar *buffer;
         xmlNodePtr tmp;
 
-        buffer = xmlEncodeEntitiesReentrant(doc, value);
-        cur->children = xmlStringGetNodeList(doc, buffer);
+        if(!xmlCheckUTF8(value)) {
+            xmlTreeErr(XML_TREE_NOT_UTF8, (xmlNodePtr) doc,
+                       NULL);
+            if (doc != NULL)
+                doc->encoding = xmlStrdup(BAD_CAST "ISO-8859-1");
+        }
+        cur->children = xmlNewDocText(doc, value);
         cur->last = NULL;
         tmp = cur->children;
         while (tmp != NULL) {
@@ -1827,7 +1834,6 @@
                 cur->last = tmp;
             tmp = tmp->next;
         }
-        xmlFree(buffer);
     }
 
     /*
@@ -6466,11 +6472,15 @@
        prop->last = NULL;
        prop->ns = ns;
        if (value != NULL) {
-           xmlChar *buffer;
            xmlNodePtr tmp;
            
-           buffer = xmlEncodeEntitiesReentrant(node->doc, value);
-           prop->children = xmlStringGetNodeList(node->doc, buffer);
+           if(!xmlCheckUTF8(value)) {
+               xmlTreeErr(XML_TREE_NOT_UTF8, (xmlNodePtr) node->doc,
+                          NULL);
+                if (node->doc != NULL)
+                    node->doc->encoding = xmlStrdup(BAD_CAST "ISO-8859-1");
+           }
+           prop->children = xmlNewDocText(node->doc, value);
            prop->last = NULL;
            tmp = prop->children;
            while (tmp != NULL) {
@@ -6479,7 +6489,6 @@
                    prop->last = tmp;
                tmp = tmp->next;
            }
-           xmlFree(buffer);
        }
        if (prop->atype == XML_ATTRIBUTE_ID)
            xmlAddID(NULL, node->doc, value, prop);


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]