[Date Prev][Date Next] [Thread Prev][Thread Next]
[Thread Index]
[Date Index]
[Author Index]
Re: [xml] libxml2 very slow on big data dump
- From: Alexandre Macard <amacard arkeia com>
- To: xml gnome org
- Subject: Re: [xml] libxml2 very slow on big data dump
- Date: Tue, 16 Dec 2008 17:54:26 +0100
Alexandre Macard a écrit :
> Stefan Behnel a écrit :
>
>> Alexandre Macard wrote:
>>
>>
>>> Stefan Behnel a écrit :
>>>
>>>
>>>> Alexandre Macard wrote:
>>>>
>>>>
>>>>> I try dump a node from a big xml (near 7mo), and the libxml2 is very
>>>>> slow to respond.
>>>>>
>>>>> I tried to trace the problem and it seems to take all it's time into
>>>>> the
>>>>> function: xmlOutputBufferWriteEscape.
>>>>> I do not need to escape data because I use a base64 encoding.
>>>>>
>>>>>
>>>>>
>>>> You didn't write which version of libxml2 you are using, but there was a
>>>> bug in an older version that could lead to horrible performance when
>>>> serialising character entities.
>>>>
>>>> Try upgrading your library.
>>>>
>>>>
>>> Sorry I forgot to precise this information. I am using the last version
>>> 2.7.2.
>>>
>>>
>> So maybe it's a similar bug, but for a different encoding (I think it was
>> related to the ASCII encoding at the time).
>>
>> Could you provide the code snippet that you use for serialisation? I.e.
>> what parameters you pass into what function?
>>
>> Stefan
>>
>>
>>
>>
> This little test code make 15secs to exit.
> The journal.xml size is 7.1Mo.
>
> int main() {
> xmlDocPtr doc;
> xmlNodePtr cur;
> xmlBufferPtr buf;
>
> doc = xmlParseFile("./journal.xml");
>
> if (doc == NULL ) {
> fprintf(stderr,"Document not parsed successfully. \n");
> return (0);
> }
> cur = xmlDocGetRootElement(doc);
>
> if (cur == NULL) {
> fprintf(stderr,"empty document\n");
> xmlFreeDoc(doc);
> return (0);
> }
>
> buf = xmlBufferCreate();
>
> xmlNodeDump(buf, doc, cur, 1, 1);
>
> xmlFree(buf);
> xmlFreeDoc(doc);
>
> return (0);
> }
>
> I will try to add later a script to generate a similar xml.
>
> Thanks.
> _______________________________________________
> xml mailing list, project page http://xmlsoft.org/
> xml gnome org
> http://mail.gnome.org/mailman/listinfo/xml
>
>
I forgot to precise that all the time is passed into function xmlNodeDump.
At the end you find a script that generate similar xml. I used this xml
to test and I had to wait 22secs for my program to exit.
usage: script.sh > journal.xml
#!/bin/bash
#Header
echo -n '<SOAP-ENV:Envelope
xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:xsi="http://www.w3.org/1999/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/1999/XMLSchema"
SOAP-ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/">
<SOAP-ENV:Header/> <SOAP-ENV:Body> <m:arkws_methodResponse
xmlns:m="urn:arkeia">'
echo -n '<m:list0 xsi:type="xsd:list"><m:last
xsi:type="xsd:integer">1</m:last><m:param0
xsi:type="xsd:integer">0</m:param0><m:base64_param1
xsi:type="xsd:string">MjAwOC8xMi8xNiAxNjo0NzoxMyBJMDAxMTAwMDAgMDFUUF9MSVNUX0FMTDogWW91IGhhdmUgc3VjY2Vzc2Z1bGx5IGxvYWRlZCB0aGUgbGlzdCBvZiB0YXBlcyE=</m:base64_param1><m:param2
xsi:type="xsd:list">'
i=0
while [ $i -lt 15000 ] ; do
echo -n '<m:item xsi:type="xsd:list"><m:base64_RDATE
xsi:type="xsd:string">MTIzMDkxMDAyNQ==</m:base64_RDATE><m:base64_NUM
xsi:type="xsd:string">MDAwMDE=</m:base64_NUM><m:base64_OWNER
xsi:type="xsd:string">cm9vdA==</m:base64_OWNER><m:base64_THREAD
xsi:type="xsd:string">MDAx</m:base64_THREAD><m:base64_PLID
xsi:type="xsd:string">NDczODVhMWY=</m:base64_PLID><m:base64_CID
xsi:type="xsd:string">NDkzNjlmZjA=</m:base64_CID><m:base64_TPID
xsi:type="xsd:string">NDc1NThlZjM=</m:base64_TPID><m:base64_VOLTAG
xsi:type="xsd:string">L2JhY2t1cHMvZmlsZQ==</m:base64_VOLTAG><m:base64_NAME
xsi:type="xsd:string">dGFwZV9maWxl</m:base64_NAME></m:item>'
i=`expr $i + 1`
done
echo -n '</m:param2></m:list0>'
#Footer
echo '</m:arkws_methodResponse> </SOAP-ENV:Body></SOAP-ENV:Envelope>'
[Date Prev][Date Next] [Thread Prev][Thread Next]
[Thread Index]
[Date Index]
[Author Index]