Re: [xml] performance of parsing docbook with xincludes

On 05/16/2018 12:41 AM, Nick Wellnhofer wrote:
On May 15, 2018, at 21:56 , Stefan Sauer <ensonic hora-obscura de> wrote:
On 05/15/2018 08:40 PM, Stefan Sauer wrote:
On 05/15/2018 12:42 PM, Nick Wellnhofer wrote:
Can you try to change the line to

    xmlCtxtUseOptions(pctxt, ctxt->parseFlags);

and see if it helps?

It does not help. I'll experiment further. Thanks for the recomendations.
I think you also have to remove the line at

    pctxt->loadsubset |= XML_DETECT_IDS;

Looks like the idea is to make sure that ID attributes are detected for XIncludes with XPointers. IMO, it 
should be the application's responsibility to set the XML_PARSE_DTDLOAD flag in this case. But changing the 
behavior might break code that relies on this feature.
This helps!

LD_LIBRARY_PATH=~/debug/lib ~/debug/bin/xmllint --timing --xinclude
--nonet --noent --noout glib-docs.xml
Parsing took 0 ms
Xinclude processing took 179 ms
Freeing took 17 ms

So one solution could be another flag to enable this?
Is libxml2 doing that for each file over and over?
Actually easy to confirm using --load-trace:
Wouldn't it make sense to only load each dtd once?
This would make sense.

And where exatly is it loaded (I can only
see xmlFreeDtd, but can't find a xmlLoadDtd or the like.
Via xmlParseDocument -> xmlSAX2ExternalSubset -> xmlParseExternalSubset.
Thanks, reading the code. Need to figure where we could cache external
subsets and what a suitable keys is (ExternalID ?).



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]