Re: [xml] performance of parsing docbook with xincludes
- From: Stefan Sauer <ensonic hora-obscura de>
- To: xml gnome org
- Subject: Re: [xml] performance of parsing docbook with xincludes
- Date: Thu, 7 Jun 2018 00:00:37 +0200
On 05/17/2018 06:01 PM, Stefan Sauer wrote:
On 05/17/2018 04:18 PM, Nick Wellnhofer wrote:
On 16/05/2018 21:51, Stefan Sauer wrote:
So one solution could be another flag to enable this?
Yes, but it would be rather ugly.
In which sense? I guess because it is something that noone should need
to know about or have to care about?
Thanks, reading the code. Need to figure where we could cache external
subsets and what a suitable keys is (ExternalID ?).
Note that I'm currently not planning to review and integrate larger
patches from other developers. I only took over some libxml2
maintenance duties because noone else did. So even if you write a
high-quality patch, it might never get merged.
Thanks for making this clear upfront. This is how I ended up becoming
the gtkdoc maintainer :)
Caching external subsets for XIncludes certainly sounds like a nice
feature but I would prefer to find a simpler solution. For example,
can't you just omit the external DTD from included documents?
Yeah, right now, the benefit of having the DTD is that one can validate
fragments. I'll do some research (aka grepping over existing projects)
to see how the doc-type headers being used today look like. If all that
people do is using an entity to inject the version, I'll write a
migration tool.
We have a test that validates the doc, but I think I can change this to
just resolve all xincludes and check through the top-level doctype.
Just to add to this, I am assuming a lot of people follow this book
http://www.sagehill.net/docbookxsl/ModularDoc.html#UsingXinclude
and using a DOCTYPE is part of the examples.
You wrote:
and gtk-doc will replicate this for the fragments (replacing 'book' with
e.g. 'refentry'). This way one can e.g. inject things like a version.
What do you mean by "inject things like a version"? Why exactly do
your included documents have to reference an external DTD?
The documentation consists of a handwritten master doc (type book), that
includes more handwritten parts (e.g. tutorials, guides) and include
generated reference docs. When gtkdoc generated the reference docs, it
applies takes the doctype header of the master-doc as a template and
uses that for the generated reference docs. If the master doc has
entities declared, those can be expanded in the reference fragments.
Thats the part I will check how widely it is actually used.
Stefan
Another idea is to stop loading external DTDs for XIncludes without an
XPointer expression. This would still change the behavior for some
users but it's much less likely to cause problems.
change the behaviour, as in we would not catch validation errors?
Too bad that xmlXIncludeParseFile() does not get the parent parserCtx,
in that case we could apply the same flags'.
Nick
I definitely don't know enough about the implications here. I was mostly
thinking to see if we can stick a dictionary of <dtd-identifier,
xmlDtdPtr> into the Parser Context and before actually loading a dtd,
check if we did already and reuse. Somehow the dict needs to be stored
in the top-level doc, when parsing is done (do we need the dtds once the
doc has been parsed?). We only free the dtds with the top-level doc. But
I agree, it is not going to be a two liner.
It seems that xmldict is only handling key and value to be a string,
right? So, we'll even need out one cache data structure. I'd say it
would need to be on the _xmlXIncludeCtxt level. global is easier, but
then we can't free it ever :/
Stefan
Stefan
_______________________________________________
xml mailing list, project page http://xmlsoft.org/
xml gnome org
https://mail.gnome.org/mailman/listinfo/xml
[Date Prev][
Date Next] [Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]