On 27/06/19 14:15 +0200, Nikolai Weibull via xml wrote:
The following RELAX NG schema requires exponential running time when matching attributes (15 attributes is where my computer begins to show the symptoms, this may vary with your set-up): a.rng: <?xml version="1.0" encoding="UTF-8"?> <grammar xmlns="http://relaxng.org/ns/structure/1.0"> <start> <element> <name>a</name> <group> <optional><attribute><name ns="">a</name><text/></attribute></optional> <optional><attribute><name ns="">b</name><text/></attribute></optional> <optional><attribute><name ns="">c</name><text/></attribute></optional> <optional><attribute><name ns="">d</name><text/></attribute></optional> <optional><attribute><name ns="">e</name><text/></attribute></optional> <optional><attribute><name ns="">f</name><text/></attribute></optional> <optional><attribute><name ns="">g</name><text/></attribute></optional> <optional><attribute><name ns="">h</name><text/></attribute></optional> <optional><attribute><name ns="">i</name><text/></attribute></optional> <optional><attribute><name ns="">j</name><text/></attribute></optional> <optional><attribute><name ns="">k</name><text/></attribute></optional> <optional><attribute><name ns="">l</name><text/></attribute></optional> <optional><attribute><name ns="">m</name><text/></attribute></optional> <optional><attribute><name ns="">n</name><text/></attribute></optional> <optional><attribute><name ns="">o</name><text/></attribute></optional> </group> </element> </start> </grammar> a.xml: <?xml version="1.0" encoding="UTF-8"?> <a a="1" b="2" c="3" d="4" e="5" f="6" g="7" h="8" i="9" j="10" k="11" l="12" m="13" n="14" o="15"/> % time xmllint --noout --relaxng a.rng a.xml real: 3.175, user: 3.147, system: 0.014 (99%) Changing the group to an interleave, that is, b.rng: <?xml version="1.0" encoding="UTF-8"?> <grammar xmlns="http://relaxng.org/ns/structure/1.0"> <start> <element> <name>a</name> <interleave> <optional><attribute><name ns="">a</name><text/></attribute></optional> <optional><attribute><name ns="">b</name><text/></attribute></optional> <optional><attribute><name ns="">c</name><text/></attribute></optional> <optional><attribute><name ns="">d</name><text/></attribute></optional> <optional><attribute><name ns="">e</name><text/></attribute></optional> <optional><attribute><name ns="">f</name><text/></attribute></optional> <optional><attribute><name ns="">g</name><text/></attribute></optional> <optional><attribute><name ns="">h</name><text/></attribute></optional> <optional><attribute><name ns="">i</name><text/></attribute></optional> <optional><attribute><name ns="">j</name><text/></attribute></optional> <optional><attribute><name ns="">k</name><text/></attribute></optional> <optional><attribute><name ns="">l</name><text/></attribute></optional> <optional><attribute><name ns="">m</name><text/></attribute></optional> <optional><attribute><name ns="">n</name><text/></attribute></optional> <optional><attribute><name ns="">o</name><text/></attribute></optional> </interleave> </element> </start> </grammar> gives % time xmllint --noout --relaxng b.rng a.xml real: 0.008, user: 0.003, system: 0.002 (54%) Making the attributes required instead of optional also fixes the issue: c.rng: <?xml version="1.0" encoding="UTF-8"?> <grammar xmlns="http://relaxng.org/ns/structure/1.0"> <start> <element> <name>a</name> <group> <attribute><name ns="">a</name><text/></attribute> <attribute><name ns="">b</name><text/></attribute> <attribute><name ns="">c</name><text/></attribute> <attribute><name ns="">d</name><text/></attribute> <attribute><name ns="">e</name><text/></attribute> <attribute><name ns="">f</name><text/></attribute> <attribute><name ns="">g</name><text/></attribute> <attribute><name ns="">h</name><text/></attribute> <attribute><name ns="">i</name><text/></attribute> <attribute><name ns="">j</name><text/></attribute> <attribute><name ns="">k</name><text/></attribute> <attribute><name ns="">l</name><text/></attribute> <attribute><name ns="">m</name><text/></attribute> <attribute><name ns="">n</name><text/></attribute> <attribute><name ns="">o</name><text/></attribute> </group> </element> </start> </grammar> % time xmllint --noout --relaxng c.rng a.xml real: 0.008, user: 0.003, system: 0.002 (55%) The problem seems to be that xmlRelaxNGValidateDefinition keeps adding states that I feel don’t need checking, but I can’t figure out how to avoid this from happening.
I can only nod to your analysis, ran into similar observations in the past, and they were likewise XML attributes (that parameters to fence device instances boil down to) related: https://bugzilla.redhat.com/show_bug.cgi?id=1224378#c7 Callgrind proved to be a useful tool to get better quantitative numbers.
Daniel, do you have any input on this? I’d gladly work on fixing this, but I feel that I need some guidance as to what’s going wrong and perhaps also how to fix it.
Would be nice to get libxml's RelaxNG validation support closer to parity with jing (the only other freely available validator I am aware of that can be directly used for a peer comparison), for sure. -- Jan (Poki)
Attachment:
pgpW57Mouuq2a.pgp
Description: PGP signature