Re: [xml] Exponential running time in RELAX NG matching of optional attributes



On 27/06/19 14:15 +0200, Nikolai Weibull via xml wrote:
The following RELAX NG schema requires exponential running time when
matching attributes (15 attributes is where my computer begins to
show the symptoms, this may vary with your set-up):

a.rng:

<?xml version="1.0" encoding="UTF-8"?>
<grammar xmlns="http://relaxng.org/ns/structure/1.0";>
  <start>
    <element>
      <name>a</name>
      <group>
        <optional><attribute><name ns="">a</name><text/></attribute></optional>
        <optional><attribute><name ns="">b</name><text/></attribute></optional>
        <optional><attribute><name ns="">c</name><text/></attribute></optional>
        <optional><attribute><name ns="">d</name><text/></attribute></optional>
        <optional><attribute><name ns="">e</name><text/></attribute></optional>
        <optional><attribute><name ns="">f</name><text/></attribute></optional>
        <optional><attribute><name ns="">g</name><text/></attribute></optional>
        <optional><attribute><name ns="">h</name><text/></attribute></optional>
        <optional><attribute><name ns="">i</name><text/></attribute></optional>
        <optional><attribute><name ns="">j</name><text/></attribute></optional>
        <optional><attribute><name ns="">k</name><text/></attribute></optional>
        <optional><attribute><name ns="">l</name><text/></attribute></optional>
        <optional><attribute><name ns="">m</name><text/></attribute></optional>
        <optional><attribute><name ns="">n</name><text/></attribute></optional>
        <optional><attribute><name ns="">o</name><text/></attribute></optional>
      </group>
    </element>
  </start>
</grammar>

a.xml:

<?xml version="1.0" encoding="UTF-8"?>
<a a="1" b="2" c="3" d="4" e="5" f="6" g="7" h="8" i="9" j="10" k="11" l="12" m="13" n="14" o="15"/>

% time xmllint --noout --relaxng a.rng a.xml
real: 3.175, user: 3.147, system: 0.014 (99%)

Changing the group to an interleave, that is,

b.rng:

<?xml version="1.0" encoding="UTF-8"?>
<grammar xmlns="http://relaxng.org/ns/structure/1.0";>
  <start>
    <element>
      <name>a</name>
      <interleave>
        <optional><attribute><name ns="">a</name><text/></attribute></optional>
        <optional><attribute><name ns="">b</name><text/></attribute></optional>
        <optional><attribute><name ns="">c</name><text/></attribute></optional>
        <optional><attribute><name ns="">d</name><text/></attribute></optional>
        <optional><attribute><name ns="">e</name><text/></attribute></optional>
        <optional><attribute><name ns="">f</name><text/></attribute></optional>
        <optional><attribute><name ns="">g</name><text/></attribute></optional>
        <optional><attribute><name ns="">h</name><text/></attribute></optional>
        <optional><attribute><name ns="">i</name><text/></attribute></optional>
        <optional><attribute><name ns="">j</name><text/></attribute></optional>
        <optional><attribute><name ns="">k</name><text/></attribute></optional>
        <optional><attribute><name ns="">l</name><text/></attribute></optional>
        <optional><attribute><name ns="">m</name><text/></attribute></optional>
        <optional><attribute><name ns="">n</name><text/></attribute></optional>
        <optional><attribute><name ns="">o</name><text/></attribute></optional>
      </interleave>
    </element>
  </start>
</grammar>

gives

% time xmllint --noout --relaxng b.rng a.xml
real: 0.008, user: 0.003, system: 0.002 (54%)

Making the attributes required instead of optional also fixes the issue:

c.rng:

<?xml version="1.0" encoding="UTF-8"?>
<grammar xmlns="http://relaxng.org/ns/structure/1.0";>
  <start>
    <element>
      <name>a</name>
      <group>
        <attribute><name ns="">a</name><text/></attribute>
        <attribute><name ns="">b</name><text/></attribute>
        <attribute><name ns="">c</name><text/></attribute>
        <attribute><name ns="">d</name><text/></attribute>
        <attribute><name ns="">e</name><text/></attribute>
        <attribute><name ns="">f</name><text/></attribute>
        <attribute><name ns="">g</name><text/></attribute>
        <attribute><name ns="">h</name><text/></attribute>
        <attribute><name ns="">i</name><text/></attribute>
        <attribute><name ns="">j</name><text/></attribute>
        <attribute><name ns="">k</name><text/></attribute>
        <attribute><name ns="">l</name><text/></attribute>
        <attribute><name ns="">m</name><text/></attribute>
        <attribute><name ns="">n</name><text/></attribute>
        <attribute><name ns="">o</name><text/></attribute>
      </group>
    </element>
  </start>
</grammar>

% time xmllint --noout --relaxng c.rng a.xml
real: 0.008, user: 0.003, system: 0.002 (55%)

The problem seems to be that xmlRelaxNGValidateDefinition keeps
adding states that I feel don’t need checking, but I can’t figure
out how to avoid this from happening.

I can only nod to your analysis, ran into similar observations
in the past, and they were likewise XML attributes (that parameters to
fence device instances boil down to) related:

https://bugzilla.redhat.com/show_bug.cgi?id=1224378#c7

Callgrind proved to be a useful tool to get better quantitative numbers.

Daniel, do you have any input on this?  I’d gladly work on fixing
this, but I feel that I need some guidance as to what’s going wrong
and perhaps also how to fix it.

Would be nice to get libxml's RelaxNG validation support closer
to parity with jing (the only other freely available validator I am
aware of that can be directly used for a peer comparison), for sure.

-- 
Jan (Poki)

Attachment: pgpW57Mouuq2a.pgp
Description: PGP signature



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]