Re: [xml] Problem with data in interleave in RELAX NG validation



Nikolai - 
Glad to see someone attacking these. I've got some RNG schema that we've
had to use jing to validate, since libxml2 was giving similiar issues to
what you're seeing, and I was even more daunted by the code than you
seem to be. If you could get this branch somewhere I can pull it down,
I'd love to see if your fixes help my schema.

Ross


On Sun, Oct 14, 2018 at 09:02:29PM +0200, Nikolai Weibull via xml wrote:
Hi!

OK, I managed to decode it somewhat.  The issue seems to be that we build
groups of what can be matched by the interleave, but that these groups don’t
include data, list, and value elements, only element and text elements.
This patch extends xmlRelaxNGGetElements so that it can return these
elements for us in xmlRelaxNGComputeInterleaves.  Then we make sure to
updatexmlRelaxNGNodeMatchesList as well so that it accepts the correct
types.

The testsuite passes and my test below does as well.

I’m a bit surprised that interleaves simply wouldn’t allow for data, list,
and value elements previously, so I’m wondering if there was a reason for
the code to be the way it was and that the fix should be placed somewhere
else or if it was simply an oversight.  Either way, this does seem to be the
correct solution. If someone could confirm that this solution is what we’re
looking for, I’ll add some proper test cases and apply another merge request
on git.gnome.org.

Best regards,
 Nikolai

---
relaxng.c | 16 ++++++++++++----
1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/relaxng.c b/relaxng.c
index 8306e546..3ed03ff4 100644
--- a/relaxng.c
+++ b/relaxng.c
@@ -3993,7 +3993,7 @@ xmlRelaxNGGenerateAttributes(xmlRelaxNGParserCtxtPtr
ctxt,
 * xmlRelaxNGGetElements:
 * @ctxt:  a Relax-NG parser context
 * @def:  the definition definition
- * @eora:  gather elements (0) or attributes (1)
+ * @eora:  gather elements (0), attributes (1) or elements and text (2)
 *
 * Compute the list of top elements a definition can generate
 *
@@ -4019,7 +4019,12 @@ xmlRelaxNGGetElements(xmlRelaxNGParserCtxtPtr ctxt,
    while (cur != NULL) {
        if (((eora == 0) && ((cur->type == XML_RELAXNG_ELEMENT)         ||
                             (cur->type == XML_RELAXNG_TEXT))) ||
-            ((eora == 1) && (cur->type ==
XML_RELAXNG_ATTRIBUTE))) {
+            ((eora == 1) && (cur->type == XML_RELAXNG_ATTRIBUTE)) ||
+            ((eora == 2) && ((cur->type == XML_RELAXNG_DATATYPE) ||
+                             (cur->type == XML_RELAXNG_ELEMENT) ||
+                             (cur->type == XML_RELAXNG_LIST) ||
+                             (cur->type == XML_RELAXNG_TEXT) ||
+                             (cur->type == XML_RELAXNG_VALUE)))) {
            if (ret == NULL) {
                max = 10;
                ret = (xmlRelaxNGDefinePtr *)
@@ -4374,7 +4379,7 @@ xmlRelaxNGComputeInterleaves(void *payload, void
*data,
        if (cur->type == XML_RELAXNG_TEXT)
            is_mixed++;
        groups[nbgroups]->rule = cur;
-        groups[nbgroups]->defs = xmlRelaxNGGetElements(ctxt, cur,
0);
+        groups[nbgroups]->defs = xmlRelaxNGGetElements(ctxt, cur, 2);
        groups[nbgroups]->attrs = xmlRelaxNGGetElements(ctxt,         cur,
1);
        nbgroups++;
        cur = cur->next;
@@ -9280,7 +9285,10 @@ xmlRelaxNGNodeMatchesList(xmlNodePtr node,
xmlRelaxNGDefinePtr * list)
                return (1);
        } else if (((node->type == XML_TEXT_NODE) ||
                    (node->type == XML_CDATA_SECTION_NODE)) &&
-                   (cur->type == XML_RELAXNG_TEXT)) {
+                   ((cur->type == XML_RELAXNG_DATATYPE) ||
+                    (cur->type == XML_RELAXNG_LIST) ||
+                    (cur->type == XML_RELAXNG_TEXT) ||
+                    (cur->type == XML_RELAXNG_VALUE))) {
            return (1);
        }
        cur = list[i++];
-- 
2.19.1


Nikolai Weibull, 2018-10-13 00:23:

Hi!

This remains unfixed.  I have absolutely no idea what’s going on in
the interleave validation code.  Daniel, could you please put together
some minor documentation on how the interleave validation code works?
It’s very complicated.

Thank you,

Nikolai

Nikolai Weibull, 2018-09-09 21:26:

Hi!

Given the following input RELAX NG grammar:

<grammar xmlns="http://relaxng.org/ns/structure/1.0";
       datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes";>
<start>
  <element name="a">
    <interleave>
      <attribute name="b"/>
      <data type="string"/>
    </interleave>
  </element>
</start>
</grammar>

and the following input document a.xml:

<a b="1">c</a>

xmllint reports:

a.xml:1: element a: Relax-NG validity error : Element a has extra
content: text
a.xml fails to validate

Changing the interleave to a group solves the issue, so the problem
is
with how interleaves are validated.

I looked at xmlRelaxNGValidateInterleave() and I sadly have no idea
what’s going on.  Please point me in the right direction and I’ll
gladly write a patch.

Nikolai

_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
xml gnome org
https://mail.gnome.org/mailman/listinfo/xml

-- 
Ross Reedstrom, Ph.D.                                 reedstrm rice edu
Senior Developer         https://cnx.org            phone: 713-348-6166
OpenStax                 https://openstax.org         fax: 713-348-3665
Rice University MS-375, Houston, TX 77005
GPG Key fingerprint = F023 82C8 9B0E 2CC6 0D8E  F888 D3AE 810E 88F0 BEDE


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]