Re: [xml] Problem with data in interleave in RELAX NG validation



On Sun, Oct 14, 2018 at 09:02:29PM +0200, Nikolai Weibull via xml wrote:
Hi!

OK, I managed to decode it somewhat.  The issue seems to be that we build
groups of what can be matched by the interleave, but that these groups don’t
include data, list, and value elements, only element and text elements.

  this may be an oversight when going from the RNG data model to what
is the libxml2 insternal structures reflecting this. Among the mismatches
are also TEXT vs CDATA which are separated in libxml2 model and unified as
simple text nodes in the RelaxNG one.

This patch extends xmlRelaxNGGetElements so that it can return these
elements for us in xmlRelaxNGComputeInterleaves.  Then we make sure to
updatexmlRelaxNGNodeMatchesList as well so that it accepts the correct
types.

  That sounds reasonable

The testsuite passes and my test below does as well.

I’m a bit surprised that interleaves simply wouldn’t allow for data, list,
and value elements previously, so I’m wondering if there was a reason for
the code to be the way it was and that the fix should be placed somewhere
else or if it was simply an oversight.  Either way, this does seem to be the

  I honnestly can't remember !

correct solution. If someone could confirm that this solution is what we’re
looking for, I’ll add some proper test cases and apply another merge request
on git.gnome.org.

  This indeed seems to be working, would you mind sending patches to add
regression tests for this, that way I can incorporate those into an upcoming
release.

   thanks a lot !

Daniel

Best regards,
 Nikolai

---
relaxng.c | 16 ++++++++++++----
1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/relaxng.c b/relaxng.c
index 8306e546..3ed03ff4 100644
--- a/relaxng.c
+++ b/relaxng.c
@@ -3993,7 +3993,7 @@ xmlRelaxNGGenerateAttributes(xmlRelaxNGParserCtxtPtr
ctxt,
 * xmlRelaxNGGetElements:
 * @ctxt:  a Relax-NG parser context
 * @def:  the definition definition
- * @eora:  gather elements (0) or attributes (1)
+ * @eora:  gather elements (0), attributes (1) or elements and text (2)
 *
 * Compute the list of top elements a definition can generate
 *
@@ -4019,7 +4019,12 @@ xmlRelaxNGGetElements(xmlRelaxNGParserCtxtPtr ctxt,
    while (cur != NULL) {
        if (((eora == 0) && ((cur->type == XML_RELAXNG_ELEMENT)         ||
                             (cur->type == XML_RELAXNG_TEXT))) ||
-            ((eora == 1) && (cur->type ==
XML_RELAXNG_ATTRIBUTE))) {
+            ((eora == 1) && (cur->type == XML_RELAXNG_ATTRIBUTE)) ||
+            ((eora == 2) && ((cur->type == XML_RELAXNG_DATATYPE) ||
+                             (cur->type == XML_RELAXNG_ELEMENT) ||
+                             (cur->type == XML_RELAXNG_LIST) ||
+                             (cur->type == XML_RELAXNG_TEXT) ||
+                             (cur->type == XML_RELAXNG_VALUE)))) {
            if (ret == NULL) {
                max = 10;
                ret = (xmlRelaxNGDefinePtr *)
@@ -4374,7 +4379,7 @@ xmlRelaxNGComputeInterleaves(void *payload, void
*data,
        if (cur->type == XML_RELAXNG_TEXT)
            is_mixed++;
        groups[nbgroups]->rule = cur;
-        groups[nbgroups]->defs = xmlRelaxNGGetElements(ctxt, cur,
0);
+        groups[nbgroups]->defs = xmlRelaxNGGetElements(ctxt, cur, 2);
        groups[nbgroups]->attrs = xmlRelaxNGGetElements(ctxt,         cur,
1);
        nbgroups++;
        cur = cur->next;
@@ -9280,7 +9285,10 @@ xmlRelaxNGNodeMatchesList(xmlNodePtr node,
xmlRelaxNGDefinePtr * list)
                return (1);
        } else if (((node->type == XML_TEXT_NODE) ||
                    (node->type == XML_CDATA_SECTION_NODE)) &&
-                   (cur->type == XML_RELAXNG_TEXT)) {
+                   ((cur->type == XML_RELAXNG_DATATYPE) ||
+                    (cur->type == XML_RELAXNG_LIST) ||
+                    (cur->type == XML_RELAXNG_TEXT) ||
+                    (cur->type == XML_RELAXNG_VALUE))) {
            return (1);
        }
        cur = list[i++];
-- 
2.19.1


Nikolai Weibull, 2018-10-13 00:23:

Hi!

This remains unfixed.  I have absolutely no idea what’s going on in
the interleave validation code.  Daniel, could you please put together
some minor documentation on how the interleave validation code works?
It’s very complicated.

Thank you,

 Nikolai

Nikolai Weibull, 2018-09-09 21:26:

Hi!

Given the following input RELAX NG grammar:

<grammar xmlns="http://relaxng.org/ns/structure/1.0";
        datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes";>
 <start>
   <element name="a">
     <interleave>
       <attribute name="b"/>
       <data type="string"/>
     </interleave>
   </element>
 </start>
</grammar>

and the following input document a.xml:

<a b="1">c</a>

xmllint reports:

a.xml:1: element a: Relax-NG validity error : Element a has extra
content: text
a.xml fails to validate

Changing the interleave to a group solves the issue, so the problem
is
with how interleaves are validated.

I looked at xmlRelaxNGValidateInterleave() and I sadly have no idea
what’s going on.  Please point me in the right direction and I’ll
gladly write a patch.

 Nikolai

_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
xml gnome org
https://mail.gnome.org/mailman/listinfo/xml

-- 
Daniel Veillard      | Red Hat Developers Tools http://developer.redhat.com/
veillard redhat com  | libxml Gnome XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | virtualization library  http://libvirt.org/


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]