Hi,
I think there is a bug in the regular expression parser for character
ranges, i.e. the character class [\]-a] with an escaped character (here
\]) is not recognized by the libXML regular expression parser.
E.g.: The simple type is not working:
<xs:simpleType name="LimitedString">
<xs:restriction base="xs:string">
<xs:pattern value="[\]-a]*"/>
</xs:restriction>
</xs:simpleType>
but
<xs:simpleType name="LimitedString">
<xs:restriction base="xs:string">
<xs:pattern value="[Z-a]*"/>
</xs:restriction>
</xs:simpleType>
If one looks at the ASCII table:
!"#$%&'()*+,-./0123456789:;<=>?
@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_
`abcdefghijklmnopqrstuvwxyz{|}~
one sees that Z is the most right character before the character ] which
does not have to be escaped in the character range definition. This
indicates that there is a bug.
I have prepared some example files to demonstrate the shortcoming:
test_not_validating.xsd is using the first simple type definition and
test_validating.xsd the second one, respectively.
You can try to validate test.xml with:
xmllint --noout --schema test_not_validating.xsd ./test.xml
and
xmllint --noout --schema test_validating.xsd ./test.xml
respectively.
It would be nice if anyone could confirm the BUG and possibly solve it.
Regards,
Dominik
Attachment:
test.xml
Description: application/xml
Attachment:
test_not_validating.xsd
Description: application/xml
Attachment:
test_validating.xsd
Description: application/xml