Re: [xml] Schema validity failure for valid document

Hi (back from vacations),

Daniel Veillard wrote:
On Mon, Jan 03, 2005 at 05:46:38PM +0100, cazic gmx net wrote:
 well, at the time of the error, all libxml2 had left from the Schemas
content model is a compiled automata. Sometimes that automata can be
reserialized as a string making sense to a human, we don't have an API for

This would be great!

this (yet that could be added). It's also sometime hard to find out what

  Reserializing the content model to a string (as DTD errors do now) is
not generally possible for all regexp and schemas content model. Instead
I propose an API to extract the last error value, and the list of potential
accepted values.
  Not yet added to the header but I commited to xmlregexp.c the following
 * xmlRegExecErrInfo:
 * @exec: a regexp execution context generating an error
 * @string: return value for the error string
 * @nbval: pointer to the number of accepted values IN/OUT
 * @values: pointer to the array of acceptable values
 * Extract error informations from the regexp execution, the parameter
 * @string will be updated with the value pushed and not accepted,
 * the parameter @values must point to an array of @nbval string pointers
 * on return nbval will contain the number of possible strings in that
 * state and the @values array will be updated with them. The string values
 * returned will be freed with the @exec context and don't need to be
 * deallocated.
 * Returns: 0 in case of success or -1 in case of error.
xmlRegExecErrInfo(xmlRegExecCtxtPtr exec, const xmlChar **string,
                  int *nbval, xmlChar **values)

For example suppose you validate

with test/schemas/all_1.xsd (where doc content model is defined as an
<all> with <a><b><c>, then when calling xmlRegExecErrInfo on the associated regexp after the error is noticed, then you will get
   string = "b"
   nbval = 1
   values[0] = "a"
on return.
This should be the foundation for:
   - better error reporting
   - schemas driven editing capabilities
   - better DTD driven editing capabilities
for the 2 last examples rather than on-error report, one would probably need potential next values from a current state of the
xmlRegExecCtxtPtr, this is actually trivial to add based on the same
kind of code, something like

int xmlRegExecNextValues(xmlRegExecCtxtPtr exec, int *nbval, xmlChar **values)

would be the associated API.

  Thoughts ? I will probably export xmlRegExecErrInfo and xmlRegExecNextValues
from the xmlregexp.h header soon. But special code to improve the Schemas
error reports would still be needed, Kasimier does this fit your need ?

A namespace aware version would be needed as well.
  xmlRegExecErrInfo2(xmlRegExecCtxtPtr exec,
    const xmlChar **string,
    const xmlChar **string2, <-- the namespace name
    int *nbval,
    xmlChar **values,
    xmlChar **values2) <-- the array of namespace names

In the case of the XML Schema engine, @string and @string2 would be
already known.


Content model: (a, b*, c)
input: <a/><d/>
would xmlRegExecErrInfo return "b" for values[0], or
rather "c", since the automaton should have passed the
b* ?

This leads to the question if/how to report occurence information;
at least if the element is mandatory or not.

Content model: (a, b+, c)
input: <a/><b/><d/>
At the point of the error prone input of "d" it seems cruicial to
report that the allowed input at this point can be
an optional "b" or a "c". Don't know if this info can be provided by the
regex engine.

Any ideas?



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]