Re: [xml] Schema validity failure for valid document
- From: Kasimier Buchcik <kbuchcik 4commerce de>
- To: veillard redhat com
- Cc: xml gnome org
- Subject: Re: [xml] Schema validity failure for valid document
- Date: Mon, 10 Jan 2005 12:53:09 +0100
Hi (back from vacations),
Daniel Veillard wrote:
On Mon, Jan 03, 2005 at 05:46:38PM +0100, cazic gmx net wrote:
[...]
well, at the time of the error, all libxml2 had left from the Schemas
content model is a compiled automata. Sometimes that automata can be
reserialized as a string making sense to a human, we don't have an API for
This would be great!
this (yet that could be added). It's also sometime hard to find out what
Reserializing the content model to a string (as DTD errors do now) is
not generally possible for all regexp and schemas content model. Instead
I propose an API to extract the last error value, and the list of potential
accepted values.
Not yet added to the header but I commited to xmlregexp.c the following
routine:
/**
* xmlRegExecErrInfo:
* @exec: a regexp execution context generating an error
* @string: return value for the error string
* @nbval: pointer to the number of accepted values IN/OUT
* @values: pointer to the array of acceptable values
*
* Extract error informations from the regexp execution, the parameter
* @string will be updated with the value pushed and not accepted,
* the parameter @values must point to an array of @nbval string pointers
* on return nbval will contain the number of possible strings in that
* state and the @values array will be updated with them. The string values
* returned will be freed with the @exec context and don't need to be
* deallocated.
*
* Returns: 0 in case of success or -1 in case of error.
*/
int
xmlRegExecErrInfo(xmlRegExecCtxtPtr exec, const xmlChar **string,
int *nbval, xmlChar **values)
For example suppose you validate
<doc><b/><c/><d/></doc>
with test/schemas/all_1.xsd (where doc content model is defined as an
<all> with <a><b><c>, then when calling xmlRegExecErrInfo on the
associated regexp after the error is noticed, then you will get
string = "b"
nbval = 1
values[0] = "a"
on return.
This should be the foundation for:
- better error reporting
- schemas driven editing capabilities
- better DTD driven editing capabilities
for the 2 last examples rather than on-error report, one would
probably need potential next values from a current state of the
xmlRegExecCtxtPtr, this is actually trivial to add based on the same
kind of code, something like
int xmlRegExecNextValues(xmlRegExecCtxtPtr exec, int *nbval, xmlChar **values)
would be the associated API.
Thoughts ? I will probably export xmlRegExecErrInfo and xmlRegExecNextValues
from the xmlregexp.h header soon. But special code to improve the Schemas
error reports would still be needed, Kasimier does this fit your need ?
Cool!
A namespace aware version would be needed as well.
Proposal:
xmlRegExecErrInfo2(xmlRegExecCtxtPtr exec,
const xmlChar **string,
const xmlChar **string2, <-- the namespace name
int *nbval,
xmlChar **values,
xmlChar **values2) <-- the array of namespace names
In the case of the XML Schema engine, @string and @string2 would be
already known.
Occurence:
Content model: (a, b*, c)
input: <a/><d/>
would xmlRegExecErrInfo return "b" for values[0], or
rather "c", since the automaton should have passed the
b* ?
This leads to the question if/how to report occurence information;
at least if the element is mandatory or not.
Content model: (a, b+, c)
input: <a/><b/><d/>
At the point of the error prone input of "d" it seems cruicial to
report that the allowed input at this point can be
an optional "b" or a "c". Don't know if this info can be provided by the
regex engine.
Any ideas?
Greetings,
Kasimier
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]