Re: [xml] Schema validity failure for valid document


Daniel Veillard wrote:
On Wed, Jan 12, 2005 at 03:13:00PM +0100, Kasimier Buchcik wrote:

 nb = 3
 nbneg = 1
 values[0] = "b|http://FOO";
 values[1] = "*|*"
 values[2] = "c|http://FOO";
 values[3] = "*|http://FOO";
I think such an API would allow complete analysis and report without needing to provide extra implementation details or new strings in the regexp format.

Cool, adding a nbneg sounds simpler than anything else.

  okay, it's in CVS already :-)

But there is still a problem: the negated namespace wildcard is build
using two transitions: "*|http://FOO"; leading to the sink state, and
"*|*" passing all other elements through. The "*|*" would not be
distinguishable from a <any namespace="##any"/> wildcard. Maby the
build of the automaton can be changed somehow to avoid this. Hmm,
is it possible to extend the sink state detection to handle this,
i.e. to surpress the report of such a "only-way-out-of-sink" transition?

  I'm not 100% sure what you would like to be reported.

      ------ *|* -------> S1
      -- *|http://FOO --> S2

Nice, I like such pictures.

S2 is the sink state. When the error is raised on S0 because {http://FOO}a is pushed you currently get
  nb = 1, nbneg = 1
  values[0] = "*|*"
  values[1] = "*|http://FOO";

What do you want instead ? nb = 0
  nbneg = 1
  values[0] = "*|http://FOO";

Yes, please :-)

I'm not sure the low level xmlregexp code is better at interpreting
why that construct was used, since it is specific to XMLSchemas.

I'm not sure it's specific to XML Schemata; I'm not deep into automata,
and less into it's jargon, so I would naively call the picture above a
"filter" pattern, and filters should not be used with XML Schemata only.

Seems to me that this reduction is better done at the level which can make an interpretation of the transition values.

Given that every negated namespace comes with an additional "*|*",
remove, for every negated namespace, one "*|*".
All "*|*" left can then be intepreted as <any namespace="##any"/>.
Yes this could be done on the schema side.

Additionally I noticed that a combination of <any namespace="##any"/>
and a negated namespace should never be a result of the error info,
since it would be ambigious; a quick test revealed that the regex
reports a non-determinist content model, if using such a combination.

So indeed, it seems to be complete. Cool :-)



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]