Re: [xml] bug in xmllint parsing of content model

On Wed, Feb 12, 2003 at 10:49:23PM -0500, Syd Bauman wrote:
I have tried searching the FAQ, list archives, and list of open bugs,
and have not found this one, to my surprise. I don't think it's
anything I'm doing wrong, but I could be wrong (about that :-).

  Hum, that's the second time in 24 hours that a content model analysis
problem is reported. There is something going on ...

xmllint seems to inappropriately remove parentheses from within
content models at times, thus creating problems. The following input
file is valid according to nsgmls, xmlparse, and rxp; a friend also
checked a more complicated version of the same problem in XMetal, and
reported no errors. xmllint generates an error (complete output
attached later):

| validity error: Content model of box is not determinist: ((box ,
|                 filler*)+ | ((gift , filler*)+ , (box , filler*)*)) 

(whitespace added). The content model shown is, as far as I can tell,
deterministic (or "non-ambiguous" for the SGMLers). However, looking

  Right, seems I have forgotten to add some epsilon transitions when
building the automata <grin/>

at the output of xmllint, it seems that the content model it is
barfing on is not the same one as is in the error message (which is
the same as the one in the input file, except for whitespace).

| <!ELEMENT box ((box , filler*)+ | ((gift , filler*)+ , box , filler**))>

Compare that to the (whitespace-similarly-altered) input

Input (and error message):
| <!ELEMENT box ((box , filler*)+ | ((gift , filler*)+ ,(box , filler*)*))>

and you can see that the parentheses around the last name token group
have disappeared. The resulting content model (without parentheses)

  Hum, okay that looks like a serialization bug, annoying but should be
relatively easy to fix.

Full input file (for those interested in the choice of element names,
I've noticed that boxes of, say, valentine's gifts are often mostly
filler, and there are often multiple layers of nested packaging to
get through before you get to the goodies) is first, followed by the
copied-and-pasted command-line and resulting output, including
xmllint version information.

[ATTACHMENT ~/temp/test_xmllint.xml, text/xml]
[ATTACHMENT ~/temp/shell-output, text/plain]

  Hum, I didn't got the attachment, could you send test_xmllint.xml
again ?


Daniel Veillard      | Red Hat Network
veillard redhat com  | libxml GNOME XML XSLT toolkit | Rpmfind RPM search engine

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]