[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [xml] bug in xmllint parsing of content model



On Wed, Feb 12, 2003 at 10:49:23PM -0500, Syd Bauman wrote:
> I have tried searching the FAQ, list archives, and list of open bugs,
> and have not found this one, to my surprise. I don't think it's
> anything I'm doing wrong, but I could be wrong (about that :-).

  Hum, that's the second time in 24 hours that a content model analysis
problem is reported. There is something going on ...

> xmllint seems to inappropriately remove parentheses from within
> content models at times, thus creating problems. The following input
> file is valid according to nsgmls, xmlparse, and rxp; a friend also
> checked a more complicated version of the same problem in XMetal, and
> reported no errors. xmllint generates an error (complete output
> attached later):
> 
> | validity error: Content model of box is not determinist: ((box ,
> |                 filler*)+ | ((gift , filler*)+ , (box , filler*)*)) 
> 
> (whitespace added). The content model shown is, as far as I can tell,
> deterministic (or "non-ambiguous" for the SGMLers). However, looking

  Right, seems I have forgotten to add some epsilon transitions when
building the automata <grin/>

> at the output of xmllint, it seems that the content model it is
> barfing on is not the same one as is in the error message (which is
> the same as the one in the input file, except for whitespace).
> 
> Output:
> | <!ELEMENT box ((box , filler*)+ | ((gift , filler*)+ , box , filler**))>
> 
> Compare that to the (whitespace-similarly-altered) input
> 
> Input (and error message):
> | <!ELEMENT box ((box , filler*)+ | ((gift , filler*)+ ,(box , filler*)*))>
> 
> and you can see that the parentheses around the last name token group
> have disappeared. The resulting content model (without parentheses)

  Hum, okay that looks like a serialization bug, annoying but should be
relatively easy to fix.

> Full input file (for those interested in the choice of element names,
> I've noticed that boxes of, say, valentine's gifts are often mostly
> filler, and there are often multiple layers of nested packaging to
> get through before you get to the goodies) is first, followed by the
> copied-and-pasted command-line and resulting output, including
> xmllint version information.
> 
> [ATTACHMENT ~/temp/test_xmllint.xml, text/xml]
> [ATTACHMENT ~/temp/shell-output, text/plain]

  Hum, I didn't got the attachment, could you send test_xmllint.xml
again ?

Daniel

-- 
Daniel Veillard      | Red Hat Network https://rhn.redhat.com/
veillard redhat com  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]