[Date Prev][Date Next] [Thread Prev][Thread Next]
[Thread Index]
[Date Index]
[Author Index]
Re: [xml] DTD validation issue
- From: Daniel Veillard <veillard redhat com>
- To: Petr Pajas <pajas ufal ms mff cuni cz>
- Cc: libxml2 <xml gnome org>
- Subject: Re: [xml] DTD validation issue
- Date: Tue, 26 Feb 2008 04:15:28 -0500
On Mon, Feb 25, 2008 at 09:45:13PM +0100, Petr Pajas wrote:
> Hi Daniel, All,
>
> the following inconsistency in DTD validation, reproducible with xmllint, was
> reported to me by a user of XSH2, Jakub Neburka.
>
> He takes two files: decl.dtd and decl.xml and does basically the following:
>
> 1) xmllint --valid decl.xml
> xmllint --postvalid decl.xml
>
> both succeed.
>
> 2) xmllint --shell decl.xml
> /> validate
>
> this, however, fails with
>
> decl.xml:5: element root: validity error : Element root was declared EMPTY
> this one has content
>
> (Probably because the library calls are alike, XSH2 behaves similarly:
> parse-time validation is fine, validating the in-memory tree fails).
>
> The test cases follow.
>
> __decl.dtd__
> <!ENTITY % cond "IGNORE">
> <![%cond;[
> <!ENTITY % content "ANY">
> ]]>
> <!ENTITY % content "EMPTY">
> <!ELEMENT root %content;>
> __CUT__
>
> __decl.xml__
> <?xml version="1.0" encoding="UTF-8"?>
> <!DOCTYPE root SYSTEM "decl.dtd" [
> <!ENTITY % cond "INCLUDE">
> ]>
> <root>content</root>
> __CUT__
>
> Can you confirm this is a bug? Shall I bugzilla it?
Not a bug. When you do things like Post validation, you give it a
preparsed DTD. in that case the DTD was parsed without the context
of the document, while the internal subset changes the behaviour.
Basically xmlValidateDtd() or any validation using a DTD parsed out
of the context of the document can't exactly match the behaviour of
XML-1.0 validation, because it allows the document to modify the
DTD.
Actually having a validation which depends only on the DTD/schemas
and where the document can't modify the set of rules set by the receiver
is in a lot of cases a good thing, if you consider a DTD/Schemas is
a contract between a producer and a consumer of documents.
If you want to have 100% the DTD validation semantic as described in
XML-1.0 spec, reparsing the document is I think the only guaranteed
correct option.
Also note that the mismatch is documented in libxml2 call
/**
* xmlValidateDtd:
* @ctxt: the validation context
* @doc: a document instance
* @dtd: a dtd instance
*
* Try to validate the document against the dtd instance
*
* Basically it does check all the definitions in the DtD.
* Note the the internal subset (if present) is de-coupled
* (i.e. not used), which could give problems if ID or IDREF
* is present.
*
* returns 1 if valid or 0 otherwise
*/
Daniel
--
Red Hat Virtualization group http://redhat.com/virtualization/
Daniel Veillard | virtualization library http://libvirt.org/
veillard redhat com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/
[Date Prev][Date Next] [Thread Prev][Thread Next]
[Thread Index]
[Date Index]
[Author Index]