Re: [xml] parse-time validation against a user provided DTD

On Sat, May 09, 2009 at 09:32:54AM +0200, Stefan Behnel wrote:

looking through the API docs, I can't really figure out a way to stick an
external DTD into the parser, so that it validates against that rather than
trying to load a DTD for the DOCTYPE (or also to do DTD validation if the
document does not define a DOCTYPE at all).

I can see that xmllint can validate against an externally provided DTD, but
only after parsing, so that doesn't help.

  I don't understand the dependancy here.
You don't want the default behaviour of validating as defined in the
spec. So don't do it then, and do it as a later stage.

Has anyone done this before? Is there maybe even a preferred/obvious/RFTM
way of doing that? I'm interested in a way to do this a) by providing a
readily parsed xmlDtd, and maybe even b) by providing a public ID (in case
there is a shortcut other than looking up the DTD manually).

  Why do you think it's better to put this in the parsing phase ?
IMHO that's more confusing than anything else. If you validate in
a way that's different from what the spec explains, then it really
should be separated as an isolated step to avoid confusion about
the 2 different validations.
  For example the mix between the internal subset overriding rules and
what you may provide in your own DTD is impossible to decipher, as
the normal rules can't apply.
  Validation is a trust checkpoint.
  Either you trust the document and that means including the external
subset, or you don't trust it.
  If you don't trust it and hand you own validation rules, why would
you allow its internal subset to override and modify your rules ?

sounds to me the correct API if you don't trust the incoming document.


Daniel Veillard      | libxml Gnome XML XSLT toolkit
daniel veillard com  | Rpmfind RPM search engine | virtualization library

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]