Re: [xml] Schemas validator

On Thu, May 10, 2001 at 08:24:06PM +0200, Marc Emery wrote:

I am an swiss student who is studying in germany and I want to implement, or 
to help to implement a Schemas validator in libxml.

  First I would like to thank you inadvance for working on this.
Second, I'm still waiting for some early patches on the subject.

Later I will integrate (link) it in the OpenOffice project.

I read the xmlsoft documentation and the DTD parser code.

I think we can divide the work in 3 parts:

- the schemas to memory representation parser

- a memory tree of the simple and complex types

- the validator

- the standard Schemas types validations

  So there is 4 parts, but right this sounds a right module definition.
  1/ the first part takes an XML schemas, parse them (there can be multiple
parts) to obtain a (set of) DOM tree then compile this tree into an easier
to manipulate form (using SAX there is probably adding complexity for
little gain).

  2/ Second part would be a set of predefined types (the Schemas part 2

  3/ The third part would work on the result of 1 and reuses 2/ to build
specific datatypes trees for this specific schemas definition (adding facets
subtyping etc ...)

  4/ the validator proper, plugging at the SAX level or reusing a preparsed
(or dynamically built tree).

Is it better to write the schemas validator form scrach or to modify the 
valid.h file ? 

The validator will validate only  memory trees (DOM-like) document. It's very 
difficult to implement Schemas with an handler.

   Hum, actually this is a trade-off, I understand the other person
interested in this would prefer to work at the SAX level. Anyway I think
points 1/ 2/ 3/ are independants of this choice. I think that a solution
implementing 4/ on a DOM tree would be a really great already, from that
reusing most of it to make it work on SAX should not be too difficult.
   Concerning the reuse of valid.h well, it's probably better to rewrite
this part from scratch. The datatypes associated to DTD validation are
really really simpler than the ones for Schemas, I don't think they can
reuse much.
   Reuse of the code from valid.c itself seems very limited too. Some
of the basis like memory allocation structure can probably be reused by
cut/paste/change but I'm afraid it is not reusable as is. As a side note
it would be great if the new type/functions were anchored in a new 
prefix (xmlSchemasXxxx).

  I will try to follow this project but i currently have a lot of
pressure to finish libxslt, but I'm interested in Schemas *


* I looked at this I think last summer but didn't go very far it's certainly
  useless too ...

Daniel Veillard      | Red Hat Network
veillard redhat com  | libxml Gnome XML XSLT toolkit | Rpmfind RPM search engine

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]