Re: [xml] Schemas validator

From: "Jon Smirl" <jonsmirl mediaone net>
To: <xml gnome org>
Subject: Re: [xml] Schemas validator
Date: Fri, 11 May 2001 11:17:05 -0400

SGML processors have shown that schemas can be used in more ways that just
validation.

One example is as a dictionary for incoming documents. First you would load
and compile the schema into memory. Then incoming documents would be parsed
and validated against this in-memory structure. Doing this allows the DOM to
be constructed more efficiently. Each element node in the DOM can consist of
a pointer to the schema description for the element. The schema description
for the element will contain an array of legal attribute names. The DOM
element node can then contain an array of pointers to attribute values that
corresponds to the attribute names in the schema structure. Linking the DOM
this way also makes it easy to check edit actions on the DOM against the
schema.

In XLST2 it is proposed that the schema for the incoming document can be
optionally specified. This will allow the XSLT sheet to be compiled against
the incoming schema. Now it becomes possible to precompute what XSLT
templates are active on each schema node and reduce search time. A lot of
string compares can also be converted to pointer compares since XSLT, the
schema and the incoming document are all using a common string database.

Knowing the input schema also allows node dumping. Some XSLT processor are
trying to dynamically transform extremely large documents without holding
the entire DOM in memory.  They do this by starting to build a DOM in memory
and simultaneously running the XSLT transform on it. The trick is knowing
when the stylesheet is done accessing the first few chapters of the book.
Getting these chapters out of memory is called node dumping. Specifying the
incoming schema allows you to compute when it is legal to node dump.

Jon Smirl
jonsmirl mediaone net

References:
- [xml] Schemas validator
  - From: Marc Emery
- Re: [xml] Schemas validator
  - From: Daniel Veillard

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]