Re: [xml] Status of the ISO Schematron implementation



On Wed, Jun 07, 2006 at 07:55:35PM +0200, Buchcik, Kasimier wrote:
Hi, 

is for it. I don't know what's the best approach, and I 
don't have that much
time right now (though a slow but reliable implementation 
should not be too
hard based on the existing code).

[...] 

Daniel, can you elaborate on the reason why it would be slow
implementation?

  Well it's gonna be easy because you know about the XPath pattern
streaming suttf :-)
  Suppose a schematron with 100 rules, each firing on a different set of
nodes. The naive implementation takes the document in memory (needed in
all cases) and for each rules a) first evaluate the target nodes b)
for each node evaluate the test expression, and since a) is usually
gonna be something like //foo/bar you end up traversing the tree 100
times. The less naive implementation I attempted was to reuse the patterns
to traverse the tree only once in and firing b) tests as nodes gets
found the the traversal.
  But the pattern are hard (isn't it ;-), the subset in schematron is not
the same as for XSD, and the way to map them to nodes isn't the same as
in XSLT patterns (which pattern instead of list of patterns applying).
That's probably why Eric hit the weird error about not be able to parse 
condition, the streaming patterns failed to parse the expression (sounds
familiar I bet :-).
  So simplest to fix the correctness of the implementation would be
to drop the streaming node selection of the current implementation,
go back to good old raw XPath evaluation for a) and then fixe the remaining
edge bugs which are gonna pop-up.
  Of course a prerequisite would be to have a final version of the ISO
spec if this is available, because IIRC it diverged a bit from
the previous version.

Daniel

-- 
Daniel Veillard      | Red Hat http://redhat.com/
veillard redhat com  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]