Re: [xml] dtd and relaxng performance



On Tue, Feb 17, 2004 at 04:09:38PM +0100, Petr Pajas wrote:
[...]
The files used are "dictionary-like" XML files taken from a real
application. It is a research-related data, so I replaced most usable
values in it but preserved the structure. If you wish, you can get the
files and the DTD from http://pajas.matfyz.cz/dtdvalid-test.zip

  okay, can you bugzilla this ?

My question is if you are aware of something in the DTD/RelaxNG
validation code that could make it scale O(n^2) or so.

  For RelaxNG, well in the face of indeterminism there could be
some non-linear behaviour, but for DTD or XSD schemas no I can't
see where this would come from... You're building the tree in 
those examples so this will trash memory usage, retesting with
--stream option might show a diffrent behaviour, if yes then 
it's likely to be related to the tree building and not to the
validation itself.
  I remember doing a --stream --relaxng validation of a 4.5 Gbytes
file last summer, I think there would be one problem which need fixing
to renew that test right now but nothing non-linear.

  The best is probably to gprof the application and see what kind of
emerging hot spot comes out of it. Kcachegrind allows an even finer
testing at the expense of setting it up and far more CPU consumption.

Daniel

-- 
Daniel Veillard      | Red Hat Network https://rhn.redhat.com/
veillard redhat com  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]