Re: [xml] xmlSchemaParse function



Hi,

On Mon, 2005-09-26 at 10:30 -0400, Daniel Veillard wrote:
On Mon, Sep 26, 2005 at 10:00:27AM -0400, Daniel Veillard wrote:
On Mon, Sep 26, 2005 at 02:16:28AM +0200, Luka Por wrote:
Hello.

Library is great, but from version 2-2.6.20 to version 2-2.6.22 i 
discover same difference.

I using xmlSchemeParse(...) to validate document.
In version 2-2.6.20 xmlSchemaParse spends 4 seconds,  but now in version 
2-2.6.22 it spends 16 second (same in version 2-2.6.21).

  You should followup with Kasimier, but I don't see how compilation of the
schemas itself could take 4 seconds of CPU time. If this is the case that will
be easy to spot. What I expect is that there is some I/O going on which are
slowing down (fetching from a slow HTTP server for example), and in that
case simply copying all resources locally and making sure you don't have
remote references anymore should bring compilation of the schemas back to a 
few milliseconds not seconds.

  Okay, this is not a network latency problem, we are looking at it...

We pinpointed the problem to be related to the construction of the
content model for a specific complex type in
http://www.e-dokumenti.si/sheme/xsd/eSlog_1_4_PreprostiRacun.xsd

As a first workaround for you, and probably leading to better
schema design, I recommend changing that type definition.
We have nested occurence information here, which is probably
not intended:

<xs:element name="Racun">
  <xs:complexType>
    <xs:choice minOccurs="0" maxOccurs="unbounded">
      <xs:element ref="GlavaRacuna"/>
      <xs:element ref="DatumiRacuna" maxOccurs="unbounded"/>
      <xs:element ref="Lokacije" maxOccurs="unbounded"/>
      <xs:element ref="PoljubnoBesedilo" minOccurs="0"
maxOccurs="unbounded"/>

      [...]
  
    </xs:choice>
  </xs:complexType>
</xs:element>

Since the surrounding xs:choice has already a maxOccurs of unbounded,
any contained item will be valid in any sequence order. So the
"unbounded" on the element declaration references is redundant.
Additionally you have a min/maxOccurs of 1 for element "GlavaRacuna",
which should produce the same results as the occurence information
for "PoljubnoBesedilo". It just makes the computation of the
content model to become heavy metal.

Removing the "unbounded" occurence for the contained items, reduces
the processed time (with local files) to:

kbu librax:/data/home/kbuchcik/gnomecvs/libxml2$ time xmllint --noout
--schema eSlog_1_4_PreprostiRacun.xsd eSlog1.xml
eSlog1.xml validates

real    0m0.028s
user    0m0.021s
sys     0m0.007s


Regards,

Kasimier




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]