RE: [xml] libxml2 v2.5.7 appears much slower than libxml2 v2.4.24



Hi Daniel

* libxml2 v2.5.7 seems dramatically (about 20x) slower than libxml2 v2.4.24
in my environment:

$ time $PD/libxml2/2.4.24/bin/xmllint --noout --postvalid --xinclude update_schema.dbk
$PD/libxml2/2.4.24/bin/xmllint --noout --postvalid --xinclude   2.75s user 0.71s system 70% cpu 4.895 total

$ time $PD/libxml2/2.5.7/bin/xmllint --noout --postvalid --xinclude update_schema.dbk
$PD/libxml2/2.5.7/bin/xmllint --noout --postvalid --xinclude update_schema.db  73.19s user 7.62s system 
83% cpu 1:36.34 total

 Ouch, is that on the same box?

Yep.



The very first thing to do would be to run with the --timing flag to
try to identify if the problem is in the XInclude processing or the
DTD post validation or split between them.

Ok, here is the output of that:

$ $PD/libxml2/2.4.24/bin/xmllint --noout --postvalid --xinclude --timing update_schema.dbk 
Parsing took 123 ms
Xinclude processing took 2876 ms
Validating took 3 ms
Freeing took 42 ms

zsh  198  2  ruscoekm xldn0199dap  /view/eq_cmad_rit_ruscoekm_unix  
~/my_tree/links/SAFE/src/apps/utilities/docs
$ $PD/libxml2/2.5.7/bin/xmllint --noout --postvalid --xinclude --timing update_schema.dbk
Parsing took 125 ms
Xinclude processing took 83357 ms
Validating took 8 ms
Freeing took 44 ms

So, that is pretty clear :-)



 Okay I have a guess:
   1/ your XInclude hierarchy is large (approx 30 files).
   2/ all of the Included parts reference the DocBook DOCTYPE

Correct on both counts.  (We make very heavy use of XInclude.)

XInclude requires to coalesce IDs, which mean parsing of each the included
parts need to be done with DTD loaded. The older version of XInclude wasn't
doing this and hence wasn't compliant. The new one does it, is fully compliant,
but now load the DocBook DTD in memory for each included instance.

So this may be normal, though more than one minute of processing for 
a document sounds horribly long, even if loading 30 times the docbook DTD
I think I timed this below 100ms on my machine (some time ago), and 
30x100ms is 3 seconds, not 60 , there is still something really strange.

I would probably need a (private/obfuscated if needed) copy of the data 
to make a cleaner analysis.

I have included at the end of this mail:
xinclude.dtd (our customised DTD)
update_schema.dbk
synopses.dbk (one of the XIncluded files)

One obvious way to try to speed this up is to reuse a DTD pool for the
set of XIncluded documents. This could bring back performances to close
to their initial values. But I would need to confirm the problem first.

Ok, let me know what you think / if you need any more info.

Cheers
Kevin



* Here is the text of xinclude.dtd:

<!-- 
Implmentation of the XInclude standard.

To use, reference this DTD as an entity and insert it into your doctype declaration.

Sagar R. Shah
-->

<!-- public identifier "-//UBS//CORE//DTD XInclude V1.0//EN" -->

<!ELEMENT xi:include (xi:fallback)?>
                  <!ATTLIST xi:include
                      xmlns:xi  CDATA #FIXED "http://www.w3.org/2001/XInclude";         
                      href       CDATA                                   #REQUIRED
                      parse      (xml|text)                              "xml"
                      encoding   CDATA                                   #IMPLIED
                  >
<!ELEMENT xi:fallback ANY>
                  <!ATTLIST xi:fallback
                      xmlns:xi   CDATA #FIXED   "http://www.w3.org/2001/XInclude";
                  >

<!--
This permits the xml:base attribute in just about every DocBook element.
It is required to get libxml2 v2.4.24 and above to work.
It will need to be removed when the standard DTD is updated to support xml:base.
-->

<!ENTITY % local.common.attrib "xml:base  CDATA  #IMPLIED">



* Here is the text of update_schema.dbk:

<?xml version="1.0"?>
<!DOCTYPE refentry PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN" "docbookx.dtd"
[
<!ENTITY % xinclude PUBLIC "-//UBS//CORE//DTD XInclude V1.0//EN" "xinclude.dtd">
%xinclude;
]


<refentry>
  <refnamediv>
    <refname>update_schema</refname>
    <refpurpose>brings an entire schema up to date</refpurpose>
  </refnamediv>
  
  <refsynopsisdiv>
    <cmdsynopsis>
      <command>update_schema</command><sbr/><sbr/>
      <arg><xi:include href="synopses.dbk#xpointer(id('A')/child::*)"/></arg><sbr/>
      <arg><xi:include href="synopses.dbk#xpointer(id('b')/child::*)"/></arg><sbr/>
      <arg><xi:include href="synopses.dbk#xpointer(id('c')/child::*)"/></arg><sbr/>
      <arg><xi:include href="synopses.dbk#xpointer(id('D')/child::*)"/></arg><sbr/>
      <arg><xi:include href="synopses.dbk#xpointer(id('d')/child::*)"/></arg><sbr/>
      <arg><xi:include href="synopses.dbk#xpointer(id('E')/child::*)"/></arg><sbr/>
      <arg><xi:include href="synopses.dbk#xpointer(id('e')/child::*)"/></arg><sbr/>
      <arg><xi:include href="synopses.dbk#xpointer(id('F')/child::*)"/></arg><sbr/>
      <arg><xi:include href="synopses.dbk#xpointer(id('f')/child::*)"/></arg><sbr/>
      <arg><xi:include href="synopses.dbk#xpointer(id('h')/child::*)"/></arg><sbr/>
      <arg><xi:include href="synopses.dbk#xpointer(id('l')/child::*)"/></arg><sbr/>
      <arg><xi:include href="synopses.dbk#xpointer(id('M')/child::*)"/></arg><sbr/>
      <arg><xi:include href="synopses.dbk#xpointer(id('P')/child::*)"/></arg><sbr/>
      <arg><xi:include href="synopses.dbk#xpointer(id('p')/child::*)"/></arg><sbr/>
      <arg><xi:include href="synopses.dbk#xpointer(id('r')/child::*)"/></arg><sbr/>
      <arg><xi:include href="synopses.dbk#xpointer(id('S')/child::*)"/></arg><sbr/>
      <arg><xi:include href="synopses.dbk#xpointer(id('s')/child::*)"/></arg><sbr/>
      <arg><xi:include href="synopses.dbk#xpointer(id('U')/child::*)"/></arg><sbr/>
      <arg><xi:include href="synopses.dbk#xpointer(id('v')/child::*)"/></arg><sbr/>
      <arg><xi:include href="synopses.dbk#xpointer(id('W')/child::*)"/></arg><sbr/>
      <arg><xi:include href="synopses.dbk#xpointer(id('w')/child::*)"/></arg><sbr/><sbr/>
      <arg><xi:include href="synopses.dbk#xpointer(id('schema_objects')/child::*)"/></arg>
    </cmdsynopsis>
  </refsynopsisdiv>
  
  <refsection>
    <title>Examples</title>

    <screen>
# 1. Use DSQUERY, USERNAME and PASSWORD environment variables
#    to connect to the database.
# 2. Read all repository files from current directory (and sub-directories).
# 3. Report on necessary changes, but do not implement them.

$ update_schema</screen>

    <screen>
# 1. Report on and implement necessary changes.
# 2. Create a log file of the invocation (highly recommended).

$ update_schema --exec_sql <co id="updateschema-examples-execsql-co" 
linkends="updateschema-examples-execsql"/>
                --log_file safe.log <co id="updateschema-examples-logfile-co" 
linkends="updateschema-examples-logfile"/></screen>

    <calloutlist>
      <callout id="updateschema-examples-execsql" arearefs="updateschema-examples-execsql-co">
        <para>Report on and implement necessary changes.</para>
      </callout>

      <callout id="updateschema-examples-logfile" arearefs="updateschema-examples-logfile-co">
        <para>Create a log file of the invocation (highly recommended).</para>
      </callout>
    </calloutlist>

    <screen>
# 1. Print out the required DDL to implement necessary changes.

$ update_schema --print_sql</screen>

    <screen>
# 1. Update all objects, whether they have changed or not.
# 3. Do not drop objects from the model or tempdb databases.
# 2. Update only defaults, rules and table indexes.

$ update_schema --components indexes
                --forced_update
                --shared_dbs model,tempdb
                defaults rules tables</screen>

    <screen>
# 1. Re-format the repository files.
# 2. Re-format only the stock tables.

$ update_schema --format_rps
                'tables,stock_*'</screen>
  </refsection>
  
  <refsection>
    <title>Description</title>
    
    <xi:include href="../../../website/htdocs/panels/panel_man_pages.dbk"/> 
  
    <xi:include 
href="../../../website/htdocs/manuals/user_guide.dbk#xpointer(id('ProgramDescriptions-UpdateSchema')/child::title/following-sibling::*)"/>
  </refsection>
  
  <refsection>
    <title>Options</title>
    
    <variablelist>
      <varlistentry><xi:include href="options.dbk#xpointer(id('Q--safe_dbs')/child::*)"/></varlistentry>
      <varlistentry><xi:include href="options.dbk#xpointer(id('Q--batch_mode')/child::*)"/></varlistentry>
      <varlistentry><xi:include href="options.dbk#xpointer(id('Q--cfg_file')/child::*)"/></varlistentry>
      <varlistentry><xi:include href="options.dbk#xpointer(id('Q--object_dbs')/child::*)"/></varlistentry>
      <varlistentry><xi:include href="options.dbk#xpointer(id('Q--debug')/child::*)"/></varlistentry>
      <varlistentry><xi:include 
href="options.dbk#xpointer(id('Q--ignore_all_errors')/child::*)"/></varlistentry>
      <varlistentry><xi:include href="options.dbk#xpointer(id('Q--exec_sql')/child::*)"/></varlistentry>
      <varlistentry><xi:include href="options.dbk#xpointer(id('Q--forced_update')/child::*)"/></varlistentry>
      <varlistentry><xi:include href="options.dbk#xpointer(id('Q--format_rps')/child::*)"/></varlistentry>
      <varlistentry><xi:include href="options.dbk#xpointer(id('Q--help')/child::*)"/></varlistentry>
      <varlistentry><xi:include href="options.dbk#xpointer(id('Q--log_file')/child::*)"/></varlistentry>
      <varlistentry><xi:include href="options.dbk#xpointer(id('Q--components')/child::*)"/></varlistentry>
      <varlistentry><xi:include href="options.dbk#xpointer(id('Q--password')/child::*)"/></varlistentry>
      <varlistentry><xi:include href="options.dbk#xpointer(id('Q--print_sql')/child::*)"/></varlistentry>
      <varlistentry><xi:include href="options.dbk#xpointer(id('Q--read_from')/child::*)"/></varlistentry>
      <varlistentry><xi:include href="options.dbk#xpointer(id('Q--server')/child::*)"/></varlistentry>
      <varlistentry><xi:include href="options.dbk#xpointer(id('Q--shared_dbs')/child::*)"/></varlistentry>
      <varlistentry><xi:include href="options.dbk#xpointer(id('Q--user')/child::*)"/></varlistentry>
      <varlistentry><xi:include href="options.dbk#xpointer(id('Q--version')/child::*)"/></varlistentry>
      <varlistentry><xi:include href="options.dbk#xpointer(id('Q--web')/child::*)"/></varlistentry>
      <varlistentry><xi:include href="options.dbk#xpointer(id('Q--write_to')/child::*)"/></varlistentry>
    </variablelist>
  </refsection>
  
  <refsection>
    <title>Exit Status</title>
    
    <xi:include href="exit_statuses.dbk#xpointer(/article/variablelist)"/>
  </refsection>
  
  <refsection>
    <title>Files</title>
    
    <para>
      Examples of relevant repository files are as follows:
      
      <variablelist>
        <varlistentry><xi:include href="rps_files.dbk#xpointer(id('COLUMNS.rps')/child::*)"/></varlistentry>
        <varlistentry><xi:include 
href="rps_files.dbk#xpointer(id('ORA_COLUMNS.rps')/child::*)"/></varlistentry>
        <varlistentry><xi:include href="rps_files.dbk#xpointer(id('DBMS.rps')/child::*)"/></varlistentry>
        <varlistentry><xi:include href="rps_files.dbk#xpointer(id('FILTERS.rps')/child::*)"/></varlistentry>
        <varlistentry><xi:include href="rps_files.dbk#xpointer(id('FUNCTIONS.rps')/child::*)"/></varlistentry>
        <varlistentry><xi:include href="rps_files.dbk#xpointer(id('PACKAGES.rps')/child::*)"/></varlistentry>
        <varlistentry><xi:include 
href="rps_files.dbk#xpointer(id('REP_COLUMNS.rps')/child::*)"/></varlistentry>
        <varlistentry><xi:include href="rps_files.dbk#xpointer(id('SAFE_DBS.rps')/child::*)"/></varlistentry>
        <varlistentry><xi:include 
href="rps_files.dbk#xpointer(id('STORED_PROCS.rps')/child::*)"/></varlistentry>
        <varlistentry><xi:include 
href="rps_files.dbk#xpointer(id('ORA_STORED_PROCS.rps')/child::*)"/></varlistentry>
        <varlistentry><xi:include href="rps_files.dbk#xpointer(id('TRIGGERS.rps')/child::*)"/></varlistentry>
        <varlistentry><xi:include 
href="rps_files.dbk#xpointer(id('ORA_TRIGGERS.rps')/child::*)"/></varlistentry>
        <varlistentry><xi:include href="rps_files.dbk#xpointer(id('VIEWS.rps')/child::*)"/></varlistentry>
        <varlistentry><xi:include href="rps_files.dbk#xpointer(id('ORA_VIEWS.rps')/child::*)"/></varlistentry>
        <varlistentry><xi:include href="rps_files.dbk#xpointer(id('ora_fn1.fn')/child::*)"/></varlistentry>
        <varlistentry><xi:include href="rps_files.dbk#xpointer(id('ora_pkg1.pks')/child::*)"/></varlistentry>
        <varlistentry><xi:include href="rps_files.dbk#xpointer(id('ora_pkg1.pkb')/child::*)"/></varlistentry>
        <varlistentry><xi:include href="rps_files.dbk#xpointer(id('proc1.proc')/child::*)"/></varlistentry>
        <varlistentry><xi:include 
href="rps_files.dbk#xpointer(id('ora_proc1.proc')/child::*)"/></varlistentry>
        <varlistentry><xi:include href="rps_files.dbk#xpointer(id('table1.table')/child::*)"/></varlistentry>
        <varlistentry><xi:include 
href="rps_files.dbk#xpointer(id('ora_table1.table')/child::*)"/></varlistentry>
        <varlistentry><xi:include href="rps_files.dbk#xpointer(id('file1.tpl')/child::*)"/></varlistentry>
        <varlistentry><xi:include href="rps_files.dbk#xpointer(id('ora_file1.tpl')/child::*)"/></varlistentry>
        <varlistentry><xi:include href="rps_files.dbk#xpointer(id('view1.view')/child::*)"/></varlistentry>
        <varlistentry><xi:include 
href="rps_files.dbk#xpointer(id('ora_view1.view')/child::*)"/></varlistentry>
      </variablelist>
    </para>
  </refsection>
  
  <refsection>
    <title>Authors</title>
    
    <para>
      Kevin Ruscoe, Colin Woodford
    </para>
  </refsection>
</refentry>



* Here is the text of synopses.dbk:

<?xml version="1.0"?>
<!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN" "docbookx.dtd">

<article>
  <cmdsynopsis>
    <arg id="A">
      <group choice="req"><arg>--safe_dbs</arg><arg>-A</arg></group>
      <arg choice="plain"><replaceable>database</replaceable></arg>
      <arg choice="plain" rep="repeat"><arg>,<replaceable>database</replaceable></arg></arg>
    </arg>
    <sbr/>

    <arg id="b"><group choice="req"><arg>--batch_mode</arg><arg>-b</arg></group></arg><sbr/>
    
    <arg id="c">
      <group choice="req"><arg>--cfg_file</arg><arg>-c</arg></group>
      <arg choice="plain"><replaceable>configuration file</replaceable></arg>
    </arg>
    <sbr/>

    <arg id="D">
      <group choice="req"><arg>--object_dbs</arg><arg>-D</arg></group>
      <arg choice="plain"><replaceable>database</replaceable></arg>
      <arg choice="plain" rep="repeat"><arg>,<replaceable>database</replaceable></arg></arg>
    </arg>
    <sbr/>

    <arg id="d">
      <group choice="req"><arg>--debug</arg><arg>-d</arg></group>
      <arg choice="plain" rep="repeat">
        <group choice="req">
          <arg>bug_report</arg>
          <arg>dbi_trace</arg>
          <arg>diagnostics</arg>
          <arg>filters</arg>
          <arg>grammar</arg>
          <arg>print_sql</arg>
          <arg>stack</arg>
          <arg>sybase</arg>
          <arg>warnings</arg>
          <arg>all</arg>
        </group>
      </arg>
    </arg>
    <sbr/>

    <arg id="E"><group choice="req"><arg>--ignore_all_errors</arg><arg>-E</arg></group></arg><sbr/>
    <arg id="e"><group choice="req"><arg>--exec_sql</arg><arg>-e</arg></group></arg><sbr/>
    <arg id="F"><group choice="req"><arg>--forced_update</arg><arg>-F</arg></group></arg><sbr/>
    <arg id="f"><group choice="req"><arg>--format_rps</arg><arg>-f</arg></group></arg><sbr/>
    <arg id="h"><group choice="req"><arg>--help</arg><arg>-h</arg></group></arg><sbr/>
    
    <arg id="l">
      <group choice="req"><arg>--log_file</arg><arg>-l</arg></group>
      <arg choice="plain"><replaceable>pathname of log file</replaceable></arg>
    </arg>
    <sbr/>
    
    <arg id="M">
      <group choice="req"><arg>--components</arg><arg>-M</arg></group>
      <arg>^</arg>
      <arg choice="plain">
        <group choice="req">
          <arg>columns</arg>
          <arg>common_keys</arg>
          <arg>foreign_keys</arg>
          <arg>indexes</arg>
          <arg>permissions</arg>
          <arg>primary_keys</arg>
          <arg>shortnames</arg>
          <arg>triggers</arg>
        </group>
      </arg>
      <arg choice="plain" rep="repeat"><arg>,<replaceable>component</replaceable></arg></arg>
    </arg>
    <sbr/>
    
    <arg id="O">
      <group choice="req"><arg>--object_owner</arg><arg>-O</arg></group>
      <arg choice="plain"><replaceable>name</replaceable></arg>
    </arg>
    
    <arg id="P">
      <group choice="req"><arg>--password</arg><arg>-P</arg></group>
      <arg choice="plain"><replaceable>password</replaceable></arg>
    </arg>
    <sbr/>
    
    <arg id="p"><group choice="req"><arg>--print_sql</arg><arg>-p</arg></group></arg>
    
    <arg id="r">
      <group choice="req"><arg>--read_from</arg><arg>-r</arg></group>
      <arg choice="plain"><replaceable>alternate directory</replaceable></arg>
    </arg>
    <sbr/>
    
    <arg id="S">
      <group choice="req"><arg>--server</arg><arg>-S</arg></group>
      <arg choice="plain"><replaceable>server</replaceable></arg>
    </arg>
    <sbr/>
    
    <arg id="s">
      <group choice="req"><arg>--shared_dbs</arg><arg>-s</arg></group>
      <arg choice="plain"><replaceable>database</replaceable></arg>
      <arg choice="plain" rep="repeat"><arg>,<replaceable>database</replaceable></arg></arg>
    </arg>
    <sbr/>
    
    <arg id="U">
      <group choice="req"><arg>--user</arg><arg>-U</arg></group>
      <arg choice="plain"><replaceable>user</replaceable></arg>
    </arg>
    <sbr/>

    <arg id="v"><group choice="req"><arg>--version</arg><arg>-v</arg></group></arg><sbr/>
    <arg id="W"><group choice="req"><arg>--web</arg><arg>-W</arg></group></arg><sbr/>
    
    <arg id="w">
      <group choice="req"><arg>--write_to</arg><arg>-w</arg></group>
      <arg choice="plain"><replaceable>alternate directory</replaceable></arg>
    </arg>
    <sbr/>

    <arg id="schema_objects">
      <arg choice="plain" rep="repeat">
        <group choice="req">
          <arg>DEFAULTS</arg>
          <arg>MESSAGES</arg>
          <arg>RULES</arg>
          <arg>TYPES</arg>
          <arg>
            '
            <group 
choice="req"><arg>functions</arg><arg>packages</arg><arg>procs</arg><arg>tables</arg><arg>views</arg></group>
            <arg choice="plain" rep="repeat"><arg>,<replaceable>file glob</replaceable></arg></arg>
            '
          </arg>
        </group>
      </arg>
    </arg>
    <sbr/>
  </cmdsynopsis>
</article>

+ANYTHING+BELOW+WAS+ADDED+AFTER+I+HIT+SEND+

Visit our website at http://www.ubs.com

This message contains confidential information and is intended only
for the individual named.  If you are not the named addressee you
should not disseminate, distribute or copy this e-mail.  Please
notify the sender immediately by e-mail if you have received this
e-mail by mistake and delete this e-mail from your system.

E-mail transmission cannot be guaranteed to be secure or error-free
as information could be intercepted, corrupted, lost, destroyed,
arrive late or incomplete, or contain viruses.  The sender therefore
does not accept liability for any errors or omissions in the contents
of this message which arise as a result of e-mail transmission.  If
verification is required please request a hard-copy version.  This
message is provided for informational purposes and should not be
construed as a solicitation or offer to buy or sell any securities or
related financial instruments.




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]