[xml] Beta release of 2.6.0beta2



  Okay this new beta release is available from ftp://xmlsoft.org/test/

This includes some serious changes to the parser, the SAX2 rewrite should
be mostly finished now, it also uses string interning for parsing.
The xmlTextReader interface has been changed significantly internally,
the subtrees built reuse the pool of strings from the parser, the
node and attributes are also recycled while scanning the document,
as a result the reader interface is more than twice as fast as previously
on a number of tests I could run. It's still slower than the pure
SAX2 API, by a factor of 2 approximately, but it's quite better.
Oh and I started adding xmlreader API to read constant strings and
not ones which need to be deallocated.

  Among the things I'm tempted to do for 2.6.0:
     - allow the standard interface of the parser to build document
       using the pool of string from the parser, then attach the dictionary
       to the resulting document, and modify the general tree handling
       code API to handle this specifically when copying or freeing document.
       The pros can be a significant speed and size boost for standard
       processing (parse/read/write/free) but can introduce some
       hazards for editing capabilities, especially when migrating
       fragments from one document to another (which would not use the
       same name dictionnaries) ! 
       Opinion welcome about this, there are many ways to handle and tweak
       the per document string dictionary
     - allow to reuse a parser context to parse multiple documents,
       this could bring significant speedup when processing a large
       set of similar doument, especially if they are based on the
       same vocabularies.
     - Update the python bindings to use the new Const interfaces to
       the xmlReader instead of the old ones.

  Feedback and comments are invited, and it's better to stop me now
from doing something foolish than complain once 2.6.0 is finally released :-)
So if you do some specific processing especially thing like modifying
documents merging them or doing advanced editing functionalities, please
think about the impact of per-document string pools, or even of an
application wide string pool. Looking around include/libxml/dict.h
and xmlreader.h is definitely a good idea.

Daniel

-- 
Daniel Veillard      | Red Hat Network https://rhn.redhat.com/
veillard redhat com  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]