Re: [xml] libxml2 review in windows::developer

From: Sean McGuire <stm particulate net>
To: xml gnome org
Subject: Re: [xml] libxml2 review in windows::developer
Date: Fri, 11 Apr 2003 04:24:13 -0700

This is a surprising result... and I think it's, um... what's theword... incorrect. :) I've been speed-testing Xerces and libxml for afew days so I've been doing a lot of benchmarking. I've done tests onparsing speed, speed of walking every node in the doc, speed ofduplicating the doc by hand (including only those parts I'd care about),speed of cloning the doc via built-in copy methods, and speed ofserializing to disk... I've found libxml to be twice as fast as Xercesfor parsing, about the same for walking, between two and three times asfast for copying, and slightly faster for writing for a highlystructured doc, and three times as fast at writing for a flatter doc. Ishould do tests on building a doc by hand, from scratch, but I've beentoo lazy and have just let the hand-copy serve as a proxy for that.

Unfortunately I can't give many details about the testing environmentsince I'm not allowed to say who I work for, lest my statements betaken as "official" when they're just the ramblings of some developer,or much about the test files. But I'll talk to my boss and see if I canget them to loosen up a bit in the interest of spreading some truth around.

Of course, it's entirely possible that my use of Xerces is bone-brainedand that accounts for some of the difference... OTOH, my use of libxmlis based on a bit of time reading the docs, and so is my use of Xerces,so at least, these numbers are a good example of what a skilled ifclueless developer will get. And frankly, even if my use of Xerces isbone-brained, it's hard for me to imagine it is *so* bone-brained as toaccount for as much difference as I've seen.

Anyway, I know that that's not too useful without actual code and samplefiles to look at, but at least I thought it'd be nice to hear that somepeople are seeing radically different numbers (which are more favorableto libxml2.)


Peter Jacobi wrote:

Hi!

In the April issue of the windows::developer magazine there was
a comparative review of five (?) XML parsers, including libxml2. Author isMatthew Wilson.An online version is athttp://www.windevnet.com/documents/s=7868/win0304a/0304a.htm,
but requires registration at the web site.
I think Matthew is a rather competent C++ guru, but this doesn't help muchin XML issues, so there is not much beef in the article. But for benefit oramusement, I'll try to summarize the main points:
All testing done using C++, libxml2 used via libbxml++. The other parserswere MSXML, Xerces and XMLBooster. XMLBooster (www.xmlbooster.com)is in fact a parser generator - given the class definition in a propietaryXML-format, it generates Serializere and Deserialiers for this class.
The test data was rather 'flat', being 1000, 10000 and 100000 entries ofthe same structure:
<agenda>
<entry year = "2003" month = "4" day = "1" who="Windows DeveloperMagazine" />
<entry year = "2002" month = "12" day = "1" who="C/C++ User's Journal" />
<entry year = "2001" month = "5" day = "1" who="Windows DeveloperMagazine" />
. . .
</agenda>

So parsing this files gives only two performance classes (with little intra-
class differentation):

'fast'
MSXML-SAX
Xerces
XMLBooster

'slow' (needs 3 times the 'fast' time)
MSXML-DOM
libxml2-tree

So what's telling us these numbers:
1. Allocating the (DOM) tree needs time, and doing SAX or a specializedparser is faster. libxml-SAX wasn't benchmarked.2. Xerces faster than libxml is a bit a mystery, but given the XML above,it may be the 'attribute cost'.
Matthew's comments on ease of use favor XMLBooster, but I find this ratherpointless, as XMLBooster offers a layer above pure XML-Parsing, which canbe added similiarly to each parser.
Regards,
Peter Jacobi


_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
xml gnome org
http://mail.gnome.org/mailman/listinfo/xml

References:
- [xml] libxml2 review in windows::developer
  - From: Peter Jacobi

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]