Re: [xml] libxml vs stl



On Tue, Nov 22, 2005 at 01:39:21PM -0500, Stefan Seefeld wrote:
I have been suggesting a new C++ API (wrapper around libxml2) on boost.org
(http://boost.org/) and a big part of the discussion is precisly about how
best to do that conversion (see 
http://lists.boost.org/Archives/boost/2005/10/96129.php)


 Don't do conversion and keep UTF-8, converting back and forth all the 
 time
bustring by substring is gonna be quite costly, probably more than the cost
of parsing the data.

We agree that this is the best option. However, users may not be in control
of all application layers, and so at some point a conversion may be 
required.
Ideally some conversion mechanism can be provided that only allocates / 
copies
data if absolutely necessary, and passes utf-8 strings through, if possible.

  My reply was precisely about the boost framework. Keep the APIs UTF-8
if you provide wrappers for libxml2, otherwise you may force a lot of
unecessary conversions.
  I tried to explain why UTF-8 was the one making the most sense.
  http://xmlsoft.org/encoding.html#internal
I would also add in retrospect that 99% of the instances you see around
use markup names in the ASCII range and hence the api using markup names
are usually the cheapest possible. At the instance level converting the
full document while streaming is less costly than converting back and
forth all tags, attribute and namespaces.

Daniel

-- 
Daniel Veillard      | Red Hat http://redhat.com/
veillard redhat com  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]