Re: [xml] libxml vs stl



Daniel Veillard wrote:
On Tue, Nov 22, 2005 at 01:17:49PM -0500, Stefan Seefeld wrote:

You probably want some unicode library in conjunction with libxml2.


  libxml2 uses UTF-8 which covers the full Unicode range, so this does not
make much sense. The API is already Unicode ready.

Sorry for being unclear. I didn't mean to suggest that libxml2 wouldn't
handle unicode data by itself. Rather, I was thinking from a user's point
of view, who wants to operate on some xml content.

The remaining question then is about how to pass data between that unicode
library and libxml2.


  Stick to UTF-8 !


I have been suggesting a new C++ API (wrapper around libxml2) on boost.org
(http://boost.org/) and a big part of the discussion is precisly about how
best to do that conversion (see http://lists.boost.org/Archives/boost/2005/10/96129.php)


  Don't do conversion and keep UTF-8, converting back and forth all the time
bustring by substring is gonna be quite costly, probably more than the cost
of parsing the data.

We agree that this is the best option. However, users may not be in control
of all application layers, and so at some point a conversion may be required.
Ideally some conversion mechanism can be provided that only allocates / copies
data if absolutely necessary, and passes utf-8 strings through, if possible.

Regards,
                Stefan




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]