On Tue, Nov 22, 2005 at 01:17:49PM -0500, Stefan Seefeld wrote:
You probably want some unicode library in conjunction with libxml2.

  libxml2 uses UTF-8 which covers the full Unicode range, so this does not
make much sense. The API is already Unicode ready.

The remaining question then is about how to pass data between that unicode
library and libxml2.

  Stick to UTF-8 !

I have been suggesting a new C++ API (wrapper around libxml2) on
( and a big part of the discussion is precisly about how
best to do that conversion (see

  Don't do conversion and keep UTF-8, converting back and forth all the time
bustring by substring is gonna be quite costly, probably more than the cost
of parsing the data.


