Re: [libxml++] Character encoding, UTF-8 and such
- From: Jonathan Wakely <cow compsoc man ac uk>
- To: libxmlplusplus-general lists sourceforge net
- Subject: Re: [libxml++] Character encoding, UTF-8 and such
- Date: Thu, 12 Jun 2003 12:53:56 +0100
On Thu, Jun 12, 2003 at 01:27:15PM +0200, Murray Cumming Comneon com wrote:
> > std::string is really a typedef for std::basic_string<char>.
> > When a std::string is passed on from libxml++ to a function
> > in libxml, it's char * representation is just casted to
> > unsigned char *. Would it be terribly wrong to assume that
> > the input in the std::string is always ISO-8859,
>
> No, it's always UTF8. Input and Output. That's simple, and that should work.
> You can use glib::ustring with that now if you want.
Morten, the problem isn't that libxml++ doesn't "support" UTF-8 (it uses
UTF-8 for all strings, just like libxml) the issue is that
std::string::size() returns the number of bytes, but when the string
contains UTF-8 data the number of bytes is not necessarily the same as
the number of characters. glib::ustring is UTF-8 aware and can tell you
the correct number of chars.
jon
--
"Some men are born mediocre, some men achieve mediocrity,
and some men have mediocrity thrust upon them."
- Joseph Heller
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]