Re: [libxml++] Character encoding, UTF-8 and such



On Thu, Jun 12, 2003 at 01:27:15PM +0200, Murray Cumming Comneon com wrote:

> > std::string is really a typedef for std::basic_string<char>. 
> > When a std::string is passed on from libxml++ to a function 
> > in libxml, it's char * representation is just casted to 
> > unsigned char *. Would it be terribly wrong to assume that 
> > the input in the std::string is always ISO-8859,
> 
> No, it's always UTF8. Input and Output. That's simple, and that should work.
> You can use glib::ustring with that now if you want.

Morten, the problem isn't that libxml++ doesn't "support" UTF-8 (it uses
UTF-8 for all strings, just like libxml) the issue is that
std::string::size() returns the number of bytes, but when the string
contains UTF-8 data the number of bytes is not necessarily the same as
the number of characters. glib::ustring is UTF-8 aware and can tell you
the correct number of chars.

jon

-- 
"Some men are born mediocre, some men achieve mediocrity,
 and some men have mediocrity thrust upon them."
	- Joseph Heller




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]