Re: [xml] How do I read German Umlaute - entities from an XML-File using libxml?



At 11:11 AM 10/05/01 -0400, Daniel Veillard wrote:
 The libxml callback gives UTF8 encoded strings. It seems taht you expect
ISO-8859-1 encoded ones. You need to convert between both. There is a routine
called isolat1ToUTF8, use it or change you program to use UTF8 encoding.

BTW- Is there a way to get UTF8Toisolat1 to replace *non* 8859-1 chars with
something else (e.g. a space)?  

Say I have a long UTF-8 string with a number of non Latin-1 chars that *do*
convert to Latin-1, but one character that doesn't.  UTF8Toisolat1 returns
an error and I'm forced to use the UTF-8 string which means I lost the
characters that would have been converted (and worse, using them as if they
were Latin-1). 
Most of the time the characters that don't convert are an entity for some
symbol that I wouldn't care about anyway.

Does that make any sense?  Would that be helpful to anyone else?

Bill Moseley
mailto:moseley hank org




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]