Re: [xml] Converting ISO Latin 9 to UTF-8



On Fri, Sep 03, 2004 at 02:57:32PM +0000, Mark Itzcovitz wrote:
I expect I'm missing the obvious, but I'm having a problem finding
an easy way to convert from ISO Latin 9 (8859-15) to UTF-8. I have a
program with libxml2 embedded (but not iconv, I haven't needed it yet
and it's easier without it on multiple platforms, including VMS) with an
xml document whose original encoding is probably UTF-16. I'm currently
adding elements and attributes using xmlNodeSetContent and xmlSetProp,
with the text assumed to be ISO Latin 1 (8859-1) and converted to UTF-8
with a call to isolat1ToUTF8. If in fact the text to be added is actually
ISO Latin 9 (8859-15), is there an easy way to specify this? I know there
is a function ISO8859_15ToUTF8 but it seems that it's not intended that
this should be called from outside libxml2.

  right. It's registered internally as ISO-8859-15 encoding handler
assuming it got compiled in. Use xmlFindCharEncodingHandler
to get one, then use xmlCharEncInFunc to encode buffers, or you
can try to use directly the input xmlCharEncodingInputFunc of the
handler, which behaves like isolat1ToUTF8.

Daniel

-- 
Daniel Veillard      | Red Hat Desktop team http://redhat.com/
veillard redhat com  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]