RE: [xml] UTF8Toisolat1() usage




-----Original Message-----
From: Geert Kloosterman [mailto:geertk ai rug nl]
Sent: Tuesday, June 04, 2002 7:37 PM
To: Henke, Markus
Subject: RE: [xml] UTF8Toisolat1() usage


"Markus" == Henke, Markus <Markus_Henke ordat com> writes:

    > Hi Geert, all,
    >> -----Original Message-----
    >> From: Geert Kloosterman [mailto:geertk ai rug nl]
    >> Sent: Monday, June 03, 2002 4:09 PM
    >> To: xml gnome org
    >> Subject: [xml] UTF8Toisolat1() usage
    >> 
    >> 
    >> Hi all,
    >> 
    >> How can I convert the result from xmlNodeListGetString() to
    >> iso-8859-1?  
    >> 
    >> I think I have to use UTF8Toisolat1(), but it isn't 
exactly clear to
    >> me how to use the function with respect to memory 
allocation.  Do I
    >> need to supply a buffer that's large enough myself, or does the
    >> function allocate it for me?
    >> 
    >> Geert
    >> 

    > It's been a while that i've looked at the corresponding code,
    > but i'm quite sure that "UTF8Toisolat1()" won't handle 
the buffer
    > size for you.
    > You can use "xmlCharEncOutFunc()" (get the encoding 
handler using
    > "xmlGetCharEncodingHandler()") to convert from UTF-8 to 
ISO-8859-1,
    > it uses xmlBuffer for in/out parameters and handles the 
buffer size.

Thanks.  So the idiom to convert an UTF8 xmlChar * to an iso-8859-1
char * is:

    - setup an encoding handler with xmlGetCharEncodingHandler()
    - create input and output buffers with xmlBufferCreate()
    - transfer the UTF8 xmlChar * to an xmlBufferPtr with 
xmlBufferCat()
    - translate to iso-8859-1 using xmlCharEncOutFunc()
    - put the iso-8859-1 results in an xmlChar * using 
xmlBufferContent()
    (- cast the xmlChar * to char * ????)

    --> for iso-8859-1 output use xmlBufferDump() on the buffer

Am I correct?

Well, seems OK for me if you want to avoid any buffer allocation
by your own (and therefore it's more complex).
The last step (cast to char*) of course depends on what you intend
to do with the buffer...  :)
It's also possible to use UTF8ToIsolat1() if you could provide an
adequate allocated out-buffer. An (out-) buffer that's convertet
from UTF-8 to ISO-8859-1 will never need more than twice the space
as the corresponding in-buffer.


Geert

Ciao, Markus



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]