[xml] internal character encoding



Hi list

I have xml files encoded in CP850. When I parse the file with libxml,
it converts the data to UTF-8. Now I have to convert everything back
to CP850 using iconv(), since the application expects the data in
CP850.

Is it possible to tell libxml to use a different charset for internal
data encoding?

In doc/encoding.html#internal it says that this is not possible. But
from the API I can see the following vars:

struct _xmlDoc {
   ...
   int    charset    : encoding of the in-memory content actua
}

struct _xmlParserCtxt {
   ...
   int    charset    : encoding of the in-memory content actua
}

The xmlDoc structure seems to have a var to tell how the internals are
encoded. And in the xmlParserCtxt there is a var to specify which
encoding the data should be represented in.

Is this implemented? Can I use it to let the internal data be
converted with iconv automatically?

Christoph
-- 
echo mailto: NOSPAM !#$.'<*>'|sed 's. ..'|tr "<*> !#:2" org fr33z3



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]