Re: [xml] XML Entities encoding question
- From: Michael Wood <esiotrot gmail com>
- To: Fred <fred fredex gmail com>
- Cc: xml gnome org
- Subject: Re: [xml] XML Entities encoding question
- Date: Tue, 6 Aug 2013 11:37:18 +0200
Hi
On 5 August 2013 18:50, Fred
<fred fredex gmail com> wrote:
I have an app that emits XML as 8859-1 (or other encoding as needed), and the XML is sent to an Oracle database where the XML is unpacked and the contents used to update an existing schema.
I apparently fail to understand something about how char encodings work at the intersection of XML and Oracle.
If I send:
<?xml version="1.0" encoding="WINDOWS-1252"?>
<MSG>
...
<LAST_NAME>BOLA<C3><C9>OS</LAST_NAME>
...
</MSG>
the two accented characters are each transformed into 0xBF. (with exactly the same result if it's 8859-1 instead of WINDOWS-1252.)
however, if I send:
<LAST_NAME>BOLAÃ ÉOS</LAST_NAME>
I get the desired result.
While I'm working on figuring out what I'm doing wrong regarding Oracle, is there some way I can force libxml2 to emit the second form rather than the first?
the tree is output using:
xmlDocDumpFormatMemoryEnc (doc, xmlbufptr, &xmlbufptr_size, "WINDOWS-1252", 1);
What happens if you use ascii instead of WINDOWS-1252? Windows-1252 and iso-8859-1 can include those characters as is, whereas if the document is encoded as ASCII, they will need to be escaped, so in theory libxml will escape them. I haven't tried, though.
--
Michael Wood <
esiotrot gmail com>
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]