Re: [xml] xmlSaveFormatFileEnc() creating invalid XML
- From: Murray Cumming <murrayc murrayc com>
- To: Jason Viers <lists beanalby net>
- Cc: xml gnome org
- Subject: Re: [xml] xmlSaveFormatFileEnc() creating invalid XML
- Date: Fri, 09 Sep 2011 16:30:45 +0200
On Fri, 2011-09-09 at 10:21 -0400, Jason Viers wrote:
On 9/9/2011 05:37, Murray Cumming wrote:
Here is a simple test case that takes the text from an apparently-valid
UTF-8 file
Not all valid UTF-8 is valid in XML. Only a subset, as defined in
http://www.w3.org/TR/2008/REC-xml-20081126/#charsets
Note that Form Feed (0xC) is not allowed. Your original input document
contains a formfeed character, and this is what ends up being invalid.
It's not a matter of escaping; form feed as a literal byte, numeric
reference, etc., is not allowed.
Stripping the form feed from the input allows it to serialize properly.
Ah, I didn't know that it couldn't be there even if escaped. Thanks.
Shouldn't libxml warn about that at the same time that it would escape
characters such as & and < rather than writing invalid XML?
--
murrayc murrayc com
www.murrayc.com
www.openismus.com
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]