Re: [xml] xmlEncodeSpecialChars and carriage reurn / CRLF / 0x0D 0x0A / \r\n / 13, 10
- From: Daniel Veillard <veillard redhat com>
- To: SABROG <sabrog inbox ru>
- Cc: xml gnome org
- Subject: Re: [xml] xmlEncodeSpecialChars and carriage reurn / CRLF / 0x0D 0x0A / \r\n / 13, 10
- Date: Thu, 9 Aug 2007 04:41:32 -0400
On Thu, Aug 09, 2007 at 11:54:21AM +0400, SABROG wrote:
Converting string contains "\n" with isolat1ToUTF8(...) don't help, i still not see " ", just LF. How i
can write RAW string without checks and etc ?
I don't see the relationship with isolat1ToUTF8 which is an encoding
converter.
From an XML perspective escaping of code point 10 is needed only in attribute
value because of the rules I pointed in the XML spec. You need this to avoid
attribute normalization that the XML parser may do in attribute values. For
the values in element content there is no need to do that escaping so libxml2
does not do it:
paphio:~/XML -> cat test.xml
<foo attr="a b">a b</foo>
paphio:~/XML -> xmllint test.xml
<?xml version="1.0"?>
<foo attr="a b">a
b</foo>
paphio:~/XML ->
Any application using a compliant XML parser *MUST* see exactly the same
input if it receives:
<foo>a b</foo>
and
<foo>a
b</foo>
the 2 content strings must be indistinguishable. After having gone though
an XML parser. If your application behaves differently that means it's not
really XML compliant, and I'm afraid there is nothing libxml2 should do to
cope wit this.
If the caracter code point 10 occurs in an attribute value then that's
a completely different story because processing of text strings there is
different and preserving is needed when serializing. This is tested
in libxml2 regression test/att3 as part of the test suite. See
http://www.w3.org/TR/REC-xml/#AVNormalize
Daniel
--
Red Hat Virtualization group http://redhat.com/virtualization/
Daniel Veillard | virtualization library http://libvirt.org/
veillard redhat com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]