Re: [xml] [xslt] HTML vs. XHTML: different output when including a file with \r\n [WAS: xmllint vs. xsltproc: different output when including a file with \r\n]



On Mon, Feb 01, 2010 at 08:28:56PM +0000, Martin (gzlist) wrote:
On 01/02/2010, Daniel Veillard <veillard redhat com> wrote:

  Not a bug.

I don't know the XInclude spec, but reading the bug report:

<https://bugzilla.gnome.org/show_bug.cgi?id=608333>

     When a text file is included with

    <include href="text.txt" parse="text"
xmlns="http://www.w3.org/2001/XInclude"/>

    and the text file uses \r\n end line markers...

Isn't the problem that the file should be opened in text-mode so the
\r characters get removed before they ever get as far as the XML
serialisation?

  There is nothing in the spec suggesting such a processing:

http://www.w3.org/TR/xinclude/#text-included-items

------------------------------------------------------------------------
Byte sequences outside the range allowed by the encoding are a fatal
error. Characters that are not permitted in XML documents also are a
fatal error.

Each character obtained from the transformation of the resource is
represented in the top-level included items as a character information
item with the character code set to the character code in ISO 10646
encoding, and the element content whitespace set to false.
------------------------------------------------------------------------

  As long as the characters are in the acceptable range, not Byte Order
Mark, they are included as is (but in UTF-8).

  So if you include a text node with \r\n , you get them in the
resulting text node, and libxml2 has to serialize them back.

Daniel

-- 
Daniel Veillard      | libxml Gnome XML XSLT toolkit  http://xmlsoft.org/
daniel veillard com  | Rpmfind RPM search engine http://rpmfind.net/
http://veillard.com/ | virtualization library  http://libvirt.org/



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]