Re: [xslt] HTML vs. XHTML: different output when including a file with \r\n [WAS: xmllint vs. xsltproc: different output when including a file with \r\n]

On Mon, Feb 01, 2010 at 08:28:56PM +0000, Martin (gzlist) wrote:
> On 01/02/2010, Daniel Veillard <veillard redhat com> wrote:
> >
> >   Not a bug.
> I don't know the XInclude spec, but reading the bug report:
> <>
>      When a text file is included with
>     <include href="text.txt" parse="text"
> xmlns=""/>
>     and the text file uses \r\n end line markers...
> Isn't the problem that the file should be opened in text-mode so the
> \r characters get removed before they ever get as far as the XML
> serialisation?

  There is nothing in the spec suggesting such a processing:

Byte sequences outside the range allowed by the encoding are a fatal
error. Characters that are not permitted in XML documents also are a
fatal error.

Each character obtained from the transformation of the resource is
represented in the top-level included items as a character information
item with the character code set to the character code in ISO 10646
encoding, and the element content whitespace set to false.

  As long as the characters are in the acceptable range, not Byte Order
Mark, they are included as is (but in UTF-8).

  So if you include a text node with \r\n , you get them in the
resulting text node, and libxml2 has to serialize them back.


