Re: [xml] [xslt] HTML vs. XHTML: different output when including a file with \r\n [WAS: xmllint vs. xsltproc: different output when including a file with \r\n]
- From: Nick Wellnhofer <wellnhofer aevum de>
- To: xslt gnome org, xml gnome org
- Subject: Re: [xml] [xslt] HTML vs. XHTML: different output when including a file with \r\n [WAS: xmllint vs. xsltproc: different output when including a file with \r\n]
- Date: Sun, 24 Jan 2010 23:20:42 +0100
On 24/01/10 21:14, Boris Schaeling wrote:
On Sat, 23 Jan 2010 20:51:46 +0100, Boris Schaeling <boris highscore de>
wrote:
When I use "xmllint --xinclude" to include a text file with \r\n end
line characters xmllint inserts characters while "xsltproc
--xinclude" doesn't. I'm currently trying to find out (on the DocBook
mailing list; see
http://lists.oasis-open.org/archives/docbook/201001/msg00052.html) why
the output is different and what to do to make xmllint not generate
characters. Maybe someone here can tell me if this is a bug or
if there is a trick to change xmllint's output (I couldn't find a
command line option so far)?
As it turns out the problem is different. After some discussions on the
DocBook mailing list it is now clear that generating XHTML leads to
characters being inserted while no characters are inserted
when generating HTML. Thus it depends on the xsl:output setting of the
stylesheet used with xsltproc if characters are inserted or not
(please forget what I wrote about xmllint; xmllint can be ignored). Bob
Stayton explained this in his message to the DocBook mailing list:
http://lists.oasis-open.org/archives/docbook/201001/msg00065.html
The question is now why \r becomes when generating XHTML but not
when generating HTML? Are there any specifications which are different
for XHTML and HTML when it comes to xincluding simple text files with
\r\n end line markers?
(Cross-posting to xml gnome org )
It seems that the default behavior of libxml is to encode "\r" as
" ". But there is an exception for HTML in
xmlEncodeEntitiesReentrant in entities.c. I haven't checked, but looking
at the source the XHTML serialization code seems to call
xmlEscapeContent in xmlIO.c. There's also xmlEscapeEntities in xmlsave.c
but that uses hex char refs. Those two functions don't make an exception
for XHTML content.
Personally, I think libxml shouldn't escape "\r" at all.
Nick
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]