Re: Re[7]: WG: [xslt] decimal char problem - possible Solution - Xalans behaviour
- From: Tony Graham <Tony Graham ireland sun com>
- To: xslt gnome org
- Subject: Re: Re[7]: WG: [xslt] decimal char problem - possible Solution - Xalans behaviour
- Date: Mon, 17 Sep 2001 10:55:37 +0100
Daniel Veillard wrote at 14 Sep 2001 07:54:42 -0400:
> On Fri, Sep 14, 2001 at 01:28:04PM +0200, Oliver Feige wrote:
> > It generates not hexadecimal char reference.
> >
> > When you do a transformation with Xalan/Xerces
> > xsl:output methode="html" encoding="ISO-8859-1"
> > it generate eg. ä ß but not œ
> > Xalan/Xerces do for œ a dezimal Char reference (œ);
>
> Hum, except for a trial to reverse-engineer what got supported
> by common browsers, I don't really see a clear rule associated to this
> (allow only entities if their character is <= 255 ? that would not
> make much sense nor cover the laquo and raquo cases ...).
>
> I don't see any "right" way to handle this in general,
Section 16.2, HTML Output Method, of the XSL 1.0 Recommendation [1]
states:
The version attribute indicates the version of the HTML. The
default value is 4.0, which specifies that the result should be
output as HTML conforming to the HTML 4.0 Recommendation.
It also states:
The html output method may output a character using a character
entity reference, if one is defined for it in the version of HTML
that the output method is using.
and:
It is possible that the result tree will contain a character that
cannot be represented in the encoding that the XSLT processor is
using for output. In this case, if the character occurs in a
context where HTML recognizes character references, then the
character should be output as a character entity reference or
decimal numeric character reference; otherwise (for example, in a
script or style element or in a comment), the XSLT processor should
signal an error.
It seems to me that an XSLT processor, in the absence a "version"
attribute with a different value, has the option of emitting character
entity references for any or all of the entities defined in HTML 4.0
[2]. It also seems to me that when a character cannot be represented
directly and the XSLT processor does not emit an entity reference for
the character -- by choice or because there's no such entity defined
in HTML 4.0 -- then a decimal numeric character reference should be
used.
Regards,
Tony Graham
------------------------------------------------------------------------
XML Technology Center - Dublin mailto:tony.graham@ireland.sun.com
Sun Microsystems Ireland Ltd Phone: +353 1 8199708
Hamilton House, East Point Business Park, Dublin 3 x(70)19708
[1] http://www.w3.org/TR/REC-xslt-19991116.html#section-HTML-Output-Method
[2] http://www.w3.org/TR/REC-html40/sgml/entities.html
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]