Re: Re[7]: WG: [xslt] decimal char problem - possible Solution - Xalans behaviour



Daniel Veillard wrote at 14 Sep 2001 07:54:42 -0400:
 > On Fri, Sep 14, 2001 at 01:28:04PM +0200, Oliver Feige wrote:
 > > It generates not hexadecimal char reference.
 > > 
 > > When you do a transformation with Xalan/Xerces
 > > 	xsl:output methode="html" encoding="ISO-8859-1"
 > > it generate eg. ä ß but not œ
 > > Xalan/Xerces do for œ a dezimal Char reference (œ);
 > 
 >   Hum, except for a trial to reverse-engineer what got supported
 > by common browsers, I don't really see a clear rule associated to this
 > (allow only entities if their character is <= 255 ? that would not
 >  make much sense nor cover the laquo and raquo cases ...).
 > 
 >   I don't see any "right" way to handle this in general,

Section 16.2, HTML Output Method, of the XSL 1.0 Recommendation [1]
states:

   The version attribute indicates the version of the HTML. The
   default value is 4.0, which specifies that the result should be
   output as HTML conforming to the HTML 4.0 Recommendation.

It also states:

   The html output method may output a character using a character
   entity reference, if one is defined for it in the version of HTML
   that the output method is using.

and:

   It is possible that the result tree will contain a character that
   cannot be represented in the encoding that the XSLT processor is
   using for output. In this case, if the character occurs in a
   context where HTML recognizes character references, then the
   character should be output as a character entity reference or
   decimal numeric character reference; otherwise (for example, in a
   script or style element or in a comment), the XSLT processor should
   signal an error.

It seems to me that an XSLT processor, in the absence a "version"
attribute with a different value, has the option of emitting character
entity references for any or all of the entities defined in HTML 4.0
[2].  It also seems to me that when a character cannot be represented
directly and the XSLT processor does not emit an entity reference for
the character -- by choice or because there's no such entity defined
in HTML 4.0 -- then a decimal numeric character reference should be
used.

Regards,


Tony Graham
------------------------------------------------------------------------
XML Technology Center - Dublin        mailto:tony.graham@ireland.sun.com
Sun Microsystems Ireland Ltd                       Phone: +353 1 8199708
Hamilton House, East Point Business Park, Dublin 3            x(70)19708


[1] http://www.w3.org/TR/REC-xslt-19991116.html#section-HTML-Output-Method
[2] http://www.w3.org/TR/REC-html40/sgml/entities.html




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]