Re: Re[3]: WG: [xslt] decimal char problem - possible Solution



On Thu, Sep 13, 2001 at 06:09:59PM +0200, Marco Stipek wrote:
> Hello Daniel,
> 
> as you can see we used the xsl:output directive. If you now will ask
> us to use encoding="html" this is not possible. Other Problems will
> occure.
> E.g. the OE Literal will not be displayed correctly. It will
> be tranformed standard conform to œ. But this is not supported
> by many browsers (including Netscape). So wee need to have the decimal
> value.

  that conversion is put in place by using HTML as a special output
encoding converter registered at the end of
  xmlInitCharEncodingHandlers() encoding.c line 1584

  it registers UTF8ToHtml() HTMLparser.c 1411 as the conversion routine
which itself uses htmlEntityValueLookup() to find the reverse match from
the unicde point to the entity name.

oelig has been broken in nestcape for ages as well as laquo and raquo
(being French I have been quite annoyed by this).

  There is multiple ways to attack the problem:
    - use the XML serializer i.e. not use method="html",
      I think it opens the door to a number of more subtle violations.
      section 16.2 of the XSLT REC should be sufficient to convince
      people that it is a dead end.
    - use the HTML serializer without an encoding declaration
      this mean fixing the UTF8ToHtml() output to be compatible with
      Netscape 4, it's bad but just reflect that HTML implementations
      are bad.
    - use the HTML serializer without an encoding declaration
      by registering your own "html encoder" this is a relatively
      simple and clean way to implement you own rules even if libxml
      is not changed the way you like
    - use the HTML serializer and forcing ISO Latin 1 encoding, in
      that case the fallback to convert "unsupported chars" to 
      a charref could be switched to decimal.

  As you can see just in 10 mn I can come with very different solutions
to solve this, including one which would work right now without changing
libxml2 ...

  I'm tempted to change the default behaviour as you initially suggested
but it was not clear at all it is the right way, there was a lot of 
other constraints and possibilities not described.

Daniel

-- 
Daniel Veillard      | Red Hat Network http://redhat.com/products/network/
veillard@redhat.com  | libxml Gnome XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]