Re: [xslt] Use character entities to represent non-ASCII characters



Hi Nick,

Thanks for the quick response.

Specifying HTML as the output type does not cause libxslt to generate ASCII with character entities for 
non-ASCII characters.

I am porting an existing XML-based app from JAXP to libxslt. The app's existing tests expect character 
entities because that is what JAXP produces for HTML output. I was hoping to avoid updating the tests for 
libxslt.

Regards,

Paul

-----Original Message-----
From: Nick Wellnhofer <wellnhofer aevum de> 
Sent: Sunday, August 21, 2022 9:08 AM
To: The Gnome XSLT library mailing-list <xslt gnome org>
Cc: Paul Kinnucan <paulk mathworks com>
Subject: Re: [xslt] Use character entities to represent non-ASCII characters

On 19/08/2022 19:41, Paul Kinnucan via xslt wrote:
I am trying to use libxslt to transform an XML file that contains 
non-ASCII characters to an HTML file. Other xslt processors, such as 
JAXP and Xalan, replace non-ASCII characters with their character 
entity equivalents, e.g., £
-> &pound; However, libxslt simply outputs the UTF-8 rendition of the
non-ASCII character.

Is there a way to get libxslt to output the equivalent character entity instead?

If the output encoding is UTF-8, there's no reason not to output non-ASCII characters as UTF-8 (unless you're 
talking about non-ASCII characters in URI attribute values). Setting the output encoding to "HTML" should do 
what you want:

    <xsl:output encoding="HTML" .../>

This is non-standard, though. You can also set the output encoding to "ASCII", but this will produce numeric 
character references like "&#163;".

Nick



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]