RE: [xslt] Kannada fonts garbled in xsltproc



Hi Manoj,

My answers below, first a backgrounder and apology to Daniel:

(Daniel, let's keep this one moment on the list please. 
For those writing between U+0900 and U+0DFF, i.e. South
Asian scripts, the situation is really unfortunate. National 
Standardization gave the ISCII character model, which is hard
to handle, in addition users of the  Dravidian languages even
produced competing provincial standards, as they felt 
misrepresented by ISCII. All these encodings are mostly not
IETF registered, let alone implemented in current browsers.
So x-userdefined and masquerading fonts rule.
There is a migration towards the Unicode character model, but
it's very slow. You must also imagine that some characters
in legacy encodings get translated into 3 or more Unicode
codepoints, i.e. 9 UTF-8 bytes. Anyway, let's try to give
Manoj a start)

Manoj wrote:
> I changed the encoding from X-user-defined to iso-8859-1 and the characters
> are not changed. Thanks for your help, its been a week since I've been
> trying to make xsltproc not to change the characters.

Did you set the encoding at the right place?
Your XSL should start like this now:

<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform";>
<xsl:output method="html" encoding="ISO-8859-1"/>

Given this xsl:output element, your Kannada characters masquerading
as Latin-1 Supplement should be left unchanged.

BUT: Manoj, you should divert some of your hacking time to 
researching your character encoding options. Are you sure, 
you can't migrate authoring, transformation and deployment 
to Unicode? Do you need Unicode fonts? 

Even if you decide the final webpages must be delivered as
x-userdefined with masquerading font, it would be a big improvement
if you author and transform using Unicode Kannada, and change
the character encoding as a separate, final step.

> Is there a better solution than libxml2? Will I run into problems later if I
> use libxml2 for this Content Management?

You are not fighting against anything libxml2 specific, but against 
the character encoding problems mentioned above.

Best Regards,
Peter Jacobi



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]