Re: Gnome 2 Screenshot - Tamil Xft rendering seems broken



At 09:53 AM 1/11/2002, Owen Taylor wrote:

Vikram Subramanian <glueview yahoo com> writes:

> Hi,
>    This is Vikram, who submitted the last patch to the
> Tamil Pango Module(from a different email ID). I
> stumbled across this GNOME 2 screenshot from
> developer.gnome.org.
> http://greebo.homeip.net/gnome2.png
>
> This has a nice screenshot showing Anti-Aliased
> Rendering of HELLO.utf8 under gedit, with Tamil
> rendering broken :-(
>
> Seems none of the characters in the ligature block got
> rendered.I wonder where the person got the TTF fonts
> from.

I'd guess the font is Microsoft's Arial Unicode, which presumably
encodes Tamil using the OpenType Indic set of features rather than the
block of characters in the PUA.

[See my comments below about PUA encodings.]


> Is there be some change required to the Tamil Xft
> module?

Well, it would be nice to support the OpenType Indic stuff :-), I
don't have much of a feeling for how much work that would be ... all
the necessary OpenType parsing code should be there, but isn't
necessarily tested at all.


The module I'm converting from ICU can, in theory, handle the following Indic scripts: Devanagari, Bengali, Punjabi, Gujarati, Oriya, Tamil, Telugu, Kannada, and Malayalam. With for following caveats:

1) I've never seen any fonts w/ GSUB/GPOS tables for Bengali, Punjabi, Oriya, or Malayalam.

2) The MS Indic spec. differentiates between Traditional and Reformed Malayalam; but there's only one script code for the script. I guess we have to implement reformed, even though to me, the traditional form looks much more beautiful, and should be easy enough to do using OpenType.

3) The MS spec. says that the features should be applied in a fixed order, regardless of what the font says the order is. Neither my OT engine, nor the FT one in Pango can do this unless we make a separate pass over the whole text for each feature :-( (I think this is how UniScribe does it...)

4) In Tamil and Malayalam, left matras are placed to the left of the *base glyph* rather than to the left of the whole syllable. This needs to be done by a post-gsub process. I've implemented that for Tamil, which is relatively simple, but not for Malayalam. (Also, I haven't started to port this piece of code yet; my hunch is that it will be a bit tricky to port since it involves moving glyphs around and might change cluster boundaries...)

About PUA encodings: In ICU, I handle Arabic fonts without GSUB/GPOS by using a canned GSUB table generated from the Unicode character data base, and using Unicode code points, including Presentation Forms, instead of actual glyph ID's. There's some glue code that overrides the normal character to glyph conversion, and which runs after the GSUB table to convert the final Presentation Forms to glyph ID's. (This required a change in my OT ligature code - before it forms a ligature, it optionally calls a call-back routine which verifies that the font contains that particular Presentation Form)

I propose doing something like this for all the PUA encodings. This means that all of the code for handling Indic can be in one place. (Well, we'll have to write some code to generate the fake GSUB table, so some of the knowledge will be in that code too - but it still seem like a win to me.)

Eric Mader
IBM GCoC - San José
5600 Cottle Rd. MS 50-2/B11
San José, CA 95193




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]