Re: Industry Thai Cell-Clustering Rules

From: Pablo Saratxaga <pablo mandrakesoft com>
To: gtk-i18n-list gnome org
Subject: Re: Industry Thai Cell-Clustering Rules
Date: Fri, 3 Nov 2000 22:46:38 +0100

Kaixo!

On Fri, Nov 03, 2000 at 03:22:26PM -0500, Owen Taylor wrote:

>  - It is important in Pango that clustering (in the Pango sense -
>    I'm not sure I quite understand the sense that you mean), is
>    independent of the font.

Clustering is very important for cursor positioning, selection, delete etc.
The Wtt2.0 has also rules to display in a different way correct and
incorrect sequences of glyphs.
THe reason for this is, if I understood correctly, that tis-620 is a
glyph-based encoding, instead of a letter based encoding.
ISCII standard has not that problem, as it is letter based, if a word
is pronounced a-b-c-d it is typed a-b-c-d. even if it displays c-X-d (with
"X" being a special conjunct). For Thai tsi-620 however, if a word is
pronounced a-b-c-d but displayed c-a-b-d, it must be typed c-a-b-d.
the problem happens because there are two kinds of "diacriticts" with
zero width that can be used together; but they must be written in right order;
however, as they are zero width, if there isn't a special shaping rule,
the visual result will be the same in both cases.

Look at http://www.inet.co.th/cyberclub/trin/thairef/ in
"III. Distinct Characteristics of Thai", you may better understand the
problem.

So, Wtt2.0 apparently solves it by displaying differently what is different:
right sequences are displayed in normal way, wrong sequences are broken in
different cells: when the right sequence is broken, a new cell is started.

>    I can have a GtkTextBuffer object with multiple views of the same
>    text with different.

With different what ?

> What would be the reasons why a user would want to choose one
> clustering rule or another?

I now think that very very few users would want to choose, and among those,
the majority would prefer the official and more logical way of Wtt2.0
So I vote for implementing only it.

> I don't think I still quite understand the concept of cell-clustering.
> 
> Can you maybe explain how the shapes and/or metrics of a font depend
> on the clustering rule for which it was designed?

a cluster is made of a base glyph (non-zero width) and various "diacritics"
(zero width glyphs); when the sequence of those "diacritics" is wrong,
the visual result would be as if a space (blank but non-zero width) would
be inserted. Some poor trying in ascii: (A B C: base letters, d e f: vowels,
x y z: tonemarks. d,e,f,x,y,z are zero width)

| good: A e x B y C f
| displayed as (3 clusters):
| x
| e  y  f
| A  B  C 

| wrong: A x e B f y z 
| displayed as (with Wtt2.0 clustering) (4 clusters):
|                     or maybe (bad typography):
|       y             x     y  z
| x  e  f  z             e  f 
| A     B             A     B
|
| instead of (without Wtt2.0 clustering) (2 clusters (or maybe 3, but not 4)):
| x  y    
| e  fz
| A  B

-- 
Ki ça vos våye bén,
Pablo Saratxaga

http://www.srtxg.easynet.be/		PGP Key available, key ID: 0x8F0E4975

References:
- Re: Industry Thai Cell-Clustering Rules
  - From: Chookij Vanatham
- Re: Industry Thai Cell-Clustering Rules
  - From: Owen Taylor

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]