Re: Industry Thai Cell-Clustering Rules
- From: Pablo Saratxaga <pablo mandrakesoft com>
- To: gtk-i18n-list gnome org
- Subject: Re: Industry Thai Cell-Clustering Rules
- Date: Fri, 3 Nov 2000 16:07:52 +0100
Kaixo!
On Thu, Nov 02, 2000 at 05:18:44PM -0800, Chookij Vanatham wrote:
> According to our experience, there are three different practices of Thai
> fonts for rendering :
>
> 1. Plain tis620 : combining characters are placed at the safe positions to
> prevent collapsion. There are two practices of this kind :
> - negative-offset-zero-width diacritics (this makes the fonts apply to
> 2. MacThai extension : an extended tis620 code set, by using codes in the
> range 0x80-0x9f and in some free slots to keep the prepositioned
> combining characters. This needs a shaping algorithm to produce
> elegant rendering.
But does the glyphs at normal tis-620 position (that is, not precombined)
share the same properties as 1. above ? In other words, if we take an
extended thai font and use it as if was a simple tis-620 only one;
would it work in an acceptable fashion?
I was thinking until now that the diferences where in how the
negative-offset-zero-width diacritics were computed...
> 3. WindowsThai extension : similar to MacThai extension, but used in
> Windows Thai Editions.
>
> The last two code sets are mapped to their own private area of Unicode and
> cannot be used together.
Can't those be detected somehow ? (it would be interesting to have the
list of combinations and the codepoints assigned to the precombined glyphs)
> As far as I know, there is only 1 cell-clustering rule defined from Thai
> government (by NECTEC). This one is called Wtt2.0 and the detail is attached.
> We should add the word "wtt2.0" to any names if they are using Wtt2.0 cell
> clustering rule.
> ****
> If the cell-cluster is composed of "consonant", "vowel" and "tonemark",
> vowel character will always follow consonant and tonemark character
> will always follow vowel as shown below.
>
> Consonant + Vowel + Tonemark -----> One cell cluster
>
> If tonemark comes before vowel, the vowel character will be considered as
> another cell-cluster as shown below.
>
>
> [Consonant + Tonemark] [Vowel] ----> Two cell clusters
> ******
But that is not font-specific, is it ?
> In my opinion, then, we might have these 2 types of cell-clustering rules
> and one has the name "wtt2.0", the other I'm not sure if we are going to
> name it or not.
But those two rules are a user definided preference, as how he prefers to see
wrongly typed thai strings, not some property of fonts; the two behaviours
should be possible with any kind of Thai font.
(or we could force the "wtt2.0" behaviour in all cases, as, IIC, it is a rule
that means "show wrong thai sequences in special way, so it is evident they
are wrong")
> - How bad the problem is with legacy fonts without identified
> clustering rules.
>
> We won't be able to have Thai display correctly after we do text
> manipulation, like, insert, delete, copy-paste, selection, scrolling,
Because the copied string into the buffer uses the non tis-620 codepoint of
the precomposed glyphs or because of a cursor positionning problem?
--
Ki ça vos våye bén,
Pablo Saratxaga
http://www.srtxg.easynet.be/ PGP Key available, key ID: 0x8F0E4975
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]