Re: Clustering rules (was Re: Adding tis620.2533-0...)



Chookij Vanatham <chookij vanatham eng sun com> writes:

> Hi Owen,
> 
> First of all, thanks so much for giving the direction to sort this out.
> 
> ] 
> ] I'm not very happy about any of the solutions that have been proposed
> ] so far, because they involve separating the font from the clustering
> ] rules that need to be used for the font.
> ] 
> ] This is going to present significant headaches for users, in having
> ] to create configuration files that describe how each font should
> ] be clustered and maintain them as they add new fonts.
> 
> I agree on this as long as there are other choices to change/adjust
> the rules.
> 
> ] 
> ] I'm already concerned that the configuration for Pango with the
> ] pangox-aliases file is not easy enough. I don't want to make this
> ] worse.
> ] 
> ] 
> ] I think we can do better than this, though it may cause backwards
> ] compatibility problems with some existing fonts; we need to
> ] either:
> ] 
> ]  - Standardize on one set of clustering rules per encoding
> ] 
> ]  - Represent the clustering rules in the encoding field of the
> ]    XLFD:  (tis620-3.wtt20)
> 
> I agree on this and I think this should be done within the scope
> that can be controlled for the short term plan and satify everybody
> with flexibility.
> 
> ] 
> ] Or:
> ] 
> ]  - Store the appropriate clustering rules as an additional 
> ]    property on the font.
> ] 
> ]    This could either be an atom representing one of a set 
> ]    of standard clustering rules or a more complex rule set.
> 
> This is an interesting idea as well and, for me, I would like to know
> more. For now, I can't say anything more because I don't have experience
> much on the font, unless any other related information, let me know,
> just in case.

X provides a mechanism where extra properties can be attached to
an X font and accessed by a client using the font. 

The data is basically, a string, though it is transferred to the
client in a rather inefficient form for representing arbitrary string
data: an atom interned on the X server.

We will be using this for representing ligaturing rules for Indic
fonts in Pango, where:

 - There is no standard for Indic glyph encodings in common use
   currently.
 - The set of ligatures can depend quite a bit on the font.

In this case, since the set of ligature rules is not at all
fixed, we actually store the rules in the property.


I don't think that we really need that much flexibility for
Thai, since there is no reason for every font to have a different
clustering rule. So, if we wanted to use a font property, 
it would be more efficient to use a single atom as an enumeration.


However, there are problems with this approach.

 - It seems to be limited to BFD (bitmap) fonts, since I don't think
   there is any way to attach font properties to TrueType or Type1
   fonts in X.

 - Using it for existing fonts would require modifying the font 
   data, unlike the approach of representing it in the encoding
   field, which can be done with simple X font aliases.

> ] I think it would be useful if someone could enumerate the clustering
> ] rules currently in use for Thai on X, so we can see:
> ] 
> ]  - How many different rules are in use
> ]  - Which ones we need to support
> ]  - How bad the problem is with legacy fonts without identified
> ]    clustering rules.
> 
> I'll be providing the cell-clustering rules for Wtt2.0 and I think,
> Thai folks overthere, for sure, K.Theppitak and I should be able
> to work all-together or any other Thai folks, so that, we can have, at least,
> the standardize on one set of clustering rules per encoding for Thai,
> (XLFD: tis620-3.wtt20)

OK, good.

Regards,
                                        Owen




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]