Re: Numeric shaping in Pango

On Tue, 2003-09-23, Roozbeh Pournader wrote:

>> "...information from other paragraphs in a document could be used to
>> conclude that the document was fundamentally Arabic, and that EN
>> should generally be converted to AN."

> That is of course "very bad" for stability of text unless the algorithm
> is carefully specified.

>> "Remap the number shapes to match those of another set.
>> For example, remap the Arabic number shapes to have the same appearance
>> as the European numbers." [or vice versa of course]

> Same here. How is one to determine which of the shapes to use? For
> example, consider a document talking about different digits in different
> writing systems. Do you want it to be displayed in different ways on
> different desktops?

> Anyway, those lines are there to be used by higher level "protocols".
> Where are those? Or are you suggesting a new protocol to be defined in
> gtk?

>> That's what should happen in the ideal world. In the real world, AFAIK
>> these characters are absent in most keyboard layouts (except maybe IBM
>> ones), and I don't expect them to be widely used in the absence of a
>> simple way to input them.

> There is some other problem: There are a lot of cases more than one set
> of digits is used in a document. For example, a certain document I am
> now writing in Persian, uses European digits to refer to different
> versions of the Unicode standard or certain pieces of software developed
> outside Iran, while it uses Extended Arabic-Indic digits for everything
> else. In a system with automatic digit shape replacement, I'd be in
> hell.

My intention was to demonstrate that the Unicode Standard leaves freedom
to select digit shapes in general. I agree with you that numeric shaping
should be done carefully (if any). IMO user preferences at the system or
application level are good candidates for "higher level protocols". I think
a good example is the Windows preferences for numerals found in
"Regional Options -> Numbers -> Digit Substitution".

Such preferences may give certain flexibility in displaying digits - for
example, if author's preferences don't match end-user's ones, for cultural

It would be possible to produce different glyphs by combining the
preferences (on the end-user side), inserting Arabic-Indic digits
directly and non-depricated control characters (on the author side).


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]