Re: GTK internationalization, right-to-left languages
- From: Owen Taylor <otaylor gtk org>
- To: Nimrod Zimerman <zimerman earthling net>
- Cc: gtk-list redhat com
- Subject: Re: GTK internationalization, right-to-left languages
- Date: 13 May 1998 19:42:40 -0400
Nimrod Zimerman <zimerman@earthling.net> writes:
>
> > > Text widgets should, in general, support *several* fonts, one for each
> > > language. I'm not certain how this can be implemented without requiring
> > > huge storage if many languages are added to gtk, however.
> >
> > Font and language are different things. In theory, using Unicode
> > you could have a single font for all languages. On the other
> > hand, a language like Arabic may be displayed using several
> > 8-bit fonts.
>
> What happens to the current font families? A modest Windows machine probably
> has about 30 font families, not including localized fonts.
> I assume this won't change in the future. In theory, a font can contain
> Unicode as a whole, but in practice, how many fonts like that would we have?
> Probably just a few (because there is no real use for this kind of thing).
The font family (face) is a different axis separate from the
the selection of the multiple fonts that may be necessary to
display a single string.
X uses the concept of a "Fontset", which is a group of fonts (in
different encodings) that are used to render a string. Usually
these would be related - so a fontset might include:
Helvetica-Roman (iso-8859-1)
Hevvetica-Greek (iso-8859-?)
A "gothic" (sans-serif) Japanese font that matched Helvetica visualyl.
> Regarding Arabic - isn't it enough to assume one Unicode font could cover
> all required chracters? As you probably know, exceptions are the one thing a
> programmer hates more than a memory corruption bug... (but, handled
> correctly, this probably shouldn't be treated as an exception at all. Huge
> tables covering the whole character are probably common enough).
Supposedly, Unicode is supposed to encode characters, not glyphs.
Lets see if I can explain the difference:
A character represents a linguistically distinct symbol.
A glyph is a visual symbol. A font may include several glyphs
corresponding to the same symbol(s).
Examples:
- The ligature combining f and l commonly found in Roman fonts
is a distinct glyph, but not a distinct character.
- The Arabic alphabet has 28 distinct characters, but to display
Arabic properly requires (almost) four times as many glyphs,
because of the variants for initial/middle/end/independent
Because Unicode combines existing character sets, it includes
some glyphs that are distinct characters (for instance, the
f-l ligature was part of iso-8859-1, so it is part of Unicode).
I don't know if Unicode encodes all the necessary glyphs for
Arabic, but in theory, it shouldn't. It should just include
the characters of iso-8859-? (which is character-based
not glyph-based)
> > To the Japanese eye, one of these character written in the Chinese
> > fashion, even though understandable, is incorrect. So
> > to correctly display a Unicode 6f22, it is not sufficient to
> > just know if it is Unicode 6f22, you also have to know whether
> > it is the Japanese 6f22 or the Chinese 6f22.
>
> Doesn't that somewhat defeat the point of Unicode? Oh, well.
>
> Does it really matter, or can it be ignored? (Differently put - what's the
> chance an angry Japanese would decide to bomb gtk's headquarters after using
> a utility that uses the Chinese version of the letter?).
Well, not likely. But if we didn't allow them to select the Japanese
versions, they might well decide against using GTK+.
Regards,
Owen
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]