Re: GTK internationalization, right-to-left languages




Nimrod Zimerman <zimerman@earthling.net> writes:

> On Fri, May 08, 1998 at 01:14:55PM -0400, Owen Taylor wrote:
> 
> > > 1. Fonts rendering.
> > > 
> > > This should be, indeed, rather straight forward. 
> > > I feel that adding support to X itself is a complicated task not quite
> > > worth the time, especially now, when The Open Group is no longer open.
> > 
> > I think this is a dangerous way to think of things. The only
> > way X is going to be saved from the Open Group is if people continue
> > working on it and contribute the results to the XFree86 project.
> 
> Yes, this might be true, but I think that in this case, changing X instead
> of building on it is the wrong approach, due to two reasons.
> First, Open Group's version of X won't support it, which is not good.
> Second, not too many people install X manually (I didn't, for example).
> Obviously, binaries can be supplied, but it would take much longer to
> propagate and be tested.

Whether the Open Group's version supports it is a non-issue really,
IMO. Commericial vendors are, on average, shipping something
around X11R6.0, or maybe even older. The time until they ship
something

Anyways, RTL output methods _should_ be supported by the OpenGroup. At
least, it would just be implementing something that they have written
(rough) specifications for.

Legacy systems are more of an issue. There are certainly will
be many systems running neither the OpenGroup's latest, nor
XFree86's latest around for quite a few years to come. That
is probably a good reason for supporting rendering text with
out support from X.
 
> >                                    The other alternative is
> > to have GTK do the conversion from the logical ordering to the
> > on-screen ordering, then pass the string on to X for the 
> > final rendering.
> 
> This appears like the right thing to do, assuming fonts are available.
> One thing that I've been trying to 'discover' for some time is how may I
> create X fonts. Do you have any pointers on the issue?

Well, Emacs-20 includes fonts for both Hebrew and Arabic. (The
Arabic one is in a mule-specific encoding - I don't think there
is an accepted encoding for the glyphs (as opposed to the characters)
of Arabic)

Also:
http://www.cs.ruu.nl/wais/html/na-dir/internationalization/font-faq.html

If you are really interested in creating your own font from scratch...
I think there is probably is an editor for bitmap X fonts around.
Though I don't have a reference.
 
> 
> >   ftp://ds.internic.net/rfc/rfc2070.txt
> 
> As far as I could tell when I read that (a few weeks ago), it doesn't say
> anything too concrete.
 
True. It is pretty abstract. But I found section 4.2.4 somewhat
instructive concerning the how 

> > The ability to enter multi-lingual text is a different matter.  From
> > my experience, most localized input methods allow a small amount of
> > multi-lingualism. Text can be entered in the localized script or in
> > English/ASCII.
> 
> In the multi-lingual version of Windows 95, a user can type in any of the
> installed languages, intermixed. I don't know how useful it is, but assuming
> various languages are supported anyhow, and in a manner that should be
> independent of the current interface language (see below), it might be
> implemented. 
> 
> In Windows 95, choice of language is done in a selection box at the bottom
> of the screen, which is an unacceptable solution for gtk. Possibly, a
> right-click on any text widget should by default open a widget
> of some sort that allows choosing a language. 
> In addition, there should be keyboard shortcuts used to switch the current
> language and English (in Hebrew Windows 3.11, this is done using 
> Right-Alt-Shift for Hebrew, and Left-Alt-Shift for English, but in
> Windows 95 they've ruined this completely by making both of these keystrokes
> *switch*, instead of *choose* a language. This might look okay in general,
> but it forces the user to keep track of the current language while typing
> quickly, which is a problem. Often users fine themselves typing in the wrong
> language.
> 
> > My current thought is to stick to this degree of
> > multilingualism for Entries, but to allow, in the Text widget,
> > language as an extra attribute, as Color and Font are now.
> 
> This is less important, as I see it.
> Text widgets should, in general, support *several* fonts, one for each
> language. I'm not certain how this can be implemented without requiring
> huge storage if many languages are added to gtk, however.

Font and language are different things. In theory, using Unicode
you could have a single font for all languages. On the other
hand, a language like Arabic may be displayed using several
8-bit fonts. So I think the most important thing that is
being toggled is the input method - how the keystrokes are
interpreted. But there are also some issues about Unicode
characters that need to be displayed differently depending
on language (see belo)
 
> > This does mean that where different languages have different glyph
> > variants for the same Unicode character, the display of GTK widgets
> > without specific language markup may be incorrect.  For instance,
> > Chinese Labels in program being viewed by someone with LANG=ja_JP,
> > will be slightly misdisplayed. But this doesn't seem to be a major
> > issue.
> 
> I didn't understand that paragraph.

Understandable. To be a bit more clear, during the creation
of Unicode, a process called "Han Unification" was done, where
identical characters from the Japanese, Chinese, and Korean 
where assigned to the same codepoint. Now, in the process
a number of characters that have the same origin, but are
written slightly differently in Japanese and Chinese were
combined. 

To the Japanese eye, one of these character written in the Chinese
fashion, even though understandable, is incorrect. So
to correctly display a Unicode 6f22, it is not sufficient to
just know if it is Unicode 6f22, you also have to know whether
it is the Japanese 6f22 or the Chinese 6f22.

(To make an analogy, the Japanese feel that using the wrong one
is a bit like if somebody decided that all w's should just
be written as lowercase omegas, because they look pretty much
the same)

Regards,
                                        Owen



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]