Re: Comprehensive East-Asian support

Steve Underwood <> writes:

> Hi all,
> Last summer I exchanged several private e-mails with Owen Taylor about
> gscript. There was nothing private about the discussion - I just failed
> to join this list, despite sending several subscribe requests. At that
> time Owen's goals for gscript were somewhat narrow - basically just to
> display text within the GTK display environment. I said that I thought
> it important for gscript to provide support for vertical and
> right-to-left text. Owen didn't see that as important for dialogs, and
> similar GTK related things, which may well have been the correct
> attitude in the context of July 1999's gscript. January 2000's Pango is
> intended as a system for the comprehensive handling of international
> high quality text output, so its time for me to restate my point of
> view.

Actually, at that point, I was quite willing to consider eventually
dealing with vertical layout, but didn't consider it a top priority.

That is still the case. For the work I'm doing, GTK+ is the first
priority to get working for Pango-1.0 ...  simply because that's the
project most important to me. If other people want to help out in
getting Pango ready for DTP type applications sooner, well that 
will be appreciated.

But it should be noted that Pango is _not_ a desktop-publishing system;
it is a system for handling the hard part of i18n layout. And that means,
it primarily deals with 1-D layout. Whether those lines are horizontally
or vertically oriented has some effect on Pango - it may need to
make different decisions based on that. But the high-level arrangement
of the lines is a matter for drivers that sit on top of Pango.
> A system which can only handle the text direction features defined in
> the Unicode support tables, bi-directional algorithm, etc. is too
> limited for East Asian languages. Good Chinese and Japanese (I can't
> speak for Korean, as I don't see that much of it) support demands
> support for top-to-bottom, right-to-left text. Chinese also demands
> options for left-to-right and right-to-left (I'm not sure if Japanese is
> ever written right-to-left, now or in the past). The Unicode tables
> assume all East Asian languages are only ever written left-to-right.

This is just inaccurate and I'm not sure that you actually understand
the purposes of the directional properties within Unicode. They are
meant to enable the automatic handling of mixed right-to-left and
left-to-right languages. There would be no point in putting properties
on a character saying:

 "this character can be written left-to-right, right-to-left or top-to-bottom"

What advantages could such a marking give? 
> As well as the need to properly present blocks of text, the behaviour of
> international systems when labelling the y-axis of a graph, and similar
> rotated text situations is important. Rendering according to the Unicode
> rules, and rotating the outcome gives the wrong result for East Asian
> languages. Kanji/Hanzi are _never_ tilted beyond about 45 degrees when
> labelling anything. Turning them on their side, along with some numbers
> or English words they may be mixed with would be totally wrong.
> Even for left-to-right Chinese and Japanese, the Unicode material, and
> the current gscript/pango, fail to implement things Asian people would
> consider important in high quality output. For example, if a short burst
> of Hanzi is dropped into a page of English it would be rendered just as
> gscript does now. If a short burst of English (say a company name) were
> dropped into a page of Chinese, space would be placed around the English
> so the Chinese characters all sit on a mon-spaced grid. Pango doesn't
> seem to provide for that, and it leads to very odd looking results.

It should be noted that frequently in Japanese usage this is not done,
though I believe Chinese usage may be different. If there are portions
within a line that are in latin script, this simply disrupts the character
grid. Again, there is wide variety of practice here that cannot be
solved in an automated way.
> If facilities for rotated text for labelling purposes, and alternative
> directions for text blocks are not put into the Pango API I think they
> are going to get hacked in later by some folk in Asian - possibly with
> me amongst them. It seems better to include the proper provision for
> comprehensive text handling from day one. The code that needs to lie
> behind the API could be missing from version 1 of Pango, but I strongly
> feel the API needs to make provision for these things, or an
> incompatible update will occur later.
> If folk are interested in providing this capability, I will be happy to
> document (which basically means gather illustrative examples) the
> detailed behaviour East Asian people have come to expect of mixed
> language handling in vertical and rotated text. I might also have some
> time to help in the implementation.
> Last summer Owen seemed convinced that people don't use vertical or
> right-to-left text any more. That puzzled me, since I know he can read
> Japanese. If people aren't convinced, I can post a few small scanned
> images of Hong Kong and Japanese newspapers and magazines to prove that
> vertical is still the order of the day in Asia. I can pick almost any
> newspaper and any page, and show vertically written text.

Just to confirm that that I wasn't forgetting something, I looked back
through the emails we exchanged on the subject. And I certainly never
indicated that I believed that vertical writing was uncommon. I'm 
quite familiar with the usage of vertical writing and the various
ways that vertical and horizontal writing are mixed.

(If you paraphrase somebody's comments, there is a certain obligation
to do it accurately.)

I perhaps was not familiar at that time with the extent that
right-to-left writing is used with Hong Kong and Taiwan, but 
in any case, I think the basic attitude that I expressed at that
point is accurate - there are a lot of different ways of handling
directionality with CJK text, and I don't think Pango can do
more than enable higher-level DTP programs to handle these cases.


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]