Re: [gtk-i18n-list] Unicode PUA supporting issue in gtk+/pango

On Tuesday 20 December 2005 12:21, mpsuzuki hiroshima-u ac jp wrote:
> On Tue, 20 Dec 2005 11:39:09 +0800
> "Arne G��tje (高盛華)" <arne linux org tw> wrote:
> >And and it will still be a long time until CJK Ext. C comes out. And
> >even after that the PUA and Plane 15/16 areas will still be used for
> >temprary storage of characters which are not yet or will never be in
> >Unicode...
> Please let me know more about the characters that
> "will never be in Unicode".

For example: Taiwanese POJ uses some latin characters with (stacked) 
diacritcs which don't have a precomposed form in Unicode. The Unicode 
folks stated that they won't include any additional precomposed latin 
combinations anymore. Those cases should use the GPOS feature (mark and 
mkmk) to put the diacritics in the correct posotion. There is nothing 
wrong with that point of view. However, one important diacritic 
(U+0358) has only recently been included into Unicode 4.1. As many 
programs and rendereing engines still cannot handle that feature and 
codepoint correctly, I have put precomposed Latin combinations in Plane 
15. Further I plan to use it for Han characters which are used in 
Taiwanese and Hakka languages and which are not even planned in CJK 
Ext. C.

> >>Is it the role of iconv?
> >
> >No. it would be the matter of the fonts to supply alias codepoints,
> > so that both, old documents and new documents can be displayed.
> > However, if someone wants to convert Unicode documents from PUA
> > codepoints to new official codepoints, there should be a script
> > provided to do that manually (for example a plug-in in
> >
> I think you say "if gtk+/pango passes PUA charcode transparently
> to the font layer, it's enough, no code-conversion issue occurs."
> My understanding is right? If so, how do we descriminate:
> the fonts which provides expected Hanzi glyph for PUA codepoint,
> and the fonts which uses PUA for other purpose?
> If there's OpenType feature tag to declare as PUA codepoints are
> filled by HKSCS or CNS11643, we can descriminate - but yet there's
> no such thing, at present...

There is no need to descriminate. Users who expect special characters in 
PUA areas have to use special fonts which provide the correct glyphs. 
And usually teh users know which fonts provide the expected glyphs.

For example: my CJK-Unifonts provide full HKSCS-2004 coverage, both the 
official Unicode codepoints according to Unicode 4.1 and the previous 
used PUA codepoints. But this is a feature of the font and has nothing 
to do with the rendering engine. PUA codepoint stays PUA codepoint. No 

> BTW, I don't have CJK unifont including Plane 15 & 16.
> How do I obtain that?


Arne Götje (高盛華) <arne linux org tw>
PGP/GnuPG key: 1024D/685D1E8C
Fingerprint: 2056 F6B7 DEA8 B478 311F  1C34 6E9F D06E 685D 1E8C
Key available at   Encrypted e-mail preferred.

Attachment: pgpciqQyZiszc.pgp
Description: PGP signature

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]