Re: [gtk-i18n-list] Unicode PUA supporting issue in gtk+/pango



> >This is another and big problem, not related to fonts.  You can imagine,
> >after a system switches to unicode 4.1, you will hear complaints such as
> >"I can see the characters (in a document or of a filename).  But when I
> >search, it just don't match!"
> 
> >IMHO, we can have a script to convert all filenames, at the post-install
> >time on package upgrade.  As for the documents, users should manually
> >convert them, if he/she needs searchability.
> 
> I think it's still font-related issue. The utilization
> of PUA codepoint is induced by the font. The "displayable"
> PUA codepoint is determined by the font (or, it's determined
> by some national standards? if so, please let me know),
No, not by the font.  There indeed are standards:
http://www.info.gov.hk/digital21/eng/hkscs/mapping_table.html

Some characters in HKSCS are mapped to PUA.  The `P' here means HK.
Some characters in (Taiwan-)BIG5 are also mapped to PUA.  The `P' here
means Taiwan.  A codepoint in PUA may be assigned different glyphs in
different areas.

Therefore, even if HKSCS contains all characters defined by
(Taiwan-)BIG5, a font to be used in HK can _not_ be used in Taiwan.

(Well, it _can_ actually, but not under pango.)

> and, as I concluded P3, "undisplayable" codepoints are out
> of the scope, such codepoints should not be touched.
> Therefore, I think, the code conversion script should be
> maintained and updated with the refered font, synchronously.
> It's bad idea (if we cannot generate the script from theu
> font file automatically).
The conversion is done by:

HKSCS in Unicode pre 4.1 --> HKSCS in BIG5 --> HKSCS in Unicode 4.1,

using the mapping table from the above URL.

-- 
Regards,
olv



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]