Re: On CJK font selection (was Re: [Fwd: Re: Request for review and advice on wqy-bitmap-fonts fontconfig settings])



Le dimanche 16 décembre 2007 à 18:22 -0500, Behdad Esfahbod a écrit :
> On Thu, 2007-12-13 at 12:13 -0500, Qianqian Fang wrote:

> > Secondly, you said that "contextual font selection" is a "cool"
> > feature, I am wondering what languages are beneficial from this feature? 
> > (I believe there are, but just want to know).
> 
> Pretty much every non-Latin script.  In some situations even the Latin
> script.
> 
> Take the Unicode character U+002E FULL STOP, aka ASCII period.  It is
> used in more than just Latin, in Arabic for example, in Hebrew, possibly
> in Indic and many other scripts.  If it was not grouped with neighboring
> characters for font selection purposes all those people would have got
> their Arabic/Hebrew/... text assigned an Arabic/Hebrew/... font while
> the periods in at the end of sentences assigned a different (default
> Latin for example) font.
> 
> The same happens for Latin under a document tagged as non-Latin.  It's
> not a luxury thing.  It's just how things are supposed to work.

To be honest this was mostly solved latin-size by creating pan-european+
LGC fonts to completely avoid triggering substitutions.

Creating coherent pan-unicode fonts would solve it for other locales but
that's a huge piece of work and some bits like opentype base are not
there yet on the FLOSS side.

> > As I said in the previous email, this 
> > creates more
> > troubles for CJK languages than benefits.Particularly this ruins the text
> > alignment in monospace environment (see attachment). I doubt anyone
> > see it would say "cool", rather, they would feel annoyed.
> 
> That's not true.  If you have Chinese text and Latin text in the same
> line, and your Latin and Chinese monospace fonts have different widths,
> you are screwed no matter what.

That's means that for monospace separate fonts with different metrics
are a dead-end, right? :p

I wonder if something semi-monospaced like using twice the base size for
complex scripts would be worth it or would just break horribly apps.

> > In addition, you seem to underestimate the difficulties of ripping out 
> > part of
> > a CJK font. This is not possible for commercial fonts. Even it is doable
> > for open fonts (very few choices though), the incompatibility of the 
> > resulting
> > fonts will make it totally unusable on most platforms.
> 
> I've put three different ways in front of you.

Easy one: removing latin from the FLOSS font. But wouldn't solve
proprietary fonts people use in the wild.

Complete one: enhancing fontconfig to blacklist parts of fonts.

I don't see much the point of the TTC solution, except as a workaround
to lack of opentype BASE support.

> The fontconfig one is
> not hard at all for anyone willing to put their fingers where their
> mouth is.  You on the other hand, seem to ignore the impossibility (not
> difficulty) of what you are asking for.
> 
> > I want to add that on Windows, CJK users had never had such a problem,
> > all known CJKfonts have their Latin glyphs (some are crappy), but the text
> > rendering are "normal" (nothing like in the attachment). How window
> > structures the style propagation for COMMON characters?
> 
> Windows does no font fallback.

But windows, however, has an input chooser that explicitely specifies
the language in use instead of just a keyboard layout switcher, and I
suspect some windows apps do use it to select the right font

Unfortunately it seems Sergey Udaltsov was discouraged by lack of
positive feedback and stopped pushing something like
http://fedoraproject.org/wiki/SIGs/Fonts/Dev/LanguageAwarenessProblem

Qianqian: you need to realise the low hanging fruits have been
harvested long ago. There are no easy solution left that was not
rejected for one reason or another. That's why you're hitting a wall
(and exasperating Behdad). The bits needed to support well CJK and
complex scripts are well-known, but they're non-trivial so they do need
some concerted effort by the affected communities.

Regards,

-- 
Nicolas Mailhot

Attachment: signature.asc
Description: Ceci est une partie de message =?ISO-8859-1?Q?num=E9riquement?= =?ISO-8859-1?Q?_sign=E9e?=



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]