Re: Font lookup ranges [was Re: Notes on Pango Xft backend]



 --- Owen Taylor <otaylor redhat com> wrote: > 
> Keith Packard <keithp keithp com> writes:
> 
> > Around 14 o'clock on May 28, Owen Taylor wrote:
> > 
> > > The one place where a fix is urgently needed is
> for Japanese
> > > vs. Traditional Chinese vs. Simplified Chinese;
> things look pretty
> > > awful there at the moment; but what I want to do
> first in that area
> > > is make font selection locale/language-tag
> sensitive.
> > 
> > That means tagging sections of text with the
> language and exposing the 
> > font language tags in the list of fonts I hand
> back.  Alternatively, you 
> > can create separate lists, one for each language
> tag found for each 
> > logical font. 
> 
> Pango already has the language tagging mechanism;
> the question is
> how to use this to influence character lookup.
> 
>  a) Call FcFontSetSort() once, get a list, and then
> when finding
>     a (language-tag, codepoint) pair, look first for
> a font with
>     the language tag and the codepoint, then if that
> fails, 
>     look for a font without the language tag with
> the codepoint.
> 
>     Problems:
> 
>      - Takes two passes to look up a character with
> a missing
>        language tag. 
>      - I don't think we should ever fall back to a
> font that wasn't
>        explicitely specified (*), just for a want of
> a language 
>        tag.
> 
>  b) Call FcFontSetSort() separately for each
> language, and somehow
>     influence the sort order; what we'd like to do
> is make including
>     the specified language tag have an weight:
> 
>      Less than whether the pattern is in the family
>      Greater than the ordering of family names
> listed in the pattern
> 
>     Problems:
> 
>      - I don't think we should ever fall back to a
> font that wasn't
>        explicitely specified (*), just for a want of
> a language 
>        tag.
> 
> Problems with both:
> 
>  - Type1 fonts don't have OS/2 tables, and thus
> don't have FC_LANG
>     entries; I think some TrueType fonts might miss
> them as well.
> 
>  - The set of languages in the OS/2 table / FC_LANG
> is pitfully

Can't you use coverage to determine this?
Aren't there tables on unicode.org to show which
languages are supported by which scripts?  You can use
coverage to tell which scripts are supported and
from there, tell which languages are supported.

>    small. On the other hand, all we really care
> about
>    is Simplified Chinese/Traditional
> Chinese/Japanese/Korean
>    so it may well be good enough.

For now yes.  Romanian uses a "comma below" some
letters
which Unicode has mapped onto a cedilla.  This has
also
been left as a font issue.  It's not as famous as the
Chinese vs. Japanese case yet.

I've also read about other cases but can't recall the
details.  Anyway the point is that it will be a more
general problem in the future.

Andrew Dunbar.

>    [
>    
>
http://mail.gnome.org/archives/gtk-i18n-list/2001-June/msg00001.html
>     describes the form language tags take in Pango;
> some of the followup
>     discussion is interesting; 
> 
>     
>
http://mail.gnome.org/archives/gtk-i18n-list/2001-June/msg00010.html
> 
>     Clarifies what Pango language tags are for in
> response to 
>     comments from Peter Constable.
>    ]
> 
> You seem to imply a third possibility with:
> 
> > That might be best as I expect font pattern
> editing to be 
> > used to select preferred faces for each language
> tag.
> 
>   c) Pango adds the language tag to the pattern it
> feeds to 
>      FcConfigSubstitute, and fonts.conf does pattern
> matching magic
>      to provide a different "Sans-serif" alias for
> every language.
> 
> Can't say I like this too much:
> 
>  - Requires lots of careful configuration (more than
> just
>    slapping extra fonts into "Sans-serif".)
> Configuration is bad.
> 
>  - Would mean that the default config files wouldn't
> 
>    use the simplified <alias> element, sort of
> ruining the point
>    of <alias>.
>      
> Still, it has the decided advantage that it frees
> the mechanism
> from relying on information in the font.
> 
> Regards,
>                                         Owen
> 
> 
> (*) Explicitely specified is a tricky concept:
> 
>     Say, we have two things in our fonts.conf
> 
>     a) Alias Sans to "Arial"
>     b) If no generic alias is found in the pattern,
> tack on "Sans-serf"
> 
>     User specifies  Pattern becomes                 
> 
>     ==============  ===============
>     Verdana         Verdana(a), Arial(b)
>     Sans-serf      Arial(c) 
> 
>     (a), and (c) are explicitely specified, (b)
> isn't.
>  
>     Basically, we want family names that were
> explicitely given
>     by the user, or family names where the expansion
> only involved
>     <prefer> elements.
> _______________________________________________
> gtk-i18n-list mailing list
> gtk-i18n-list gnome org
> http://mail.gnome.org/mailman/listinfo/gtk-i18n-list 

=====
http://linguaphile.sourceforge.net http://www.abisource.com

__________________________________________________
Do You Yahoo!?
Everything you'll ever need on one web page
from News and Sport to Email and Music Charts
http://uk.my.yahoo.com



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]