Re: Font selection algorithm?



On 01/30/2010 12:36 AM, Robert Siemer wrote:

> Assume that the application gives no hints: What is the algorithm to
> derive “languages” from a text? (Just the current locale language, or
> something more?)

Defaults to language from locale, yes.  The app should have set
setlocale (LC_ALL, "") before first though.


> What happens when out-of-language characters pop up?
> 
> Is it like the following:  ?
> 
> 1) Pango gets a (untagged, no language) text to layout
> 2) Pango launches a sans:lang=locale request and gets a list from
> fontconfig
> 3) every character covered in the first font from the result list is
> laid out
> 4a) if a character comes up that is i) not covered but ii) is considered
> the same language is tried to layout with the next font from the list
> (recursive)
> 4b) if a char is i) not covered and ii) is not considered the same
> language: that results in a new lookup like “sans:lang=new_language” and
> a covering font is searched in the new result list  (*)
> 5) Pango falls back to the very first font from the original result list
> as soon a a character from the initial (locale derived) language is hit
> ...and so on...

Well, something like that.  It's that:

1) Pango breaks text into runs each having a script and language.  The way
this is determined is that:

  - Script of the run is determined using Unicode Character Database.

  - There is a language preference list, which is the user-set language,
followed by contents of $PANGO_LANGUAGE or $LANGUAGE, followed by $LC_ALL,
$LC_CTYPE, and $LANG.  Not sure about the order, check the source.

  - The first language that uses the run script is selected.  This is
determined using pango_language_includes_script()

2) For each run, the language is used to get a fontset from fontconfig.

3) For each character, the first font from the fontset that supports the
character is used.


> (*) Does Pango still use the infamous lang=“xx” requests in case it
> can’t derive a language?

Yes, but that's a feature.


> Fontconfig has to be adapted to always return a “unicode-complete” font
> at the bottom of the list to avoid boxes to be drawn. (Pango does never
> ask for specific glyphs, but languages, right?)

Fontconfig always returns a fontset that provies fonts for all character that
at least one font supports, so no particular ordering is required, it does the
Right Thing.

behdad


> 
> Regards,
> Robert
> 
> 
>>
>>> So to
>>> avoid missing characters it requires a corresponding fontconfig
>>> configuration, and does not launch adapted requests for e.g.
>>> "sans:lang=someotherlang" or "sans:charset=..."?
>>
>> It does for lang, not charset.
>>
>>
>>> In the example above: is the "sans" the default in Pango if not
>>> overwritten in the app? Or is the default the empty pattern?
>>
>> I think sans is the default, yes.
>>
>>> In a standard gtk/gnome app, the ":lang=..." part comes from the current
>>> locale or is it text driven?
>>
>> Defaults to locale, but the app can override (eg, firefox passes the
>> web page
>> language).
>>
>> behdad
>>
>>>
>>> Thanks again,
>>> Robert
>>> _______________________________________________
>>> gtk-i18n-list mailing list
>>> gtk-i18n-list gnome org
>>> http://mail.gnome.org/mailman/listinfo/gtk-i18n-list
>>>
> 
> 


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]