Re: I think there is a bug in get_shaper_and_font() in pango-1.8.2

Hash: SHA1


I've cut several million lines out of it, and distilled it down to this
small test case (attached).  The bug occurs on both pango-1.8.2 and

I've given the output of the test case below.  It seems my method of
writing out the character number is not quite right, but that doesn't
matter for this purpose.

Running it with the '1' argument shows what the output SHOULD look like.
  Running it with the '2' shows - at the bottom of the output - that
some of the letters are now being rendered using the "Basic" shape
engine.  This makes them get placed wrong when drawn.

The hebrew word is 1489 1468 1464 1512 1464 1443 1488.  The first,
middle (1512), and last character are all hebrew *letters*, while the
others are vowels and diacritical marks.  Pango considers the vowels and
diacritics to be "INHERIT" script type.

This test tricks pango into polluting its 'shaper font cache' (as
implemented by shaper_font_cache_get() in pango-context.c).

It works like this:

The string 32, 1468, 1464, 32, 1464, 1443, 32 contains no characters
that are of the HEBREW script type.  But, we tell it to use the Hebrew
language.  So, the vowels resolve to the "Basic" shape engine through
the inheritance rules. Their script type is considered to be perhaps
LATIN, but certainly not Hebrew.  But the "he" language setting causes
it to use the same cache as the good Hebrew text does.  It then pollutes
the cache by storing a mapping for the individual vowel characters to
the Basic shape engine instead of the Hebrew one.

- From then on, Hebrew text containing the polluted vowels and diacritics
is rendered wrong.

It is difficult to avoid triggering this bug in a web browser, because
it throws all sorts of crud at the rendering engine.  I found the bug
through browser testing.


- ------ OUTPUT OF TEST CASE ------

aotearoa$ ./pango-test 1
Doing test 1 - Hebrew working properly

(pango-test:7215): Pango-WARNING **: Cannot open font file for font Ezra
SIL 12
1488 HebrewEngineFc
1443 HebrewEngineFc
1512 HebrewEngineFc
1512 HebrewEngineFc
1489 HebrewEngineFc
1489 HebrewEngineFc
1489 HebrewEngineFc
- ---
aotearoa$ ./pango-test 2
Doing test 2 - Hebrew getting corrupted through cache poisoning

(pango-test:7217): Pango-WARNING **: Cannot open font file for font Ezra
SIL 12
32 BasicEngineFc
32 BasicEngineFc
32 BasicEngineFc
32 BasicEngineFc
32 BasicEngineFc
1443 BasicEngineFc
32 BasicEngineFc
- ---
1488 HebrewEngineFc
1443 BasicEngineFc
1464 BasicEngineFc
1512 HebrewEngineFc
1468 BasicEngineFc
1468 BasicEngineFc
1489 HebrewEngineFc
- ---

Owen Taylor wrote:
> On Wed, 2005-08-10 at 16:04 +1200, Stephen Blackheath wrote:
>>Hash: SHA1
>>Dear Pango developers,
>>Hello there!  I've been trying to work on getting Mozilla Firefox to
>>render Hebrew properly, and I have discovered what I think is a bug in
>>Pango 1.8.2, and - (without a thorough check), seems to exist in
>>pango-1.9.1 as well.  The information below doesn't constitute a
>>"proper" bug report as such, but it should hopefully at least be enough
>>to point out that there is a problem.  Please let me know if you want
>>more in)formation, or a proper test case.
> A test case that doesn't involve millions of lines of code would
> definitely be hugely appreciated.
> (You could start with the stuff in examples/ ... something there
> might demonstrate your bug with the right input text
> Thanks,
> 						Owen

Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird -


#include <glib/gunicode.h>
#include <gdk/gdkpango.h>
#include <gdk/gdkrgb.h>
#include <pango/pango.h>
#include <string.h>
#include <stdio.h>

void dump(PangoContext* pc, gunichar2* text16, int length16)
    gchar* text8;
    PangoLayoutLine* line;
    PangoLayout *layout;
    GSList *tmpList;

    text8 = g_utf16_to_utf8(text16, length16, NULL, NULL, NULL);

    layout = pango_layout_new(pc);

    pango_layout_set_text(layout, text8, strlen(text8));
    line = pango_layout_get_line(layout, 0);

    for (tmpList = line->runs; tmpList && tmpList->data;
         tmpList = tmpList->next) {
        gint i;
        PangoLayoutRun *layoutRun = (PangoLayoutRun *)tmpList->data;

        for (i=0; i < layoutRun->glyphs->num_glyphs; i++) {
            gint thisOffset = (gint)layoutRun->glyphs->log_clusters[i] + layoutRun->item->offset;
            printf("%d %s\n", g_utf8_get_char(text8+thisOffset),


int usage(char* argv0)
    fprintf(stderr, "Usage:\n");
    fprintf(stderr, "  %s 1  Show Hebrew working properly\n", argv0);
    fprintf(stderr, "  %s 2  Show Hebrew getting corrupted through cache poisoning\n", argv0);
    return 1;

int main(int argc, char* argv[])
    PangoContext* pc;
    gint i;
    PangoFontDescription* fd;
    gint test_no;

    gtk_init(&argc, &argv);

    if (argc < 2)
        return usage(argv[0]);

    test_no = atoi(argv[1]);
    if (test_no < 1 || test_no > 2)
        return usage(argv[0]);

    printf("Doing test %d - %s\n", test_no,
      test_no == 1 ? "Hebrew working properly"
                   : "Hebrew getting corrupted through cache poisoning");

    pc = gdk_pango_context_get();
    pango_context_set_language(pc, pango_language_from_string("he"));

    if (test_no == 2) {
          /* Formatting this string causes pango-1.8.2 to subsequently render
          Hebrew text brokenly. */
        gunichar2 text16[] = {' ', 1468, 1464, ' ', 1464, 1443, ' '};
        dump(pc, text16, 7);

        gunichar2 text16[] = {1489, 1468, 1464, 1512, 1464, 1443, 1488};
        dump(pc, text16, 7);

CFLAGS = $(shell pkg-config --cflags gtk+-2.0)
LDFLAGS = $(shell pkg-config --libs gtk+-2.0 pango)

all: pango-test
	$(CC) -o pango-test pango-test.c $(CFLAGS) $(LDFLAGS)

	rm -f pango-test

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]