Pango glyph extents patch revisited


I decided to review Billy's glyph extents patch which was the
last of the bits remaining from the major optimization work three
weeks ago.  In the process I came up with my own completely
different patch that I'm posting.  I think we can pick the good
stuff from the two and close this issue.

We are talking about pangocairo-fcfont.c.  The current
implementation creates a hash_table per PangoCairoFcFont, and
whatever glyph extents it computes, drops into the hash table.
The hash table stuff from this is showing on the profiles for
about 2.5%.  So note that we are talking about as little as 2.5%,
if you feel like it's a waste of time, feel free to stop reading
now :).

Billy did a minimal patch [1] that removes the g_hash_table, and
insteal allocates a fixed-size 1024-bucket custom hash table,
each entry a linked list.  That definitely has its merits, and
may work pretty fast as well.  But I didn't like allocating 4kb
of hash table, and mallocing an item (40 bytes) per glyph later.

What I went for instead, is a copy of what Federico did for
gunichar->glyph lookup: A fixed-size last-only 256-bucket cache.
In fact I even shared the cache-reporting facilities with his
cache.  I also reordered the code around, so we make fewer cairo
calls.  I made the per-glyph item to store in the cache smaller,
making a 256-item cache down to 6kb, which is comparable to
Billy's array of NULLs in size.  The hit ratio in all my tests
have been >99%.  In fact, in any realistic scenario, the hit
ratio of this cache should be exactly the same as federico's
cache, since if you convert a gunichar to a glyph, you will get
that glyph's extents sooner or later.

One problem with this approach is that unlike the original code
and vektor's, mine does not cache all glyph extents ever queried.
I would like to see that as a plus, that the cache does not grow
unbounded.  On the other hand, cairo and FreeType have their own
caches, so we are just adding a small L1 cache on top of them.
Very reasonable IMHO.

What do people think?  As for speed, I did a measurement, it
performed almost like vektor's.  Although I expect it to be a bit
slower, since the cache size is 256, not 1024.  Would be nice if
somebody else benchmarks too.




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]