Re: Faster UTF-8 decoding in GLib



Hi,

Am Dienstag, den 16.03.2010, 17:20 +0200 schrieb Mikhail Zabaluev:

> I've made a glib branch where I tried to optimize the UTF-8 decoding routines:
> http://git.collabora.co.uk/?p=user/zabaluev/glib.git;a=shortlog;h=refs/heads/fast-utf8
> 
> The new code uses a table of unrolled functions to decode byte
> sequences, dispatched by the first character. g_utf8_get_char() got an
> inlined implementation.

Ouch.  I'm not sure that's such a great idea -- indirect calls usually
completely kill any branch prediction.  I would advise to test on
different CPU types.  Also, table lookups have their downside -- more
cache pressure, GOT needs to be fetched etc.

If you are interested, you may check out the glibmm solution, which gets
away without using any tables whatsoever:

http://git.gnome.org/browse/glibmm/tree/glib/glibmm/ustring.cc#n267

I'd love to have numbers on how my implementation competes with
table-based solutions.  In particular, if you inline the function.  So,
if you have the time... :-)

--Daniel




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]