Re: Faster UTF-8 decoding in GLib

From: Daniel Elstner <daniel kitta googlemail com>
To: Mikhail Zabaluev <mikhail zabaluev gmail com>
Cc: gtk-devel-list gnome org
Subject: Re: Faster UTF-8 decoding in GLib
Date: Tue, 16 Mar 2010 18:47:52 +0200

Hi,

Am Dienstag, den 16.03.2010, 17:20 +0200 schrieb Mikhail Zabaluev:

> I've made a glib branch where I tried to optimize the UTF-8 decoding routines:
> http://git.collabora.co.uk/?p=user/zabaluev/glib.git;a=shortlog;h=refs/heads/fast-utf8
> 
> The new code uses a table of unrolled functions to decode byte
> sequences, dispatched by the first character. g_utf8_get_char() got an
> inlined implementation.

Ouch.  I'm not sure that's such a great idea -- indirect calls usually
completely kill any branch prediction.  I would advise to test on
different CPU types.  Also, table lookups have their downside -- more
cache pressure, GOT needs to be fetched etc.

If you are interested, you may check out the glibmm solution, which gets
away without using any tables whatsoever:

http://git.gnome.org/browse/glibmm/tree/glib/glibmm/ustring.cc#n267

I'd love to have numbers on how my implementation competes with
table-based solutions.  In particular, if you inline the function.  So,
if you have the time... :-)

--Daniel

Follow-Ups:
- Re: Faster UTF-8 decoding in GLib
  - From: Daniel Elstner

References:
- Faster UTF-8 decoding in GLib
  - From: Mikhail Zabaluev

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]