Faster UTF-8 decoding in GLib
- From: Mikhail Zabaluev <mikhail zabaluev gmail com>
- To: gtk-devel-list gnome org
- Subject: Faster UTF-8 decoding in GLib
- Date: Tue, 16 Mar 2010 17:20:41 +0200
Hello,
I've made a glib branch where I tried to optimize the UTF-8 decoding routines:
http://git.collabora.co.uk/?p=user/zabaluev/glib.git;a=shortlog;h=refs/heads/fast-utf8
The new code uses a table of unrolled functions to decode byte
sequences, dispatched by the first character. g_utf8_get_char() got an
inlined implementation.
I have added a performance test that can also be used against mainline.
Some performance observations with x86, the code compiled by gcc 4.4.1
with optimization flags -O3 -march=core2 and ran on a ThinkPad T61p:
- There is a 15-50% speedup on g_utf8_get_char(), depending on the text.
- g_utf8_to_ucs4_fast() got a similar boost for ASCII, but curiously,
performance has degraded for Chinese text.
- g_utf8_get_char_extended() and g_utf8_get_char_validated()
surprisingly perform better in the present branchy implementation,
compared to my attempted reimplementation using the function tables. I
left them untouched.
I can get measurements on ARM Cortex A9 with a Nokia N900, if there is
enough interest.
Feel free to play and improve.
Mikhail
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]