regarding g_utf8_offset_to_pointer optimizations

From: "Turner, David" <DTurner nds com>
To: <performance-list gnome org>
Subject: regarding g_utf8_offset_to_pointer optimizations
Date: Thu, 3 Nov 2005 15:22:58 +0100
Hello everyone,

I'm jumping on the discussion here, but I've written a small test
program that compares the performance of all variants I've collected
so far on planet and the performance list.

I'm using a different set of test texts, and it seems that the
relative performance of the various tricks being used depend a
*lot* from the input text.

I'm working on a 1.5 Ghz Pentium 4, with "-O3 -march=pentium4" by the
way, using "-O2" doesn't show really different profiles. What I'm
noticing is that:

  - for mostly ASCII text, or mostly Latin-1, the "32-bits at a time"
    algorithm beats anything else by a large factor. Surprisingly,
    G_LIKELY slows down things a bit, or has no effect on some of them.

  - for mixed text containing a lot of characters from various parts
    of the Unicode plane, the *original* algorithm beats down anything
    else by a _large_ margin.

  - for Bengali text, the 32-bits at a time algorithm is about
    *twice* slower than the original one. Again, the original one
    wins hands down.

  - on my machine (((guchar)*s++ >> 6) != 2) is slower than
    ((*s & 0xC0) != 0xC0), go figure... :-)

I've been puzzled because these results seem to contradict what has
been published previously, though I suppose this comes from the fact
that I'm not trying to parse the po-data files.

Thus is my question:

  - what kind of input text do we want to optimize the function for ?.

    I'm not convinced that either the po-data or the UTF-8 samplers
    listed below are good examples.

    I suppose that mostly-ASCII is the priority, at least for
    developers :-) But are we ready to pay a price for some
    users (e.g. Bengali, but probably others).

  - do you experience the same results with other kinds of CPUs ?

    I wouldn't be surprised to see large changes


I must say that I'm glad for all the wicked tricks that appeared here,
they're really fun to study. However, I'm still thinking that, unless
you know exactly what kind of text you're going to parse, it'd be
better to fix GtkTextView and others to avoid those quadratic algorithms.
the savings should be a lot more meaningful, and probably more consistent
as well.


Oh well, I don't know why I'm writing all of this. Keep up the great work !!

- David Turner
- The FreeType Project  (www.freetype.org)





PS:

I'm using the following:

   test file 1:  all text at the following page
      http://www.cl.cam.ac.uk/~mgk25/ucs/examples/UTF-8-demo.txt

   test file 2:  all text at the following page
      http://www.columbia.edu/kermit/utf8.html

   test file 3:  all text at the following page
      http://www.columbia.edu/~fdc/pace/

   test file 4: all text at the following (mostly Bengali)
      http://lekho.sourceforge.net/banglapage.html

   test file 5: all text at
      http://www.unics.uni-hannover.de/nhtcapri/multilingual1.html

   test file 6: all ASCII text at
      http://www.gnu.org


the text can be collected with FireFox and "Save As" -> "Text File"

the benchmark program doesn't depend on GLib or anything else, so
it's very portable, use it as:

  gcc -o bench   fastutf8.c  -DUNIX  -O2 <youroptions>
  ./bench  test*.txt

results for my machine (1.5 GHz P4) are below:

parsing 'test1.txt'
original                       : 32.667 us/op
sven neumann's                 : 54.732 us/op
test < 0xC0                    : 50.323 us/op
32-bits at a time              : 61.410 us/op
32-bits at a time + G_LIKELY   : 59.521 us/op
hi-bits only                   : 61.102 us/op
with if..else                  : 62.316 us/op
with conditionals              : 70.442 us/op
with bit tricks                : 64.888 us/op
parsing 'test2.txt'
original                       : 152.987 us/op
sven neumann's                 : 150.394 us/op
test < 0xC0                    : 121.571 us/op
32-bits at a time              : 92.953 us/op
32-bits at a time + G_LIKELY   : 92.537 us/op
hi-bits only                   : 301.572 us/op
with if..else                  : 143.304 us/op
with conditionals              : 338.475 us/op
with bit tricks                : 313.501 us/op
parsing 'test3.txt'
original                       : 109.395 us/op
sven neumann's                 : 92.921 us/op
test < 0xC0                    : 74.268 us/op
32-bits at a time              : 48.613 us/op
32-bits at a time + G_LIKELY   : 56.188 us/op
hi-bits only                   : 214.953 us/op
with if..else                  : 84.350 us/op
with conditionals              : 239.424 us/op
with bit tricks                : 222.241 us/op
parsing 'test4.txt'
original                       : 23.714 us/op
sven neumann's                 : 45.397 us/op
test < 0xC0                    : 42.483 us/op
32-bits at a time              : 46.623 us/op
32-bits at a time + G_LIKELY   : 46.469 us/op
hi-bits only                   : 43.218 us/op
with if..else                  : 44.223 us/op
with conditionals              : 50.071 us/op
with bit tricks                : 46.128 us/op
parsing 'test5.txt'
original                       : 37.755 us/op
sven neumann's                 : 39.056 us/op
test < 0xC0                    : 34.076 us/op
32-bits at a time              : 41.643 us/op
32-bits at a time + G_LIKELY   : 41.738 us/op
hi-bits only                   : 72.922 us/op
with if..else                  : 48.119 us/op
with conditionals              : 82.630 us/op
with bit tricks                : 76.864 us/op
parsing 'test6.txt'
original                       : 27.041 us/op
sven neumann's                 : 21.321 us/op
test < 0xC0                    : 16.363 us/op
32-bits at a time              : 7.189 us/op
32-bits at a time + G_LIKELY   : 7.542 us/op
hi-bits only                   : 55.838 us/op
with if..else                  : 16.424 us/op
with conditionals              : 61.249 us/op
with bit tricks                : 57.877 us/op


***********************************************************************************
Information contained in this email message is confidential and may be privileged, and is intended only for use of the individual or entity named above. If the reader of this message is not the intended recipient, or the employee or agent responsible to deliver it to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please immediately notify the postmaster nds com and destroy the original message.
***********************************************************************************
Attachment: fastutf8.c
Description: fastutf8.c
Follow-Ups:
- Re: regarding g_utf8_offset_to_pointer optimizations
  - From: Benoît Dejean
[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]