Hello everyone, I'm jumping on the discussion here, but I've written a small test program that compares the performance of all variants I've collected so far on planet and the performance list. I'm using a different set of test texts, and it seems that the relative performance of the various tricks being used depend a *lot* from the input text. I'm working on a 1.5 Ghz Pentium 4, with "-O3 -march=pentium4" by the way, using "-O2" doesn't show really different profiles. What I'm noticing is that: - for mostly ASCII text, or mostly Latin-1, the "32-bits at a time" algorithm beats anything else by a large factor. Surprisingly, G_LIKELY slows down things a bit, or has no effect on some of them. - for mixed text containing a lot of characters from various parts of the Unicode plane, the *original* algorithm beats down anything else by a _large_ margin. - for Bengali text, the 32-bits at a time algorithm is about *twice* slower than the original one. Again, the original one wins hands down. - on my machine (((guchar)*s++ >> 6) != 2) is slower than ((*s & 0xC0) != 0xC0), go figure... :-) I've been puzzled because these results seem to contradict what has been published previously, though I suppose this comes from the fact that I'm not trying to parse the po-data files. Thus is my question: - what kind of input text do we want to optimize the function for ?. I'm not convinced that either the po-data or the UTF-8 samplers listed below are good examples. I suppose that mostly-ASCII is the priority, at least for developers :-) But are we ready to pay a price for some users (e.g. Bengali, but probably others). - do you experience the same results with other kinds of CPUs ? I wouldn't be surprised to see large changes I must say that I'm glad for all the wicked tricks that appeared here, they're really fun to study. However, I'm still thinking that, unless you know exactly what kind of text you're going to parse, it'd be better to fix GtkTextView and others to avoid those quadratic algorithms. the savings should be a lot more meaningful, and probably more consistent as well. Oh well, I don't know why I'm writing all of this. Keep up the great work !! - David Turner - The FreeType Project (www.freetype.org) PS: I'm using the following: test file 1: all text at the following page http://www.cl.cam.ac.uk/~mgk25/ucs/examples/UTF-8-demo.txt test file 2: all text at the following page http://www.columbia.edu/kermit/utf8.html test file 3: all text at the following page http://www.columbia.edu/~fdc/pace/ test file 4: all text at the following (mostly Bengali) http://lekho.sourceforge.net/banglapage.html test file 5: all text at http://www.unics.uni-hannover.de/nhtcapri/multilingual1.html test file 6: all ASCII text at http://www.gnu.org the text can be collected with FireFox and "Save As" -> "Text File" the benchmark program doesn't depend on GLib or anything else, so it's very portable, use it as: gcc -o bench fastutf8.c -DUNIX -O2 <youroptions> ./bench test*.txt results for my machine (1.5 GHz P4) are below: parsing 'test1.txt' original : 32.667 us/op sven neumann's : 54.732 us/op test < 0xC0 : 50.323 us/op 32-bits at a time : 61.410 us/op 32-bits at a time + G_LIKELY : 59.521 us/op hi-bits only : 61.102 us/op with if..else : 62.316 us/op with conditionals : 70.442 us/op with bit tricks : 64.888 us/op parsing 'test2.txt' original : 152.987 us/op sven neumann's : 150.394 us/op test < 0xC0 : 121.571 us/op 32-bits at a time : 92.953 us/op 32-bits at a time + G_LIKELY : 92.537 us/op hi-bits only : 301.572 us/op with if..else : 143.304 us/op with conditionals : 338.475 us/op with bit tricks : 313.501 us/op parsing 'test3.txt' original : 109.395 us/op sven neumann's : 92.921 us/op test < 0xC0 : 74.268 us/op 32-bits at a time : 48.613 us/op 32-bits at a time + G_LIKELY : 56.188 us/op hi-bits only : 214.953 us/op with if..else : 84.350 us/op with conditionals : 239.424 us/op with bit tricks : 222.241 us/op parsing 'test4.txt' original : 23.714 us/op sven neumann's : 45.397 us/op test < 0xC0 : 42.483 us/op 32-bits at a time : 46.623 us/op 32-bits at a time + G_LIKELY : 46.469 us/op hi-bits only : 43.218 us/op with if..else : 44.223 us/op with conditionals : 50.071 us/op with bit tricks : 46.128 us/op parsing 'test5.txt' original : 37.755 us/op sven neumann's : 39.056 us/op test < 0xC0 : 34.076 us/op 32-bits at a time : 41.643 us/op 32-bits at a time + G_LIKELY : 41.738 us/op hi-bits only : 72.922 us/op with if..else : 48.119 us/op with conditionals : 82.630 us/op with bit tricks : 76.864 us/op parsing 'test6.txt' original : 27.041 us/op sven neumann's : 21.321 us/op test < 0xC0 : 16.363 us/op 32-bits at a time : 7.189 us/op 32-bits at a time + G_LIKELY : 7.542 us/op hi-bits only : 55.838 us/op with if..else : 16.424 us/op with conditionals : 61.249 us/op with bit tricks : 57.877 us/op *********************************************************************************** Information contained in this email message is confidential and may be privileged, and is intended only for use of the individual or entity named above. If the reader of this message is not the intended recipient, or the employee or agent responsible to deliver it to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please immediately notify the postmaster nds com and destroy the original message. ***********************************************************************************
Attachment:
fastutf8.c
Description: fastutf8.c