Re: g_utf8_offset_to_pointer() optimization
- From: Luis Menina <liberforce fr st>
- To: Behdad Esfahbod <behdad cs toronto edu>
- Cc: performance-list gnome org, Federico Mena Quintero <federico novell com>
- Subject: Re: g_utf8_offset_to_pointer() optimization
- Date: Thu, 03 Nov 2005 04:16:53 +0100
Ok, I've checked my code and I think you're wrong:
As I pre-increment the pointer, the first byte is never checked (I
assume I'm not in the middle of a character). So I'm waiting in this
case (offset == 1) for the first byte that doesn't match the "10xx xxxx"
pattern... wich is the case of the null byte !
Offset is then decremented, and everything goes smoothly...
BTW I've tried to use Federico's pango benchmark tools (
http://primates.ximian.com/~federico/news-2005-10.html#25 ), but i'm
left with an error...
After the "import cairo" error (solved by installing pycairo) I have
this error that I can't resolve, as i'm no python guru:
================
python ./plot-languages.py -o chart.png test1.xml
Traceback (most recent call last):
File "./plot-languages.py", line 373, in ?
main ()
File "./plot-languages.py", line 367, in main
rset = ResultSet (file)
File "./plot-languages.py", line 47, in __init__
self.parse (filename)
File "./plot-languages.py", line 63, in parse
self.parse_language_node (l)
File "./plot-languages.py", line 78, in parse_language_node
time = float_from_node (child)
File "./plot-languages.py", line 32, in float_from_node
return float (c.nodeValue)
ValueError: invalid literal for float(): 11,560000
=================
Thanks to anyone that can help me...
Behdad Esfahbod a écrit :
On Wed, 2 Nov 2005, Luis Menina wrote:
Can you give me more info about what is wrong with my function ?
I don't understand what you mean by "it doesn't pass over the tail of
the last characters"
Your code fails if the last character to skipped is a multibyte
one. Suppose this is the input:
str = "\xC2\xA0"
offset = 1
which is the U+00A0 NO-BREAK SPACE. The output should be str +
2, but your code returns str + 1.
behdad
gchar * g_utf8_offset_to_pointer1 ( const gchar *str,
glong offset)
{
while (offset)
{
if ((*++str & 0xC0) != 0x80)
--offset ;
}
return (gchar *)str;
}
--behdad
http://behdad.org/
"Commandment Three says Do Not Kill, Amendment Two says Blood Will Spill"
-- Dan Bern, "New American Language"
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]