Re: [Vala] how can I get the number of unicode points in a string?



On 04/03/2011 06:08 AM, ?????? wrote:
From: "??????"<pharaoh456 163 com>
Date: 2011-04-03 18:15:12
To: "Luca Bruno"<lethalman88 gmail com>
Subject: Re:Re: [Vala] how can I get the number of unicode points in a string?

At 2011-04-03 16:06:32??"Luca Bruno"<lethalman88 gmail com>  wrote:

On Sun, Apr 03, 2011 at 03:59:23PM +0800, ?????? wrote:
I see that since 0.11.0 vala string.length returns number of bytes rather than that of unicode characters, 
and string[i] returns only one byte. I wonder how to deal with east Asian character strings.
There are other methods in string that deal with utf8. For example
char_count() and next_char().

thank you.
I find char_count(), get_char() and next_char() in gtk+ document.
Looks like these methods are not covered in vala tutorial and document.
Is there something like string[i] for index access to utf8? I didn't get it in docs.

To get the i-th character, you could do this:

str.get_char(str.index_of_nth_char(i));

But the current string methods are designed for iteration by offsets, not characters. So you should *not* do this, which will be inefficient:

for (int i = 0 ; i < str.char_count() ; ++i) // don't do this
str.get_char(str.index_of_nth_char(i));

Instead, you want to iterate over the string using get_char() and next_char(). This is slightly inconvenient since these functions use pointers rather than integer offsets. In Vala trunk, J??rg has just committed a new method string.get_next_char() which will make it easier to iterate over strings:

// in class string
public bool get_next_char (ref int index, out unichar c);

That isn't in any Vala release yet, though. (In the meantime, you might be able to copy and paste his implementation from glib-2.0.vapi in Vala trunk.)

adam




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]