Re: Glib::ustring tradeoffs?



On Saturday 29 October 2005 09:49, Matthias Kaeppler wrote:

> I'd have another question though. First, what's the difference between
> these two conversions:
>
> std::string ascii = "text";
>
> // what's the difference between this ...
> Glib::ustring unicode = ascii;
>
> // ... and this?
> Glib::ustring unicode = Glib::locale_to_utf8(ascii);

In this case, there is no difference if the locale charset (and the charset in 
which you have written your source code) is ASCII compliant (eg one of the 
ISO8859 or Microsoft charsets), because ASCII characters are valid UTF-8, and 
the string "text" is valid ASCII and so valid UTF-8.

If you are using Unix or Windows you can assume your charset is ASCII 
compliant.  So Glib::locale_to_utf8() will do nothing in your example.

> Furthermore, do I have to wrap every line of text into a call to
> Glib::locale_to_utf8()? For example if I define a widget text somewhere,
> do I have to do it like this:
>
> #define WIDGET_LABEL Glib::locale_to_utf8("BUTTON")
> ...
> widget.set_text(WIDGET_LABEL);
>
> and what could happen if I don't make the conversion?

You need Glib::locale_to_utf8() if you want to put characters which are in 
your locale charset (where that might not be UTF-8) into something requiring 
UTF-8, such as a GTK+/gtkmm widget.  This is only likely to happen if you are 
reading from an external source such as a file which contains data in the 
locale charset.  As in your example you are hardwiring the text into the 
source code, then the best thing is not to convert the text, but to write the 
hardwired string in UTF-8 in the first place.  (As it happens, "BUTTON" is 
valid ASCII and therefore valid UTF-8.)

Normally if you are hardwiring text into your source code, such as the text 
for a label or button, you would enable translations from different languages 
by using gettext() and intltools.  You can ensure that gettext() provides the 
translated text in valid UTF-8 format by calling 
bind_textdomain_codeset([prog_name], "UTF-8") after the call to 
bindtextdomain().  The text can then be put directly into a GTK+/gtkmm 
widget.  (In that case, calling Glib::locale_to_utf8() on the text returned 
by gettext() would be positively wrong, because an incorrect conversion will 
be attempted if the locale charset is not UTF-8 and the call will do nothing 
if the locale charset is UTF-8.)

Similarly, if you are writing a program which needs to store data to file, it 
is best to store it in UTF-8.  You will then not need to do any code 
conversion when reading the file contents into a GTK+ widget, and the file 
will be entirely portable (it will work whatever locale the program user 
happens to have adopted on his/her system).

Practically all GTK+/gtkmm widgets from which text is obtained (such as 
Gtk::Entry) provide it in UTF-8 encoding, but there are exceptions.  
Gtk::FileChooser passes a file or folder name in the locale's filename 
encoding so it can be passed straight to Unix/Windows open() (but the 
filename text can be converted to UTF-8 with Glib::filename_to_utf8(), and 
vice-versa with Glib::filename_from_utf8(), if wanted).

Chris




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]