gettext() translations, encoding and Glib::locale_to_utf8()



As I was testing the French translations of GUI texts in one of our applications the other day, it occurred to me that the software has some encoding issues. Essentially, I was doing

   LANG=FR_fr myApplication

(on the command line), and got a lot of messages of the form

   (myApplication:28074): Pango-WARNING **: Invalid UTF-8 string passed
   to pango_layout_set_text()

The reason for this was that the program has a lot of calls like:

   Gtk::Label label;
   ...
   label.set_text(_("Data"));

- as well as some similar stuff for text columns in a TreeView. The "_(String)" macro is defined "the usual way", i.e. as gettext(String). With the above LANG setting, this apparently means passing an ISO-8859-1 string to set_text() - this is the standard encoding for the "FR_fr" setting, and therefore also the encoding gettext() uses for the strings it returns (or so it seems.) Gtk::Label::set_text(), on the other hand, expects UTF-8. If there are accented letters in the translation text, this implies passing characters that are considered invalid.

If, on the other hand, I enter

   LANG=FR_fr.utf8 myApplication

everything is OK, because then the locale encoding is UTF-8, so this is what _() and gettext() (also) return.

I suppose you knew that already. Now the question(s):

What do you reckon is the best way to deal with this situation? Should I for instance wrap all _() calls in Glib::locale_to_utf8(). Or would it be better to define _(String) as Glib::locale_to_utf8(gettext(String))? Or is there another way I can call gettext(), so as to get it to return UTF-8 encoded strings? Or can I safely ignore the whole issue, based on the assumption that the encoding will be set to utf8 in a more "real" setup? What do other people do?

- Toralf



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]