RE: Just a few UTF8 questions...



Also, if I read in from a socket to a gchar buffer[1024] and I then 
proceed to print that information in the form 
  
  g_message("socket input: %*s", bytes, buffer);

Does the * represent how many characters or bytes that are 
printed from the
buffer?

There was a thread about this in gtk-list in March:

http://mail.gnome.org/archives/gtk-list/2003-March/msg00007.html

The answers were:

a) The way GLib uses UTF-8 together with printf has the 
unfortunate effect
   that the precision operates on bytes rather than characters.

b) Glibc has a "feature" where %Ns actually checks for a whole 
   number of characters in the current encoding. So, unless you
   are sure you are always going to be in an UTF-8 locale, avoid
   using %Ns. (You are basically OK for iso-8859-1, but will
   have problems in say, a Japanese locale.)

If I receive information in from a GLIB IO Channel, it should be UTF8
right?


If what Owen says is true, as I understand it, printf uses * for the
number
of bytes and GLIB's implementation uses it for the number of characters.

No. Owen speaks about glibc, and the precision is always the number of bytes
(unless you  use wprintf and wide characters). The feature Owen means is
that
glibc checks that the bytes to be printed form a valid sequence of
characters in
the encoding of the selected locale (ie that the byte array doesn't end in
the middle
of a multibyte character).


So if I receive a buffer filled with Russian characters, then my
buffer[1024] is FULL of multibyte characters.  Using GLIB's implementation
means that I would be attempting to print 1024 characters when infact
there
may only be 900.  This would be why it is causing a crash, but never when
the information is in english.  Do you agree?

io channels in fact return utf-8. For the rest, see above.

So I can presume that printing WITHOUT the * would be the fix?

The simplest solution would certainly be to nul-terminate the byte array and
omit the
precision.

Matthias


-- 
+++ GMX - Mail, Messaging & more  http://www.gmx.net +++

Jetzt ein- oder umsteigen und USB-Speicheruhr als Prämie sichern!




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]