[Date Prev][Date Next] [Thread Prev][Thread Next]
[Thread Index]
[Date Index]
[Author Index]
RE: Just a few UTF8 questions...
- From: martyn 2 russell bt com
- To: maclas gmx de, gtk-app-devel-list gnome org
- Subject: RE: Just a few UTF8 questions...
- Date: Wed, 9 Jul 2003 09:45:47 +0100
> > Also, if I read in from a socket to a gchar buffer[1024] and I then
> > proceed to print that information in the form
> >
> > g_message("socket input: %*s", bytes, buffer);
> >
> > Does the * represent how many characters or bytes that are
> printed from the
> > buffer?
>
> There was a thread about this in gtk-list in March:
>
> http://mail.gnome.org/archives/gtk-list/2003-March/msg00007.html
>
> The answers were:
>
> a) The way GLib uses UTF-8 together with printf has the
> unfortunate effect
> that the precision operates on bytes rather than characters.
>
> b) Glibc has a "feature" where %Ns actually checks for a whole
> number of characters in the current encoding. So, unless you
> are sure you are always going to be in an UTF-8 locale, avoid
> using %Ns. (You are basically OK for iso-8859-1, but will
> have problems in say, a Japanese locale.)
If I receive information in from a GLIB IO Channel, it should be UTF8 right?
If what Owen says is true, as I understand it, printf uses * for the number
of bytes and GLIB's implementation uses it for the number of characters.
So if I receive a buffer filled with Russian characters, then my
buffer[1024] is FULL of multibyte characters. Using GLIB's implementation
means that I would be attempting to print 1024 characters when infact there
may only be 900. This would be why it is causing a crash, but never when
the information is in english. Do you agree?
So I can presume that printing WITHOUT the * would be the fix?
Regards,
Martyn
[Date Prev][Date Next] [Thread Prev][Thread Next]
[Thread Index]
[Date Index]
[Author Index]