Re: Valid UTF-8 text mangled up in GtkLabel



Hi Behdad, Gaurav,
IMHO this is *the* classic example of misapplied "bidi algorithm", that is, the heuristic for determining base direction based on the first strong directional. There is no reason ever to use this heuristic in any normal GUI application. In almost all Arabic, Farsi and Hebrew apps you know the base direction from the outset or else you want to be able to set is specifically. For example, email addresses are always LTR, text editing must always be user-selectable. It would be good to implement the same directional model in Gtk that Sun implemented in Java Swing TextComponent, though I appreciate that this would require a lot of work.
My 2c,

 - yba



On Fri, 5 Aug 2005, Behdad Esfahbod wrote:

On Fri, 5 Aug 2005, Gaurav Jain wrote:

Hi,

I'm trying to set the text in a GtkLabel to a UTF-8 string, which
contains some arabic characters first, followed by my email address in
angle brackets, followed by my name in round brackets.  For e.g., a
sample value is:

X <gaurav somewhere com> (Gaurav Jain)

In the above, 'X' represents a valid sequence of arabic UTF-8
characters.  The problem that I see is that when I run this program
(appended to this mail), the output shown is something like this:

(gaurav somewhere com> (Gaurav Jain> X

Note that the angle and round brackets are all messed up, and that
order of arabic and ascii words is also wrong.

Apparently our milages do vary ;).


Does anyone know WHY this is happening?

Yes, because Arabic is written from right to left, unlike Latin.
And this behavior is part of the Unicode standard.

Just for information, I'm using GTK 2.4.14.  Also,
I was surprised to discover that this works fine with an older version
of GTK (2.0.9).

Right.  Because /bidi/ was not implemented completely in that
version.  This specific part of bidi that is causing problems for
you is called automatic paragraph direction.

Does something special need to be done so I can get
it to work with GTK >= 2.4?

It /is/ working.  If you like to get the behavior similar to the
old one, you need ot insert a U+200E LEFT-TO-RIGHT MARK character
at the beginning of the buffer.


Thanks,
Gaurav

--behdad
http://behdad.org/
_______________________________________________
gtk-i18n-list mailing list
gtk-i18n-list gnome org
http://mail.gnome.org/mailman/listinfo/gtk-i18n-list


--
 EE 77 7F 30 4A 64 2E C5  83 5F E7 49 A6 82 29 BA    ~. .~   Tk Open Systems
=}------------------------------------------------ooO--U--Ooo------------{=
     - yba tkos co il - tel: +972.2.679.5364, http://www.tkos.co.il -



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]