Re: BUG FIX (was: Re: 0.99.7 Entry (editable?) bug)

From: Owen Taylor <owt1 cornell edu>
To: Tony Gale <gale daedalus dera gov uk>
Cc: gtk-list redhat com
Subject: Re: BUG FIX (was: Re: 0.99.7 Entry (editable?) bug)
Date: 16 Mar 1998 16:23:24 -0500

Tony Gale <gale@daedalus.dera.gov.uk> writes:

> [1  <text/plain; us-ascii (7bit)>]
> 
> It seems the complicated backward character calculations were to
> accommodate the possibility of the entry text having characters of
> different widths.
> 
> I can't see how this can happen, however, so I've put together a
> patch assusing that all the characters in the text entry are the
> same width, and this fixes the problem I reported.
> 
> I'm not saying that this is 100% correct, but I'd like to hear why it
> isn't. I have only changed the gtk_move_backward_character function
> as an example, but the other functions that use
> move_backward_character will also probably need some work (e.g.
> gtk_move_backward_word)

This is 100% incorrect, sorry. ;-)

For instance, in the common EUC encoding, a character
can either be a single byte < 0x80, or two bytes, >= 0x80.

Other multibyte encodings (such as UTF-8) can be much more
complicated, and multibyte encodings can even include shift
sequences that switch from single byte to double byte 
characters. (For instance ISO-2022-jp) To accommodate the
last possibility, the only correct thing to do is to
start from the beginning of the string.

I think, it might be correct for all multibyte encodings
commonly used by X , to do:

  while (position > 0)
    {
      position--;
      if (mblen(entry->text[position], MB_CUR_MAX) != 0)
         break;
    }

But that certainly isn't valid for all multibyte encodings.

Regards,
                                        Owen

References:
- BUG FIX (was: RE: [gtk-list] RE: 0.99.7 Entry (editable?) bug)
  - From: Tony Gale

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]