Why does gtk_text_buffer() normalize the cluster on backspace?

I'm was trying to figure out why backspace does not delete the last character (accent) in the buffer when entering Hebrew text with accents, and I stumbled upon the reason in gtk+/gtk/gtktextbuffer.c:gtk_text_buffer_backspace():

      if (backspace_deletes_character)
          gchar *normalized_text = g_utf8_normalize (cluster_text,
                                                     strlen (cluster_text),
          glong len = g_utf8_strlen (normalized_text, -1);
          if (len > 1)
            gtk_text_buffer_insert_interactive (buffer,
                                                g_utf8_offset_to_pointer (normaliz
ed_text, len - 1) - normalized_text,
          g_free (normalized_text);

And there's the crux. Why the normalization through the call g_utf8_normalize()? If backspace should not simply delete the last character in the buffer, shouldn't its behavior be language dependent, perhaps as part of the pango language module? In any case for Hebrew the current behavior is not logical as there are accents that imo tie stronger than other. E.g. when inserting:

   U+5D1 Hebrew Letter Bet
   U+05BC Hebrew Point Dagesh or Mapiq
   U+05B8 Hebrew Point Qamats

the dotting of the BET (Mapiq) logically ties stronger than the vowel mark Qamats (to such an extent that fonts often provide a different special glyph for the combination Bet/Mapiq), but backspace currently first erases the Mapiq. The reason is probably that Mapiq has a higher unicode code point than the Qamats... This e.g. breaks the open type table Bet/Mapiq ligature as the characters are no longer adjacent. Of course one may build more sophisticated opentype tables, but this seems quite roundabout...


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]