Bug in Gtk+2 gtk_text_buffer_insert() of a '\n' following an existing '\r'?



All,

  I think I've found a bug in Gtk+2 gtk_text_buffer_insert() of a '\n'
following an existing '\r' in a GTK_TEXT_BUFFER (to convert end-of-line from
`CR` (Max pre-OSX) to `CRLF`. When I locate the existing '\r' end-of-line with
gtk_text_iter_forward_to_line_end() and gtk_text_iter_forward_char() and check
that the next char is NOT '\n', I simply want to insert a '\n' to make the
conversion. However, gtk_text_buffer_insert() inserts '\r\n' (0x0d 0x0a)
instead of '\n' making the line-end '\r\r\n'. I don't say "I think I've found
a bug" lightly, and I can already hear the "You screwed up buddy..." coming,
but bear with me.

  By simply making the change of gtk_text_buffer_backspace() to remove the
exising '\r' and then tk_text_buffer_insert() of a '\r\n' the end-of-line is
correctly converted from '\r' to '\r\n'. There is no reason I should have to
backspace over '\r' and then insert '\r\n' instead of just inserting '\n'.
Thus the bug in how gtk_text_buffer_insert() handles inserting a single '\n'
following an existing '\r'.

  This is a summary of the question (details follow afterwards)

    Why does the following fail?

        if (gtk_text_iter_get_char (&iter) != '\n') {
            gtk_text_buffer_insert (buffer, &iter, app->eolstr[LF], -1);
        }

    When the following works?

        if (gtk_text_iter_get_char (&iter) != '\n') {
            if (gtk_text_buffer_backspace (buffer, &iter, FALSE, TRUE)) {
                gtk_text_buffer_get_iter_at_mark (buffer, &iter, app->last_pos);
                gtk_text_buffer_insert (buffer, &iter, app->eolstr[CRLF], -1);
            }

  This is either a bug, or Gtk+2 considers CRLF a single char and will NOT
allow manual creation of '\r\n' by inseting a '\n' following an existing '\r'
in a buffer. (which Is probably where the bug (or issue) is, but I don't know
where to begin to look in the Gtk+2 source... I can remove the exising '\r'
and insert '\r\n' and everything is fine, but I cannot place a single '\n'
after an existing '\r' in the buffer and have it work.

  Here are the details for two test cases whittled down to readable examples.
First the boiler plate declarations and initialization of the EOL string and
struct values:

#define EOL_LF     "\n"
#define EOL_CR     "\r"
#define EOL_CRLF   "\r\n"
#define EOL_NO     3
#define EOLNM_LF   "LF"
#define EOLNM_CR   "CR"
#define EOLNM_CRLF "CRLF"

enum eolorder { LF, CRLF, CR };     /* global constants for LF, CRLF and CR */

  Declaration of struct holding values

typedef struct {
...
    gint            eol;                /* end-of-line */
    gchar           *eolnm[EOL_NO];     /* ptrs to eol names */
    gchar           *eolstr[EOL_NO];    /* ptrs to eol strings */
...
    GtkTextMark     *last_pos;          /* position of last match in buf */
...
} kwinst;

  Initializaitons of struct values passed through app (struct instance is
named 'app'):

    kwinst *app = NULL;             /* replaced GtkWidget *window */
    app = g_slice_new (kwinst);     /* allocate mem for struct    */
    context_init (app);             /* initialize struct values   */

  Within context_init (app), you have:

#ifndef HAVEMSWIN
    app->eol            = LF;       /* default line end LF */
#else
    app->eol            = CRLF;     /* default line end CRLF */
#endif
    app->eolstr[0]      = EOL_LF;   /* eol ending strings */
    app->eolstr[1]      = EOL_CRLF;
    app->eolstr[2]      = EOL_CR;
    app->eolnm[0]       = EOLNM_LF; /* eol string names */
    app->eolnm[1]       = EOLNM_CRLF;
    app->eolnm[2]       = EOLNM_CR;

  The test file for the conversion loaded into buffer is a 'CR' delimited
file, e.g.

$ hexdump -C eol_cr.txt
00000000  6d 79 0d 64 6f 67 0d 20  20 68 61 73 0d 20 20 66  |my.dog.  has.  f|
00000010  6c 65 61 73 0d 61 20 6c  6f 74 0d 20 20 20 20 6f  |leas.a lot.    o|
00000020  66 20 66 6c 65 61 73 0d                           |f fleas.|
00000028

  In human readable form:

$ cat eol_cr.txt
my
dog
  has
  fleas
a lot
    of fleas


  On file open 'CR' end-of-line is properly detected and app-eol is set to CR.
On menu choice the user can choose between CR, CRLF and LF line end. The
relevant parts of the function called to change from CR to CRLF that exposed
the problem is:

void buffer_convert_eol (kwinst *app)
{
    GtkTextBuffer *buffer = GTK_TEXT_BUFFER(app->buffer);
    GtkTextIter iter;
    ...
    /* get iter at start of buffer */
    gtk_text_buffer_get_start_iter (buffer, &iter);

    /* set app->last_pos Mark to start, and move on each iteration */
    app->last_pos = gtk_text_buffer_create_mark (buffer, "last_pos", &iter,
FALSE);

    /* loop, moving to the end of each line, before the EOL chars */
    while (gtk_text_iter_forward_to_line_end (&iter)) {

        gunichar c = gtk_text_iter_get_char (&iter);
        gtk_text_buffer_move_mark (buffer, app->last_pos, &iter);

        if (c == '\n') {            /* if end-of-line begins with LF */
            ...
        }
        else if (c == '\r') {       /* if end-of-line begins with CR */
            if (app->eol == LF) {   /* handle change to LF */
                ...
            }
            else if (app->eol == CRLF) {    /* handle change to CRLF */
                gtk_text_iter_forward_char (&iter);
                if (gtk_text_iter_get_char (&iter) != '\n') { /* if not '\n' */
                    /* just insert '\n' */
                    /* CODE THAT PRODUCES THE BUG */
                    gtk_text_buffer_insert (buffer, &iter, app->eolstr[LF], -1);
                }
            ...
        }
        gtk_text_buffer_get_iter_at_mark (buffer, &iter, app->last_pos);

    }
    ...
}

  The resulting file is:

$ hexdump -C eol_messcr.txt
00000000  6d 79 0d 0d 0a 64 6f 67  0d 0d 0a 20 20 68 61 73  |my...dog...  has|
00000010  0d 0d 0a 20 20 66 6c 65  61 73 0d 0d 0a 61 20 6c  |...  fleas...a l|
00000020  6f 74 0d 0d 0a 20 20 20  20 6f 66 20 66 6c 65 61  |ot...    of flea|
00000030  73 0d 0d 0a                                       |s...|
00000034

  That's just wrong. A '\r\n' is inserted following the existing '\r' instead
of '\n' alone. (it looks like gtk_text_buffer_insert() anticipates that a CRLF
should be created out of the '\r' and the inserted '\n', but leaves the
original '\r' unchanged and, in fact, inserts a CRLF of its own -- bizarre).
Here is the same function with the backspace over '\r' and insert of '\r\n'
that works as intended:

            else if (app->eol == CRLF) {    /* handle change to CRLF */
                gtk_text_iter_forward_char (&iter);
                if (gtk_text_iter_get_char (&iter) != '\n') { /* if not '\n' */
                    /* then backspace, reinit iter, and insert '\r\n' */
                    if (gtk_text_buffer_backspace (buffer, &iter, FALSE, TRUE)) {
                        gtk_text_buffer_get_iter_at_mark (buffer, &iter,
app->last_pos);
                        gtk_text_buffer_insert (buffer, &iter,
app->eolstr[CRLF], -1);
                    }

  '\r' is removed and replaced by '\r\n' and the resulting file produced is:

$ hexdump -C eol_messcr2.txt
00000000  6d 79 0d 0a 64 6f 67 0d  0a 20 20 68 61 73 0d 0a  |my..dog..  has..|
00000010  20 20 66 6c 65 61 73 0d  0a 61 20 6c 6f 74 0d 0a  |  fleas..a lot..|
00000020  20 20 20 20 6f 66 20 66  6c 65 61 73 0d 0a        |    of fleas..|
0000002e

  Why does the following fail?

    if (gtk_text_iter_get_char (&iter) != '\n') {
        gtk_text_buffer_insert (buffer, &iter, app->eolstr[LF], -1);
    }

  When the following works?

    if (gtk_text_iter_get_char (&iter) != '\n') {
        if (gtk_text_buffer_backspace (buffer, &iter, FALSE, TRUE)) {
            gtk_text_buffer_get_iter_at_mark (buffer, &iter, app->last_pos);
            gtk_text_buffer_insert (buffer, &iter, app->eolstr[CRLF], -1);
        }

  Either I'm crazy and have botched something (which is always possible, but
unlikely here), or there is a bug in Gtk+2 (latest) gtk_text_buffer_insert()
that is triggered by attempting to insert a one-character string "\n" after an
exising '\r' in the buffer. Maybe gtk_text_iter_forward_to_line_end (&iter)
correctly moves to the end and gtk_text_iter_forward_char (&iter); correctly
moves to the next position follwing '\r', but gtk_text_buffer_insert (buffer,
&iter, "\n", -1); fails to insert a '\n' following the exising '\r' and
instead inserts '\r\n' -- which is just wrong.

  Is this a bug? If so, I'll report it. If not, then I'm sure there is a
deeper explaination for why the obvious won't work.

-- 
David C. Rankin, J.D.,P.E.


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]