Re: Invalid utf-8



On Sun, Oct 17, 2010 at 04:06:37PM -0300, John Coppens wrote:
Trying to load a text into a GtkTextBuffer, I bumped onto the well-known
Warning, that the text wasn't valid utf-8.

Ok... There were two accented Ã's in it, so I edited the file in utf-8
mode, and changed those characters.

Still, I get the invalid utf-8 warning! Googling, I found a suggestion
to run:

iconv -f UTF-8 /tmp/fpc2_31399.lst -o /dev/null

as validation, but I had no complaints from iconv.

I then tried:

   #!/usr/bin/perl

   f = open("/tmp/fpc2_31399.lst", "r");
   fdata = f.read();
   fdata.decode('utf-8', 'strict');

   f.close();

Again - no complaints.

The file contains:

00000080 55 6E 69 76 â 65 72 73 69 â 64 61 64 20 â 43 61 74 C3  Universidad Cat. 
00000090 B3 6C 69 63 â 61 20 64 65 â 20 43 C3 B3 â 72 64 6F 62  .lica de C..rdob 
000000A0 61 0A 0A 43 â 6F 6D 70 69 â 6C 65 72 20 â 6C 69 73 74  a..Compiler list

So, the utf-8 sequences are 0xC3 0xB3, which seem valid enough (C3 + B3 -> F3)

Finally, I gave up, and modified the generated text.
Why can't I read the text into the TextBuffer? (It's not a trailing \0,
I specify the length).

John

PS:
This is the code used to read the file, and set TextBuffer:

g_file_get_contents(fname, &bff, &len, &err);
gtk_text_buffer_set_text(list_bff, bff, len);

Are you sure that g_file_get_contents() returns no error?

Anyway, you show only excerpts of the text and of the code.  This exact
file called âtestâ

0000000: 556e 6976 6572 7369 6461 6420 4361 74c3  Universidad Cat.
0000010: b36c 6963 6120 6465 2043 c3b3 7264 6f62  .lica de C..rdob
0000020: 610a 0a43 6f6d 7069 6c65 7220 6c69 7374  a..Compiler list
0000030: 0a

loaded by this exact code

#include <gtk/gtk.h>

int
main(int argc, char *argv[])
{
    gtk_init(&argc, &argv);

    GtkWidget *window = gtk_window_new(GTK_WINDOW_TOPLEVEL);
    gtk_window_set_default_size(GTK_WINDOW(window), 400, 300);
    GtkWidget *textview = gtk_text_view_new();
    gtk_container_add(GTK_CONTAINER(window), textview);
    GtkTextBuffer *textbuffer = gtk_text_view_get_buffer(GTK_TEXT_VIEW(textview));

    {
        GError *error = NULL;
        gchar *buffer = NULL;
        gsize len = 0;
        if (!g_file_get_contents("test", &buffer, &len, &error)) {
            g_printerr("%s\n", error->message);
            g_clear_error(&error);
        }
        else {
            gtk_text_buffer_set_text(textbuffer, buffer, len);
            g_free(buffer);
        }
    }

    g_signal_connect(window, "destroy", G_CALLBACK(gtk_main_quit), NULL);
    gtk_widget_show_all(window);
    gtk_main();

    return 0;
}

works.  If it works also for you then you have to find the ten
differences... or at least one.

Regards,

Yeti




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]