Invalid utf-8



Hello all.

Trying to load a text into a GtkTextBuffer, I bumped onto the well-known
Warning, that the text wasn't valid utf-8.

Ok... There were two accented Ã's in it, so I edited the file in utf-8
mode, and changed those characters.

Still, I get the invalid utf-8 warning! Googling, I found a suggestion
to run:

iconv -f UTF-8 /tmp/fpc2_31399.lst -o /dev/null

as validation, but I had no complaints from iconv.

I then tried:

   #!/usr/bin/perl

   f = open("/tmp/fpc2_31399.lst", "r");
   fdata = f.read();
   fdata.decode('utf-8', 'strict');

   f.close();

Again - no complaints.

The file contains:

00000080 55 6E 69 76 â 65 72 73 69 â 64 61 64 20 â 43 61 74 C3  Universidad Cat. 
00000090 B3 6C 69 63 â 61 20 64 65 â 20 43 C3 B3 â 72 64 6F 62  .lica de C..rdob 
000000A0 61 0A 0A 43 â 6F 6D 70 69 â 6C 65 72 20 â 6C 69 73 74  a..Compiler list

So, the utf-8 sequences are 0xC3 0xB3, which seem valid enough (C3 + B3 -> F3)

Finally, I gave up, and modified the generated text.
Why can't I read the text into the TextBuffer? (It's not a trailing \0,
I specify the length).

John

PS:
This is the code used to read the file, and set TextBuffer:

g_file_get_contents(fname, &bff, &len, &err);
gtk_text_buffer_set_text(list_bff, bff, len);




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]