Re: UTF-8 problem (XLS)



On Tue, 2007-08-07 at 11:10 -0300, John Coppens wrote:
Hi Nick.
I don't usually use Windows, and I don't have Office. But my notebook
came with W$ preinstalled, and I just left it enough space for
experiments. I downloaded and installed Excel viewer.

Thanks for taking the time to do that.

Yes, the characters are the same. Which confirms my theory that the
characters were imported from DOS (OEM charset) without conversion.

OK, then there's probably not much we can sensibly do in Gnumeric. You
might be able to come up with an iconv incantation that "undoes" the
mistaken conversions to get back the right characters in the OEM code
page and then convert from that but you'd need to know where the mistake
was.

FWIW in undoing Windows latin encoding you probably want to specify
Windows-1252 rather than ISO-8859-1 as a parameter to iconv.

A conversation about that would be rather off-topic for this list, but
feel free to reply to me in person.

Can't do that here, but I'll try at the university this evening.

Thanks but it's probably not going to help after the other result.

I just believe some export or import filter used to generate the xls file
ignored the OEM_850 (or whatever) coding and considered them Windows
charset (Latin).

Yes, this sounds about right. The devil is in the details.

Might you be able to make (a copy of) the file available to one of the
developers to look into further? 

Not a problem. Who should I send it to? I'll add comments with offsets of
where the strings are.

If Excel can't get this right then we probably can't do much better so
there's not much to gain from looking at your file (if someone else has
a better idea I'm sure they'll speak up).

Nick.




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]