Re: UTF-8 problem (XLS)



On Tue, 07 Aug 2007 11:22:39 +0100
Nick Lamb <njl tlrmx org> wrote:

On Mon, 2007-08-06 at 11:04 -0300, John Coppens wrote:
The origin of the strange characters in the file still unclear. These
xls files are sent to my wife's pharmacy monthly with updates.

It would be interesting to know some things if you have access to MS
Excel on Windows or Mac or to the (zero cost) Microsoft Excel viewer
program on Windows and time to experiment.

* Do these characters appear visually to be the same in Excel as in
Gnumeric when you load the file ? I get the impression from previous
emails that the answer is "No", but I'm not quite clear.

Hi Nick.
I don't usually use Windows, and I don't have Office. But my notebook
came with W$ preinstalled, and I just left it enough space for
experiments. I downloaded and installed Excel viewer.

Yes, the characters are the same. Which confirms my theory that the
characters were imported from DOS (OEM charset) without conversion.

* If you re-save the Excel file (perhaps slightly alter the text in one
cell to make sure you're really saving a new file) from Excel does it
make a difference to the strange characters when imported into
Gnumeric ?

Can't do that here, but I'll try at the university this evening.
 
The former would tell us whether Excel itself can figure out what's
going on here and get the right characters (in which case Gnumeric can
probably learn how to do the same) and the latter would tell us whether
perhaps the file sent to the pharmacy is non-standard in some way while
the same data when saved by Excel itself can be read correctly.

I just believe some export or import filter used to generate the xls file
ignored the OEM_850 (or whatever) coding and considered them Windows
charset (Latin).

Might you be able to make (a copy of) the file available to one of the
developers to look into further? 

Not a problem. Who should I send it to? I'll add comments with offsets of
where the strings are.

Thanks,
John



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]