[Date Prev][Date Next] [Thread Prev][Thread Next]
[Thread Index]
[Date Index]
[Author Index]
Re: UTF-8 problem (XLS)
- From: Nick Lamb <njl tlrmx org>
- To: gnumeric-list gnome org
- Subject: Re: UTF-8 problem (XLS)
- Date: Mon, 06 Aug 2007 11:52:46 +0100
On Mon, 2007-08-06 at 02:19 -0300, John Coppens wrote:
> Hello people.
>
> Using version 1.6.3 of gnumeric, I tried to read an xls file and save it
> as csv. There was a problem with the resulting csv file, in that iconv
> didn't want to convert it into another coding. I'm not sure where the
> problem lies.
>
> The original .xls had the following string in it:
>
> 0068 0069 0070 [201A] 0072 ...
>
> I've marked the 201A code, apparently a valid utf-16 code (according to
> the xls specs).
Yes, this is U+201A which is a type of quotation mark. Someone might
have used it (wrongly) instead of a comma, or it might be the usual
style of quotation mark for some non-English text. If there's just one
such character and you're sure it's a mistake (e.g. it should obviously
be a comma) you can fix it in the spreadsheet and ignore the rest of my
post.
> Gnumeric (or ssconvert) saved this in the csv as:
>
> 68 69 70 E2 80 9A 72 ...
>
> Again 201A, and it seems to be the shortest utf-8 code that can represent
> it. But iconv -f utf8 -t iso-8859-1 chokes on the sequence and aborts
> with:
>
> illegal input sequence at position xxxx
>
> I know -c can make iconv skip the error, but that doesn't seem elegant.
> Can anyone indicate where to look for a solution?
This is not a Gnumeric problem
The ISO 8859-1 character set does not include U+201A, so iconv is
objecting because this transformation loses information. If you use the
iconv -c switch this quotation mark will just vanish from the output.
There is no way to do what you're asking, there simply isn't a way to
write this U+201A character using ISO-8859-1, so either you need to
choose a different encoding, or re-think the whole plan.
Nick.
[Date Prev][Date Next] [Thread Prev][Thread Next]
[Thread Index]
[Date Index]
[Author Index]