[Fwd: Re: Can't read Ruby Spreadsheet generated XLS files]

Doh! This message went to Bill Lam instead of to the list.

--- Begin Message ---
On 16/02/2009 9:59 PM, bill lam wrote:
On Mon, 16 Feb 2009, Alexander Skwar wrote:

I have a VERY simple ruby program which uses spreadsheet on Windows
2000 to create an Excel file:

require 'spreadsheet'
Spreadsheet.client_encoding = 'UTF-8'
book = Spreadsheet::Workbook.new
sheet1 = book.create_worksheet

It should be clear if you dump and examine biff records of that xls
file.  I'm now not having time to actually look into it, but I suspect
the 'UTF-8' caused trouble because biff only support utf16-le. May be
the byte count of unicode string is incorrect.  What if you change
'UTF-8' to latin?

Hi Bill,

On Windows XP SP3 the spreadsheet can be opened quite happily in Excel 2003 and OOo Calc v2 but fails with Gnumeric 1.9.1 [with only an "Unsupported file format" box, no other messages like those reported by the OP].

Unicode doesn't appear to be the problem. The OP is writing plain old ASCII text. Changing the input ("client") decoding is rather unlikely to affect the outcome. Re "because biff only support utf16-le": it is the job of the software to convert the input from utf8 or latin1 or gagolithic or whatever to utf16le.

I have had a quick look through it with xlrd, both at the BIFF-dump-char-hex level and by extracting the data with the verbosity level turned up ... nothing suspicious at all. The 7 text strings extracted match those in the Ruby script.

In any case, the Gnumeric problem appears to be happening at the OLE compound document level i.e. before it gets anywhere near BIFF records or unicode issues:

(gnumeric:20531): libgsf:msole-WARNING **: failure reading block 16

(gnumeric:20531): libgsf:msole-WARNING **: failure reading block 16
E Unable to open module file "/usr/lib/gnumeric/1.8.3/plugins/psiconv/psiconv". E libpsiconv.so.6: cannot open shared object file: No such file or directory

Note that the Workbook stream is only 1032 bytes long; hence it is being put in the "Short Sector Container Storage" [OOo jargon] or the "Ministream" [MS jargon] ... this is unusual, some writing software chickens out and pads short Workbook streams with zeroes so that they are 4096 bytes long and thus qualify for the normal storage, so reading software doesn't get much Ministream reading experience. I suspect that the problem is here.

What is psiconv? Where do those two messages fit into the picture?


--- End Message ---

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]