Re: Excel Test Files
- From: John Machin <sjmachin lexicon net>
- To: "Martin Kotulla (SoftMaker)" <martin-k softmaker de>
- Cc: gnumeric-list gnome org
- Subject: Re: Excel Test Files
- Date: Fri, 09 Feb 2007 00:21:16 +1100
On 8/02/2007 10:43 PM, Martin Kotulla (SoftMaker) wrote:
John Machin schrieb:
Please let me introduce myself: As well as being a user of Gnumeric, I'm
also the author/maintainer of xlrd, a Python package for programatically
extracting data from XLS files.
In addition to what Morten wrote: those files appear to have been
written by Softmaker, not by Excel. They are also a test of how tolerant
XLS readers are when faced with XLS files that *don't* quite match what
Excel would write. [...]
John:
Thank you for these specific reports. I have forwarded them to our
developers. Let's see what they can do to make PlanMaker files more
amenable to your routines.
Hi Martin,
Thanks for your concern and your kind offer, but I have already made
xlrd more tolerant of the strangnesses in your files :-)
Of course, as you are most certainly aware, there is a wide range of
file format variations that Excel has written over the years.
Our files are within that spectrum,
Ah yes, but the idea is supposed to be that you pick on a version and
write a file that corresponds with that version; using the BIFF2 record
code and the BIFF8 layout for ARRAY records is stretching "within that
spectrum" just a little ;-) So is writing 51 or 32 colours in a PALETTE
record when Excel writes 16 or 56. What should a reader do about the
missing 5 or 24 colour indexes: hope they are not used in the file? map
them to the corresponding RGB values in the Excel default palette?
BTW, try opening the array_pm06.xls with Excel 2003 and saving it as
some other name. When I did that, Excel didn't write out a PALETTE
record, indicating that there were no used (colour index, RGB)
combinations that weren't in the default palette -- IOW, the PALETTE
record is redundant. Also, compare the contents of the PALETTE record
with the BIFF8 default palette -- I could be wrong, but it appeared to
me that most of the entries were just the standard palette entries
offset by two; this looks much more accidental than intentional i.e. I
suspect a bug.
and Excel and OpenOffice.org have no problem
opening them.
They have been at it for a longer time with a much greater volume. I
don't imagine that their version 0.1 was so tolerant. I suspect it's
just like my experience: some liberally-written file has caused a crash
or an assertion to fail, they've inspected the file and decided whether
they can ignore the non-conformance or must refuse to open it or may be
they can open it, with some kind of warning.
So, use them as yet another test case for how flexible
your code must be.
I am, within reason. The antique record code is now accepted without a
murmur, a truncated PALETTE record generates a NOTE message, and the
file-structure inconsistency generates a WARNING message.
Be liberal in what you accept... :-)
Indeed, and this is necessary only because some folk act as though in
blissful ignorance of the second half of that quotation: ... "and
conservative in what you send" :-)
Regards,
John
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]