If we want add support for different spreadsheet formats, we can add
some filters using Gnumeric support tools:
SSCONVERT
"It is a command line utility to convert spreadsheet files
between various spreadsheet file formats." (from man ssconvert)
Supported formats:
.as Appliz
.oleo Oles
.wb? Quattro Pro
.dbf Xbase
Paradox Database or Index
SC/xspread
.wk? Lotus 123
.dif data interchange
.mps linear and integer programming
MultiPlan (SYLK)
.csv comma separated values
.tsv tab separated values
Plus, of course the .gnumeric format and MS Excel.
SSINDEX
"It is a command line utility to generate index data for various
spreadsheet file formats." (from man ssindex)
This is really really good tool for our purpose. It's used by
Beagle too. While ssconvert is used to convert the format of a
spreadsheet (so we should export it in a know format, extract
relevant stuff from here, then delete temporary stuff), ssindex
do the hard job for us. Running `ssindex -i` on attached
Gnumeric spreadsheet you have:
<?xml version="1.0" encoding="UTF-8"?>
<gnumeric>
<data>January</data>
<data>January</data>
<data>Travelling:</data>
<data>Total:</data>
<data>Buy new music:</data>
<data>Drinking:</data>
<data>Buy new RAM:</data>
<data>Eating</data>
<data>February</data>
<data>February</data>
<data>Travelling:</data>
<data>Total:</data>
<data>Buy new movies:</data>
<data>Drinking:</data>
<data>Buy printer ink:</data>
<data>Eating</data>
</gnumeric>
All relevant data in XML, ready to be indexed. Numbers are
formulas are purged. Maybe it could be able to extract metatada,
but I wasn't able to save them in Gnumeric...
The only limitation seems the supported formats: running
`ssindex -i` you have this list:
application/x-gnumeric
application/csv
application/tab-separated-values
text/comma-separated-values
text/csv
text/spreadsheet
text/tab-separated-values
text/x-comma-separated-values
Not so much, but honestly I was able to successfully run it on
Excel too, so maybe it could work on old formats like Lotus123
and Quattro Pro. I've to find some of those files on the Net.
So, anyone ready to work on it?
Attachment:
cash-flow.gnumeric
Description: application/gnumeric