If we want add support for different spreadsheet formats, we can add some filters using Gnumeric support tools: SSCONVERT "It is a command line utility to convert spreadsheet files between various spreadsheet file formats." (from man ssconvert) Supported formats: .as Appliz .oleo Oles .wb? Quattro Pro .dbf Xbase Paradox Database or Index SC/xspread .wk? Lotus 123 .dif data interchange .mps linear and integer programming MultiPlan (SYLK) .csv comma separated values .tsv tab separated values Plus, of course the .gnumeric format and MS Excel. SSINDEX "It is a command line utility to generate index data for various spreadsheet file formats." (from man ssindex) This is really really good tool for our purpose. It's used by Beagle too. While ssconvert is used to convert the format of a spreadsheet (so we should export it in a know format, extract relevant stuff from here, then delete temporary stuff), ssindex do the hard job for us. Running `ssindex -i` on attached Gnumeric spreadsheet you have: <?xml version="1.0" encoding="UTF-8"?> <gnumeric> <data>January</data> <data>January</data> <data>Travelling:</data> <data>Total:</data> <data>Buy new music:</data> <data>Drinking:</data> <data>Buy new RAM:</data> <data>Eating</data> <data>February</data> <data>February</data> <data>Travelling:</data> <data>Total:</data> <data>Buy new movies:</data> <data>Drinking:</data> <data>Buy printer ink:</data> <data>Eating</data> </gnumeric> All relevant data in XML, ready to be indexed. Numbers are formulas are purged. Maybe it could be able to extract metatada, but I wasn't able to save them in Gnumeric... The only limitation seems the supported formats: running `ssindex -i` you have this list: application/x-gnumeric application/csv application/tab-separated-values text/comma-separated-values text/csv text/spreadsheet text/tab-separated-values text/x-comma-separated-values Not so much, but honestly I was able to successfully run it on Excel too, so maybe it could work on old formats like Lotus123 and Quattro Pro. I've to find some of those files on the Net. So, anyone ready to work on it?
Attachment:
cash-flow.gnumeric
Description: application/gnumeric