Re: Strings in gnumeric: implementing a gawk pipeline



Oliver Burnett-Hall wrote:
You can use the if() and iserror() functions to cope with cases where
the search string isn't found. This can make the formulas long, repetitive and unwieldy, but it does work.
This is just one of the problems (and indeed one very ugly).

The real problem is, that the spreadsheets I am regularly working with have 70 to > 100 columns and 1000 to > 5000 rows. They contain basically patient data, which are mostly strings (symptoms, ICD10 codes, procedures, ...).

I regularly have ~1000 patients coded in the spreadsheets with mean of 5 days/ patient. (one sheet is basic patient data and one is a detailed day by day expansion) I often need complex searches, which are impossible to do using current functions: I need regexp, transforming string to number and doing numerical operations, searching only one value/patient (in the multi-day sheet), searching strings in a particular order.

I also have written a number of gawk scripts for my purpose. So, I usually have to export as a text file (csv), run the script and then import back. Running it within gnumeric would be ideal; something like a Menu Entry (as for Tools -> Statistical Analysis -> Anova ->...), e.g. 'Tools -> Scripts -> Run Gawk Script'.

Well, while security might be a concern:
- the user MUST explicitly invoke the script
- there could be an option to disable this
- gnumeric could scan the script for the system() command and point the user to this particular problem

Overall, I definitely believe that the benefits of running gawk within gnumeric far outweigh the potential adverse issues. Unfortunately, current spreadsheets are NOT designed for work with strings, BUT there is NO replacement to date. So I am stuck with these spreadsheets. And my solution does NOT really need huge changes within gnumeric, so I hope that it can be implemented really fast.

- Leo



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]