Re: csv (comma separated value) file
- From: "Freddie Unpenstein" <fredderic excite com>
- To: gtk-list gnome org
- Cc: liam holoweb net
- Subject: Re: csv (comma separated value) file
- Date: Tue, 04 Aug 2009 07:03:51 -0400
From: "Liam R E Quin", Date: 04/08/2009 03:21, Wrote:
> On Mon, 2009-08-03 at 12:45 -0400, Tristan Van Berkom wrote:
>> Currently its pretty easy using g_file_get_contents()/g_strsplit()
> CSV files are not just comma separated, and in some cases can have
> column headers and other metadata. There's also escaping.
>
> a,b,c\d,e
> a,b,"c,d",e
> a;b;c,d;e
This would have been worth mentioning on the original post... ;)
> You also have to deal with differing line ending conventions.
GIO should be able to handle that, I'd have thought... Even if it doesn't, read the file, and look for a column separator, or NL or CR. If you get an NL or CR, ignore any that follow it and send the line so far where it belongs.
> It's enough of a mess that both MS Office and most other
> office programs today seem o use XML instead :-)
> Probably gnumeric has code for this, though.
If it's coming form OO or Excel or something, and/or you need to be able to read arbitrary CSV files, then one of the ones that were suggested would likely do.
But if you know what you've got, it's probably easier to do it yourself anyhow. CSV isn't generally a particularly complex grammar, so all it usually takes is a cell parser that returns the first character that's not part of the cell. (It'll be either the cell separator, or the end of the line.) And if the parser you're using doesn't quote match the grammar you've got to parse, then you can get through all that work and find you've got to go back to the start and do it all yourself anyhow, or spend three times as long trying to hack in special cases and pre-parsing and what-not.
If the columns have strict formatting (like number, string, etc.), then a bunch of small cell parsers called in sequence will do it. Otherwise a generic cell parser that checks for and handles quoting and escaping is all that's needed. I've usually found it easier than fiddling with generic parsers. Even if the file has cell headings at the top, and columns can be re-arranged, just build an array with the parser to use for a given column, the location to stick the data in, and off you go.
Fredderic
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]