Re: new VFS directory listing parser



Leonard den Ottolander wrote:
Hi Roland,

On Tue, 2005-09-27 at 01:27 +0200, Roland Illig wrote:

The current directory listing parser (for ftpfs and extfs) has problems with file names starting with white-space or a four-digit sequence.


The latter is caused by the fact that different types of time format are
accepted: "Mon DD YYYY", "Mon DD hh:mm", "Mon DD YYYY hh:mm" and "Mon DD
hh:mm YYYY". This makes it so the code doesn't know if it parsed the
full date after having parsed the first 3 fields.

If you've just dropped the assumption there are more than 3 fields and
only accept "Mon DD YYYY" and "Mon DD hh:mm" as valid formats the
existing code can quite easily be patched instead of doing a whole
rewrite. See utilvfs.h around line 635 ("This is a special case for
ctime"). Add a var got_time and stop parsing the date after either
got_time or got_year.

After I had rewritten vfs_parse_filetype and vfs_parse_fileperms (see utilvfs.c), I found the interface of these functions general enough so that it could be extended to the whole parsing of directory listings. I think one benefit of my code is that you can use all the parsing functions the same way. If you have used any of them, you know how to use all others. This has not been the case with the old code.

My code makes it easy to add support for other file listing types, as every component of a listing line (filemode, size_or_rdev, filedate, uid, gid) has its own parsing function. If you look at the code of vfs_parse_unix_line(), this should become very clear.

I also didn't like the function names. I expect that a function called is_year(...) does not modify its parameters (see the Java Coding Conventions). Most functions also used sscanf() for parsing, which is inappropriate, as sscanf("%d", "123XXXXXXXXXXXXXXX", &n) will return 123, although "123XXXXXXXXXXXXX" does not represent a number.

And yes, I have dropped that assumption. I have started to collect example listing from some FTP servers, but they are still too few. I will also examine the Indy FTP code (see http://www.indyproject.org/), which has some more examples.

Roland



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]