Re: Filetype identification in gmc




>From: mawarkus@t-online.de (Matthias Warkus)

>The problem here is that it doesn't scale, if you've got a directory
>with 10,000 files, you need to do 10,000 calls to file, which spawns
>10,000 processes one after another and does 10,000 open() calls.
>
>Over a network, this could take hours.


This isn't a very good arguement.  Why can't it just do a single "file *" in 
each directory, and then parse the output?  Make sure you don't over-run any 
buffers (in your 10,000 file case) but you get 1 call to file, spawning 1 
process, which returns quite quick, even over a network, scaled by the number of 
files - not hugely longer than what we have now.  And it uses a system resource, 
rather than reinventing the wheel, and making it triangular, as we have now.

Note that, at least on Solaris, you have options to (a) use your own magic file 
(so users can extend the list without needing root access) and (b) give it a 
list of files, for file selections and in case the "*" option isn't quite right.

Have fun,

-------My opinion - Not sane, intelligent or necessarily useful-------
o o                                      mailto:Moredhel@earthling.net 
/v\ark R. Bowyer.  http://i.am/Moredhel  mailto:Mark.Bowyer@UK.Sun.COM
`-'         "Everything is true, for a given value of 'true'" - PTerry




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]