Re: Suggestion for file type detection approach
- From: Brian Nitz <Brian Nitz Sun COM>
- To: Arvind Narayanan <arvindn meenakshi cs iitm ernet in>
- Cc: gnome-devel-list gnome org
- Subject: Re: Suggestion for file type detection approach
- Date: Mon, 05 Jan 2004 18:05:08 +0000
I agree that extensions are much too crude for file association,though
they may be useful as a clue to speed up sniffing (e.g. if I have a
suspician what the filetype is, at least I'll know where to look in the
content for proof) I also like Sean Middleditch's idea for an error
message offering to correct the file extention (to improve compatibility
with legacy OS's ;-)
I've been wondering about another possibility which could improve security:
1) When an application is installed, it registers a private key in the
application registry and enters itself as capable of opening certain
2) When the application creates a document, it signs the document with
it's private key.
3) When gnome-vfs is looking for a document handler, it can determine
whether the file has a handler and it can look furthur (if security is
set to "high") to determine whether the file was signed by a trusted
4) If security is set to "high" and the document signature doesn't match
a trusted application, a popup can ask what to do with untrusted content.
This should eliminate the possibility of *.doc.scr and *.jpg.reg worms.
Obviously we would have to fall back on extension clued content
sniffing for existing and cross platform docs and warn the user that
this is untrusted.
Also AFAIK, OSX has two levels of file association:
- The default view/open application for a file type
- The creator of a file.
This means you can have 2 jpeg images which open with different
applications, one will open with photoshop (because it was created with
photoshop) the other will open with iview (or whatever) because it has
no creator. You could argue the intuitiveness of this behavior, it's
probably more useful to content creators than content viewers.
Arvind Narayanan wrote:
My 2 cents:
* It is fairly common to get misnamed files. For instance, a webserver
has a cgi script that generates a pdf file on the fly and the browser
prompts the user to save it as .cgi. I have seen users become totally
confused by this. Had they been using nautilus they would have opened
the file without any problem.
* I'm not sure if it is "natural" for users to associate file type with
file extension. Even if it is, its just not feasible on Unix. On
MS-windows it is Ok because the OS has an idea about file types, but
on Unix it would definitely be an ugly hack.
* The speed depends greatly on the type of files. If they are mostly
folders, it is very fast but if they are executables or mp3s it is very
slow. The result is that performance is acceptable most of the time
but is very bad some of the time.
* Nothing is displayed before file type is completely determined for all
the files in the directory. IMHO changing this would greatly alleviate
the speed problem. Do file type detection only for files in the current
* Caching would also lead to a big speed boost. At a minimum, sniffed
types for files in the current navigation stack should be cached, so
that back and forward are instantaneous.
* Another suggestion for speed: Use file type as a "clue" for sniffing.
What I mean is:
# If the file ends in .tar:
- First check if it has a tar header. If yes declare it as a tar file
- If no then do a full sniffing to check for each of the other types
# If the file is in /somepath/bin/
- First check if it is an ELF executable.
- If not do a full check
This would make sniffing instantaneous for *most* files.
] [Thread Prev