Re: Why file content sniffing sucks
- From: "W. Michael Petullo" <mike flyn org>
- To: nautilus-list gnome org
- Subject: Re: Why file content sniffing sucks
- Date: Wed, 24 Dec 2003 11:50:24 -0600
> - The information generated by it is proven to not be accurate enough to
> be used by a program do determine its actions (the above example and
> the dozens of related bugs in bugzilla are sufficient). This data
> should be used merely for informational purposes to the user in the
> Properties dialog, for example.
The point I have been trying to make for a while is that systems should
have ONE magic database. This database should be shared by file, GNOME,
KDE, etc. The problem of maintaining a database that correctly identifies
types, given the diverse range of files out there is big. One database
should be maintained by all so that it is as accurate as possible.
Note that there are GNOME bugzilla bugs that (as a sub-point) ask,
"I've gotten GNOME to identify file X as type Y, now how do I do the
same for the file command?"
> - It reduces performance of Nautilus to a dead turtle:
This is probably due to bad design instead of a result of the idea itself.
The file command, for example, compiles its magic database into a more
quickly read format. This could possibly speed up things. Some kind
of intelligent cache system may as well.
For what its worth, here is what the author of the file command (the
version usually found in Linux distributions at least) said about
modifying file to use the freedesktop.org magic/mime database
specification:
Yes, I've seen it. I don't know if parsing xml is really worth it.
I'd rather have something compile xml into a format that can be
parsed quickly. file, even goes to compile the magic entries so
that it does not have to do any work reading them. The other
issue I have with it is the priority code. I think that there
should be something in file computing the strength of each
magic number depending on the length and a frequency map,
and auto-sorting magic entries. I am planning this for the
next version of file. I don't like depending on the extensions
of files. What I do like is the ability to utilize the file
database to produce different kinds of output (mime, text, etc)
which the xml stuff gives you and file kludges horribly. So yes,
I am not happy about the format of file entries, but I am also
not happy about the way the xml stuff was done. I mentioned this
years ago to the shared-mime-info folks, but I don't think they
understood what I was saying with respect to generalizing file
to handle things such as tiff or jpg files properly and sorting
magic according to strength.
--
Mike
:wq
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]