Re: GLib file magics



On 28 Jul 2000, Maciej Stachowiak wrote:

> Tim Janik <timj@gtk.org> writes:
> 
> > On 27 Jul 2000, Maciej Stachowiak wrote:
> > 
> > ok, first, when i pondered a suitable solution for BSE, gnome-mime didn't
> > handle magics (and was sitting in gnome-libs, iirc).
> 
> Yes, it used to be in gnome-libs, but it has handled magic-style
> matching for an extremely long time, a time which I am quite certain
> predates BSE. You must not have looked very hard.

maybe i din't look hard enough, but for what it's worth, BSE predates GNOME.

> > so now, i had a look at gnome-vfs-mime-magic.c, and to not require the
> > rest of the gtk-devel audience to compare literall code as well, i've
> > compiled a rough pro/con list for both approaches:
> > 
> > gnome-vfs-mime-magic.c:
> > -       closely tied to file io (even so for the mime specification)
> 
> This is lame, and we are planning to add versions of the interfaces
> that can work on in-memory buffers (and tell you how much a prefix you
> need to do all the magic tests, or have some kind of callback setup).

good.

> > -       misses 'u' prefix for types
> > -       doesn't handle an of the >, <, x, &, ^ test checks
> 
> I have no idea what these are, I assume this is for additional kinds
> of tests. We have not missed these in defining any types so far, as
> far as I know, so I am not sure why that's espcially bad.

take a quick loog at your /usr/share/[misc/]magic file, e.g.:

0       string          !<arch>\ndebian
>8      string          debian-split    part of multipart Debian package
>8      string          debian-binary   Debian binary package
>68     string          >\n             (format %s)
0       string          \351,\001JAM\           JAM archive,
>7      string          >\0                     version %.4s
20      lelong          0xfdc4a7dc      Zoo archive data
>4      byte            >48             \b, v%c.
>>6     byte            >47             \b%c
>>>7    byte            >47             \b%c

just to name a few.

> > +       features offset ranges (non magic(5)), with NUM:NUM
> > +       can dump mime table
> > +       handles date, ledate types
> > 
> > GMagic:
> > +       implements the important subset of mime(5) including
> >         comprehensive numerical tests
> 
> When you say mime(5), do you mean magic(4) or do I have a wildly
> different set of man pages from you?

sorry, screwed that one, s/mime(5)/magic(5)/ ;)
the mime man page is mime(1), the magic man page is magic(5) here.

> > +       provides size type extension (required by gimp)
> > +       accounts for match collisions with priorities
> 
> The gnome-vfs-mime interface does too, I think, although priority is
> implicit in the ordering of the list of magic rules.

yeah, so for gimp or bse that means that ordering depends on the
order in which plugins are loaded and therefore magics get registered,
that's not suitable.

> > +       provides a generic interface (no global list, return
> >         values are user-defined)
> 
> I claimed earlier that this is not particularly an advantage. Maybe I
> am wrong, but I;m not sure why.

i tried to outline that in my last mail.

> > > The current code does assume a
> > > central list of magic info and filename/extension patterns, and always
> > > returns a mime type rather than user-defined data, but I see no
> > > particular reason to allow either the match set or the return value to
> > > be user-supplied.
> > 
> > all in all, i'll certainly not move that code into glib, it'd
> > not be suitable for gimp (doesn't feature normal mime(5) numeric
> > matches), pixbuf (and can't operate on byte streams) or bse (doesn't
> > provide the magic table in a file, gnome-mime relies on ordered
> > magic registration since it doesn't account for collisions).
> > it more looks like you'd get a huge win out of basing the
> > mime magic matching backend on GMagic and simply use GMagic's
> > data pointer for mime type specs. that is, once GMagic
> > features ranges, though i'll probably not add those if i don't
> > see backend reuse intends from the gnome-mime side.
> 
> We could only do this if it were possible to implement all the
> gnome-mime features based on it. 

as i said, if there's a clear intend to base gnome-mime magics on
GMagic, i'll add string matches within specified ranges.

> > as for why a user-supplied gpointer data; member is usefull,
> > well, gnome-mime can stuff it's mime-type string/struct there,
> > BSE can store a procedure type of a loader function there,
> > gimp can use that for a PDB proc identifier, and gmagictest
> > can use it for storing messages, i do see a use there ;)
> 
> It seems more logical to me to map a mime type to a loader after
> determining mime type, than to require everything to construct it's
> own magic table mapping to loaders. One benefit of this is that even
> if you don't know how to load a file type, the centralized file type
> database knows what it is, and you can use that info to give a nice
> error message, like "The GIMP cannot load files of type
> image/x-proprietary-patented-microsoft-format" instead of just bombing
> generically. In fact, gnome-vfs also provides a layer for translating
> the mime type into a human-readable string.

that'd either mean that plugins can only load files that the
mime types are known about and there'd in some cases have to be
different mime types for different loaders that just happen to
implement loading facilities for different versions of the
"same" file format, or mime types have to be dynamically addable
by the plugins, as well as magics that just map to those new
mime types, introducing an extra layer of indirection that's
completely bogus from the point of the plugin implementation.
not to mention that all gimp saving/loading plugins would have
to be converted to provide mime types in addition to the magics
they currently come with.

> Anyway, I was hoping to reduce the amount of wheel reinvention going
> on here but I guess I have failed. I'm looking forward to the "which
> API should I use to determine file types" questions from developers.
> :-)

sure the intend was good. and at least you pointed my nose at the
gnome-mime magic implementation and had me compare both approaches.
in the end GMagic might benefit from that by featuring ranges, and
gnome-mime might benefit from that by featuring the standard mime
match set as well (and hopefully use a speedier backend for the
match attempts).

for developers that are unsure about how to retrive file types,
well, there's gnome-mime with a rich set of standard mime types,
go ahead, match your files! ;)
if they are masochistic enough to provide all magics on their own,
well, then we at least provide a decent matching backend for that
in glib, i think the distinction is clear enough ;)

> 
>  - Maciej
> 

---
ciaoTJ





[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]