Re: Questions about MIME type detection



I wrote:

> I have some questions regarding how MIME type detection in GNOME works
> in detail.

Now I wrote about what I have found in a WikiBooks chapter (sorry, in
german) about MIME type detection in GNOME:
http://de.wikibooks.org/wiki/Linux-Kompendium:_Dateitypen_%E2%80%93_die_MIME-Datenbank_in_GNOME


Some additional notes about the text type detection:

IMO the function should be straightened:

* look for zeros -> binary
* if the current locales encoding isn't multibyte -> text, since
everything except zeros is allowed in text files
* if the encoding isn't UTF-8, try to decode the buffer using mbrtowc();
 if successful -> text
* try to decode the buffer using
g_utf8_validate()/g_utf8_get_char_validated(); if successful -> text;
else -> binary


As an alternative the function could stick with the recommendation from XDG:

* look for values 0x00 to 0x1f or 0x7f, except TAB/LF/CR

I think this would make things more easy and fast, and multibyte
encodings don't use this values except for themselves. Don't bother if
it decodes with multibyte encodings, since it might be ISO 8859-1 or
something like that.


Either way, the function doesn't much more than guessing, and it is not
user-configurable, so the result should be treated as unreliable. Make
globbing rules override it in every case, not just for subtypes of
text/plain. Don't let Nautilus complain about security risks, since it
is never a security risk if a binary file "looks like text".

Thank you for listening,

Redoute



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]