Re: Questions about MIME type detection



Alexander Larsson wrote:

>This was a bug, fixed in cvs head.

Thank you very much. I will wait until this shows up in Ubuntu.

Please let me repeat my first question: How does GNOME decide if a given file is
a text file?

Is this description correct (please guide me if there is other documentation
than the source):

- It analyses a buffer with the first 256 bytes of the file

- If the file is empty -> text

- (new) If the buffer contains any zero byte -> no text

- If the buffer passes the test in g_utf8_validate() -> text
  - I assume g_utf8_validate() checks for valid byte sequences that are 
    decodable as UTF-8
  - Does it additionally check if the resulting codepoints are valid/printable 
    Unicode codepoints? Which codepoints are allowed?

- the second part of _gnome_vfs_sniff_buffer_looks_like_text is compiled 
  conditionally:
  #if defined(HAVE_WCTYPE_H) && defined (HAVE_MBRTOWC)
  - WC = wide char? MB = multibyte? Which character encoding/s are covered by 
    this part? Is it a CJK-Issue? Is this part included in an ordinary 
    european Ubuntu system?

Again: Thank you very much!

Redoute



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]