Re: Questions about MIME type detection



On Tue, 2006-08-22 at 13:43 +0200, Redoute wrote:
> Alexander Larsson wrote:
> 
> >This was a bug, fixed in cvs head.
> 
> Thank you very much. I will wait until this shows up in Ubuntu.
> 
> Please let me repeat my first question: How does GNOME decide if a given file is
> a text file?
> 
> Is this description correct (please guide me if there is other documentation
> than the source):
> 
> - It analyses a buffer with the first 256 bytes of the file
> 
> - If the file is empty -> text
> 
> - (new) If the buffer contains any zero byte -> no text
> 
> - If the buffer passes the test in g_utf8_validate() -> text
>   - I assume g_utf8_validate() checks for valid byte sequences that are 
>     decodable as UTF-8
Yes
>   - Does it additionally check if the resulting codepoints are valid/printable 
>     Unicode codepoints? Which codepoints are allowed?
No
> 
> - the second part of _gnome_vfs_sniff_buffer_looks_like_text is compiled 
>   conditionally:
>   #if defined(HAVE_WCTYPE_H) && defined (HAVE_MBRTOWC)
>   - WC = wide char? MB = multibyte? Which character encoding/s are covered by 
>     this part? Is it a CJK-Issue? Is this part included in an ordinary 
>     european Ubuntu system?

This part tests if the test is valid in the current locales encoding.


=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
 Alexander Larsson                                            Red Hat, Inc 
                   alexl redhat com    alla lysator liu se 
He's a gun-slinging drug-addicted dwarf haunted by memories of 'Nam. She's a 
cold-hearted French-Canadian wrestler fleeing from a Satanic cult. They fight 
crime! 




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]