Re: UTF-8



Sven Neumann <sven@gimp.org> writes:

> The standard 'file' utility seems to do a decent job at detecting
> UTF-8 encoded file. It fails to distinguish some other encodings
> correctly but some quick tests I did showed no false positive or
> negative for UTF-8.

It will fail if a typical UTF-8 sequence does not occur early enough
(16 KB?).  But I admit at least in German it's likely to encounter a 8bit
character within the first 16 KB.

This does not mean I want offer resistance to Damien's proposal.

-- 
Linux frechet 2.4.18-4GB #1 Fri Apr 5 15:14:39 UTC 2002 i686 unknown
  9:34am  up 98 days, 18:54, 14 users,  load average: 1.62, 1.20, 0.95
                                             work    :      ke@suse.de
Karl Eichwalder                              home    : keichwa@gmx.net



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]