Re: [Rhythmbox-devel] UTF-8 issues still present
- From: gabor <gabor z10n net>
- To: rhythmbox-devel gnome org
- Subject: Re: [Rhythmbox-devel] UTF-8 issues still present
- Date: Sun, 11 Jan 2004 22:20:06 +0100
On Sun, 2004-01-11 at 02:57, Chris Petersen wrote:
> > This is characteristic of an UTF-8 which was considered as being
> > iso8859-1 encoded (in UTF-8, most 8 bit characters are coded on 2 bytes,
> > and in iso8859-1, 1 character is always 1 byte long).
>
> Yeah, I realized this much. Just used to applications being smart
> enough to detect "bad" utf8 characters and convert them from latin1, or
> "good" utf8 characters and not doing anything with them.
sorry, but imho there is not "bad" latin1 character ... latin1 means
iso-8859-1, and it defined 256 characters... so basically every byte
array is a valid latin1 encoded string => it's impossible to correctly
detect if it's NOT latin1 for 100%. there can be some heuristics, but
that's all.
bye,
gabor
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]