Re: [Banshee-List] utf8 validation
- From: Raphael Slinckx <raphael slinckx net>
- To: banshee-list gnome org
- Subject: Re: [Banshee-List] utf8 validation
- Date: Sat, 19 Nov 2005 14:31:27 +0100
On Sat, 2005-11-19 at 13:33 +0100, Martin Probst wrote:
> > I won't comment about the patch, but i think it's not a great idea to
> > support a broken mode for id3V1 tags. They were not meant to hold utf8
> > chars, only iso-8859-1, why not stick to the rule ?
>
> Was that standardised? Anyway ASCII is a proper subset of UTF-8, and
> it's AFAIK very unlikely to mistake an ASCII based single byte charset
> for UTF-8 when it isn't. So checking if it is valid UTF-8 and using that
> if it fits will work for 90% of the tags as they are ASCII anyways, and
> I personally have not seen a single German ISO-8859-1(5) string that
> would also be valid UTF-8 if it's not ASCII anyways.
>
> Meaning I think that's a very valid choice, it will benefit UTF-8 users
> and probably only give a very minor performance hit for ISO-8859-1
> users. I like it :-)
Ok, i see in the v1 testsuite the following blurb:
> extra
> Tests that test additional capabilities in the ID3 reader. The
> charset of ID3 isn't formally defined, so both ISO-8859-1
> capability as well as UTF-8 capability is tested. Also, some
> readers detect URL:s in the comment field, so this is also tested.
However, most people and tagging library use iso-8859 which is thus a de-facto standard.
Will people who use iso-8859-1 with characters in the higher ASCII space
1xxxxxxx be affected by the UTF8 interpretation of the text in
iso-8859 ?
See also http://www.htmlhelp.com/reference/charset/latin1.gif for the
chars in latin1
Raf
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]