Re: UTF-8



German Poo Caaman~o wrote:

> El mié, 10-07-2002 a las 12:28, Damien Donlon - Sun Ireland - Solaris
> Software - Localisation Engineer escribió:
> > [...]
> > What to do ?
> > _________________________
> >
> > I think the following things ought to be done :
> >
> > [1] Identify what are the usage limitations of UTF-8 for some translations
> >     teams and identify how they can be eliminated ( the limitations not
> >     the translation teams ;-) )
>
> Agree.
>
> > [2] Create a tool that can check whether a file is UTF-8 encoded.
> >     The tool should not be dependent on simply reading a charset field
> >     within the file to see whether it says UTF-8 but by analysing the
> >     byte stream. Does such a tool exist already within the community?
>
> file(1)

Hmmm,  Linux 'file' appears to be a bit more versatile than 'file' on
Solaris ;-) My 'file' doesn't have the 'magic' to identify UTF-8 by
default.

Regards,
Damien

>
> >     I think it may be impossible to distinguish between UTF-8 and 8859-1
> >     if no character is outside the 0-127 range. Can anyone confirm? Is
> >     this a big problem in identifying UTF-8 encoded files?
>
> 8859-1 use 0-255 range, AFAIK.
>
> >     The tool would be provided to translation teams to check their files
> >     prior to the cvs commit.
>
> file(1) and the header check script, anything else, I guess.
>
> --
> German Poo Caaman~o
> mailto:gpoo@ubiobio.cl
> http://www.ubiobio.cl/~gpoo/chilelindo.html
> «Hay 10 tipos de personas: las que entienden binario y las que no.»
>
> _______________________________________________
> gnome-i18n mailing list
> gnome-i18n@gnome.org
> http://mail.gnome.org/mailman/listinfo/gnome-i18n




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]