Re: [gnome-cy] po file tools: msgconv
- From: Kevin Donnelly <kevin dotmon com>
- To: Gnome Welsh List <gnome-cy www linux org uk>
- Subject: Re: [gnome-cy] po file tools: msgconv
- Date: Fri, 4 Apr 2003 17:06:24 +0000
On Friday 04 April 2003 12:26 pm, Alan Cox wrote:
> That conversion is lossy. For non original C locale souce you cannot do
> this. You must retain full encoding properties (ie UTF-8 is about the
> only choice). At the point you meet a non C or 8859-1 encoded file you
> can only safely convert it to/from unicode space, not into iso 8bit
> encodings.
>
> For the Gnome case UTF-8 is mandatory. GNOME doesn't support not utf-8
> encoded files.
So presumably the revised files I sent you earlier were lossy, then? You mean
this in terms of "although they were in UTF-8 format when I got them, prior
to that they went through some hoops which may have thrown away UTF-8 type
info which is now non-recoverable"?
OK - there is a process train currently of the form:
(1) cvs down the relevant pot files
(2) read each file through Kartouche to give a MySQL table
(3) upload that to the Web and present it in a browser interface
(4) user inputs suggestions to the table via a browser
(5) user gets confirmation of suggestions added
(6) on completion read each table through Kartouche to give a po file
(7) run msgconv on the file to convert it to UTF-8
Are you saying that at points (2) and (6) the conversion will be lossy because
I personally am using an ISO-8859-1 charset on my PC and it should be UTF-8?
And that the solution to this would be to use the UTF-8 charset on my PC?
Also at (2), the table currently stores msgids and msgstrs in a text field,
but this can be changed to a BLOB format easily, which is the only way MySQL
can currently store UTF-8. This would then ensure no loss in the db store.
At (3/4/5), does anything else need to be done to avoid people seeing
gibberish in a browser (eg sgrÃ?n instead of sgrîn?). Or perhaps worse,
putting in what looks OK to them (eg using Alt+numpad entries, or using a
font with w^) and having it returned as what looks like gibberish?
You'd mentioned in an earlier email that reporting a UTF-8 charset in the
browser headers should enable most browsers to render it OK, but will this
also apply to older Win PCs, which, as I understand it, were not UTF-8
compliant?
Presumably (7) can then still be used to convert any lingering 8859 encodings
in the file (eg input from a browser on a PC using the 8859 encoding) into
the proper UTF-8 ones.
There are no answers - only questions :-) But it would be nice to be able to
get this definitively sorted before I go any farther.
Best wishes
Kevin
_______________________________________________
gnome-cy mailing list
gnome-cy pengwyn linux org uk
http://pengwyn.linux.org.uk/mailman/listinfo/gnome-cy
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]