Translation problems and bad strings
- From: Telsa Gwynne <hobbit aloss ukuu org uk>
- To: gnome-hackers gnome org, gnome-doc-list gnome org,gnome-i18n gnome org
- Subject: Translation problems and bad strings
- Date: Sun, 4 Jan 2004 11:10:44 +0000
Really sorry about cross-post: gnome-i18n know most of this
but perhaps not the explanations for a couple of the strings
below; gnome-doc-list need to know what chaos "Control" caused;
and gnome-hackers is my attempt to catch the hackers.
Not at all sure where replies should go. Use your common sense :)
Some time ago I asked gnome-i18n for what they thought the worst
strings to translate in Gnome were. And got a pile of answers
which I then didn't summarise on-list. Reading the threads about
"what problems do translators face", I am reminded that I should.
I know a load of hackers are familiar with this, but for
those who aren't, here's quick description of the translation
process. There are loads of ways of translating: editing po files
by hand, using a web interface, using gtranslator or KBabel; but
they all revolve around a list of strings which go like this:
#: /module/path/path/filename
msgid "Original string here with occasional <b> or \n marks"
msgstr ""
...and the aim of the game is to fill in msgstr. You do not
necessarily have any more context than that, and you have
to guess what some strings mean. (For example, if you don't
have a CD drive, you can't start rhythmbox to see whether
a particular message appears when you stick a CD in..)
I begin with this one, which started the whole thing. A remark
from one of the Arabic translators on IRC:
<olimar> joke of the month in a party: "Model column to search
through when searching through code"
I'm still not sure what this means..
There are lots of things programmers can do to help here:
particularly they can put comments by the strings in the code
saying things like "Translators: this is seen by.." or "This
refers to..". For some apps there is such specialised vocabulary
that this can really help. Unfortunately, nearly everyone
decides to translate gtk+ early on (it makes up half of the
strings in developer-libs), and it's full of such strings as above
and there is not really a lot you can do about it without a
gigantic split into "messages for users" and "messages for
developers". Having said that, anyone who finishes gtk is
well set up to finish most of Gnome :)
So here are the sort of strings translators were dealing
with in the summer. The aisleriot examples have gone, I think
(yay Callum!): but the rest were all around in the summer.
Historical (ie, gone, to the best of my knowledge):
---------
* gnome-games/aisleriot:
msgid "borp"
Read it backwards: "prob"...
(Abel Cheung -- who later figured it out, decided
it was cute, and put "melborp" in somewhere else :))
* gnome-games/aisleriot,
* nautilus/somewhere I forget:
msgid " of "
This is things like "king of hearts" or "file 1 of
8". This flatly won't translate in some languages.
Malayam (.ml) needs to see "n of m" for numbers
and change it to say "of m, n". Different words
are used for "of" in Welsh (cy) for "1 of 10" and
"the king of hearts". There's also a further
complication for cy which I don't think I can
explain in a single line, so you are spared.
What do these mean in _English_?
-------------------------------
#: aisleriot/golf.scm.h:3
msgid "bdc\n"
msgstr ""
Debugging message referring to "button-double-clicked"
subroutine.
(Abel Cheung)
#: several places, apparently
msgid "Control"
msgstr ""
"That's just gorgeous - Is is a verb? A noun? What
kind of noun? Where can I find it in the app?"
(Stanislav Visnovsky)
"Control" crops up in strings all over the place.
Months later I discovered that this is the Gnome
Docs Project official word for "widget", because
"widget" is thought not a good word to give to
end-users. This is probably true. But translators
do at least know "widget"; and it doesn't have another
eight possible meanings. And at least some
translators didn't know "control" was in the docs
team's word list of Good Words.
#: gtk+
msgid "IM Preedit style"
msgstr ""
from Ole (dk), who noted that you can figure out that IM is
input method from other entries (if you're working from the
po file and not from a web interface), but preedit?
Some months later, Dave Malcolm explained on IRC, and I think I
shall share for anyone who didn't know:
<DaveMalcolm> olimar: GTK has small "plugins" that handle text
input in different ways; they take keyboard input and convert
the keypresses into text being typed. If you right-click in an
entry box or gedit you can select the method.
<DaveMalcolm> "pre-edit" is where a preview of your edit appears
in the control; so for Japanese you might type the romanised form,
and have that appear in grey as the preedit string, which might
later get converted into hiragana/katakana/kanji characters
depending on further input.
So now we know! Thanks, Dave.
Error messages:
--------------
gcalctool is a great example for this. There are over _forty_
strings which are error messages referring to the inner workings
of the code. For example:
msgid ""
"*** B = %d ILLEGAL IN CALL TO MPCHK.\n"
"PERHAPS NOT SET BEFORE CALL TO AN MP ROUTINE ***\n"
msgid ""
"*** ERROR OCCURRED IN MPROOT, NEWTON ITERATION NOT CONVERGING PROPERLY ***\n"
msgid "*** ABS(X) NOT LESS THAN 1 IN CALL TO MPEXP1 ***\n"
Words that cause problems for different languages:
-------------------------------------------------
* Multiple languages distinguish between "key" as in "thing that turns
in lock" and "thing on a keyboard". It's easy to guess when you're
translating gconf itself or acme itself which it should be. In other
files though, it's not so easy.
* "Package" and "packet" seem to be the same word in more than one
language (Welsh, French)
* "Render":
Render is very hard to translate, at least for Danish. Sometimes it
means "draw", sometimes "generate" or "create", sometimes "copy to
screen". Usually it is some sort of combination.
-- Ole Laursen
Abel said it was the same for Chinese.
* "Antialiasing":
Perhaps it's just in Welsh, but we in the cy team had endless trouble
with this. Translating the parts of the word made no sense. Trying to
make up a word which explained what the technique meant made no sense.
* "Meta"
Anything involving the Greek prefix "Meta" makes olimar unhappy:
trying to find an Arabic equivalent is apparently hard. Metafile,
metadata.. okay, so it's a file about files, and data about data.
Great: so metacity is... erm. No. Ow. There's also the meta key,
but I don't actually remember seeing that in strings.
Long strings of "Noun noun noun noun":
-------------------------------------
Particularly horrible when one or more of the nouns can also be
used as a verb, and common in tooltips and menus. Examples:
# gnome-terminal: NB: totally undocumented feature which jrb
explained to me recently.
msgid "S/Key Challenge Response"
# libbonobo:
msgid "generic factory 'new' moniker"
-- Ole again. Abel suggested "all CORBA keywords"
as well :)
# libbonobo:
msgid "ORB IOR handling moniker"
from Andraz
# nautilus:
msgid: "Image Properties content view component"
from Reinout van Schouwen (.nl)
#: Evolution-groupwise:
msgid "Evolution Calendar Groupwise backend"
I can't remember where it's from, but the record is five nouns (some
of which might be verbs or instructions) in a row. Other "noun? verb?
what?" words can even be "End" and "Finish". "-ing" words have similar
problems. I think the technical term for the variety that isn't a
verb is "gerund", but I can't think of a good example in the po files
offhand (but they are there!)
Miscellany:
----------
Strings that arrived without comment or which don't fit elsewhere.
* "Resident memory set"
* "Minimum Shared Memory Size"
* "Minimum Resident Memory Size"
* "Request obsoletes service's data"
* msgid "Error checking error; no exception"
* "MInternal Error: Weird value (%ld) in do_test\n"
* "Model column to search through when searching through code"
* "FALSE displays the \"invisible char\" instead of the actual text (password mode)"
* "Just because a crosswalk looks like a hopscotch board doesn't mean it is one"
Incidentally, I showed Alan the "FALSE displays.." one and he said
"Makes perfect sense to me." Because he knows what it's talking
about. Non-hacker translators don't.
Some translators make a point of filing every string with a
problem in bugzilla. This takes _ages_ but it helps. But the
problem then is that there is a very limited period when you
can change them.
Most translation teams use the translation status tables which
are at http://developer.gnome.org/projects/gtp/status/ to keep
on top of things. The current 2.5 stuff for each language is at
http://developer.gnome.org/projects/gtp/status/gnome-2.6/XX/developer-libs/index.html
http://developer.gnome.org/projects/gtp/status/gnome-2.6/XX/desktop/index.html
(put any language code in XX: sr, cy, de..)
When strings are changed at all, it upsets all the statistics. So
you have to find a time when you _can_ change them, because those
statistics do matter and do help you keep on top of things. It is
really really disheartening to see your 100% app has suddenly gone
to 81% because someone has altered all the tabs inside the strings;
and even worse when it's a much more substantial change which
requires you to do a lot more than just remove the fuzzy marker.
And towards the end of a release cycle is not the time to do it.
But some of these strings really do have to be changed, or explanations
appended in the comments next to the function that contains the things.
For example, Epiphany goes to the appropriate language page on Google
because of this comment:
#. Translators you should change these links to respect your locale.
#. * For instance in .nl these should be
#. * "http://www.google.nl" and "http://www.google.nl/search?q=%s"
Others found with a quick grep:
evolution/po/cy.po:#. This is a filename. Translators take note.
gnome-applets/po/cy.po:#. Translators - The + and - refer to increasing and decreasing the volume.
I don't have a complete checkout of all CVS, but I have quite a
few modules out. But that's about all there is. A few more of
"Translators: this 'plane' is not the sort that flies but instead
a term used by Unicode' would be really nice. (Actually, that's
a bad example, because apparently the place that appears is not
a place you can put such a comment: but it's a good example of
the sort of word that might need clarifying.)
There used to be a string review period in the release cycle.
It concentrated on the English as far as I know. Making the
English clearer certainly helps translators, but even then
there can be problems. Most translators do the gnome-glossary
early on as a sort of standardising terminology exercise, but
even so we (cy) didn't realise that "control" was the approved
term for "widget" when we met it later on in po files.
So there you are. I don't really know what the solution is, but
I do know that in between 2.4 (we which had at 100% in Welsh)
and now, we have acquired 1000 fuzzy strings and 750 untranslated
in apps which we had done completely; and another 6000 strings
to do from the list of "proposed" so far. That's on top of
16,000 strings which remained constant. That's a lot of strings,
and I dread the changing of them in order to make them more
intelligible to other teams.
But at some stage, some of these have to be fixed in the originals,
which means they become "untranslated" or "fuzzy" in the files
of every team which has done them already. It will make it easier
for new teams. But I'm not looking forward to the process!
Telsa
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]