Re: [Translation-i18n] Proposal for declinations in gettext



>
>Subject: Re: [Translation-i18n] Proposal for declinations in gettext
>   From: Danilo Segan <dsegan@gmx.net>
>   Date: Fri, 13 Jun 2003 23:00:34 +0200
>     To: linux-utf8@nl.linux.org
>     Cc: translation@iro.umontreal.ca,  gnome-i18n@gnome.org, 
> translation-i18n@lists.sourceforge.net
>
>Veronica Loell wrote:
>
>>Are you talking about machine translation here? From my perspective as a computational linguist this is not something that should be part of gettext, rather in the tools that use gettext or the tools you use to work with gettext material.
>>  
>>
>Nope, I'm talking about manual translation. The current problem, as I 
>see it, is that words and expressions take on different forms in 
>different contexts.
>
>The usual practice among english-speaking programmers is to "compose" 
>strings out of smaller parts.

Unfortuntately this is a problem in the internationalization, making it 
impossible to localize. It's fine for languages that are like english.
But if you really want to be able to translate to other languages you
just can't do this. 

Your approach as you say will work definately for several languages, but 
my point is that the problem is with the original strings. If you compose 
strings in this way you will only be able to translate them into languages 
that are built the same way as the original.

>
>It's therefore usual to have:
>msgid "Workspace %d"
>msgstr "radna površina %d"
>
>msgid "Desktop %d"
>msgstr "desktop %d"
>
># Here is %s replaced with the translation of "Workspace %d" or "Desktop 
>%d" or ... (something user generated)
>msgid "Switch to %s"
>msgstr "Premesti na %s"
>
>In at least Serbian language, the words "radna površina" would need to 
>change form to "radnu površinu" (it can easily be deduced by translator, 
>based on the "switch *to*", what actually causes the word to change). In 
>German, it might need to change from "der Workspace" to "den Workspace" 
>(I'm not completely certain on the articles here, correct me if I'm wrong).
>
>The current "solution" for these and similar problems are to put the 
>work on programmers, and force them to create "complete" strings, so 
>instead of the former three, one would have four strings:
>msgid "Workspace %d"
>msgid "Desktop %d"
>msgid "Switch to workspace %d"
>msgid "Switch to desktop %d"
>

As I said before, this is the only way that you would be able to translate
to languages that do not use the same structure as the original one.

>
>Please don't mind my ignorance of German, in trying to explain the concept.
>German has 4 declinations (nominative, genitive, dative, accusative, 
>right?). In case of "switch to", I believe "accusative" should be used. 
>So, the German translation for "Switch to %s" would be
>msgstr "Switch to %<3>s"
>
>The translation for "Workspace %d" would look like:
>msgid "Workspace %d"
>msgstr<0> "der Workspace %d"
>msgstr<1> "das Workspace %d"
>msgstr<2> "dem Workspace %d"
>msgstr<3> "den Workspace %d"
>
>So, the title of "Workspace 5" would be "der Workspace 5", while the 
>menu which allows switching to that workspace would read "Switch to den 
>Workspace 5".
>
>>Correct me if I'm wrong, but the "autotranslation" in gettext merge? is only for complete clauses, not for individual words, yes? So this should not be an issue for gettext. For your translation memory and your translation tools that you use to work with gettext, now that is where this thought and effort should go in. I would like very much to see some serious translation tools getting generated in the free software community, but that is a separate issue.
>>  
>>
>I guess you misunderstood me. The translator provides all the forms one 
>word takes, if it's a finite number (no matter how large, but finite). 
>It's along the same lines as Plural-forms feature of gettext, just for 
>the different purpose, and entirely controlled by translator: (s)he uses 
>it when appropriate, with no interference of programmer (who, in case of 
>plural-forms, had to insert ngettext calls instead of gettext).
>
>>Just a note, for Swedish, one would have to deal with compound words, for Finnish and other agglutinating languages you would have to deal with the agglutination etc. etc..  
>>  
>>
>Could you perhaps give a complete example, and why the proposed 
>mechanism couldn't handle it? I certainly have no experience in many 
>languages, but if this would prove useable for at least 20 languages, 
>I'd find it quite desireable in such a widespread package as gettext. I 
>guess I can count in so far at least Serbian, Russian, Ukrainian, 
>Croatian, Slovenian, German, Macedonian, Bulgarian, Belarus, Polish, 
>Czech, Slovak, Lithuanian, ... If this wouldn't work for any of these, 
>please inform me; if it *would* work for any language that is not in 
>this list, than inform me even faster :-)
>

The main difference in English and Swedish is that in English you write
compound words one at a time, in Swedish you would make them part of
one word.

email program = epostprogram
world health organization = världshälsoorganisationen

English is a very isolating language which means that for example the
definitive form of a word is presented by the means of an article. In
Swedish an affix is used instead.

the house = huset

in a truly agglutinating language such as Finnish the entire clause 
"in my house" would be 1 word.

>Such feature would not get in the way to anyone who doesn't use it, but 
>it would solve at least some problems. I guess that should be a reason 
>enough, though if anything better can be imagined, I'd love to hear it.

Your feature would encourage the programmers to keep constructing the sort
of strings that you present in the beginning. My point is that this will 
make the programs impossible to translate for many target languages.

It is a hassle to translate 10 strings instead of 5, but the solution is
to create better tools, not to find shortcuts, imo, especially as the 
shortcuts as is usual with shortcuts will make the job harder for anything
that does not fit this pattern.

- Veronica





[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]