Translation Technology at Sun [ was Re: (no subject) ]

From: Tim Foster <Tim Foster Sun COM>
To: Glynn Foster <Glynn Foster Sun COM>
Cc: gnome-i18n gnome org
Subject: Translation Technology at Sun [ was Re: (no subject) ]
Date: Thu, 22 May 2003 18:16:00 +0100

On Thu, 2003-05-22 at 01:58, Glynn Foster wrote: 
> I seem to remember some of the Sun people saying that translations were
> pretty inefficient in GNOME - maybe it's time to rethink? [1]

Hi Glynn & All,

A quick introduction : I'm both Glynn's brother and an engineer working
on translation technology here at Sun (though I wasn't responsible for
suggesting that GNOME translation was inefficient, nor was I in the pub
with Glynn at the time :-)

To start with, GNOME translation continually amazes me : it's totally
astounding the number of languages GNOME has been localised into, and
I'm very impressed by the progress that has been made - nice one guys !

On the other side, it's possible that the process for translation could
be improved, as witnessed by the build/release engineering discussions
elsewhere on this thread that relate to CVS and pot files, but that's
not my area of expertise.

Regarding the use of technology to make life easier, Gtranslator looked
like it was going the right direction. FWIW, here's some of my
experience with translation technology. 

At Sun, we use translation memory technology across all of our
translations, mostly documentation at the moment, since docs account for
the highest volume of translation - re-using existing translations from
HTML and SGML was the first thing we went for. (there's other existing
tools for software TM which we'll roll into the current system that does
SGML, HTML and XML thus giving us a little more consistency between docs
and software at a later date - work in progress)

We have a large TM with several hundred thousand segments (usually
sentences) and their translations which is growing all the time.

All these segments reside in a single Oracle database. We have a fast
lookup program to search for fuzzy matches on this database for every
string being presented for translation. Any exact and fuzzy matches are
presented to translators using a standard editor, which they can quickly
modify to provide the final translation.

We're also using XLIFF, a standard managed by OASIS designed for the
translation process, which allows us to take the format-specific stuff
out of the translation loop, and allow translators concentrate on the
text being translated.

There's an article up on Sun's Global Developers Application Corner that
may be of interest :

http://www.sun.com/developers/gadc/technicalpublications/articles/xliff.html

(there's a screenshot of our translation editor there as well)

Gtranslator seems to use TMs okay, but just not at the same level - the
databases appear to be quite small and it only does po files.

The problem with trying to apply this approach to GNOME translations,
across the community, is logistic. 

Since we can provide XLIFF files offline that contain partially
translated text for translators to complete in a translation editor, can
you have a single database server that gets looked up whenever you want
to release new strings for translation (probably integrated into the
release engineering process somewhere) ?

Once you've got a centralised database of strings across the GNOME docs
and software, you get translation consistency across tools for free (new
translations re-use existing translations) and you spend less time doing
translation : needless to say, this would be a very Good Thing.

[ for Sun's software translation effort (Solaris, Java, etc.) we get the
bonus of not having to spend as much cash on doing new translations,
since we can re-use translations across product lines ]

Anyway - this mail was just a "here's what we do" - rather than a "my
tool's bigger than yours", which could really be taken out of context
(in two ways, apparently) and that's not what I meant at all :-)

	Hope this is of interest ?

		cheers,

			tim

Follow-Ups:
- Re: Translation Technology at Sun [ was Re: (no subject) ]
  - From: Gudmund Areskoug

References:
- (no subject)
  - From: Christophe Fergeau
- Re: (no subject)
  - From: Havoc Pennington
- Re: (no subject)
  - From: Christian Rose
- Re: (no subject)
  - From: Havoc Pennington
- Re: (no subject)
  - From: Glynn Foster

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]