Re: Handling Translations

From: Joakim Ziegler <joakim ximian com>
To: Christian Rose <menthos menthos com>
Cc: gnome-web-list gnome org
Subject: Re: Handling Translations
Date: 31 Aug 2001 15:08:55 -0500

On Fri, 2001-08-31 at 15:22, Christian Rose wrote:
> Joakim Ziegler wrote:

>> Please do not make this out to be me trying to make life difficult for
>> the translators. Why would you think I want to do that?

> <RANT>
> I'm not saying that you are purposely trying to make life difficult for
> translators. It's just that everytime we discuss translations, your
> attitude that you know translation better than translators annoys me a
> lot. I keep spending my time on explaining why po format is needed for
> translators over and over, and I'm getting sick of it.
> </RANT>

I don't say I know how to translate better than translators (although I
am multilingual and have done some translation work myself, but not for
GNOME). However, I do consider myself to be somewhat knowledgable in the
area of information management systems, which includes translation
systems. Being a good translator does not make you an expert on the
technology behind a translation system. As you have pointed out
yourself, translators are usually not hackers. So even though the
translators are the end users of a translation system, they might not be
the best people to design it. This holds true for all areas of software
design. But I digress. On to the points.

>> Notification of changes could be done in many ways. Trivially, the page
>> is changed if the datestamp of the master page (most likely the English
>> one) is newer than the translated pages. There are also more advanced
>> ways of tracking it, if we want more complexity.

> And how do you mark what has changed? How do you merge translations, so
> that the same original word occurring twice, three times, four times,
> and so on, only needs to be translated once? How do you partially re-use
> existing translations (and mark them for inspection) when a new text,
> that is similar to an old already translated one, is added to the pages?

> Any solution extracting the page strings to po format, which allows use
> of gettext tools on top of that, solves all those problems. These tools
> are essential to translators.

> Gettext support is already built into PHP, so it's really your
> replacement technology that adds complexity.

I've also not seen any good examples of sites actually *using* the
gettext support in PHP (except for that Italian site that was cited
somewhere in this discussion). That makes the pragmatist in me skeptical
about how proven this is for use in site translation, both from a
performance and complexity point of view.

At the very least, I'd like to see pregenerated pages instead of using
gettext dynamically. gettext can add a lot of overhead in interpreted
languages, I know this from experience with a one order of magnitude
slowdown in the XST backends when we started using Perl gettext. Lots of
hacks were necessary to get it back to a reasonable speed. I'd like to
hear from Joshua how difficult that would be with the template system
he's suggesting.

>>>> I believe it's pretty common to use foobar.en.html and so on.

>>> And that doesn't mean that we have to do an inferior and very much
>>> broken solution like that when doing a brand new site. Listen to
>>> translators for once, and help them help you, instead of ignoring them
>>> and their plea for the proper translation interface on purpose.

>> Please stop erecting strawmen like this. Why do you think I'm "ignoring
>> them and their plea for the proper translation interface on purpose"?

> <RANT NUMBER="2">
> It's just that I get the feeling that you know better everytime we
> discuss what is needed for translators.
> </RANT>

I don't. I see that "this is how most sites that do translations do it",
while there are very few sites that use gettext. That makes me think
that maybe all those other sites have gone through this discussion too,
and maybe they found problems with using gettext. It's worth noting that
the GNU site, creators of gettext, don't use gettext for translating
their site, as far as I know (if they do, they pre-generate, but I don't
think the GNU site is pre-generated, from my involvement with it).

You're a translator. You think about what's easy to translate with.
That's a valid concern, but it's not the only concern that should be
considered in making decisions about this. Other concerns are
performance of the website, maintainability of the underlaying system
used to create and keep the site up to date, and so on.

>> The main problem can be summarized as such: *The nature of text in
>> software is very different from the nature of text on webpages*.

>> Text in software consists of short, relatively independent strings. This
>> is exactly what gettext is created to manage. When a string changes,
>> you'll know, and you can translate that string again.

>> On the other hand, text on webpages is prose. It's long passages of
>> text, and it's *highly interdependent*.

> No, the weak spot of gettext is very, very short strings (lack of
> context). It's ideal for short paragraphs (usually more than enough
> context).
> I also translate documentation (not for GNOME though), and I have yet to
> see any documentation that has too long paragraphs for translation.
> Documentators know that long paragraphs are problematic for readers, so
> the paragraphs happen to be just the right size for translators too.
> Also, the interdependency is not a problem, it is not a problem in
> documentation, so I fail to see why it would be for web pages. See
> below.

As far as I know, GNOME documentation is not translated using gettext.
Although someone said that KDE does that, and that's definitely
interesting, it provides a data point for someone using gettext for
translating something more like the text on web pages.

Thank you for addressing my other points on chunking of text for
translations, etc. I'm still not convinced that gettext is ideal for the
job, but you've alleviated some of my worries.

I think gettext might work, but it's imperative that we find some way to
do pregeneration of static pages in this case, because if not, there's
going to be an unacceptable performance hit.

-- 
    Joakim Ziegler - Ximian Engineer - joakim ximian com - Radagast IRC
 FIX sysop - Free Software Coder - Writer - FIDEL & Conglomerate
developer
http://www.avmaria.com/ - http://www.ximian.com/ -
http://www.sinthetic.org/

Follow-Ups:
- Re: Handling Translations
  - From: Tomas V.V.Cox
- Re: Handling Translations
  - From: Christian Rose

References:
- Handling Translations
  - From: jeichorn
- Re: Handling Translations
  - From: Joakim Ziegler
- Re: Handling Translations
  - From: Christian Rose
- Re: Handling Translations
  - From: Joakim Ziegler
- Re: Handling Translations
  - From: Christian Rose

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]