Re: PO-based Documentation Translation
- From: Tim Foster <Tim Foster Sun COM>
- To: gnome-i18n gnome org
- Subject: Re: PO-based Documentation Translation
- Date: Wed, 01 Oct 2003 16:16:34 +0100
On Wed, 2003-10-01 at 15:33, Karl Eichwalder wrote:
> Tim Foster <Tim.Foster@Sun.COM> writes:
> > msgid "a,b,c,d"
> > msgstr "a,b,c,d"
> > However, doing it this way means that if a single sentence changes, the
> > entire paragraph gets marked as a "new" message (probably a strong fuzzy
> > match) in an automatic translation system.
> My proposal to solve this problem was: keep track of the previous msgid
> and add it as a comment (using a marker like '#|') while doing the
> msgmerge step:
Yeah - that would work, - you're getting the differences displayed, but
not taking advantage of the matches you'd get across other projects if
you could segment at the sentence level.
> Splitting at the sentence level can cause other problems. What's a
> sentence is different from language to language.
You're right : we have different segmenters for different languages. It
just turns out, that 99% of our translation uses english as a source
> > In fact, there's another bonus too, since if the sentence "A launcher
> > can reside in a panel or a menu" appears in another document, but not in
> > exactly the same paragaph context, you still have the translation in
> > your database and can reuse that.
> There is no guarantee it will fit this way.
Of course, context is important, but in the absence of anything else, we
can suggest the match - a reviewer doesn't have to use an exact match we
suggest, but usually it's acceptable.
In terms of saving time and money, sentence segmentation is fantastic
and works for us.
> I case of <b> (= bold) this will work; it may or may not work for
> other elements
For inline tags, we've found this to be just peachy and haven't broken a
html or sgml document yet based on this approach.
> thus you are better off converting or escaping the
> inline "tags" (and convert it back once the translation is done).
> Depending on your data something as follows may work:
> This is a piece of !!b>bold text.
> This is a new sentence that!!/b> isn't in bold any more.
Yes, that would be another way of doing it, but it requires more work
post-translation, which was something we were trying to avoid.
Our rule for sgml, html and xml was to never put invalid (not
well-formed) text into our database - so if we had a segment with a
missing open or close tag that was not allowed in the particular dtd
involved, we'd always make them well-formed before storing the text.
] [Thread Prev