Re: About translating documents (.xml/.sgml) in GNOME
- From: Sander Vesik <sander_traveling yahoo co uk>
- To: Malcolm Tredinnick <malcolm commsecure com au>,Simos Xenitellis <simos74 gmx net>
- Cc: GNOME Documentation List <gnome-doc-list gnome org>,gnome-i18n gnome org
- Subject: Re: About translating documents (.xml/.sgml) in GNOME
- Date: Thu, 30 Jan 2003 18:34:42 +0000 (GMT)
--- Malcolm Tredinnick <email@example.com> wrote:
> I have code for some of the following (mostly extracting the strings for
> translation and constructing something that is pretty close to a valid
> .po file). If people cannot shoot too many holes in my method I will
> continue down this path, although that does not help Simos with his
> immediate problem of translating _now_.
for now, its largely on teh level of using xemacs/psgml or similar.
> Probably I should point out that the main problem I see in Simos'
> approach is that you will get a lost of unnecessary stuff in the .po
> file, which looks like it will interfere with smooth translations. Also,
> I could not see how some of the "issues to be resolved" were addresses
> by that code (to be fair, the authors mentioned at the time that it was
> a prototype of an idea).
> He who laughs last thinks slowest.
> > A documentation translator -- design document
> Much GNOME documentation exists in the form of DocBook-SGML, DocBook-XML
> and (X)HTML documents. Translators are most comfortable working with GNU
> gettext-style po files. The aim of this program is to provide an efficient
> means of converting documentation source into po files and then merging the
> resulting translations back into a document for distribution.
> DESIGN IDEAS:
> The are two halves to this program. The first part is extracting all of the
> translatable strings into the po files, ready for translation. The second part
> is creating translated documents from the po files at build time.
> (1) Extracting the strings
> When run, the program is given a list of tags which are considered
> "block elements" (by analogy with the concept in HTML). These tags are
> the ones which do not have significant bearing on the ability of their
> contents to be translated. So they are dropped in the conversion to po
> format. By way of example, in HTML we would consider the following tags
> to be amongst those which are block elements: p, h1, h2, br, hr, table,
> and so on.
A big problem with doing this on block level approach is that you do not get
any amount of reuse. So having translated <menuchoice> <guimenu>File</guimenu>
<guimenuitem>Open</guimenuitem></menuchoice> once doesn't help you at all to
get somewhere with the next (possibly hundreds) of occurences. There are a
lot of different occurences of <guimenu>, <guilabel>, <guimenuitem> and
<guibutton> with alrgely the same contents. This may have to be a separate
pretranslation step though.
> [NOTE: Typically, a chunk for translation will be a paragraph. This
> seems like a sensible division, since it may lead to a better
> translation to reorganise the sentence structure, but keeping the
> paragraph structure the same should not be too much of a burden, from
> my limited experience of other languages.]
> In a normal program internationalisation effort, all strings from all
> files are put into a single po file for each language. However, when
> translating documentation, this approach does not seem efficient.
> Firstly, it is not unreasonable to expect that only a fraction of the
> documentation in any package will be initially translated. Secondly,
> the po files will be much larger than for all but the largest programs,
> since user and developer documentation is often quite lengthy. Typical
> use of this program will therefore place the po files for each document
> in their own directory (probably under the document's source directory,
> or its immediate parent).
> [NOTE: It has also been floated that, since the source is usually
> under a C/ directory, alternative translations can go in directories
> labelled by their locale name -- so es/, no/, and so forth. The files
> in these directory would still be .po format files so that translators
> can use their current familiar techniques.]
This is by now the standard - all localised docs are in their own subdirectories
Do You Yahoo!?
Everything you'll ever need on one web page
from News and Sport to Email and Music Charts
] [Thread Prev