Re: Mistakes in doc translations



Hi all,

As Chusslove says, version control of PO is more difficult than that
of normal source files.
But there are tips on how to manage PO on VCS.
1. --no-wrap
2. --no-location

The above points are msgmerge's options. Msgmerge with --no-wrap does
not wrap texts. That gets rid of noise from rewraps.
And --no-location is also useful. That removes location information
which causes conflicts.

Please refer to the following PO diff.
http://git.gnome.org/browse/gnome-user-docs/commit/gnome-help/ja/ja.po?id=dc7c0d447e510dd25d15b73cc7462bf79f623d69

The PO file was msgmerged with --no-wrap and --no-location. Commits of
the po indicate no inessential and noisy diff which makes version
control of PO difficult.
In a word, it's important to remove information and format
automatically generated.

Thanks,

-- 
Jiro Matsuzawa
E-mail:
 jmatsuzawa gnome org
 jmatsuzawa src gnome org
 matsuzawa jr gmail com
GPG Key ID: 0xECC442E9
GPG Key Fingerprint: E086 C14A 869F BB0E 3541 19EB E370 B08B ECC4 42E9



On Wed, Apr 18, 2012 at 5:40 PM, Chusslove Illich <caslav ilic gmx net> wrote:
>> [: Shaun McCance :]
>> The answer is plainly yes, if you use version control correctly. PO files
>> might have some characteristics that make some things harder, but they're
>> not so special that they're outside the realm of git.
>
> But PO files are the furthest outwards in the realm of Git (version control
> in general). I'm looking for ways to close them in.
>
>> PO files are more line-oriented than XML files. Will you get diff noise
>> from rewraps? Sure.
>
> Documentation XML files may be slightly more special than program code, but
> for the single reason you mention, text wrapping. And I've heard that
> powerful diff tools that can work around it (Emacs I think). Also, I
> personally never word-wrap text in XML files, so in my uses XML files are
> exactly same as source code.
>
> Wrapping in PO files causes much more noise because most translators use
> dedicated PO editors, which usually rewrap all messages when saving a PO
> file; there can be almost total line-level diff for one actual message
> changed. Then, there are unfuzzied messages, where half a message becomes a
> diff, even if even one word was changed. There are source reference
> comments, which change in all subsequent messages when source lines in front
> are moved. There is ordering of messages, which can change either due to
> source perturbations or messages being obsoleted and shifted to end.
>
> Here is a typical scenario. Translator works for some time on a PO file
> obtained from somewhere (from repository incl. intltool-update, from DL),
> and completes the translation. Some time afterwards, that PO file is
> received by the committer (through email, through DL). The received PO file
> is now arbitrarily different from the PO file in the repository, with the
> baseline unknown. What is the committer supposed to do? If he doesn't want
> to review the translation, he will just copy the received PO over the
> current repository PO, run intltool-update and msgfmt -c, and commit. Here
> maintainer's fix will be lost outright. If the committer does want to
> review, he may run intltool-update over repository PO and over received PO
> and diff that (or something to that effect, e. g. rely on DL). Here, given a
> lot of garbage in line-level diff, it will require good concentration not to
> miss maintainer's fix -- how many committers do this regularly?
>
> With code (or documentation XML) the diff is much more meaningful and the
> baseline is normally known, so the version control system (or a standalone
> tool) can perform an effective 3-way merge and automatically bring up the
> real conflicts. Something in the spirit of this would be needed for truly
> non-locking PO workflow. But it would not be sufficient on its own:
>
>> About a dozen people regularly commit to the same Mallard page files in
>> gnome-user-docs. Not a single one of the files belongs to only one person.
>> I regularly commit to files written by someone else. It does work, as long
>> as you use version control correctly.
>
> What is the difference between what is done for Mallard page files and what
> programmers do with the code? By looking through Git log, I dont's see any.
> For PO files, it goes like this.
>
> By far the most frequent modification to PO files is translation update
> after merging. This update will usually happen sometime near to release. For
> n PO files and m active translators, most of the n * m file-translator
> combinations are viable. If two translators update the same file at the same
> time, there will be a lot of conflicts. These conflicts will be such that
> one translator's work will simply have to be discarded. The net result is
> that translators practically never rely on version control for work
> synchronization, but almost always establish some sort of locking workflow
> on the organizational level. This can be informal, e.g. through "who will
> now update what" on a mailing list, or more formal, e.g. through web
> assignment interfaces (like DL's reservations).
>
> With code it is much rearer that two same persons will work on the same code
> at the same time. They may work on the same file, but at different parts of
> it. For n source files and m active programmers, only a small subset of
> n * m file-programmer combinations is viable. The result is that clean
> merges are possible most of the time, very little work is lost due to
> overlapping, and hence version control can be relied upon for work
> synchronization. Organizational locking is extremely rare.
>
> --
> Chusslove Illich (Часлав Илић)
>
> _______________________________________________
> gnome-i18n mailing list
> gnome-i18n gnome org
> http://mail.gnome.org/mailman/listinfo/gnome-i18n
>


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]