Re: Mistakes in doc translations



Hi

On Wed, 18 Apr 2012, Shaun McCance wrote:

On Wed, 2012-04-18 at 10:40 +0200, Chusslove Illich wrote:
[: Shaun McCance :]
The answer is plainly yes, if you use version control correctly. PO files
might have some characteristics that make some things harder, but they're
not so special that they're outside the realm of git.

But PO files are the furthest outwards in the realm of Git (version control
in general). I'm looking for ways to close them in.

I'm pretty sure that distinction belongs to PNG files. :)

PO files are more line-oriented than XML files. Will you get diff noise
from rewraps? Sure.

Documentation XML files may be slightly more special than program code, but
for the single reason you mention, text wrapping. And I've heard that
powerful diff tools that can work around it (Emacs I think). Also, I
personally never word-wrap text in XML files, so in my uses XML files are
exactly same as source code.

There are amazing XML diff tools out there. And if there aren't amazing
PO diff tools out there, somebody needs to write one. Heck, I've even
seen image diff tools, which is incredible. But pulling and merging in
git, all that matters is the line diff.


pyg3t contains "podiff" which ignores wrapping. However this is hardly likely to make it into git, and so will not really solve the problem.

I do wrap lines in XML at somewhere around 80-100 characters, and I
encourage all the doc team contributors to do the same. The reason is
that, if you have a 800-character line, and you change a letter, it's
really hard to figure out what's going on in the diff.

For the same reason, I encourage people not to rewrap all the lines
in a paragraph when adding a word or two. Yes, it leads to more ragged
right edges. But it makes the diffs so much cleaner. Line wrapping is
good. Automatic rewrapping is a pain.


So here's a summary of the things that cause problems, as I understand
it. Correct me if I'm wrong.

1) Automatic rewrapping creates lots of noise and can confuse merges.
2) Multiple people merging the POT file creates conflicts.
3) Messages get reordered, which creates complete noise in diffs.
4) Due to workflow, we don't have a baseline commit to reference.

(4) is a serious problem. git is really smart, and has a number of
merge strategies that I can only describe as "magic". But they don't
work if you bypass version control.

(1) and (3) totally suck, and I think reflect problems in the tools
translators use. It's like the tools are actively trying to prevent
you from having meaningful version control.

I'd be surprised if git couldn't manage to deal with (2), given that
it ought to be the exact same changes introduced. For my part, I can
guarantee that I'll never commit those kinds of changes.

Do we really not have any tools to help translators merge two sets
of changes? That seems like it should be a solved problem.

Which criterion would one use to merge them? Supposedly one would choose strings from one changeset above the other, and then include as many translations as possible. Would that be the idea?

Anyway, if there are problems that break the build (less common with
recent itstool improvements), then my choices are to fix them or to
disable the translation. I'm not in the habit of building daily, so
if I have to disable a translation, it probably means it's disabled
in the release. And that would be a real shame. You guys do too much
awesome work for it to be wasted like that.

--
Shaun

In my opinion it is fine to fix critical syntax errors by hand. Translators should be able to deal with changes in the po/pot-file at any time, as the is updated at arbitrary times already. This seems like simply a practical issue, as the one who finds the syntax error can probably even fix it faster than it would take to communicate its location to the translators.

Regards
Ask


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]