Re: Doc Translations



Hi Karl,

Today at 18:20, Karl Eichwalder wrote:

> To avoid a wrong impression I must emphasize again that you are on a
> very right train.  I appreciate your work a lot, I can only judge from a
> translator's POV and as one who play with SGML and XML from time to
> time.

No offense taken, you need not worry ;-)  I'm also hoping to get the
best possible results, so I appreciate all the comments I can get.
Feel free to point out whatever I may be missing.

> Before I forget another issue: Don't drop surrounding tags entirely;
> esp. keep short tags like <title> or <caption> as comments:
>
>     #. tag: title
>     msgid "Installing Gnome"
>     msgstr ""
>
>     #. tag: caption
>     msgid "Gnome Desktop"
>     msgstr ""

Good one!  I have thought about it as well, but simply didn't get
there yet.

>> It's very hard to determine the sort of text contained in the
>> entity.  It can be anything from simple text, well-formed XML,
>> to non-well-formed XML.  It's a pain.  So, we'd probably need to
>> assume it's at least well-formed XML.
>
> Yes.  Reminds me...  We need a validator that can check the XML syntax
> as encapsulated in the PO format.  If this is impossible (I'm inclined
> to assume it), after conversion back into the XML format the validator
> must provide a reference to the PO file as well.

Are you talking about real validity test (sort of xmllint -valid), or
simply a test if it is well-formed?  

>> The simplest way to achieve this is to simply let translators add
>> DTD extensions they wish themselves.  Eg. have something like:
>>
>> #. Translators: define any other entities you wish to use here
>> msgid "<!ENTITY app "Bug Buddy">\n"
>> msgstr ""
>
> Yes, something along these line I have had in mind.  And yes, msgfmt
> should learn to do some basic XML checking; it can derive the
> surrounding tag from the comment I proposed above.

Ok, this is not too difficult (provided I don't special case EXTERNAL
entities), but I wouldn't output it by default.

>> This was the other way around: we have multiple XML files joined
>> together using SYSTEM entities.  We want translations from single PO
>> file with translations to go into multiple XML files which don't have
>> DTD's and stuff.
>
> It is not difficult to wrap the DTD stuff around the separate XML files
> temporarily.  IIRC, using osx (from OpenSP) it is possible to expand
> only soem types of entities.  Maybe, it is possible to use a tool liek
> osx as a preprocessor.

It's not a problem of wrapping DTD around, it's the problem of usage
thereafter.  We'd have to let xml2po know about the document which
extends DTD via a parameter.  Also, in the code, I'd have to
post-process output to remove DTD from produced XML in such cases,
because libxml2 insists on working with well-formed documents.

>> So, what would be the best for *translators*? Allow them to define
>> DTD themselves?
>
> Yes, and it might be the best to keep it separate from the PO file.

Well, that's an option, but we lose some of the advantages like
having it all in the one file.  I have so far worked on putting even
stuff like translator-credits into PO files, so translators wouldn't
have to leave their familiar PO editor at all (except for images,
though we could base64-encode them into messages as wellâI'm not going
to do that, for those who may think I was ;-).

The basic idea was to use xml2po during *build* time, so it shouldn't
even be used by the translators: something like intltool does for
other stuff.  We'd put only our own sr.pos in CVS, and have
documentation translation regenerated on the fly.  (This has one
problem of documentation thus likely being partly translated, and
partly in English; we have lived with that in UI, can we live with
that in docs?)

So, anything above I say that I dislike, I dislike only in the sense
of this automated usage.  Letting any other project choose their own
defaults is a good thing, but I'd like to keep it simple and nice for
Gnome.

> From the translator's POV that's just a macro definition file; happily,
> the syntax is not that clumsy:
>
> <!ENTITY app "Bug Buddy">

Agreed that it's quite simple, but it still requires updating more
than one place in order to update your translation.  I believe the
success of PO format for translators has been in exactly that they
needed to care only about one file.

> Another important issue is to show the user a diff in case a fuzzy entry
> was produce while merging.  I always thought it could not be that
> difficult to add either the previous (old) string to a fuzzy entry in
> the PO file or a (w)diff output (using #| as a comment marker for the
> old content):

That's a nice idea, but I'm currently using msgmerge in order to
merge translations, so it's the one responsible for marking messages
"fuzzy".  It's the only one which knows what message was the
translation based on (it's "fuzzy pair"), so perhaps that's where we
should ask for this update.  Or, I could go on and reimplement
msgmerge (eventually, I'd like to do that to add different sorts of
fuzzy matching, based on structure of the text), but I don't plan to
work on that anytime soon.

If this behaviour already exists, I can use this option when I call
msgmerge, but I don't know about it, so I'm not using it yet. ;-)

> The next step would be enhancing our PO file editors to display this
> nicely.

Well, there's always room for improvement in that area. :-)

FWIW, I've added the items I found interesting to
gnome-doc-utils/xml2po/TODO.  I've also added "-k" option, but it
currently doesn't work as it should (I get some mysterious
"Segmentation faults" from libxml2.parseMemory, and I don't have time
to look into it right now; there's only a work-around, but spaces are
not correctly stripped from such messages with entities).

Cheers,
Danilo


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]