Re: Diffing POT files



Hi

On Thu, Mar 22, 2012 at 4:20 PM, Chris Leonard <cjlhomeaddress gmail com> wrote:
> I frequently encounter situations where I am interested in comparing
> POT files that are closely related, but not 100% identical.
>
> Does anyone know of a tool or scriptable series of actions using
> Translate Toolkit modules (or other common text manipulation tools)
> where it is simple to determine the differences between two POT files
> (e.g. two versions of the same project at differnt points in time,
> etc.).
>
> I imagine something like a podiff or a pounique operation where there
> are two inputs (file1.pot and file2.pot) and the output might ideally
> be three files that represent the textual equivalent of a Venn diagram
> of these two files.
>
> file1-unique.pot
>        msgids (still in a nice POT format) that are unique to file1
>
> file2-unique.pot
>        msgids (still in a nice POT format) that are unique to file2
>
> file1-file2 common.pot
>        msgids (still in a nice POT format) that represent the completely
> identical msgid overlap between file1.pot and file2.pot.
>
> This process should not permit fuzzy matching, which could lead to confusion.
>
> Does anyone know of such a tool?  It would ideally be aware of PO file
> structure to treat string subunits of a PO file as a single "chunk" as
> opposed to a simple *nix diff which would be line-by-line.
>
> Alternatively, does any one have an "algorithm" employing Transalte
> Toolkit modules to achieve the same or similar result that could be
> turned into a shell script that involves minimal manual manipulation
> of the input of output files to achieve this sort of POT comparison
> result.
>
> TIA for any ideas or suggestions.
>
> cjl
> Sugar Labs Translation Team Coordinator

Not quite, but pyg3t contains a tool called gtcompare which will give
a qualitative overview of the differences between two files (useful
e.g. if there are conflicting changes and you want to see roughly how
bad things are)

For example here's some output comparing before and after a recent
translation of mine:

----------------------------------
askhl@mime:~/Downloads$ gtcompare old/gnome-disk-utility.master.da.po
gnome-disk-utility.master.da.po
Each file contains 425 msgids, and they are all identical.

0 messages remain untranslated.
0 untranslated messages changed to fuzzy.
5 untranslated messages changed to translated.
0 fuzzy messages changed to untranslated.
0 messages remain fuzzy.
28 fuzzy messages changed to translated.
0 translated messages changed to untranslated.
0 translated messages changed to fuzzy.
392 messages remain translated.

There are no conflicts among translated messages.
-----------------------------------

Right now the script has no options.  Maybe we should implement an
option to print the actual messages in specified categories.  This
would be exceedingly easy I think.  What would be a good syntax?
Something like:

gtcompare --fuzzy2translated --untranslated2fuzzy file1 file2

(a bit long)

Or:
gtcompare --print f2t,u2f file1 file2

(rather ugly)

Any ideas?

Regards
Ask


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]