Statistics for each GNOME translator's work



Hi All,

A few months ago there was a discussion on gnome-i18n about
the issue of having translation statistics or a way to see easily how much work
each translator is doing.

Such a thing would be useful if, for example, you want to make an announcement
of the localisation of GNOME 3.0 to your language and you want to show how
much work each translator did.

I am working on such a tool and it is available at
http://github.com/simos/gnome-l10n-translator-stats

Here is a sample run for 'yelp' only (/tmp/GIT/ has only 'yelp'),

> ./gnome-l10n-translator-stats stats --language el --startdate "2009/01/01" --enddate "2010/06/30" --release gnome-2-30 --repositories /tmp/GIT/
Release      : gnome-2-30      retrieved release: gnome-2-30
Language     : el              retrieved language: el
Repositories : /tmp/GIT/
Start date   : Thu Jan  1 00:00:00 2009
End date     : Wed Jun 30 00:00:00 2010

             Thanos Lefteris <xxx gmail com>        165         16
                 Kostas Papadimas <xxx gnome org>        257         45
                Simos Xenitellis <xxx gnome org>          0          2
> _

The first column is translated words and the second is 'changed
messages' (or translation fixes/updates).

What's missing is to figure out a better algorithm to count the work
when a translation is 'updated'.
Because when a translation is added for the first time, it's simple to
count the translated words.

The current algorithm is
1. Obtain the before and after versions of a PO file.
2. Use 'pocount' to count the translated strings in both, note down
the different
3. Use 'podiff' to count the 'changed' messages (message updates).

Disadvantages
a. 'pocount' shows sometimes less messages in the newer PO, so the
difference is negative.
Currently we do not count these numbers in.
b. I did not establish the significance of the podiff changed messages.
c. The figures are rough statistics. It should take several revisions
before the statistics
are exact.

Features/Advantages
a. The stats for a full GNOME release (gnome-2-30) and a language (el)
takes about 20 minutes to complete
b. You can specify which release to use or even a specific module.
c. Made my own 'python-git' class. 'python-git' was erratic. There are
many commit messages
which are messy. For example, non-UTF8 text, e-mails structured as 'me
at gmail dot com',
a field 'Merge:'.
d. Can show text colors.
e. Allows to extend prior to April 2009; before then, GNOME used to have SVN.
The actual translator name was in the comment. With this tool you can
attribute correctly
the actual translator.
f. The way the tool works is it creates a temporary branch and then it
removes the commits
so that it can find the different editions of the PO file during the
specified time period.
Once the stats have been calculated, the temporary branch is gone and the
repository switches back to 'master'.

Cheers,
Simos


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]