Re: Format specifier (re)ordering in Python (was: String change for orca)



Hi Wouter:

From my experiences with the Q_ vs. C_ change, for example, I saw translators being overly aggressive in their translations, translating stuff the comments explicitly told them not to translate. Hence, I definitely supported the change to C_ because it helped prevent this.

At the risk of translators being more likely to get "%s level %d" right and "%(role)s level %(level)d" wrong, however, I think I'd prefer to reserve the use of format specifiers where disambiguation is necessary.

But, if the gnome-i18n team makes a decree that thou shalt always use format specifier reordering, perhaps a rule of thumb might be to use cryptic parameter names, like:

_("%(p1)s blah %(p2)d") % {'p1' : 'foo', 'p2' : 3 }

Will

Wouter Bolsterlee wrote:
Hi all,

The "format specifiers should be in a different order" problem is not new,
and not specific to Python. Let's look into this specific case once more:

2009-02-11 klockan 20:58 skrev Willie Walker:
At the request of our Hungarian translator, we've changed a string in Orca. The old string was this:

#. Translators: this is in reference to a heading level
#. in HTML (e.g., For <h3>, the level is 3).
#.
#: ../src/orca/scripts/toolkits/Gecko/speech_generator.py:90
#, python-format
msgid "level %d"

The new string is this:

#. Translators: the %d is in reference to a heading
#. level in HTML (e.g., For <h3>, the level is 3)
#. and the %s is in reference to a previously
#. translated rolename for the heading.  If you
#. change the order of the %s and %d in the string
#. (as needed for Hungarian, for example), Orca will
#. detect it and do the right thing.
#.
#: ../src/orca/scripts/toolkits/Gecko/speech_generator.py:93
#, python-format
msgid "%s level %d"

While this may work in this specific case, this is by no means a generally
acceptable solution to this kind of problem. What happens if both arguments
are strings? Or both are numbers? Or if you had 4 format specifiers? Well,
it would break horribly. So we need something more reliable :)

In C, sprintf() supports reordering format specifiers using "%n$f", where n
is a number. Python does not support the "%1$d" syntax to its built-in
sprintf-like string formatting operator, %, but it does support named
parameter substitution when a mapping (e.g. a dictionary instance) is passed
to the % operator.

This may sound cryptic, so let's illustrate this with a few examples:

  >>> 'the role is %(role)s and the level is %(level)d' % {'role': 'test', 'level': 3}
  'the role is test and the level is 3'

If translators decide the order has to be changed, this will result in
something like this:

  >>> 'the level is %(level)d and the role is %(role)s' % {'role': 'test', 'level': 3}
  'the level is 3 and the role is test'

This approach works in all cases, even in the case that the two format
specifiers are the same (e.g. both are numbers).

There is one downside though: translators MUST NOT translate the keywords
("level" and "role" in my case), since those are the keys used to lookup the
value in the mapping passed to the % operator. But then, translators can
just as well mess up format specifiers that are used in the traditional,
positional way, e.g. by translating '%.3f' to %.0f', which will result in
the fraction being removed upon display. So... a short translator notice
should be fine to avoid this problem.

Hope this helps others as well.

    — Wouter



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]