Format specifier (re)ordering in Python (was: String change for orca)



Hi all,

The "format specifiers should be in a different order" problem is not new,
and not specific to Python. Let's look into this specific case once more:

2009-02-11 klockan 20:58 skrev Willie Walker:
> At the request of our Hungarian translator, we've changed a string in  
> Orca.  The old string was this:
>
> #. Translators: this is in reference to a heading level
> #. in HTML (e.g., For <h3>, the level is 3).
> #.
> #: ../src/orca/scripts/toolkits/Gecko/speech_generator.py:90
> #, python-format
> msgid "level %d"
>
> The new string is this:
>
> #. Translators: the %d is in reference to a heading
> #. level in HTML (e.g., For <h3>, the level is 3)
> #. and the %s is in reference to a previously
> #. translated rolename for the heading.  If you
> #. change the order of the %s and %d in the string
> #. (as needed for Hungarian, for example), Orca will
> #. detect it and do the right thing.
> #.
> #: ../src/orca/scripts/toolkits/Gecko/speech_generator.py:93
> #, python-format
> msgid "%s level %d"

While this may work in this specific case, this is by no means a generally
acceptable solution to this kind of problem. What happens if both arguments
are strings? Or both are numbers? Or if you had 4 format specifiers? Well,
it would break horribly. So we need something more reliable :)

In C, sprintf() supports reordering format specifiers using "%n$f", where n
is a number. Python does not support the "%1$d" syntax to its built-in
sprintf-like string formatting operator, %, but it does support named
parameter substitution when a mapping (e.g. a dictionary instance) is passed
to the % operator.

This may sound cryptic, so let's illustrate this with a few examples:

  >>> 'the role is %(role)s and the level is %(level)d' % {'role': 'test', 'level': 3}
  'the role is test and the level is 3'

If translators decide the order has to be changed, this will result in
something like this:

  >>> 'the level is %(level)d and the role is %(role)s' % {'role': 'test', 'level': 3}
  'the level is 3 and the role is test'

This approach works in all cases, even in the case that the two format
specifiers are the same (e.g. both are numbers).

There is one downside though: translators MUST NOT translate the keywords
("level" and "role" in my case), since those are the keys used to lookup the
value in the mapping passed to the % operator. But then, translators can
just as well mess up format specifiers that are used in the traditional,
positional way, e.g. by translating '%.3f' to %.0f', which will result in
the fraction being removed upon display. So... a short translator notice
should be fine to avoid this problem.

Hope this helps others as well.

    — Wouter

Attachment: signature.asc
Description: Digital signature



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]