Translation in Qt5 (placeholders)

Thu Jul 21 06:35:09 BST 2011

>>> [: Oswald Buddenhagen :]
>>> even more wrong when you cut characters while you have a pixel width and
>>> a proportional font. you wouldn't believe how hard it is to convince
>>> certain nokians of this rather obvious reality ... :} anyway, more valid
>>> use cases [for in-string formatting] are:
>>>   - choosing a less/more verbose date format
>>>   - using less padding, thus trading truncation for risk of misalignment
>>>   - choosing a digit style depending on context (think arabian 6)
>
> [: Chusslove Illich :]
> As you would say, I am not convinced :)
> [...]
> But fine. I could agree to supporting :* formatting extension, provided
> that that its use is clearly discouradged in normal circumstances. [...]

Keeping with the desktop, I figured out that Gnome translations, with its
printf-style strings, would be a good corpus to check what translators
actually did with formatting directives. So I run some statistics on Gnome
3.0, and here are the results.

2,141,000 translated messages (from nominal 173 languages) had 265,000
formatting directives in them. Out of those, 151 formatting directives were
modified in translation. That is 0.06%, or 1 in 1700. The interesting part
is the breakdown of those modified directives.

34 modifications were reordering errors. Reordering is when the original
message is "Foo %s bar %s" and the translation needs "Bar %2$s fu %1$s"
(typical error is missing $, e.g. "Bar %2s fu %1s"). 94 more modifications
are errors of other kinds (e.g. "%.1f" modified to "%1.f"). That adds to 128
errors; out of those, 31 lead to loss of information or comprehensibility.

2 modifications were due to abuse of formatting directives, in this message:

  gnome-3-0-sv-mod/gnome-user-share.po:237(#45)
  #. Translators: The %s will get filled in with the user name
  #. of the user, to form a genitive. If this is difficult to
  #. translate correctly so that it will work correctly in your
  #. language, you may use something equivalent to
  #. "Public files of %s", or leave out the %s altogether.
  #. In the latter case, please put "%.0s" somewhere in the string,
  #. which will match the user name string passed by the C code,
  #. but not put the user name in the final string. This is to
  #. avoid the warning that msgfmt might otherwise generate.
  #: ../src/http.c:134
  #, c-format
  msgid "%s's public files"
  msgstr "%.0sPublika filer"

18 modifications could be theoretically counted as intentional, but were
more likely typos or unfuzzying glitches, since there is, in my judgement,
no justification for them (e.g. "%.1f mmHg" changed to "%.2f мм рт. ст." --
do Russian users really expect atmospheric pressure readout as 756.23 while
others are fine with 756.2? actually, everyone is fine with 756).

That leaves exactly 3 out of 151 modifications which were likely intentional
rather than erroneuos:

  gnome-3-0-de-mod/gnome-power-manager.po:776(#129)
  #. Translators: This is %2i minutes %02i seconds
  #: ../src/gpm-graph-widget.c:449
  #, c-format
  msgid "%2im%02i"
  msgstr "%02i min %02i"

  gnome-3-0-eo-mod/gnome-power-manager.po:687(#129)
  #. Translators: This is %2i minutes %02i seconds
  #: ../src/gpm-graph-widget.c:449
  #, c-format
  msgid "%2im%02i"
  msgstr "%02i min %02i"

  gnome-3-0-et-mod/gnome-power-manager.po:687(#125)
  #. Translators: This is %i days %02i hours
  #: ../src/gpm-graph-widget.c:433
  #, c-format
  msgid "%id%02ih"
  msgstr "%ip%ih"

All of them are time formats. (In current KDE code there are somewhat more
than 3 time/date messages, and yet noone ever called out for adding
formatting functionality to placeholders to handle that.)

To recap: in Gnome 3.0 translations, out of 265,000 formatting directives in
total, there were 3 intentionally modified, 128 erroneously modified, and 20
unnecessarily modified. (The list of all mined messages with modified
formatting directives in translation is attached for reference.)

Here's another interesting result:

These same numbers in Gnome 2.30 were: out of 242,000 formatting directives
in total, there were 2 intentionally modified, 67 erroneously modified, and
27 unnecessarily modified. The jump in erroneous messages from 2.30 to 3.0
comes primarily from one gnome-power-manager.po:

  gnome-3-0-el-mod/gnome-power-manager.po:965(#166)
  #. TRANSLATORS: tell the user how much time they have got
  #: ../src/gpm-manager.c:1505
  #, c-format
  msgid "%s of battery power remaining (%.0f%%)"
  msgstr "Απομένουν %s λειτουργίας της μπαταρίας (%.1f%%)"

  gnome-3-0-eu-mod/gnome-power-manager.po:999(#177)
  #. TRANSLATORS: tell user more details
  #: ../src/gpm-manager.c:1644
  #, c-format
  msgid "Wireless keyboard is low in power (%.0f%%)"
  msgstr "Haririk gabeko teklatuak energia baxua du (%% %.1f)"

  gnome-3-0-lv-mod/gnome-power-manager.po:985(#175)
  #. TRANSLATORS: tell user more details
  #: ../src/gpm-manager.c:1637
  #, c-format
  msgid "Wireless mouse is low in power (%.0f%%)"
  msgstr "Bezvadu peles baterija ir gandrīz tukša (%.1f%%)"

and so on, ~60 in total. What has happened here is that the programmer
decided to switch from %.1f to %.0f for displaying battery charges, and
sometimes translators missed to see that when unfuzzying. Had the code been
something like:

  const char *fmt = "%.1f";
  ki18n("Wireless mouse is low in power ({charge}%)").subs(mouseCharge, fmt)
  ki18n("Wireless keyboard is low in power ({charge}%)").subs(kbdCharge, fmt)
  ...

the switch would have been invisible to translators -- no fuzzies, no
errors -- *and* quicker for programmer to make.

-- 
Chusslove Illich (Часлав Илић)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: fmtdirs-gnome-3-0.txt.gz
Type: application/x-gzip
Size: 10366 bytes
Desc: not available
URL: <http://mail.kde.org/pipermail/kde-core-devel/attachments/20110721/bb630ce4/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: This is a digitally signed message part.
URL: <http://mail.kde.org/pipermail/kde-core-devel/attachments/20110721/bb630ce4/attachment.sig>