Translation in Qt5

Tue Jul 5 16:09:38 BST 2011

> [: Oswald Buddenhagen :]
> right, now i remember that we talked about that. auto-escaping is the only
> reasonable option. suppression should be done on a per-placeholder basis
> (which brings us to rich formats again). though i wonder how often you'd
> actually want to substitute a pre-quoted string? i'd expect this to happen
> in cases where regular concatenation is actually an option. counter-
> examples?

Me confused (but see below).

> one could also use a different syntax for the markup, think gmake macros:
>     "foo $(filename %1)"
> or maybe qmake?
>     "foo $$filename(%1)"
> of course either would require some escaping stunts, etc.

I thought about a markup less verbose than XML, but I would rather not risk
it. XML also has defined syntax for attributes, escapes, it is well-known to
both programmers and translators, parsers available on every corner (e.g.
when writing translation checker tools), syntax highlighting, etc.

> if one defines that KUIT is a superset of qt rich text, there is not even
> a theoretical problem with it.

But KUIT is supposed to be transformable into different target formats, of
which Qt rich text is one? (The other currently is plain text, and I've
reserved "a slot" for shell color sequences.)

> what i'd expect now would be an actual use case for this stuff. you
> indicated yourself that this is no high priority for you. why? did it turn
> out less useful than anticipated? or too problematic (despite the above
> answers)?

The use case was simply that some people got fed up with thinking every time
whether to wrap, say, a file name with single quotes, double quotes, <i>,
<b>, etc. So they wanted a system by which they can just say "this is a
filename" and then it gets formatted according to that and to output
destination. Really, the standard argument for semantic markup. At first,
the idea was to make such indications from the outside (e.g. a .subs()
method overloaded on QFile, QUrl, etc.), but that quickly spiraled out of
control (you can check the thread "KDE 4 proposal: Paths in i18n strings",
July 2006). As an alternative, a bit later, I came up with semantic markup
directly in text. This had the benefit of giving extra information to
translators, and enabling translators to use the markup themselves.

Three problems cropped up.

First, some people really didn't like that semantic markup was thrown onto
everyone, without possibility to disable it. This was mostly due to need to
care about escaping. So they wanted it other way around, that it needs to be
enabled by those who want it -- but this is not easy to provide (there is a
wider fundamental problem in the background).

The second problem is escaping and substitution. I tried briefly with auto-
escaping, got regression reports in about a month, had to disable it. But
substition also has issues related to how the output format is chosen:

  QString problem = i18n("@item:intext", "<filename>%1<filename> got deleted.", filename)
      # ...this formats as plain text because that is the default for @item:intext...
  i18nc("@info", "A problem happened: %1", problem)
      # ...while this is rich text because that is default for @info,
      # but due to the above the filename still ends up with plain text formt.

The third problem I was aware of from the beginning, but thought let's give
it a try. When you have fixed semantic markup, there is the problem of set
of tags. If it is a small set, then people will miss special tags for their
special things in their applications. If it is a large set, people will
still miss some, and worse, they will be blocked thinking what to use from
that large set (this is sapping the life out of Docbook, for example). So I
went with a small set, and even myself am now not very happy at the choices
I made.

In summary, whether due to these problems or because simply not many people
cares about it, semantic markup didn't really take off. So I hoped it is not
big deal to drop it. (Note that this does not include @-context markers,
they have no problems on their own and are much more used than markup.)

But then there comes along an i18n bug like this:
https://bugs.kde.org/show_bug.cgi?id=267439 . In the code, it looks like:

    importantHighlight(i18n("Uses of") + " ") + nodes[1]
  + importantHighlight(" " + i18n("from") + " ") + nodes[0] + "<hr>";

Here the programmer did some semantic markup on his own, not thinking of
i18n rules. But if he had thought of i18n rules, what could he have done?
Nothing -- except drop the markup. (This markup resolves into colors, and
there are other *Highlight() wrapper functions, all neatly defined in a
standalone file: typeHighlight, propertyHighlight, commentHighlight...)

I too was in a similar position as this guy, in a Python code I maintain; it
is shell-only, but needs output coloring dependent on destination (shell
sequences, HTML, none). Therefore I badly needed some sort of markup, and
having in mind the problems listed above, I thought about (and partly
implemented) an uprated markup system. Erm, in next message...

-- 
Chusslove Illich (Часлав Илић)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 197 bytes
Desc: This is a digitally signed message part.
URL: <http://mail.kde.org/pipermail/kde-core-devel/attachments/20110705/c32df869/attachment.sig>