Review process for translations

Karl Ove Hufthammer karl at huftis.org
Sun Apr 24 11:31:33 BST 2022


Frederik Schwarzer skreiv 23.04.2022 22:20:
> My question is: do other teams have some sort of review process and
> how does it look like?

Typically, we do the following:

  * I receive PO files by e-mail *or* the translator commits them
    directly to SVN.
  * If I receive the PO file by e-mail, I take a quick look at it to
    eliminate any obvious mistakes (e.g., syntax errors, using ‘msgfmt
    -c’) and commit the file to SVN. We use the summit workflow, and I
    commit to the summit folder, so the translations won‘t yet be
    automatically included in future release.
  * I use ‘poediff’ to examine the changes and fix any issues I find. I
    commit the result to SVN, as one or more revisions.
  * I tell the translator by e-mail or IM about any general issues, and
    about how to see the changes I made. Sometimes I can supply a link
    to websvn.kde.org, but it’s usually best to use poediff and its word
    diff feature. That also eliminates any word wrapping issues. Here’s
    an example command:
    poediff -c svn -sRr 1519114:1519115 | less -FXR

We also make extensive use of the Pology validation feature 
(http://pology.nedohodnik.net//doc/user/en_US/ch-lingo.html#sec-lgrules) 
to catch many common mistakes (in terminology, word choice and grammar). 
I run the ‘posieve check-rules’ whenever I have made changes to a PO 
file. It usually finds some mistakes (or false positives, which I mark 
as exceptions).

For checking the spelling, I’ve found a nice solution. The spell checker 
for our language is very incomplete, especially for compound words, so 
it typically finds *many* false positives (though few false negatives). 
But most spelling mistakes aren’t made because the person doesn’t know 
*how* to spell a word; they are *typing* mistakes. And if you type a 
word incorrectly, you will probably do it only once, not every time you 
type the word.

So using a simple shell script, I make a list of *every* word in *every* 
PO file and sort it by frequency. I then take the list of potential 
spelling mistakes (generated by ‘posieve check-spell-ec --skip-obsolete 
-slist’) and compare it with the frequency list. I get a file that looks 
like this:

speling
worrd
wrongg 1
aubergine 1
balancing 2
…
browsers 9
…
file 106

The first two words (that don’t have a number) appear in this PO file 
but not in any other PO files, so they are either spelling mistakes or 
terminology that’s very specific for this PO file. They are also *new* 
words; i.e., they were not in the PO file when the frequency list was 
generated (I only update the frequency list every month or so). The 
words with the number 1 appear *once* in the collection of PO files, so 
they are also suspected spelling mistakes. The words with the number 2 
appear twice (they are quite rare words *or* spelling mistakes), and so 
on. As one goes further down the list, it becomes less and less probable 
that the word is a spelling mistake.

Of course, I package all this into a script, so I only have to type 
‘spell file.po’ or ‘spell folder’ to get a sorted list of potential 
spelling mistakes for the file/folder. I try to run it whenever I have 
edited a PO file, and it usually finds a small spelling/typing mistake, 
even when I think I’ve been very carefully in spelling everything 
correctly. :)


-- 
Karl Ove Hufthammer



More information about the kde-i18n-doc mailing list