Review process for translations
Karl Ove Hufthammer
karl at huftis.org
Sun Apr 24 11:31:33 BST 2022
Frederik Schwarzer skreiv 23.04.2022 22:20:
> My question is: do other teams have some sort of review process and
> how does it look like?
Typically, we do the following:
* I receive PO files by e-mail *or* the translator commits them
directly to SVN.
* If I receive the PO file by e-mail, I take a quick look at it to
eliminate any obvious mistakes (e.g., syntax errors, using ‘msgfmt
-c’) and commit the file to SVN. We use the summit workflow, and I
commit to the summit folder, so the translations won‘t yet be
automatically included in future release.
* I use ‘poediff’ to examine the changes and fix any issues I find. I
commit the result to SVN, as one or more revisions.
* I tell the translator by e-mail or IM about any general issues, and
about how to see the changes I made. Sometimes I can supply a link
to websvn.kde.org, but it’s usually best to use poediff and its word
diff feature. That also eliminates any word wrapping issues. Here’s
an example command:
poediff -c svn -sRr 1519114:1519115 | less -FXR
We also make extensive use of the Pology validation feature
(http://pology.nedohodnik.net//doc/user/en_US/ch-lingo.html#sec-lgrules)
to catch many common mistakes (in terminology, word choice and grammar).
I run the ‘posieve check-rules’ whenever I have made changes to a PO
file. It usually finds some mistakes (or false positives, which I mark
as exceptions).
For checking the spelling, I’ve found a nice solution. The spell checker
for our language is very incomplete, especially for compound words, so
it typically finds *many* false positives (though few false negatives).
But most spelling mistakes aren’t made because the person doesn’t know
*how* to spell a word; they are *typing* mistakes. And if you type a
word incorrectly, you will probably do it only once, not every time you
type the word.
So using a simple shell script, I make a list of *every* word in *every*
PO file and sort it by frequency. I then take the list of potential
spelling mistakes (generated by ‘posieve check-spell-ec --skip-obsolete
-slist’) and compare it with the frequency list. I get a file that looks
like this:
speling
worrd
wrongg 1
aubergine 1
balancing 2
…
browsers 9
…
file 106
The first two words (that don’t have a number) appear in this PO file
but not in any other PO files, so they are either spelling mistakes or
terminology that’s very specific for this PO file. They are also *new*
words; i.e., they were not in the PO file when the frequency list was
generated (I only update the frequency list every month or so). The
words with the number 1 appear *once* in the collection of PO files, so
they are also suspected spelling mistakes. The words with the number 2
appear twice (they are quite rare words *or* spelling mistakes), and so
on. As one goes further down the list, it becomes less and less probable
that the word is a spelling mistake.
Of course, I package all this into a script, so I only have to type
‘spell file.po’ or ‘spell folder’ to get a sorted list of potential
spelling mistakes for the file/folder. I try to run it whenever I have
edited a PO file, and it usually finds a small spelling/typing mistake,
even when I think I’ve been very carefully in spelling everything
correctly. :)
--
Karl Ove Hufthammer
More information about the kde-i18n-doc
mailing list