[Kbabel] [Fwd: [l10n-dev] [Fwd: [Freecats-Dev] The other side: "Commercial" TM tools (cont.)]]

Stanislav Visnovsky visnovsky@nenya.ms.mff.cuni.cz
Fri, 7 Feb 2003 09:32:59 +0100 (CET)


Hi!

On Wed, 5 Feb 2003, Gudmund Areskoug wrote:

> Hi,
> 
> > I see. But does that need to be that complicated? How hard is to manage 
> > all these databases? You only need to setup reasonable thresholds etc
> > to identify the particular purpose of the text to be stored in a correct 
> > database?
> 
> Not sure if I understand what you mean, it is very doable, if that's 
> what you mean.
> 
> If you meant "why make it that complicated": the point is of course 
> that the three data sets are used in different ways and for 
> different purposes, not that the data has to be stored in different 
> files, which is rather academic - but perhaps easier to handle 
> mentally for the user.

Yes, that was my question.

> 
> The approach has proven very flexible and efficient, with some 
> unexpected results in how the system is actually used, thanks to its 
> toolbox nature.
> 
> DV uses the (proprietary) Dewey system for subject classification. 
> Been looking around for an open source classification system, so far 
> haven't found any I thought were usable enough.
> 
> The thing about the Dewey system is its hierarchical nature, that 
> allows for a priority order, so that in a project that was assigned, 
> say, subject 1234.5678, perfect matches would be selected like this:
> 
> match1	1234.5678 = first choice for suggestion
> match2	1234.567x = second choice for suggestion
> match3	1234.56xx = third choice for suggestion
> 
> ...and so on.
> 
> Something similar, but open source, could easily be set up for KDE, 
> Gnome, almost anything GNU. As long as it stays more or less purely 
> within the software domain, it doesn't have to be too hard to set 
> up. I've started asking at the institute for archive and library 
> science (or the like in Swedish) for some general non-proprietary 
> system to use, and will go on to check with the computer 
> linguisticts department and a few others.
> 
> Such a system could be used for keeping track of when and where 
> global terms (e. g. KDE-wide "Yes" -> "Ja", "No" -> "Nej", "File" -> 
> "Arkiv", "Save as..." -> "Spara som...", "Accept" -> "Verkställ" 
> etc.) should be used, and where local terms  (e. g. "Accept" -> "Ja, 
> starta").

I see. Something for FreeCATs IMHO.

> 
> This could probably be extended to prevent shortcut and menu 
> conflicts and the like.
> 
> I started setting up a dummy for KDE and Gnome terminology 
> hierarchy, along with a shortkut/fastkey list, but "All work and no 
> play"...

Seems interesting. I'd be glad to see your results so far!

> Yes, it has to be whole snippets.
> 
> >>Add a terminology and figures check feature, and you've got the 
> >>overview, although lots more could be said.
> > 
> > What is a terminology check? We do not work with figures, so this one is 
> > out of question.
> 
> You have the program go through the file you're working on or the 
> whole project, to see if there's any row (or string pair) where the 
> corresponding target terms to the source terms according to the 
> database isn't present. IMHO, it should be complemented by a reverse 
> check, to see if the same target term has been used for different 
> source terms.

But it's hard to determine, which term corresponds to which one. Or
do you set it up using one of the databases you've mentioned?

> 
> The figures checking finds pairs where figures in the target string 
> are different from the ones in the source string.

Figures = pictures? Something in spirit of the argument check in KBabel?

> > Thank you for information. Still, I'm missing the big picture how these 
> > tools really work.
> 
> Download and try it, if you have access to a Windoze machine?

I have, but I can't find an access to a evaluation version or something 
:-(

> 
> Haven't used Trados, DV's biggest competitor (M$ owns a large part 
> of Trados, it's completely dependent on M$ Word...), but this is how 
> the workflow can look like in DV (cutting out some to keep it short):
> 
> - You set up a project, assigning it languages (you typically work 
> with one pair at at time, as a single user), the databases that 
> should be used for this project, where the translatable files are 
> found and where to export the finished result, subject and client 
> settings etc.
> 
> - You (batch) import the translatable files into the project. You 
> can either work on all files at once, or one at a time.

Dwayne Bailey asked for this some time ago.

> 
> - You (optionally, if you like to set a terminology beforehand) 
> build or import a lexicon and resolve it against the database(s).
> 
> - You (optionally) run a pretranslate, which is configurable (use 
> newer matches over older, only allow perfect matches, etc. etc.).
> 
> - You go through the rows (translatable segments) and translate, 
> optionally with DV autoassembling and inserting stuff as 
> suggestions, optionally with DV showing any matches it finds in any 
> of the DB's along with all additional info on the side.
> 
> - As soon as you leave a row, it is saved. Good for the evercrashing 
>   Windows environment.

:-)

> 
> - You can set different status for rows, like pending, finished, 
> locked, don't send to databse etc. These and other things can be 
> used as fitering criteria to produce different project views to work in.
> 
> On top of this, there are some additional general settings like 
> fuzziness level, sentence delimiters, etc.
> 
> - You can run QA checks on them, like the terminology check I 
> mentioned.
> 
> - When a file/all files are ready, you export the finished translation.
> 
> Things can be filtered and exported for distributing work, for 
> sending only segments you're unsure of for QA etc.
> 
> All DB searches are fairly quick.
> 
> > Looks like a cultural mismatch. :-(
> 
> Yes, definitely. When I first stumbled onto the KDE i18n list with 
> CAT suggestions, it wasn't very popular. Going in the other 
> direction to get Windozed people to understand the benefits of 
> things free (GNU, KDE, ...) is at least as difficult.
> 
> Most 'dozers aren't as technically oriented, whereas Linux people 
> mostly are, and often have their own pet solutions to things :).
> 
> And I'm trying to help bridge the gap.

I would be happy to help, but I need to get a feel what users need.

Thanks!

Stanislav