[KDE-Sonnet] [Mountain Goat Programmer] New comment on Queen and Country.

Jacob Rideout jacob at jacobrideout.net
Fri Jan 19 04:38:41 CET 2007


Henrique,

I'm CC this to the kde-sonnet list. Please direct any reply to the list.

>  The grammar is pretty much the same for both languages, but vocabulary is
> much different, radically different in some areas. As a native pt_BR
> speaker, it is difficult, but not impossible, for me to understand pt_PT. I
> don't know if the differences are enough for you to need a model of each of
> them, but surely the difference is orders of magnitude greater than the
> difference between dialects of English. I can do a comparison of the pt_BR
> and pt_PT translations of KDE for you if that would be of any help. You can
> contact me via the e-mail address "XXXXXX at kdemail.net".

It now appears to me that Portuguese is special case, and a more
general solution isn't acceptable. Tcatng uses a combined pt_PT and
pt_BR corpus generated model to detect Portuguese, then uses
specialized models to differentiate.

Take a look at the .corpus files at this site:
http://tcatng.cvs.sourceforge.net/tcatng/tcatng/language-profiles/pt-br/

Are those words characteristic of their respective dialects?

Jacob


More information about the kde-sonnet mailing list