[Kde-pim] Please upload articles for automatic Language/Layout Switching

Martin Sandsmark martin.sandsmark at kde.org
Wed Nov 27 14:23:01 GMT 2013


On Tue, Nov 26, 2013 at 11:12:37PM +0530, Shivam Makkar wrote:
> Implementation: https://github.com/amourphious/Language-Detection

Looking at the corpus you have already it is not up to par.
https://github.com/amourphious/Language-Detection/blob/master/LanguageDetection/langdata/norwegian
is 110 years old today, for example.
The Danish corpus seems to be part of an outdated bible translation.

So I would recommend (as others have done) to either use Wikipedia, or
altenatively a proper corpus like this one for Norwegian:
http://www.tekstlab.uio.no/norsk/bokmaal/english.html

Or if you just want a simple n-gram algorithm, there's several ready-made
alternatives, like this one from Chromium:
https://code.google.com/p/chromium-compact-language-detector/


-- 
Martin Sandsmark
_______________________________________________
KDE PIM mailing list kde-pim at kde.org
https://mail.kde.org/mailman/listinfo/kde-pim
KDE PIM home page at http://pim.kde.org/



More information about the kde-pim mailing list