Natural language processing tech for the desktop!

Alexander Dymo dymo at ukrpost.ua
Wed Oct 22 23:26:05 BST 2008


> > I don't even think that spell and grammar checking should be separated
> > very much, since a spell checker should ideally know about the sentence
> > structure, too.
>
> Well, AFAIK (but checking is not my speciality), you can do a really good
> and fast spell checker with simple statistical techniques and a simple
> distance editing. For a grammar checker, you must have a full NL language
> syntax parser and other techniques to find what are the errors and suggest
> solutions. But, it's true that a good grammar checker must also be solid in
> front of spelling errors.

That's why I like the idea of using Link parser
http://www.abisource.com/projects/link-grammar/
http://www.link.cs.cmu.edu/link/

It's quite convenient to use. It has both word dictionaries and grammar rules 
and when you try to build the graph (of links between words), it will figure 
out the morphology and syntax simultaneously. 

If the sentence is not correct, it will either leave words as not recognized 
morphologically (spelling error) or it will leave words outside the sentence 
link graph (syntax error). The algorithm to do that is IIRC O(n^3) which is 
great. 

I know abiword uses that now but I don't know how they do error reporting for 
syntax errors (which is quite interesting question itself). 

The only problem is that only english grammar is complete atm. There're 
italian and german grammars but they don't look like mature yet. There's also 
quite good russian grammar but it's unfortunatelly proprietary.





More information about the kde-core-devel mailing list