Natural language processing tech for the desktop!
Alexander Dymo
dymo at ukrpost.ua
Wed Oct 22 23:26:05 BST 2008
> > I don't even think that spell and grammar checking should be separated
> > very much, since a spell checker should ideally know about the sentence
> > structure, too.
>
> Well, AFAIK (but checking is not my speciality), you can do a really good
> and fast spell checker with simple statistical techniques and a simple
> distance editing. For a grammar checker, you must have a full NL language
> syntax parser and other techniques to find what are the errors and suggest
> solutions. But, it's true that a good grammar checker must also be solid in
> front of spelling errors.
That's why I like the idea of using Link parser
http://www.abisource.com/projects/link-grammar/
http://www.link.cs.cmu.edu/link/
It's quite convenient to use. It has both word dictionaries and grammar rules
and when you try to build the graph (of links between words), it will figure
out the morphology and syntax simultaneously.
If the sentence is not correct, it will either leave words as not recognized
morphologically (spelling error) or it will leave words outside the sentence
link graph (syntax error). The algorithm to do that is IIRC O(n^3) which is
great.
I know abiword uses that now but I don't know how they do error reporting for
syntax errors (which is quite interesting question itself).
The only problem is that only english grammar is complete atm. There're
italian and german grammars but they don't look like mature yet. There's also
quite good russian grammar but it's unfortunatelly proprietary.
More information about the kde-core-devel
mailing list