[Kde-accessibility] Fwd: Re: paraphlegic KDE support
Willie Walker
William.Walker at Sun.COM
Thu Feb 23 17:57:34 CET 2006
Hi All:
I just want to jump in on the speech recognition stuff. Having
participated in several standards efforts (e.g., JSPAI, VoiceXML/SSML/
SGML) in this area, and having developed a number of speech
recognition applications, and having seen the trials and tribulations
of inconsistent SAPI implementations, and having led the Sphinx-4
effort, I'd like to offer my unsolicited opinion :-).
In my opinion, there are enough differences in the various speech
recognition systems and their APIs that I'm not sure efforts are best
spent charging at the "one API for all" windmill. IMO, one could
spend years trying to come up with yet another standard but not very
useful API in this space. All we'd have in the end would be yet
another standard but not very useful API with perhaps one buggy
implementation on one speech engine. Plus, it would just be
repeating work and making the same mistakes that have already been
done time and time again.
As an alternative, I'd offer the approach of centering an available
recognition engine and designing the assistive technology first. Get
your feet wet with that and use it as a vehicle to better understand
the problems you will face with any speech recognition task for the
desktop. Examples include:
o how to dynamically build a grammar based upon stuff you can get
from the AT-SPI
o how to deal with confusable words (or discover that recognition for
a particular grammar is just plain failing and you need to tweak it
dynamically)
o how to deal with unspeakable words
o how to deal with deictic references
o how to deal with compound utterances
o how to handle dictation vs. command and control
o how to deal with tapering/restructuring of prompts based upon
recognition success/failure
o how to allow the user to recover from misrecognitions
o how to handle custom profiles per user
o (MOST IMPORTANTLY) just what is a compelling speech interaction
experience for the desktop?
Once you have a better understanding of the real problems and have
developed a working assistive technology, then take a look at perhaps
genericizing a useful layer to multiple engines. The end result is
that you will probably end up with a useful assistive technology
sooner. In addition, you will also end up with an API that is known
to work for at least one assistive technology.
Will
More information about the kde-accessibility
mailing list