[Kde-accessibility] Proklam and KMouth

Bill Haneman bill.haneman@sun.com
23 Sep 2002 20:57:54 +0100


Pupeno said:

> I don't want to see what happens if Proklam is configured as a
gnome-speech 
> backend AND gnome-speech is configured as a Proklam backend.

It's not as crazy as it may sound.  If the two have interfaces that map
onto one another, they can interoperate freely, which was my point. 
Though of course you wouldn't use the two "plugins" in a circular
fashion, if the services interoperated fully then all Proklam clients
could access all gnome-speech engines, and all gnome-speech clients
could access all the Proklam drivers.  That would be the goal.
 
...

> As I said in another mail... I expect as little as posible from the
TTS, 
> that's why Proklam does a lot of parsing and preprocessing. I think
this will 
> allow me to use as many TTSs as posible.

What we found in discussions with assistive technology engineers was
that good accessibility requires some fairly advanced features, or at
least better than lowest-common-denominator.  But by doing all the
complexity in the front-end, as you suggest, you lost the ability to use
powerful TTS backend features, and you probably increase latency (which
is a big issue for accessibility).

On the other hand if your API is rich and full -featured, you can choose
to expose advanced features from the TTS driver when they are available,
and otherwise wrap/emulate them in the frontend, as you suggest for
stopping speech.

I think the big difference here is that you have decided to fix both the
front and back ends of your architecture, which is very limiting if you
are trying to make flexible plugins.  The gnome-speech architecture is
not really suitable as a lowest-common-denominator back-end, it is a
rich-featured "front end".  

In gnome-speech we do not have a single back-end API, we use the APIs
available to us from the TTS engines.  If you did likewise with Proklam
you would be able to access the extended features you mention much more
readily.  My suggestion is that the front ends of the two architectures
should be made as similar as possible. for a number of reasons.

> I know this won't allow me to use the extended features of feature
rich 
> TTSs... I may think of some workarroud solution for some features,
like the 
> ability to stop on demand... when a stop message is recived by
Proklam... it 
> should mark the text to be stoped and run a stop function for the plug
in... 
> it may stop in the moment or just ignore it. It also depends if the
say 
> function is blocking or not.
> 
> > So if our APIs are close enough, the possibilities for how the two
> > architectures plug together are much more flexible, and much less
work
> > is involved in writing either kind of plugin.  This means of course
that
> > we need to try to find IDL that meets the needs that we've both
> > identified, i.e we should make sure we haven't "left anything out"
of
> > our IDL which would be important to Proklam and vice-versa.

> As gnome-speech would be a plug in from Proklam, Proklam's interface
doesn't 
> have to be similar to gnome-speech's but Proklam's plug in interface
should 
> be similart to gnome-speechs's and I think the best would happen to 
> gnome-speech (aren't the interfaces from the aplication and to the TTS
very 
> diferent in gnome-speech ? well, in Proklam they're, from the
application, 
> there's a rich interface with lot's of features and capabilities and
more are 
> to be developed, and to the TTS there's a simple interface so any TTS
could 
> be used).

gnome-speech does not have a single back-end API, for the reasons you
allude to yourself.  gnome-speech could be a plugin for Proklam, but
only if Proklam had a more flexible back-end that allowed for both
simple and complex TTS drivers would you be able to use gnome-speech's
advanced features effectively.

-Bill