[Kde-accessibility] Proklam and KMouth

Bill Haneman bill.haneman@sun.com
23 Sep 2002 19:32:46 +0100


On Mon, 2002-09-23 at 18:57, Pupeno wrote:
...
> > I think the phrases that are spoken via KMouth should be classified as
> > messages? I do not need to navigate through them, but a feature to stop
> > speaking a phrase would be nice. (Is it possible that KMouth gets informed
> > when Proklam finishes speaking a message?)
> Stoping the phrase won't be posible as the speaking function, at least in 
> Festival, is atomic from this side of the street. Once I call the Festival 
> function, the program is blocked untill it ends, and even when Proklam being 
> multithreading there's nothing I can do tu interrupt a function.

We have been working on the same sorts of issues in gnome-speech.  Marc
Mulcahy (not on this list) is the best resource for details; the new IDL
which should be the default in a few days provides for not only the
ability to stop an utterance, but the ability to get "callback"
notifications from the speech process at the best granularity that the
underlying speech engine can support.  Unfortunately this IDL isn't in
CVS yet, but should be in a couple of days, maybe a week.  It's possible
that we could converge on a common API in the next few days.

I believe that if you use the festival internal APIs instead of just the
festival-server API, you can get better control of utterances.  It is
certainly possible to stop a phrase that is partway spoken, by issuing
this string to the festival server:

"(audio_mode 'shutup)\n"

Best regards,

Bill







> You'll have a number which will be a unique number to identify messages or 
> warnings and you could use that to know if a warning or message has been 
> spoken or not. We could add a feature that you can 'register' with proklam so 
> it will send you a dcop message when a message or warning is spoken, but 
> then, that won't be covered in the API KSpeech (mixing everything, what could 
> be done is make a blocking call to sayWarning and sayMessage in KSpeech).
> I think KMouth should normaly say messages but with the posibility to say 
> warnings because someone may be listening to a long text and may have 
> something to say imediatly... then that would be a warning.
> 
> > > > 4th idea: Make it possible to load several plug-ins, and implement a
> > > > parameter forwarding mechanism. When the program asks for special
> > > > parameters ("language" = "en" and "sex" = "female"), then Proklam calls
> > > > a parameter function in the default plugin. If the function returns
> > > > with "false", then the next plugin is tried. If none of the active
> > > > plugin is found, then the default configuration is used.
> > >
> > > [...]
> > > My idea is this... as you say, load more than one plug in at a time. In
> > > the KControl module, you'll chose to add a language... so, you add the
> > > language English and a tab or something like that for English is added,
> > > then, inside that tab, you chose which plug in you want and which voice
> > > you want (Festival for example, doesn't trully separate languages, but
> > > voices, you could end up defining an spanish voice for a english), then
> > > if you add Spanish, another tab for Spanish is added.
> > > [...]
> >
> > With KMouth in mind I could work with both ideas. However, I need a list of
> > languages that are supported by Proklam (I assume that Proklam has a
> > function that can be used to enquire this list). The former idea needs to
> Ok, it'll be there.
> 
> > be extended for that (each plug in needs to contain a function that returns
> > a list of languages).
> >
> > The formar idea is more work to implement:
> > -There needs to be a larger interface between Proklam and the plug ins.
> > -Proklam needs to contain some more complex code in order to determine
> > which plug in is used.
> > -Each multilingual plug in needs to have a multilingual configuration,
> > which is more complex both to implement and to use than a single-lingual
> > configuration (selecting multiple voices for multiple languages for
> > example).
> >
> > As both ideas contain the fact that Proklam can load multiple plug ins at
> > the same time, I do not see why we need to implement this overhead.
> > Therefore I would prefer the latter idea.
> 
> The problem allowing multilingual plug ins to manage multiple languages is 
> that it turns the complete system in something much more complex, from my 
> point of view (developer), from the user point of view (to configure the 
> system), and maybe even to you (the developers using Proklam).
> The problem with that is that it should allow you to load any amount of plug 
> ins and configure languages inside them, any amount of languages and then if 
> you load Festival it should load by default one voice with one language which 
> may interfeer with another TTS already working and that requires a lot of 
> checking-if-the-configuration-will-work. Another problem with that as you 
> said, plug ins and Proklam will need a greater comunication so plug ins could 
> register the language they are supporting. And I think this could even have 
> another drawback... how much times takes Festival to switch from one voice 
> (language) to another ? supouse that you triggered the reading of a long text 
> and something triggers a bunch of messages and warnings and although you 
> speak english, you're desktop is english and that messages and warnings are 
> in english, the text could be in german and it will be hell of loading voices 
> which will consump a lot of resources.
> 
> So, reformulating my previous idea, it would work this way:
> In the configuration, you chose a language and press [ADD], so a tab with the 
> language name in it is added at the bottom of the configuration... in that 
> tab, you chose the plug in you want to use for that language, and then you 
> configure it (of course, you have to be carefull to chose an english voice 
> for english and a spanish voice for spanish in the case of Festival for 
> instance). If you want another language, you select another language and 
> press add, another tab is added... then, you can select festival or whatever 
> plug in you have installed... so, each language will have it's own plug 
> in/configuration. Do you understand ? do you like it this way ? The main 
> drawback of this is having a lot of plug ins loaded at the same time... but 
> who will need more than 4 languages at the same time ? An important feature 
> is that with advanced configurations I'll be able to tune the system. For 
> example, in Festival, you have the heapsize... supouse that you load spanish 
> because from time to time you want to say/hear something in spanish, but your 
> main language is english and that's what you'll use all the time, then you 
> may configure Festival in english with a big heap size and festival in 
> spanish with a smaller heap size.
> 
> Another thing... for example, there's no easy way to say which languages 
> support Festival, since they're just voices, to know which languages it 
> supports I'll have to dig inside the available voices to build the list of 
> languages and it doesn't seem a nice practice... with other TTS it may not be 
> posible at all (I'm not sure that it's posible with Festival).
> 
> I'll take a look at gnome-speech API and Architecture to polish this and I'll 
> try to send ASAP a mail with the next API/Arch of Proklam here so we can 
> discuss over that, ok ?
> 
> Thank you.
> - -- 
> Pupeno: pupeno@pupeno.com
> http://www.pupeno.com
> - ---
> Help the hungry children of Argentina, 
> please go to (and make it your homepage):
> http://www.porloschicos.com/servlet/PorLosChicos?comando=donar
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.0.7 (GNU/Linux)
> 
> iD8DBQE9j1YRLr8z5XzmSDQRAo/mAJ9+0X4iyXMxm1VSEl2hKd2sWIzxogCg4OWb
> Nxtjj75qvwottQqxGzjZkTA=
> =rDzU
> -----END PGP SIGNATURE-----
> 
> _______________________________________________
> kde-accessibility mailing list
> kde-accessibility@mail.kde.org
> http://mail.kde.org/mailman/listinfo/kde-accessibility