[Kde-accessibility] Another speech engine for KTTS?

Wed Feb 8 01:59:53 CET 2006

On Tuesday 07 February 2006 10:11, Jonathan Duddington wrote:
> Here is another speech synthesis engine which can be used with KTTS.
>    http://home.clara.net/jsd/linux/linux_speak.zip   (200 kbytes)

I downloaded the zip file and took a look.  No source code and the license is 
not stated.  Also, never heard of portaudio.

>
> It sounds quite different from the other speech engines such as
> Festival. Although it's not as smooth or natural, it's perhaps clearer
> and more energetic.  I prefer it, but that may just be because I'm used
> to it.
>
> Would it be worthwhile making it freely available for use as an
> alternative with KTTS?
>
> It speaks British English, although there are also superficial attempts
> at Esperanto and German in order to illustrate language switching.
>
> It seems to work quite well as a "Command" talker with KTTSMgr.
>
> I note that the KTTS handbook states:
>   "Ideally, you should use a command that synthesizes to a temporary
>   audio (wav) file, rather than send the speech directly to the audio
>   device"
>
> ... but I didn't see details on how to do this.  Or why it's
> preferable. How do KTTS and the speech engine both know where the wav
> file is located?  Or does the speech engine play the wav file itself?
> Currently the speech is output using the portaudio library, which seems
> to work well here (using linux kernal 2.6.12).

It is preferable for three reasons.  1. Less audio device collision.  kttsd 
can output the wav file in 4 possible ways (aRts, ALSA, GStreamer, or aKode).  
2. It improves throughput by allowing kttsd to play a wav file while the 
synth works on the next utterance (sentence).  3.  You can pause and stop the 
speech in mid-utterance.

Based on the README and "speak --help" it does not appear that linux_speak 
will create a wav file.

For the sake of argument, let's say someone adds that capability to  
linux_speak.  Lets suppose that it can be to told what the wav 
filename is on the command line.  Use the %w parameter in the command.  kttsd 
will generate a temporary filename and subsitute it for %w in the command.  
Let's say that linux_speak can output to a specified wav file using a -w 
option.  The command might look like this:

cat %f | speak --stdin -w %w

I will email the author to see what his license terms are, whether source code 
is available, and if output to wav can be added.

-- 
Gary Cramblitt (aka PhantomsDad)
KDE Text-to-Speech Maintainer
http://accessibility.kde.org/developer/kttsd/index.php