[Kde-accessibility] [Announce] Revisions to KDE Text-to-Speech System (KTTS)

Gary Cramblitt garycramblitt at comcast.net
Sat Nov 13 06:13:26 CET 2004


A number of changes have been committed to CVS for KTTSD and the KDE 
Text-to-Speech system.  Chief among them is support for Festival 2.0 and 
MultiSyn voices.

If you have not tried Festival 2.0 (currently 1.95 beta) and the new MultiSyn 
voices, you are in for a treat.  The voices are very natural sounding, 
sometimes indistinguishable from a human voice.  Congrats to the Festival 
team for a job very well done indeed.

In addition, Festival is now free for any purpose, commercial or 
non-commercial alike.  See Festival website for details.

That's the good news.  Now the bad news.  The MultiSyn voices are huge and 
typically require 5 to 15 seconds to load.  Furthermore, if the MultiSyn  
voices are the only voices you have installed, you pay this penalty each time 
Festival is started.  To deal with this, KTTS now offers the option to start 
Festival and load voices when KTTSD is started, rather than waiting until the 
first use of the synth.

Synthesis of sentences also takes slightly longer than for other voices.  
Because KTTSD endeavors to keep the synth busy while simultaneously playing 
already synth'ed sentences, this is not too bad, but does cause a noticeable 
delay for the first sentence.

The datadir parameter is no longer supported in Festival 2.0, and the MultiSyn 
voices are not expected to be in the voices/ directory.  Instead, they are in 
the voices-multisyn/ directory, while the old voices remain in the voices/ 
directory.  Accordingly, the Festival Interactive configuration dialog no 
longer asks for the path to the voices directory.  Instead, it asks for the 
path to the Festival executable, and queries Festival itself for the 
available voice files.  As a side benefit, this permits you to install both 
the old and new Festival and configure one or more Talkers for each one.  
Because the query can take a long time (up to 15 seconds, but usually 5 to 6 
seconds), a Cancel button is offered for users who wish not to wait.  (Tip: 
If you want to avoid long query times, install one of the diphone voices in 
addition to the MultiSyn voices, such as kal_diphone.)

At present, only two MultiSyn voices are available -- Canadian English and 
Scottish English -- but IMHO, this new technology is so impressive that 
additional languages are sure to come soon.

The MultiSyn voices seem to ignore the Duration_Stretch parameter, so for 
these voices, the Speed settings in the Festival Interactive configuration 
dialog are disabled.

Several other enhancements and bug fixes have been made.  See the ChangeLog 
below.

KTTS is currently available in KDE CVS, kdenonbeta module, directory kttsd.   
See

  http://developer.kde.org/source/anoncvs.html

Nightly tarballs of the kdenonbeta module (large) are available at

  ftp://ftp.kde.org/pub/kde/unstable/latest/kdenonbeta.tar.bz2

Festival 1.95beta is available at

  http://www.cstr.ed.ac.uk/projects/festival/download.html

If the Talker configuration dialogs seem to be misbehaving for you, please do 
the following:

1.  Exit KTTSMgr and killall kttsd.
2.  Run clean_obsolete.sh in the kttsd root directory.
3.  Reinstall KTTS, i.e., make install.
4.  Delete $HOME/.kde/share/config/kttsdrc
5.  Start KTTSMgr and reconfigure your Talkers.  Be sure to click Apply!

Since these changes required code changes throughout KTTS, I'm sure there are 
a few bugs lurking.  Feedback is appreciated.

ChangeLog
---------------

2004-11-11  Gary Cramblitt (PhantomsDad) <garycramblitt at comcast.net>
        * Support for Festival 2.0. and Festival MultiSyn voices in 
FestivalInt plugin.
        * Query Festival for available voices, rather than scanning for 
directories.
        * Support for multiple versions of Festival executable.  Now asks for 
EXE path rather
          than voices path.
        * Allow preload of Festival voices that take a long time to load.
          If set, Festival is started when KTTSD starts and the voice is 
loaded.
        * When stopText() is called and FestivalInt plugin is synthing (not 
saying) using
          a pre-loaded voice, instead of killing Festival, which would cost 
hugely in
          re-startup time, Festival is allowed to finish synthing and result 
is discarded.
          This improves performance when rewinding/fastforwarding.
        * Corrected FestivalInt voices file as to voice descriptions and 
languages.
          Added MultiSyn voices.
        * Added accelerators and WhatsThis help to FestivalInt, Command, Epos, 
Flite, and Hadifix
          configuration dialogs.
        * Added modal, cancelable, progress dialog while Testing in 
FestivalInt, Command, Epos,
          Flite, FreeTTS, and Hadifix configuration dialogs.
          This prevents a crash when user clicks OK or Cancel before test has 
completed.
        * Command plugin always displays configuration dialog when added, 
i.e., never
          autoconfigs.
        * Speed adjustment disabled when using MultiSyn Festival voices.
        * Allow KTTSMgr screen to be resized to minimum size.  Allow splitter 
to resize jobs
          ListView to minimum vertical size.
        * No longer attempt to build Festival plugin (static linking to 
Festival/Speech Tools
          libaries).  User must explicitly request via ./configure 
--enable-kttsd-festival.
          Code is woefully behind, I cannot get it to work, and probably 
wouldn't work
          anymore even if I could get it to link and get past crash on first 
call to library.
          FestivalInt seems to work just fine..grc.

2004-11-10  Paul Giannaros (Cerulean)
        * getTalkerCodes() returning corrupted talker codes.

-- 
Gary Cramblitt (aka PhantomsDad)
KDE Text-to-Speech Maintainer
http://accessibility.kde.org/developer/kttsd/index.php


More information about the kde-accessibility mailing list