[Kde-accessibility] Re: Fwd: kttsd

Tue Mar 23 23:41:06 CET 2004

On Tuesday 23 March 2004 12:09 pm, Olaf Jan Schmidt wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Hi Gary!
>
> Thanks for your mail and for your work on kttsd. I was away for a couple
> of days, this is why I reply a bit late.
>
> I forward your e-mail to my brother Gunnar, who accepted to become kttsd
> maintainer a few months ago but who hasn't written any kttsd code yet.
>
> There are several reasons why we didn't write much new kttsd code yet.
> Both Gunnar and I have very limited time. I am therefore happy that you
> have joined us and that you are working on kttsd, and I think my brother
> thinks similarly. (Well, he didn't have a chance to read your mail yet, I
> only told him about it on the phone.)
>
> Another reason is that Gunnar and I have been discussing a lot of changes
> for kttsd, which might require rewriting great parts of it. If you are
> willing to work together with us on these big changes, then it would be
> ideal for us to have you as the new kttsd maintainer.
>
> This is a short list of the changes I have in mind: (Of course we are
> always open to be convinced otherwise!)
>
> * At FOSDEM, we met some developers of other speech synthesis related
> projects, and we had the idea to use a common speech device driver
> format. Then we could reuse a number of existing speech device drivers
> and would offer hardware speech device manufactors a common API to
> support.
> This could be implemented as a kttsd plug-in, so it would fit in with
> existing kttsd code, but the kttsd plug-in API would need to be enhanced
> to include a lot of additional functions like stopping a text or setting
> text markers.

Do you have any specs for this yet?

>
> * As you already mentioned, the DCOP API needs more functionality.
>
> * kttsd has currently quite some lag time when speaking sentences from
> KMouth. This is partly because it is mainly designed to read out long
> texts, which is a functionality not needed by KMouth. I would prefer
> kttsd to be designed for very different applications, including possible
> screen readers, which are completely unsupported in the current kttsd.

Most of the lag time seems to be loading the tts engine.  I've been playing 
with Festival using the Festival Interactive plugin I wrote.  The first 
sentence takes a second or two, because Festival has to load and read the 
voice files etc.  Since it remains running, however, the lag thereafter is 
not bad.  We could perhaps enhance the plugin api to permit pre-loading the 
engine when kttsd is started.

I've also been playing with Festival Lite using the Command plugin.  Flite 
needs only about 45ms to load and so it begins speaking more quickly.  
However, it has to be loaded for each sentence.  (BTW, one thing I'm 
wondering about is why the Command plugin runs the command in a shell?)  The 
bad news about Flite is there are only one or two voices available.

>
> * If we wish to integrate kttsd with the KDE message service, then it
> might make sense to write a very small kttsd deamon for kdebase and have
> all extra functionality like the splitting of texts, the GUI and all of
> the plug-ins (apart from the very small command plug-in) in
> kdeaccessibility.

I'm not sure what you mean by "message service".  Do you mean KNotify?

At the very minimum, you want to expose Stop, Pause, and Resume buttons in all 
cases, so I don't see the GUI part of kttsd disappearing entirely.  I agree 
that file reading, sentence parsing, etc., should be moved out of kttsd 
eventually.

>
> We are not sure yet how to define the new APIs. I will post a suggestion
> to the kde-accessibility mailing list in a few days. I hope you can join
> in with other good ideas, so we can turn kttsd into a small but very
> flexible tool both for reading long texts and for reading out short
> messages without delay.

I'm looking forward to it.  Just so you know, I am not much interested in 
"accesssibility".  My main goal is to get a robust text-to-speech capability 
working in KDE.  I like to have my computer read long stories and articles to 
me.  I don't want to wait for ATK, so I'm anxious to get started, even if the 
kttsd API has to change later.

>
> Olaf
>
> PS: Regarding your festival questions, I will try to investigate some of
> the issues in the following days, but it would be best if you could
> re-post your questions on the kde-accessibility mailing list, so that
> other people have a chance of responding.
> - --
> Olaf Jan Schmidt, KDE Accessibility Project
> KDEAP co-maintainer, maintainer of http://accessibility.kde.org
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.0.6 (GNU/Linux)
> Comment: For info see http://www.gnupg.org
>
> iEYEARECAAYFAkBgbzwACgkQoLYC8AehV8eQaQCbBHB6u5QSV2eQIlcQRaQbWqGD
> UmgAn38NVAt95TJf0IZ6tjil1Jd8uu5i
> =f58Y
> -----END PGP SIGNATURE-----

> Hello,
> 
> I've made a number of enhancements and a few minor fixes to KTTSD.  I 
noticed 
> you've been working on it in the last few months.   I put myself down as a 
> maintainer in the README file and About screen and wondered if you wanted to 
> add your name as well.
> 
> I haven't been able to get KTTSD to work with the Festival plugin.  It 
> immediately crashes on the call to festival_initialize.  Since libFestival.a  
> is statically linked, my theory is that libFestival.a was compiled with an 
> older, incompatible version of gcc and the gcc libraries.  Can you confirm?  
> Does it work for you?  I have KDE 3.2.1, Festival 1.42, and gcc 3.2.2.
> 
> I tried downloading and building the latest Festival (1.43).  While it 
> compiles just fine on my machine, I can't get it installed.  The make files, 
> by default, assume that the source and install directories are the same.  
> When I tried to build it with /prefix=/usr, it would not build.
> 
> So to avoid this problem altogether, I wrote a new plugin for kttsd called 
> Festival (Interactive).  Its in the plugins/festivalint directory.  It runs 
> Festival as a KProcIO and communicates with it via pipes.  Seems to work 
> pretty well (at least for me), except that I don't have the Force Arts 
option 
> working yet.
> 
> I saw your cvs commit from about 7 weeks ago.  I think, for now, the gui 
stuff 
> should stay in.  What is needed is a richer dcop interface so that 
> applications using kttsd can get more feedback.  For instance, apps might 
> want to know if kttsd is currently speaking, how many sentences are in the 
> queue, and which sentence it is on.
> 
> I'm also wondering if kttsd would make sense as a KIOslave, but I know next 
to 
> nothing about that.  (Got some reading to do. :)
> 
> BTW, I am aware of the ATK and QT 4, but got tired waiting for them.  
> Eventually, kttsd will become obsolete, I suppose.
> 
> -- 
> Gary Cramblitt (aka PhantomsDad)