[Kde-accessibility] kttsd and KSayIt etc...

Gunnar Schmi Dt gunnar at schmi-dt.de
Tue Feb 10 18:27:37 CET 2004


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hello,

As the current maintainer of kttsd I want to give my opinion on kttsd and 
KSayIt:

I would like to see that kttsd becomes a general text-to-speech-interface 
for KDE applications. This tts-interface should be used by applications as 
KSayIt (which speaks clipboard contents and other long texts), KMouth 
(which speaks single sentences that were typed by the user) and possibly 
screen readers.

However, the requirements from these applications differ from each other, 
so we will have to write a speech system that is able to handle all those 
requirements.

On Tuesday 10 February 2004 17:22, Gary Cramblitt wrote:
> [...]
> The ideal client would permit user to: 
>
> 1.  Open a text file and begin speaking.
>
> 2.  Pause or Stop speaking.
>
> 3.  Remember (bookmark) current place in a file for restart at a later
> time.
>
> 4.  Adjust speed and volume while speeking.
>
> 5.  Highlight text on-screen as it is spoken.
>
> 6.  Backup by word, sentence, or paragraph.
>
> 7.  Filter architecture for converting files in other formats to text.
> For example, HTMLtoTXT, PDBtoTXT, etc.
>
> 8.  KPart or Service that it is available from Konqueror.
> [...]
Well, these features can be implemented by two different approaches:

i. KSayIt passes the complete text to kttsd and kttsd divides the text into 
paragraphs, sentences etc.

In this case kttsd would need to fulfill some requirements:

a) kttsd needs to know how to split the text into pieces. It currently 
contains a simple implementation for that, but I am not sure how stable 
that piece of code is. Additionally different languages could require 
different rules for splitting the text. This leads to some complexity that 
is only needed by KSayIt, not by KMouth or a screen reader.

b) kttsd needs to give feedback about the position to KSayIt. This feedback 
cannot be as simple as the number of the piece as KSayIt does not know 
anything about the pieces.

c) kttsd needs to be able to start a text at a given position. Again, the 
position cannot simply be the number of the piece as KSayIt does not know 
about the pieces.

ii. KSayIt cuts the text into pieces and sends them one by one (or a list 
of pieces) to kttsd. In this case requirement a) is placed onto KSayIt. 
The feadback for b) is as simple as the number of the piece and c) can be 
implemented sending the pieces starting at the given position. (Backing up 
can be implemented by invalidating all sent pieces and sending new 
pieces.)


Overall I think that ii. is the better architecture. 

> (Note: It is possible to get Festival to convert a text file to a wav
> file and then play it in any media player.  The trouble with this
> approach is 1) it takes a long time and a lot of disk space to produce
> the wav file from a large text file, and 2) it is not possible to
> bookmark play and come back later for restart.)
> [...]
If you cut the text into pieces (e.g., sentences) you can convert the text 
piece by piece. While one piece is being played you convert the next one. 
This way you do not hear a gap between the pieces and the space needed for 
the sound files is only depending on the size of the pieces (not on the 
number of pieces).

Gunnar Schmi Dt
- -- 
Co-maintainer of the KDE Accessibility Project
Maintainer of the kdeaccessibility package
http://accessibility.kde.org/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)

iD8DBQFAKRSPsxZ93p+gHn4RAsW6AKCRo8pCgsC1Kx5J5lm4FBFU9jg6vQCZAWUS
sqYxfoTM6U4J6IN+5QPjgoI=
=FR56
-----END PGP SIGNATURE-----


More information about the kde-accessibility mailing list