[Kde-accessibility] kttsd and KSayIt etc...
Gunnar Schmi Dt
gunnar at schmi-dt.de
Tue Feb 10 18:27:37 CET 2004
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hello,
As the current maintainer of kttsd I want to give my opinion on kttsd and
KSayIt:
I would like to see that kttsd becomes a general text-to-speech-interface
for KDE applications. This tts-interface should be used by applications as
KSayIt (which speaks clipboard contents and other long texts), KMouth
(which speaks single sentences that were typed by the user) and possibly
screen readers.
However, the requirements from these applications differ from each other,
so we will have to write a speech system that is able to handle all those
requirements.
On Tuesday 10 February 2004 17:22, Gary Cramblitt wrote:
> [...]
> The ideal client would permit user to:
>
> 1. Open a text file and begin speaking.
>
> 2. Pause or Stop speaking.
>
> 3. Remember (bookmark) current place in a file for restart at a later
> time.
>
> 4. Adjust speed and volume while speeking.
>
> 5. Highlight text on-screen as it is spoken.
>
> 6. Backup by word, sentence, or paragraph.
>
> 7. Filter architecture for converting files in other formats to text.
> For example, HTMLtoTXT, PDBtoTXT, etc.
>
> 8. KPart or Service that it is available from Konqueror.
> [...]
Well, these features can be implemented by two different approaches:
i. KSayIt passes the complete text to kttsd and kttsd divides the text into
paragraphs, sentences etc.
In this case kttsd would need to fulfill some requirements:
a) kttsd needs to know how to split the text into pieces. It currently
contains a simple implementation for that, but I am not sure how stable
that piece of code is. Additionally different languages could require
different rules for splitting the text. This leads to some complexity that
is only needed by KSayIt, not by KMouth or a screen reader.
b) kttsd needs to give feedback about the position to KSayIt. This feedback
cannot be as simple as the number of the piece as KSayIt does not know
anything about the pieces.
c) kttsd needs to be able to start a text at a given position. Again, the
position cannot simply be the number of the piece as KSayIt does not know
about the pieces.
ii. KSayIt cuts the text into pieces and sends them one by one (or a list
of pieces) to kttsd. In this case requirement a) is placed onto KSayIt.
The feadback for b) is as simple as the number of the piece and c) can be
implemented sending the pieces starting at the given position. (Backing up
can be implemented by invalidating all sent pieces and sending new
pieces.)
Overall I think that ii. is the better architecture.
> (Note: It is possible to get Festival to convert a text file to a wav
> file and then play it in any media player. The trouble with this
> approach is 1) it takes a long time and a lot of disk space to produce
> the wav file from a large text file, and 2) it is not possible to
> bookmark play and come back later for restart.)
> [...]
If you cut the text into pieces (e.g., sentences) you can convert the text
piece by piece. While one piece is being played you convert the next one.
This way you do not hear a gap between the pieces and the space needed for
the sound files is only depending on the size of the pieces (not on the
number of pieces).
Gunnar Schmi Dt
- --
Co-maintainer of the KDE Accessibility Project
Maintainer of the kdeaccessibility package
http://accessibility.kde.org/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)
iD8DBQFAKRSPsxZ93p+gHn4RAsW6AKCRo8pCgsC1Kx5J5lm4FBFU9jg6vQCZAWUS
sqYxfoTM6U4J6IN+5QPjgoI=
=FR56
-----END PGP SIGNATURE-----
More information about the kde-accessibility
mailing list