[Kde-accessibility] KSpeech

Thu Mar 6 16:13:19 UTC 2014

On Thu, Mar 6, 2014 at 6:43 AM, Frederik Gladhorn <gladhorn at kde.org> wrote:
> Onsdag 5. mars 2014 23.04.12 skrev Jeremy Whiting:
>> Took a quick read through that just now and it looks pretty promising
>> from what I saw. I guess I don't know my way around gerrit very well
>> because I couldn't see a place to comment on the code like
>> reviewboard.
>> Really the only difference between jovie and that class are the following:
>> 1. jovie has some old code and ui to control jobs at a fine grain that
>> spd doesn't expose really well, so I left it out when I ported ktts to
>> spd.
>
> I would like to expose "voices" and "languages" in a sensible fashion. This is
> tricky to get right cross-platform. I started with something on Linux but
> decided to implement other backends first before attempting to implement voice
> selection.
> For language/locale I think qtspeech should default to the system locale and
> let the user select a different one.

Using the system locale as default makes sense. What do you mean by
"voices" you mean something like spd's voice type (male1, male2,
female1, etc.)
Ktts had a complex system of specifying a voice with xml with
language, voice type, speed, pitch, etc. attributes and if an
attribute was empty it meant any voice with the other attributes was
acceptable. I think that's a bit too fine-grained for most cases
though, most uses I can think of just want to choose the voice type,
or even just the gender, and let the user/defaults choose the rest.
If more complex specification is wanted applications could always use
ssml to change the voice as part of the text they send to qtspeech.

>
>> 2. user defined filters with some sane/useful defaults (if we were to
>> use QtSpeech for kde notifications, set konvi to speak all messages,
>> there's not a way to let the user say change "<jpwhiting> fregl: you
>> rock" into "jpwhiting says fregl you rock")
>
> Maybe. I'd rather keep qtspeech very simple. My goals where to make it a tiny
> library that is lean, fast and async by using signals and slots.
> I want it to be good enough to be used in apps that use voice navigation, but
> also when writing a screen reader. Some level of configuration is required in
> any case. Let's come up with a good api that makes sense across platforms,
> then I'm in.

Right, simple is definitely good. I'm just wondering if it could
accept plugins that implement some filtering method to filter the
text. Then filters could be as simple as a regex to convert
xml/html/etc. text into something that makes sense audibly like that
example from irc, or a complex filter plugin to change the voice could
inject ssml into the text. Maybe something like

QAbstractSpeechFilter {
  public:
    virtual QString filterText(QString &text)
};

Then a simple filtermanager (or even part of the existing class) loads
the plugins and when say() is called it passes the text through all
the plugins filterText() methods.

Is there some other Qt library or class that takes plugins for
specific functionality we could use as inspiration for making this
work and look clean also?

>
>> 3. user configurability (As a user I can't set up which voice I would
>> like all speech-using applications to use)
>
> As with other Qt libs, this is more for the platform to set up. Currently
> qtspeech uses whatever voice is selected system wide (aka the default). I
> think that is the right approach - follow what we get from the platform.
> For KDE I'd thus suggest creating a configuration module which lets the user
> choose the platform defaults.

Yeah, each platform could have its own configuration of the defaults
sure, the only part missing is a real-time configuration change. For
example if Jovie is reduced to a kcm to configure speech-dispatcher's
default voice and I start listening to a pdf from okular or something
and decide I need the pitch to be lower, changing the default voice
wont change the voice that speech-dispatcher is already using to read
the pdf.  Maybe that could be fixed with a patch to speech-dispatcher
to accept immediate default changes though, I'll have to think about
that.
>
>> 4. dbus, though this isn't as important if each application that uses
>> speech links to the library and speech-dispatcher or the system apis
>> do the async for us already anyway as you said.
> I don't see a point in adding dbus into the mix indeed. One thing that is
> interesting though is what kind of effect you get when opening the speech
> backend from two apps at the same time.
>
>> Items 1 and 4 will be irrelevant in a KF5 world but I'm wondering how
>> 2 and 3 could be added either to qtspeech itself or as a kspeech
>> library that wraps qtspeech for kde applications to use.
>>
>> Any thoughts on that? I would be pretty interested in helping with
>> qtspeech if it greatly simplifies or even deprecates jovie as it looks
>> like it could do possibly.
>
> I'd be more than happy to get contributions of course. I cannot promise much
> from my side, of course I'd like to continue working on this project as time
> permits (so far it really is a spare time thing).

Yep, that's completely understandable, np.

thanks,
Jeremy

>
> Greetings,
> Frederik
>
>
>> Jeremy
>>
>> On Wed, Mar 5, 2014 at 12:29 PM, Frederik Gladhorn <gladhorn at kde.org> wrote:
>> > On Tuesday 4. March 2014 16.43.10 Jeremy Whiting wrote:
>> >> Hello all, I've realized a bit ago that kspeech was not included in
>> >>
>> >> the kdelibs split (probably because it was in staging at the time and
>> >>
>> >> didn't conform to the other framework policies yet). I've cleaned it
>> >>
>> >> up a bit and put it in my scratch space, but have some architectural
>> >>
>> >> questions about it before I make it a proper framework.
>> >>
>> >>
>> >>
>> >> 1. The KSpeech dbus interface is old and showing its age. Many of the
>> >>
>> >> methods are no longer implemented in the application itself since it
>> >>
>> >> was ported to speech-dispatcher. One thing I would definitely like to
>> >>
>> >> do is clean up/remove methods that aren't implemented currently (and
>> >>
>> >> possibly re add some later on if speech-dispatcher gets better/more
>> >>
>> >> support for job control, etc.) So the question about this is is KF5
>> >>
>> >> time a good time to drop/clean up the dbus interface?
>> >>
>> >>
>> >>
>> >> 2. The KSpeech interface that was in kdelibs/interfaces is just that a
>> >>
>> >> dbus interface only. I would like to make it a proper
>> >>
>> >> library/framework with a QObject based class for talking to Jovie (the
>> >>
>> >> application that implements the KSpeech dbus interface) and wonder if
>> >>
>> >> other things such as what's currently in jovie/libkttsd should be in
>> >>
>> >> the kspeech library also. If I move code from jovie into libkspeech
>> >>
>> >> (or merge kspeech interface into libkttsd and make libkttsd a
>> >>
>> >> framework likely renamed to libkspeech since libkttsd isn't a public
>> >>
>> >> library anyway and has the old ktts name) what's the best way to
>> >>
>> >> preserve the history of both the kspeech interface and libkttsd
>> >>
>> >> sources. Didn't the plasma or kde-workspaces split do something fancy
>> >>
>> >> with git where old history pointed to the old git repo somehow?
>> >>
>> >> Along with this, if libkspeech is defining the kspeech dbus interface
>> >>
>> >> and has a class to talk to that interface, does the interface still
>> >>
>> >> need to be in servicetypes like the dbustexttospeech.desktop file that
>> >>
>> >> was installed in /usr/share/kde4/servicetypes in kde4 times?
>> >
>> > In case you are interested, I started a cross platform library for tts
>> > here:
>> >
>> > https://codereview.qt-project.org/#admin,project,qt/qtspeech,info
>> >
>> > It's a regular Qt module providing a library that currently consists of
>> > one
>> > class.
>> >
>> >
>> >
>> > It is currently quite incomplete because it lacks voice/language
>> > configuration.
>> >
>> > On the up side, I implemented basic backends for win/mac/android/linux.
>> >
>> > Linux is using speech-dispatcher, but I was quite dissatisfied with spd's
>> > API. For example it lacks proper free functions for the structs it
>> > allocates - so one has to basically leak them.
>> >
>> > I didn't dare looking at Jovie/kttsd since I used the Qt license.
>> >
>> >
>> >
>> > Greetings,
>> >
>> > Frederik
>> >
>> >> thanks,
>> >>
>> >> Jeremy
>> >>
>> >> _______________________________________________
>> >>
>> >> Kde-frameworks-devel mailing list
>> >>
>> >> Kde-frameworks-devel at kde.org
>> >>
>> >> https://mail.kde.org/mailman/listinfo/kde-frameworks-devel
>> >
>> > _______________________________________________
>> > Kde-frameworks-devel mailing list
>> > Kde-frameworks-devel at kde.org
>> > https://mail.kde.org/mailman/listinfo/kde-frameworks-devel
>