[kde-community] The Future of Speech Recognition in KDE: Proposal

Carl Symons carlsymons at gmail.com
Sat Aug 31 15:11:52 UTC 2013

On Sat, Aug 31, 2013 at 2:59 AM, Peter Grasch <peter at grasch.net> wrote:
> Hello,
> for those of you that do not yet know me, my name is Peter Grasch and I
> currently maintain the Simon project (http://simon.kde.org), a speech
> recognition project in KDE's extragear.
> Over the course of the summer, I have been working on bringing dictation
> capabilities to Simon (more info & demo video: http://grasch.net/node/22).
> Now, I'm trying to build up a network of developers and researchers that
> work together on building high accuracy, large vocabulary speech
> recognition systems for a variety of domains (desktop dictation being
> one of them).
> Building such systems using free software and free resources requires a
> lot of work in many different areas (software development, signal
> processing, linguistics, etc.).
> In order to facilitate collaboration and to establish a sustainable
> community between volunteers of such diverse backgrounds, I am convinced
> that the right organizational structure is crucial to ensuring continued
> long-term success.
> Naturally, as a KDE contributer, I would like to launch this project as
> part of KDE. I talked to quite a number of the people who expressed
> interest in taking up an active role in this effort, and this is what we
> would like to propose:
> * A new category in KDE's extragear called "Speech" (putting it on the
> same level as e.g., "Network"). Rationale: Not all speech recognition
> applications are necessarily related to accessibility (e.g., lecture
> transcription) and splitting up the projects in different categories
> would hinder collaboration.
> * Creating the "open speech group" (name still a work in progress) and
> setting up a project page for it. This would serve as little else than a
> common label for all projects that are part of the initiative -
> basically the equivalent of "KDE Multimedia Team" but for speech instead
> of multimedia. Rationale: A common brand makes it easier to market and
> represent the collective effort of all sub-projects.
> I've obviously read the KDE manifesto carefully and I think that such a
> group would be in line with the overall spirit, even though there are
> some details that I feel the need to point out explicitly:
> Some of the sub-projects may not necessarily be about end-user software
> or even software at all (e.g., speech modeling). However, please keep in
> mind that this is a sub-project of a larger initiative that is very much
> about end-user software; splitting the speech modeling in a separate
> project just makes sense because it's an ambitious project in it's own
> right.
> Some of sub projects may appear to diverge from "established practices"
> (by not using C++, for example) but that is mostly because there won't
> be any similar KDE projects (for example, somebody is already working on
> a web-based transcriber system based on ruby on rails) or "special
> considerations" (e.g., an application for Mac OS X may use the native
> toolkit because the KDE infrastructure for OS X is not sufficiently mature).
> I'm posting this here on the community list because I want to hear your
> thoughts on the proposal. Do you think that the 'open speech group'
> would fit within KDE?
> Best regards,
> Peter


"Accessibility" is an important aspect of the Simon project, but can
also be limiting as you explain.

Tablets and smartphones are mostly for content consumption rather than
creation. Adding speech recognition to Plasma Active would be nifty.


More information about the kde-community mailing list