Randa Meeting: Notes on Voice Control in KDE
Thomas Pfeiffer
thomas.pfeiffer at kde.org
Mon Sep 18 13:00:40 BST 2017
> On 16. Sep 2017, at 00:08, Aditya Mehra <aix.m at outlook.com> wrote:
>
> Hi Everyone :),
>
> Firstly i would like to start of by introducing myself, I am Aditya, i have been working on the Mycroft - Plasma integration project since some time which includes the front-end work like having a plasmoid as well as back-end integration with various plasma desktop features (krunner, activities, kdeconnect, wallpapers etc) .
>
> I have carefully read through the email and would like to add some points to this discussion (P.S Please don't consider me partial to the mycroft project in anyway, I am not employed by them but am contributing full time out of my romantics for Linux as a platform and the will to have voice control over my own plasma desktop environment in general). Apologies for the long email in advance but here are some of my thoughts and points i would like to add to the discussion:
>
> a) Mycroft AI is an open source digital assistant trying to bridge the gap between proprietary operating systems and their AI assistant / voice control platforms such as "Google Now, Siri, Cortanta, Bixbi" etc in an open source environment.
>
> b) The mycroft project is based on the same principals as having a conversational interface with your computer but by maintaining privacy and independence based on the "Users" own choice. (explained ahead)
>
> c) The basic ways how mycroft works:
> Mycroft AI is based of python and runs four services mainly:
> i) websocket server more commonly referred to as the messagebus which is responsible for accepting and creating websocket server and connections to talk between clients(example: plasmoid, mobile, hardware etc)
> ii) The second service is called the 'Adapt' intent parser that acts like an platform to understand the users intent for example "open firefox" or "create a new tab" or "dict mode" with multi language support that performs the action that a user states.
> iii) The third service is the STT (Speech to text service): This service is responsible for the speech to text actions that are sent over to adapt interface after conversion to text for performing the' specified intent
> iv.) The fourth service is called "Mimic" that much like the "espeak TTS engine" performs the action of converting text to speech, except mimic does it with customized voices with support for various formats.
>
> d) The mycroft project is based on the Apache license which means its completely open and customizable by every interested party in forking their own customizable environment or even drastically rewriting parts of the back end that they feel would be suitable for their own user case environment and including the ability to host their own instance if they feel mycroft-core upstream is not able to reach those levels of functionality. Additionally mycroft can also be configured to run headless
>
> e) With regards to privacy concerns and the use of Google STT, the upstream mycroft community is already working towards moving to Mozilla deep voice / speech as their main STT engine as it gets more mature (one of their top ranked goals), but on the side lines there are already forks that are using STT interfaces completely offline for example the "jarbas ai fork" and everyone is the community is trying to integrate with more open source voice trained models like CMU sphinx etc. This sadly i would call a battle of data availability and community contribution to voice vs the already having a google trained engine with advantages of propitiatory multi language support and highly trained voice models.
>
> f) The upstream mycroft community is currently very new in terms of larger open source projects but is very open to interacting with everyone from the KDE community and developers to extend their platform to the plasma desktop environment and are committed to providing this effort and their support in all ways, including myself who is constantly looking forward to integrating even more with plasma and KDE applications and projects in all fronts including cool functionality accessibility and dictation mode etc.
>
> g) Some goodies about mycroft i would like to add: The "hey mycroft" wake word is completely customizable and you can name it to whatever suits your taste (what ever phonetic names pocket sphinx accepts) additionally as a community you can also decide to not use mycroft servers or services to interact at all and can define your own api settings for stuff like wolfram alpha wake words and other api calls etc including data telemetric's and STT there is no requirements to follow Google STT or default Mycroft Home Api services even currently.
>
> h) As the project is based on python, the best way i have come across is interacting with all plasma services is through Dbus interfaces and the more applications are ready to open up their functionality over dbus the more faster we can integrate voice control on the desktop. This approach on the technical side is also not only limited to dbus but also developers who prefer to not wanting to interact with dbus can choose to directly expose functionality by using C types in their functions they would like to expose to voice interaction.
>
> i) There are already awesome mycroft skills being developed by the open source community which includes interaction with plasma desktop and stuff like home-assistant, mopidy, amarok, wikipedia (migrating to wiki data) , open weather, other desktop applications and many cloud services like image recognition and more at: https://github.com/MycroftAI/mycroft-skills
> <https://github.com/MycroftAI/mycroft-skills>
>
> j) I personally and on the behalf of upstream would like to invite everyone interested in taking voice control and interaction with digital assistants forward on the plasma desktop and plasma mobile platform to come and join the mattermost mycroft chat area: https://chat.mycroft.ai <https://chat.mycroft.ai/> where we can create our own KDE channel and directly discuss and talk to the upstream mycroft team (they are more than happy to interact directly with everyone from KDE on one to one basis and queries and concerns and also to take voice control and digital assistance to the next level) or through some IRC channel where everyone including myself and upstream can all interact to take this forward.
>
> Regards,
> Aditya
Thank you for taking the time to provide more context about Mycroft and the integration with our software, Aditya!
I have a feeling that there wasn’t enough knowledge about that in KDE up until now, but now at least for me it’s much clearer.
Thanks,
Thomas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.kde.org/pipermail/kde-community/attachments/20170918/53d441a8/attachment.htm>
More information about the kde-community
mailing list