Date for Simon IRC meeting: 10th of January at 10pm (GMT+1)

Δημήτρης Παπαντωνίου papantoniou.dimitris at gmail.com
Mon Jan 9 23:40:59 UTC 2017


Hi Mario!

This is an excellent idea, to arrange a chat to discuss the future of
Simon. I had been afraid that the project was actually dead...

My name is Dimitris and i ‘ve playing around with Simon since its very
early days, trying to build a greek language model now and again in my
spare time (just discovered my first recorded samples dated 2009 in my
dad’s ancient laptop now at christmas!) I guess i am also the least
technical of those coming tomorrow, so i thought i would write some
non-techinical guy’s thoughts separately.

My experience with Simon all these years has been basically a continuous
stress: It has basically been the main reason i learnt what a terminal and
compiling is :-) And although they can be of interest to many of you, it
has not been funny for me. Typical case senario: I was in the mood to work
with my greek model couple of times per year: I would install simon (easy).
Then i would try to compile the HTK. It would take anything between five
afternoons and a couple of week. By the time i managed to google the
uncomprehensible errors, i was no longer in the mood to record samples or
work on my phonetic dictionary, so i would quit for a few months. I ended
up having a separate, otherwise unused, Mint partition on my laptop for
several years, preserved for the sole purpose that i had both simon, HTK
and sphinxtrain prealpha successfully installed, and was afraid of losing
the installation and starting over again.

Last time i tried to install this year it went a lot easier, i must admit
(either have i grown better, internet searching on errors is easier, or i
was lucky). On the other hand i have tried to install Peter’s Lera on at
least three occasions, all failed on some KDE environment variables which i
have no idea what to do with...

Moreover, a few years ago i spent a long time creating a greek dictionary
with a couple of hundrend thousand words, and annotating each as article,
verb, noun etc. I was lured by Peter’s use of the word grammar in simon at
that time...It was later in 2012 or 13 when he explained the notion of
ngrams at a mail and i felt extremely stupid for losing my time...

So, shortly put: from a non-technical and non-speech researcher’s point of
view, just someone’s who has just played a lot with Simon and read a few
papers on speech recognition as a hobby i would like to suggest that we
focus on simplicity and clear instructions on these two cases:

1) Simon and all dependencies should be installed with a click. I know HTK
is restrictively licenced, but spinxtrain could likely be included.
(Kaldi??). Let’s create a snap, flatpak or anything that just works out of
the box. Five years ago i had discussed with a friend who was teaching a
speech therapy class to have all students contribute samples...We would
have had a full acoustic model with several tens of hours withith a week,
but failed to do so because of students unable to install...

2)Simon should be a language-agnostic program, and should be used to easily
build user models from scratch in non-english language for non-savvy users.
We should document clearly step by step a suggested process, a how-to build
acoustic models, language models, dictionaries etc. Anyone should be able
to get started without losing time (see my story of losing tens of hours in
building my dictionary above). Also which programs to use, and how to run
them: I am still having a nightmare (and have for the time being abandoned)
trying to process a corpus of a few gigs of text for ngram creation-
running python stuff in a console is just too complicated. Anyway, i have
started documenting my steps for my greek model, and i can help with this,
as i guess i am not a lot of use in more technical stuff.

3)I 've always heard that we should collect more training data. I am not
sure if this is really the bottleneck in accuracy, but looking at the
voxforge project stuck in half its goal after so many years for english and
3% for my native language, greek... Is it impossible to even setup a simon
“booth” “please give us your speech samples” in a few FOSS meetings? Just
ask every developper here to contribute an hour of speech? Power training
feature makes collection so easy and fast, if everything is setup (maybe
even inclusive a dic and training texts?) in a snap so that one can start
directly!

4)Please, we should make Lera work out of the box. I wanted to showcase my
greek model to some cousins to persuade them to contribute now in
christmas, but i have never managed to compile it, and it quite simply
isn’t impressive enough to dictate simple words in simon. Same as above:
Let’s do a snap!!!

5) Is it possible to build a system where simond keeps recognition samples
(already exists) and simultaneously feeds samples to google api (or other?)
get back recognition results, and use them to train the model?

/Dimitris

2017-01-08 22:19 GMT+01:00 Mario Fux <ml at unormal.org>:

> Good morning
>
> As Doodle decided we'll meet on Tuesday, 10th of January at 10pm (Timezone:
> GMT+1).
>
> You'll find more information here:
> https://notes.kde.org/public/simon-meeting-201701
>
> See you in the IRC channel #kde-accessibliity on freenode.net
>
> Best regards and happy new year
> Mario
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.kde.org/pipermail/kde-speech/attachments/20170110/9b3a8ee7/attachment.html>


More information about the Kde-speech mailing list