Date for Simon IRC meeting: 10th of January at 10pm (GMT+1)

Mario Fux ml at unormal.org
Tue Jan 10 11:05:43 UTC 2017


Am Dienstag, 10. Januar 2017, 00:40:59 CET schrieb Δημήτρης Παπαντωνίου:
> Hi Mario!

Good morning Dimitris

> This is an excellent idea, to arrange a chat to discuss the future of
> Simon. I had been afraid that the project was actually dead...

No it just smelled so ;-).

But thanks a lot for your email (see some notes below) and looking forward to 
meet you today!

> My name is Dimitris and i ‘ve playing around with Simon since its very
> early days, trying to build a greek language model now and again in my
> spare time (just discovered my first recorded samples dated 2009 in my
> dad’s ancient laptop now at christmas!) I guess i am also the least
> technical of those coming tomorrow, so i thought i would write some
> non-techinical guy’s thoughts separately.

Perfect. Thanks a lot for writing this up and for staying so long with Simon.

> My experience with Simon all these years has been basically a continuous
> stress: It has basically been the main reason i learnt what a terminal and
> compiling is :-) And although they can be of interest to many of you, it
> has not been funny for me. Typical case senario: I was in the mood to work
> with my greek model couple of times per year: I would install simon (easy).
> Then i would try to compile the HTK. It would take anything between five
> afternoons and a couple of week. By the time i managed to google the
> uncomprehensible errors, i was no longer in the mood to record samples or
> work on my phonetic dictionary, so i would quit for a few months. I ended
> up having a separate, otherwise unused, Mint partition on my laptop for
> several years, preserved for the sole purpose that i had both simon, HTK
> and sphinxtrain prealpha successfully installed, and was afraid of losing
> the installation and starting over again.
> 
> Last time i tried to install this year it went a lot easier, i must admit
> (either have i grown better, internet searching on errors is easier, or i
> was lucky). On the other hand i have tried to install Peter’s Lera on at
> least three occasions, all failed on some KDE environment variables which i
> have no idea what to do with...

I'm sure your skills improved quite a lot over the years and I'm sure you know 
Simon better in some areas better than me. My main goals to take over 
maintainership was to keep Simon from dying where I did a lousy job honestly. 
Hope to improve there... but I and we need and can use every help and support.

> Moreover, a few years ago i spent a long time creating a greek dictionary
> with a couple of hundrend thousand words, and annotating each as article,
> verb, noun etc. I was lured by Peter’s use of the word grammar in simon at
> that time...It was later in 2012 or 13 when he explained the notion of
> ngrams at a mail and i felt extremely stupid for losing my time...

Never feel stupid about errors and problems as long as you learn and improve. 
It might look like lost time but you learn something and so it was of value.

> So, shortly put: from a non-technical and non-speech researcher’s point of
> view, just someone’s who has just played a lot with Simon and read a few
> papers on speech recognition as a hobby i would like to suggest that we
> focus on simplicity and clear instructions on these two cases:
> 
> 1) Simon and all dependencies should be installed with a click. I know HTK
> is restrictively licenced, but spinxtrain could likely be included.
> (Kaldi??). Let’s create a snap, flatpak or anything that just works out of
> the box. Five years ago i had discussed with a friend who was teaching a
> speech therapy class to have all students contribute samples...We would
> have had a full acoustic model with several tens of hours withith a week,
> but failed to do so because of students unable to install...

Sounds like a good idea indeed. Make the software easily accessible for 
everyone and thus install and start with a minute.

> 2)Simon should be a language-agnostic program, and should be used to easily
> build user models from scratch in non-english language for non-savvy users.
> We should document clearly step by step a suggested process, a how-to build
> acoustic models, language models, dictionaries etc. Anyone should be able
> to get started without losing time (see my story of losing tens of hours in
> building my dictionary above). Also which programs to use, and how to run
> them: I am still having a nightmare (and have for the time being abandoned)
> trying to process a corpus of a few gigs of text for ngram creation-
> running python stuff in a console is just too complicated. Anyway, i have
> started documenting my steps for my greek model, and i can help with this,
> as i guess i am not a lot of use in more technical stuff.

Not sure about the different languages. I want to work more with German myself 
but Peter always tried to get it ready with English and then continue from 
there.

> 3)I 've always heard that we should collect more training data. I am not
> sure if this is really the bottleneck in accuracy, but looking at the
> voxforge project stuck in half its goal after so many years for english and
> 3% for my native language, greek... Is it impossible to even setup a simon
> “booth” “please give us your speech samples” in a few FOSS meetings? Just
> ask every developper here to contribute an hour of speech? Power training
> feature makes collection so easy and fast, if everything is setup (maybe
> even inclusive a dic and training texts?) in a snap so that one can start
> directly!

Definitely worth thinking about and no 1 above would help here too I think.

> 4)Please, we should make Lera work out of the box. I wanted to showcase my
> greek model to some cousins to persuade them to contribute now in
> christmas, but i have never managed to compile it, and it quite simply
> isn’t impressive enough to dictate simple words in simon. Same as above:
> Let’s do a snap!!!

Yes. I hope to make a last release of qt4/kdelibs4 based Simon and then focus 
on kf5-based and Lera based ideas.

> 5) Is it possible to build a system where simond keeps recognition samples
> (already exists) and simultaneously feeds samples to google api (or other?)
> get back recognition results, and use them to train the model?

Not sure if and how we could use Googles recognition engine here and how this 
could work license wise.

CU and thanks
Mario

> /Dimitris
> 
> 2017-01-08 22:19 GMT+01:00 Mario Fux <ml at unormal.org>:
> > Good morning
> > 
> > As Doodle decided we'll meet on Tuesday, 10th of January at 10pm
> > (Timezone:
> > GMT+1).
> > 
> > You'll find more information here:
> > https://notes.kde.org/public/simon-meeting-201701
> > 
> > See you in the IRC channel #kde-accessibliity on freenode.net
> > 
> > Best regards and happy new year
> > Mario




More information about the Kde-speech mailing list