[Kde-accessibility] Simon: Second Meeting: Minutes

Peter Grasch peter at grasch.net
Mon Aug 5 19:21:18 UTC 2013


Hi,

we just concluded the second meeting. Here are the minutes:
1. Status of currently ongoing tasks:
1.a. David Greaves has started work on the packaging for Mer but hasn't
yet gotten very far due to other work interfering.
1.b. Jon looked into echo cancellation which is indeed already available
as a plugin in Pulseaudio. Results appear to be good but not perfect.
1.c. Simon was busy building and improving the LiveTranscriber system: A
web application based on Ruby on Rails for semi-automatic transcription
of large audio files. Right now, the current development branch requires
the user to upload a transcription and ASR created label file to the
site. From that, the rich UI allows quick, guided corrections.
Simon also analyzed the LibriVox and Youtube (CC-SA) content
collections. There is a lot of usable material there.
1.d. I, meanwhile, have started on incorporating the TED LIUM audio data
into it's own model which compared very favorable to the old Voxforge
one. Currently, I'm using this model to improve the Voxforge
transcriptions and am fine tuning model parameters for optimal performance.

2. Project outline: We quickly reviewed the plan of building a end user
dictation system. To do that, we will first concentrate on building a
great speech model. In order to facilitate this, we're building
transcription software to aid acquisition of good training data. In
parallel we should be working on the language model and the dialog
manager. Both areas need volunteers at the moment.
While we wait for real world data from our first prototype, we will
evaluate recognizer performance against the TED LIUM test sets.

3. Organizational: We discussed possible organizational structures
(separate foundation or consortium, a project under the KDE
umbrella,...) for our working group that would cater best to our needs.
Due to the delicate subject, we decided to have this discussion on the
mailing list. I'll sent out my proposal after this email.

4. Tasks:
4.a. I'll keep working on the acoustic models (improving the Voxforge
model, incorporating VCTK data, merging corpora together).
4.b. Jon will explore integrating noise reduction algorithms into
pulseaudio and how both echo and noise cancellation can be controlled
from a client application.
4.c. Simon keeps working on the LiveTranscriber together with Abinash
(who was ill last week). Moreover, Simon will look into aligning
LibriVox recordings using his tool.
4.d. David Greaves will keep working on the packages for Mer as time
permits.


I've attached the complete IRC log to this mail as well.

Best regards,
Peter
-------------- next part --------------
[17:01:06] <bedahr> hi guys, 5 minutes
[17:05:19] <bedahr> alright, missing at least Madhu and Andrew but let's start anyway, maybe they'll join later on
[17:05:31] <skpvox> hi
[17:05:48] <abinash> hi
[17:05:55] <bedahr> hi everybody
[17:06:20] <bedahr> abinash: Jon___ lbt unormal__ ^
[17:06:40] <lbt> o/
[17:06:50] <bedahr> so let's start right away with the status of the tasks claimed in the last meeting
[17:06:55] -*- lbt got trapped in bug but will listen
[17:07:01] <bedahr> :)
[17:07:02] --> fewcha (~sanjiban at 117.211.86.109) has joined #simon
[17:07:14] <bedahr> lbt: bug related to the packaging?
[17:07:26] <lbt> no - work bug
[17:07:28] <bedahr> ah
[17:07:39] <bedahr> alright, np. So not much progress to report, I guess
[17:08:48] <bedahr> skpvox was kind enough to donate a server in Amazons cloud since the last meeting and that server has been chugging away nicely, building bigger and better (tm) acoustic models
[17:09:22] <lbt> I've pulled the source and preliminary packaging. Started an OBS area and not much more.
[17:09:36] <bedahr> lbt: alright, thanks
[17:10:04] <bedahr> everybody: you should all (given that you're hopefully subscribed to kde-accessibility at kde.org) have received skpvox invitation email. If you want to work on the models and need access to a powerful machine, please request ssh credentials
[17:10:19] <skpvox> yes, please hit me up for SSH access
[17:10:28] <bedahr> skpvox: thanks for the generous donation!
[17:10:45] <Jon___> yes
[17:11:16] <bedahr> Jon___: you have had a look at noise and echo cancellation in pulseaudio
[17:11:23] <Jon___> yes
[17:11:31] <bedahr> I saw the comment on trello, but could you please summarize what you did and where you're at?
[17:11:45] <Jon___> pulseaudio does come with a noise cancellation module out of the box
[17:11:53] --> fregl (quassel at kde/gladhorn) has joined #simon
[17:12:03] <Jon___> it supports a number of algorithms including speex, which i have used personally
[17:12:23] <Jon___> you can also write your own custom modules in pulseaudio and thereby plug in your own solution 
[17:12:23] <bedahr> noise cancellation or echo cancellation?
[17:12:43] <unormal__> bedahr: Present. Sorry. Was afk for a moment. Reading the backlog.
[17:12:50] <Jon___> i do have some strong relationships with a number of high end dsp companies both in silicon valley and israel that i could potentially tap
[17:13:18] <bedahr> unormal__: np, hi!
[17:13:21] <bedahr> Jon___: regarding..?
[17:13:28] <Jon___> i am sorry. . . i meant echo cancellation
[17:13:38] <Jon___> does not come with a noise cancellation algorithm out of the box
[17:13:40] <bedahr> right, I figured
[17:14:11] <Jon___> it supports speex echo cancellation, which i've used.  it's ok, but not super robust
[17:14:18] <bedahr> still, interface-wise echo cancellation should include noise cancellation with the right plugin
[17:14:29] <bedahr> okay
[17:14:47] <bedahr> you posted a command line to initiate it (as a user) - can an application request these filters as well?
[17:15:21] <Jon___> i believe so, via commands to the daemon
[17:15:57] <bedahr> do they act on the device stream or the application stream?
[17:16:03] --> drdru (a689d1a1 at gateway/web/freenode/ip.166.137.209.161) has joined #simon
[17:17:16] <drdru> Hi
[17:17:23] <Jon___> can you clarify your question
[17:17:31] <bedahr> hi, drdru, please hang on
[17:17:59] <drdru> Sorry I'm late
[17:18:04] <bedahr> Jon___: well, just going from e.g., muting: you can mute the device (in pulseaudio) or individual streams to applications
[17:18:33] <bedahr> if we were to activate agressive noise cancellation on the device level this may certainly be annoying - not so on the level of the application stream
[17:18:42] <drdru> Currently commuting on a very crowded muni bus :(
[17:18:45] <Jon___> as far as i know streams to applications.  however, i would need to look into that further.
[17:20:03] <bedahr> Jon___: okay. but you haven't seen any noise cancellation algorithms in pulseaudio, correct?
[17:20:44] <Jon___> i have not based on their documents.  however, as i noted, you can write your own modules and presumably plug in a solution from somewhere else.
[17:21:03] <bedahr> alright, okay, thanks
[17:21:26] <bedahr> skpvox: abinash: you both claimed the audio transcription task last time
[17:21:37] <skpvox> yes, correct
[17:21:43] <skpvox> i've been busy hacking LiveTranscriber together
[17:21:51] <skpvox> a semi automatic transcription platform with reviewing capabilities
[17:22:00] <skpvox> https://github.com/skpvox/LiveTranscriber/tree/experimental
[17:22:07] <skpvox> here is the source code plus a demo video
[17:22:10] <abinash> i was sick, so unable to work, but would catch up
[17:22:53] <bedahr> skpvox: could you please talk a little bit about the planned capabilities and the current state?
[17:23:41] <skpvox> the initial release was basically a live transcribing having pocketsphinx running in the background and sending the recognition results
[17:23:45] <skpvox> to the client via HTML5 event streams
[17:24:00] <skpvox> i have since then moved on to pre-processing the recognition
[17:24:07] <skpvox> i split the recording into segments
[17:24:16] <skpvox> each segment has n- words
[17:24:28] <skpvox> each word can have alternatives assigned (if the ASR provides for that)
[17:24:37] <skpvox> so as a user you just go through each segment and correct the mistakes
[17:24:58] <skpvox> i've also added a reviewer interface
[17:25:09] <skpvox> where the user marks mistakes in the transcription
[17:25:29] <skpvox> every X transcriptions has a mine (purposely wrong word)
[17:25:36] <skpvox> in order to verify whether the user is actually completing the task
[17:26:13] <skpvox> this week i will test the system "in the real world" to see what else might be necessary
[17:26:31] <skpvox> i have transcribed a TED task very fast (approx 1.5x realtime)
[17:26:44] <bedahr> just shortly: involved technologies (and what they do in the project)
[17:26:54] <bedahr> s/shortly/quickly/
[17:27:01] <-- unormal__ (~fmario at adsl-89-217-251-21.adslplus.ch) has quit (Remote host closed the connection)
[17:27:08] <skpvox> there are 2 branches of it right now
[17:27:26] <skpvox> the original branch is based on rails 4 with a modified pocketsphinx version and redis for client-to-server communication
[17:27:27] --> unormal__ (~fmario at adsl-89-217-251-21.adslplus.ch) has joined #simon
[17:27:36] <skpvox> it was using HTML5 event sources
[17:27:43] <unormal__> re
[17:28:04] <-> unormal__ is now known as unormal
[17:28:34] <skpvox> the current branch is less complex. it's rails4 based, allows any ASR to be plugged in (via label files)
[17:28:46] <skpvox> if anybody wants to see a live demo, plz hit me up and i will setup an account
[17:28:57] <bedahr> alright
[17:29:23] <Jon___> yes i would
[17:29:24] <bedahr> abinash: worked with ruby before?
[17:29:46] <abinash> no, but I would learn 
[17:30:26] <bedahr> okay, you should coordinate with skpvox, then.
[17:30:37] <bedahr> there's of course also the second part
[17:31:00] <bedahr> right now, the current branch relies on externally specified label files (i.e. the recognition is done outside of the system)
[17:32:01] <skpvox> correct
[17:32:03] <bedahr> you could work on this part, re-introducing integrated (though pluggable) ASR
[17:32:32] <bedahr> just a suggestion of course, you and skpvox should probably talk about the best way to proceed afterwards if that's okay for both of you?
[17:32:46] <skpvox> yea, i'm already talking with abinash 
[17:32:52] <bedahr> great
[17:33:15] <bedahr> okay, unormal, you sent out a letter to the University regarding getting students involved, right?
[17:35:00] <bedahr> k, apparently afk right now
[17:35:13] <unormal> Back. And right.
[17:35:18] <bedahr> ah
[17:35:28] <bedahr> anything else to report?
[17:35:33] <skpvox> yes
[17:35:38] <bedahr> no, I meant unormal
[17:35:42] <unormal> Not from my side.
[17:35:42] <skpvox> :(
[17:35:44] <bedahr> k
[17:35:48] <bedahr> sorry, skpvox, go right ahead
[17:35:50] <unormal> skpvox: Your turn ;-)
[17:35:53] <skpvox> lol
[17:36:06] <skpvox> i've just finished analyzing the librivox catalogue
[17:36:12] <skpvox> i think this is a real goldmine for training
[17:36:16] <skpvox> over 4 years of total recordings
[17:36:21] <skpvox> http://wiki.babelbase.org/index.php/LibriVox_Corpus
[17:36:41] <bedahr> yes, it was already on our todo list
[17:36:50] <skpvox> i will start with that this week
[17:36:54] <bedahr> it's a great corpus of read speech - not perfect for everything but certainly helpful
[17:36:58] <bedahr> cool
[17:37:05] <bedahr> ah, that actually brings me to something else
[17:37:10] <-- fewcha (~sanjiban at 117.211.86.109) has quit (Ping timeout: 245 seconds)
[17:37:16] <skpvox> i've also finished my YouTube creative commons analysis today
[17:37:21] <bedahr> could you talk about how forced alignment / correction of an already existing transcript is integrated in LiveTranscriber?
[17:37:51] <skpvox> do you mean an unaligned transcript?
[17:38:02] <bedahr> yes
[17:38:18] <bedahr> for LibriVox, for example
[17:38:20] <skpvox> i am still performing tests with sphinx3_align 
[17:39:01] <skpvox> if you have an .srt it can be converted and imported as a label file
[17:39:18] <bedahr> (for everyone else: ultimately, we need time codes for words (even phonemes); for relatively short samples, this can be done automatically - an audio book is too long though; The act of adding timed transcriptions given a transcript is called alignment)
[17:39:36] <skpvox> i have not yet verified the exact terms of youtube
[17:39:43] <-- drdru (a689d1a1 at gateway/web/freenode/ip.166.137.209.161) has quit (Ping timeout: 250 seconds)
[17:39:45] <skpvox> but i have a method to use their ASR to do force alignment with good accuracy
[17:40:10] <skpvox> i will probably work on a hybrid mode this week
[17:40:27] <bedahr> you won't use youtube to align the transcripts for Librivox, though, right?
[17:41:12] <skpvox> well, we must verify how their terms deal with that
[17:41:28] <skpvox> i assume the automatic transcript they produce simple falls under copyright
[17:41:39] <skpvox> *simply
[17:41:57] <bedahr> I wouldn't use it regardless
[17:42:14] <skpvox> my plan is to use sphinx3 as it gives me the most information
[17:42:35] <bedahr> you're talking about at least hundreds of hours of data - that violates at least the spirit of their tos even if maybe not the letter
[17:42:44] <bedahr> yes, that'd make more sense
[17:42:59] <skpvox> well, all transcriptions are done via their API
[17:43:03] <bedahr> yes
[17:43:05] <skpvox> so within their enforced limits
[17:43:15] <skpvox> which is 5mm requests / day
[17:43:17] <bedahr> just because it works doesn't mean we should do it
[17:43:23] <bedahr> 5mm = ?
[17:43:26] <skpvox> yea
[17:43:36] <bedahr> what's "mm"? milimeter?
[17:43:41] <skpvox> millions
[17:43:45] <bedahr> ah
[17:43:49] <bedahr> okay
[17:44:11] <bedahr> well, I still don't think it's a great idea and ruins the reproducability without involving commercial / closed systems
[17:44:23] <skpvox> i think ultimatelly i will silence segment the recording
[17:44:35] <skpvox> in let's say 5 minute pieces
[17:44:42] <skpvox> let the user confirm start and end of transcript
[17:44:49] <skpvox> and then sphinx3 can do the rest
[17:45:05] <bedahr> we really don't need to rely on youtube - I did some forced alignment for German a while back with SPHINX and even though the models I used were way worse than what we have no, the results were almost perfect
[17:45:16] <skpvox> good
[17:45:23] <skpvox> i prefer sphinx3 every day of the week
[17:45:26] <bedahr> you don't even need to do the 5 minute segmentation
[17:45:31] <bedahr> we can talk about this after the meeting as well
[17:45:36] <bedahr> but, I'll keep this as "todo"
[17:45:40] <skpvox> good
[17:45:56] <bedahr> alright, next up are all the tasks that were assigned after I sent out the minutes
[17:45:57] <skpvox> i've also finished my YouTube creative commons analysis today
[17:46:03] <skpvox> http://wiki.babelbase.org/index.php/YouTube_Creative_Commons
[17:46:04] <bedahr> ah, alright
[17:46:10] <skpvox> i think it's less than ideal
[17:46:15] <skpvox> but it would give us 12+ years of material ..
[17:46:47] <bedahr> yeah, let's keep this in our back pocket for now
[17:46:54] <bedahr> it's very difficult material to parse
[17:47:16] <skpvox> yea
[17:47:23] <skpvox> i'll stick to librivox for now
[17:47:26] <bedahr> yep
[17:47:28] <bedahr> okay, moving on
[17:47:39] <bedahr> we're gonna be late again by this rate :)
[17:48:09] <bedahr> okay, I moved the infrastructure task to done just now
[17:48:16] <bedahr> that leaves the various corpora
[17:49:11] <bedahr> so: I parsed the TED LIUM corpus and built SPHINX transcription from their format. I then built an acoustic model from that data (with a combination of their and our current dictionary, extended to cover all words in the transcripts)
[17:49:43] <skpvox> was it the fullCased dict?
[17:50:10] <bedahr> no
[17:50:27] <bedahr> the fullCased is actually for dictation and limited to the words in the LM - not suitable for AM training
[17:50:42] <skpvox> you mean essential-sane-65k.fullCased ?
[17:50:53] <bedahr> yes
[17:50:59] <bedahr> anyway, against the TED dev set, the so created model (using the same LM as the dictation prototype), achieved a WER of ~ 30 %
[17:51:24] <bedahr> that doesn't sound like much but keep in mind that we're now talking about spontaneous speech including stuff like "uhm", etc. that's much harder to recognize
[17:51:50] <bedahr> our old voxforge model manages 46.34 % WER on this test set
[17:52:02] <unormal> Nice.
[17:52:37] <bedahr> anyway, my plan now is to build an updated voxforge model with transcriptions that are force aligned using the TED model to include tags like {NOISE} or {UM}
[17:53:02] <bedahr> I expect a very small bump in WER at this point
[17:53:30] <bedahr> I would also like to experiment with some parameters; ultimately, I'm looking to merging the different corpora, including the VCTK corpus that was also already prepared
[17:53:43] <bedahr> alright, that's it from my side
[17:54:05] <skpvox> do you plan to merge the TED + Voxforge model?
[17:54:17] <bedahr> yes
[17:54:23] --> drdru (~Adium at 50-78-103-105-static.hfc.comcastbusiness.net) has joined #simon
[17:54:25] <bedahr> but we'll see how that goes
[17:54:35] --> fewcha (~sanjiban at 14.139.221.18) has joined #simon
[17:54:40] <bedahr> okay, now let's quickly move to point 2: Planning the next steps
[17:55:06] <bedahr> before we dive right into that (and I think most of you already have their todos cut out for them), I'd like to re-focus on our goal here
[17:55:13] <drdru> sorry, my iphone disconnected
[17:55:18] <bedahr> drdru: np
[17:55:48] <drdru> is there a log for our chat available?
[17:55:59] <bedahr> drdru: I can paste it to you after the meeting
[17:56:14] <drdru> thanks!
[17:56:26] <bedahr> what I was trying to say: Our main goal for this project is to build an open source dictation system
[17:56:44] <bedahr> to get there, we need great acoustic- and language models and a dialog manager
[17:57:01] <bedahr> the creation of the acoustic models need a lot of data, which has inspired the transcriber aspect of the project
[17:57:47] <bedahr> for the LM aspect, we still don't have a good concept
[17:58:12] <-- unormal (~fmario at adsl-89-217-251-21.adslplus.ch) has quit (Quit: Konversation terminated!)
[17:58:24] --> unormal (~fmario at adsl-89-217-251-21.adslplus.ch) has joined #simon
[17:59:16] <bedahr> so to sum up, the next steps, as I see them are: create the best possible acoustic and language models from what data we already have, built an open system for transcriptions both as a means in itself but also to source more data in the future for the next generation of acoustic models
[18:00:34] <bedahr> for the LM, we still have no concrete plan, although I'm working on that (and hopefully some of you will also start to get involved there)
[18:00:58] <bedahr> the dialog manager is better defined but pretty much up for grabs
[18:01:32] <bedahr> any comments / feedback on this "plan"?
[18:02:14] <unormal> Sounds good and reasonable.
[18:02:20] <skpvox> yea, i agree
[18:02:49] <bedahr> is everybody on board with the "end-user dictation application" end goal?
[18:03:09] <unormal> For the moment, yes ;-).
[18:03:48] <bedahr> yes, this should not be interpreted as "we won't do anything else (as can be seen by the transcriber aspect)" but we need one application that we will ultimately focus on
[18:03:59] <bedahr> alright, no pitchforks, I'm taking this as a yes
[18:04:17] <unormal> Btw what about your blog post?
[18:04:29] <bedahr> yes, wait for it :)
[18:04:46] <skpvox> what i wanted to clarify is how we keep each other updated about progress?
[18:04:47] <bedahr> then, before we revisit the task assignment: drdru joined us today
[18:05:03] <bedahr> skpvox: good point, adding it to the agenda
[18:05:29] <drdru> what you guys are doing is really cool
[18:05:31] <drdru> I want to help
[18:06:17] <bedahr> could you please (in as few sentences as possible) say who you are, what you're doing currently and what you would like to do in this project?
[18:06:51] <drdru> yeah
[18:07:08] <drdru> My goal is to help build acoustic and language models that can be used with offline pocketsphinx
[18:07:42] <drdru> My hobby is building machine translation systems, but these systems are useless if you need to be connected to a network
[18:07:54] <drdru> I've already done some work with MT, but now I need ASR
[18:08:07] <drdru> pocketsphinx works well on iPhone and Android, but it lacks good models
[18:08:30] <drdru> I loaded Peter's language model + voxforge acoustic model into pocketsphinx, and it worked reasonably well
[18:08:44] <drdru> I haven't done adaptation yet, I'm hoping to get time to do that this week
[18:09:09] <drdru> I have access to a huge amount of speech data, but it is under a private license
[18:09:24] <drdru> my goal is to be able to re-license it for use with FOSS
[18:09:47] <drdru> right now it's difficult, since my users haven't consented to donating their speech
[18:09:57] <drdru> they have consented to my collection, but not my donation
[18:10:15] <skpvox> how are you collecting the speech?
[18:10:18] <drdru> bedahr: does that cover it?
[18:10:37] <unormal> Haven't yet or are against it?
[18:10:46] <drdru> through mobile apps that my company has created
[18:10:54] <drdru> I'm not against it at all
[18:11:05] <drdru> just have to figure out how to ask users "do you agree to donate your speech"
[18:11:09] <drdru> in a way that doesn't scare them
[18:12:47] <drdru> right now we are using Nuance
[18:12:58] <drdru> but it is cost prohibitive to continue
[18:13:27] <-- unormal (~fmario at adsl-89-217-251-21.adslplus.ch) has quit (Remote host closed the connection)
[18:13:51] --> unormal (~fmario at adsl-89-217-251-21.adslplus.ch) has joined #simon
[18:14:08] <unormal> Haven't yet or are against it? (resend)
[18:14:20] <unormal> About the users agreement.
[18:14:26] <bedahr> unormal: haven't yet had the option to do so
[18:14:35] <drdru> correct
[18:14:45] <drdru> and it needs to be done in a way that users are ok with it
[18:14:49] <drdru> otherwise it will kill the business
[18:14:58] <bedahr> sure
[18:15:02] <drdru> so it needs to be done carefully
[18:16:01] <skpvox> your current data might serve as a benchmark though
[18:16:13] <drdru> exactly
[18:16:16] <skpvox> do you have some of the recordings transcribed or just the nuance results?
[18:16:25] <drdru> so long as it doesn't leak out into open source data sets and models
[18:16:50] <drdru> I have 2-3000 hours of speech data + transcriptions from Nuance
[18:17:05] <drdru> I can't use their transcriptions for training, but I can certainly use them for comparing results
[18:17:20] <drdru> actually, my contract says I can collect the audio for QA purposes
[18:17:43] <drdru> my goal is to stop using them
[18:17:48] <drdru> but right now there's no viable option
[18:18:05] <drdru> if we can come up with an open source solution, I will be very happy
[18:18:11] <bedahr> another important point here as that right now we're solely focussing on English (which makes sense as a start)
[18:18:12] <skpvox> bedahr: on that topic i wanted to clarify how we plan to evaluate our models for dictation purpose?
[18:18:27] <bedahr> skpvox: very good point. We need a good test set
[18:18:45] <drdru> yes, we are focusing on English, but wouldn't the same approaches be useful for any language?
[18:19:08] <drdru> ie, data collection, evaluation, etc
[18:19:28] <bedahr> skpvox: I haven't given too much thought about this tbh; I don't think we'll get actual dictation data before our first usable prototype is released to the public
[18:19:44] <bedahr> right now, talks should serve as a sufficient approximation (also spontaneous but deliberate speech)
[18:20:20] <bedahr> I think using the TED test sets should be okay - at least when using at relative WER improvements instead of absolute numbers
[18:20:27] <bedahr> drdru: yes, absolutely
[18:20:56] <bedahr> there may be several techniques that should be employed for certain languages but the gist of it will remain largely the samew
[18:20:58] <bedahr> *same
[18:21:51] <bedahr> okay, the next point on my impromptu agenda is organizational
[18:22:42] <bedahr> some of you are affiliated to cooperations, from skpvox I see content on "babelbase.org" and email from "polyglotfoundation.org"
[18:24:21] <unormal> Need to leave in 5 minutes. Will read the backlog. Cu.
[18:24:25] <bedahr> unormal: bye
[18:24:39] <bedahr> I have absolutely no problem with people's affiliations to companies and / or organization (I think it's great that we have reached the point where we appeal to a larger audience) but I think that to enable cooperation on equal footing, we should all identify ourselves as contributers to a given, common project.
[18:25:21] <bedahr> that project should have a common name, identity, way to publish updates (regarding the "how to keep others informed" part from earlier), etc.
[18:26:58] <bedahr> this identity was and still is "Simon", a KDE project
[18:27:37] <bedahr> now let the flamewar begin: is there anyone who has a different opinion about the organizational structure?
[18:28:39] <bedahr> skpvox: ?
[18:29:25] <skpvox> I think we achieve the most if we consider it a collaborative effort
[18:29:36] <bedahr> it absolutely is
[18:29:45] <-- drdru (~Adium at 50-78-103-105-static.hfc.comcastbusiness.net) has left #simon
[18:29:58] --> drdru (~Adium at 50-78-103-105-static.hfc.comcastbusiness.net) has joined #simon
[18:30:24] <bedahr> that doesn't mean that the project itself shouldn't have it's own identity
[18:31:04] <bedahr> example: Digia (the company) works on Qt (the project). People from Digia working on Qt are pretty much just like any other contributers in that they contribute to the project
[18:32:24] <bedahr> in the same spirit, company X could get involved in working on Simon, without needing to slap a different name on the resulting cooperation
[18:33:50] <skpvox> well, for me Simon is a software and our collaborative project (the acoustic and language models) is distinct
[18:34:14] <skpvox> i would consider it a joint-effort
[18:34:40] <bedahr> yes, Simon is a program as well but it's also a project that strives to built open source speech recognition systems
[18:35:21] <bedahr> the Simon project has quite a few applications and sub-projects over the last couple of years
[18:35:34] <bedahr> it's already an established name
[18:36:15] <bedahr> and of course it's an collaborative effort - pretty much all open source projects are.
[18:37:00] <skpvox> drdru: since you come from a cooperate background, how do you see the situation?
[18:37:06] <skpvox> *corporate
[18:37:47] <drdru> good question
[18:37:58] <drdru> my primary interest is in models
[18:38:29] <drdru> I believe they are the missing piece before FOSS ASR is as good as commercial ASR
[18:38:39] <drdru> commercial ASR is cost prohibitive for companies like mine
[18:38:58] <drdru> so we both gain from working together
[18:39:04] <bedahr> skpvox: how do you actually see the situation? what would be your ideal outcome?
[18:39:31] <drdru> as far as organization, I am not familiar with structuring of FOSS orgs
[18:39:37] <drdru> but licensing is key
[18:39:52] <drdru> however the models are released, they need to have an appropriate license
[18:40:08] <drdru> that's how we merge
[18:40:36] <bedahr> drdru: sure, all the content will be published under a free license; this is not about closing development off to anybody (it really isn't) but about having a coherent representation to interested contributers, distributions, media, etc.
[18:40:48] <skpvox> i think that we all benefit from making contributions to achieve a common goal
[18:40:56] <skpvox> namly open source speech recognition becoming better
[18:41:32] <skpvox> the ideal form is the one that results in the most contributions
[18:42:15] <bedahr> that doesn't really answer my question
[18:42:59] <skpvox> i think Simon is a good cornerstone
[18:43:25] <skpvox> but you might limit the amount of contributions if you try to channel all contributions under it's wings
[18:43:52] <-- unormal (~fmario at adsl-89-217-251-21.adslplus.ch) has quit (Remote host closed the connection)
[18:44:08] <bedahr> so what (as in: actionable item) do you actually suggest?
[18:44:15] --> unormal (~fmario at adsl-89-217-251-21.adslplus.ch) has joined #simon
[18:44:18] <drdru> one thing I would like to say - Simon is a great application, but it is limited to people who use a linux desktop
[18:44:34] <skpvox> yea, it would be great to have a MacOS port
[18:44:42] <drdru> either we need to port Simon to more platforms, or come up with a way of making it accessible to more platforms
[18:44:48] <drdru> or web
[18:44:59] <bedahr> okay, you guys are now talking about the application, not the project
[18:45:12] <drdru> after seeing what you built (skpvox), why not build a web-based version of Simon
[18:45:26] <drdru> well, maybe you can frame what the project is better?
[18:45:34] <bedahr> sure
[18:45:40] <skpvox> it was actually my original plan to integrate HTML5 recording capabilities
[18:45:48] <skpvox> but the models aren't strong enough for that yet
[18:46:17] <bedahr> I already mentioned this earlier but basically Simon is an effort to build an open-source speech recognition system.
[18:46:34] <skpvox> for linux?
[18:46:39] <bedahr> no, in general
[18:46:46] <drdru> ok
[18:46:46] <bedahr> Windows is a supported platform
[18:47:13] <bedahr> and we have done projects that went way beyond the Simon application - this is not a new distinction
[18:47:14] <drdru> to get more users, I think making it platform agnostic is huge
[18:47:24] <drdru> ie, making it support ios, android, web
[18:47:34] <bedahr> yes, but really, I want to talk about the organizational parameters before talking about the technical stuff
[18:47:45] <skpvox> but that's crucial
[18:47:52] <skpvox> i am focusing mostly on web side
[18:47:57] <skpvox> drdru on embedded devices
[18:48:01] <bedahr> yes
[18:48:11] <bedahr> so?
[18:48:23] <drdru> I think what skpvox has created shows that it's possible to build something that is accessible to the masses
[18:48:29] <bedahr> of course
[18:48:32] <drdru> I'm not saying KDE development should stop
[18:48:53] <drdru> but maybe there's a way to build a native wrapper for a web app
[18:48:59] <bedahr> okay, guys, let me rephrase the topic under discussion - I may not have been clear
[18:49:13] <drdru> then the actual application is HTML5, and the wrappers are the only native code
[18:49:27] <drdru> I have familiarity with Sencha Touch
[18:49:38] <bedahr> if I were to talk to somebody about you people (i.e. the people in this meeting) - how would I refer to you? Would I list your names? or would I talk about the "new Simon team"? Or the "xy team"? what would I call you?
[18:50:42] <drdru> sure
[18:50:43] <skpvox> good question. as i said i consider simon as a cornerstone. however, i would be against forcing all contributions under it's name
[18:51:08] <drdru> to me, Simon is a KDE application
[18:51:13] <skpvox> yea, exactly
[18:51:14] <drdru> it doesn't yet support dictation
[18:51:28] <drdru> and I thought the goal of this project was to build FOSS dictation?
[18:51:28] <-- fewcha (~sanjiban at 14.139.221.18) has quit (Ping timeout: 245 seconds)
[18:51:36] <bedahr> yes
[18:51:50] <drdru> there is a major differentiator
[18:52:01] <drdru> Simon is only available on windows and linux
[18:52:18] <bedahr> okay
[18:52:20] <drdru> whereas this would be available to anyone who could integrate pocketsphinx on any platform
[18:52:20] <bedahr> if you look at http://docs.kde.org/development/en/extragear-accessibility/simon/simon.pdf
[18:52:40] <bedahr> the first sentence of the introduction is: Simon is the main front end for the Simon open source speech recognition solution
[18:52:53] <bedahr> Simon (software) != Simon (project)
[18:53:20] <bedahr> but this is branding problem. and we're talking about branding here so maybe we really should consider a different name
[18:53:29] <drdru> yes
[18:53:39] <bedahr> what would you suggest?
[18:53:53] <drdru> OpenDictation
[18:54:01] <bedahr> too limiting, imho
[18:54:19] <drdru> how so?
[18:54:22] <Jon___> guys i have a hard stop in 5 minutes so need to decide on tasks for next week
[18:54:42] <bedahr> yes, this will take more time, obviously
[18:54:51] <bedahr> let's interrupt the discussion for assigning tasks
[18:54:55] <drdru> k
[18:55:31] <bedahr> can everybody (in alphabetical order of their nicks) say what they're planning to do (or ask what would be required) until the next meeting?
[18:55:33] <bedahr> abinash: go
[18:56:30] <bedahr> ...
[18:56:33] <bedahr> okay, or not
[18:56:37] <drdru> lol
[18:56:52] <bedahr> me, I'll keep working on the models (pretty much extending what I've been doing so far)
[18:56:57] <bedahr> drdru:
[18:57:07] <drdru> I would like to learn how to build models, and help with that
[18:57:14] <drdru> I can also test them against my data
[18:57:46] <bedahr> great, let skpvox give you a login to the server and mail me a few time slots where you'd be willing to get a short introduction from me
[18:57:51] <bedahr> Jon___: ?
[18:57:55] <skpvox> yea, i would be interested in that too
[18:58:00] <bedahr> alright
[18:58:15] <-- unormal (~fmario at adsl-89-217-251-21.adslplus.ch) has quit (Remote host closed the connection)
[18:58:28] <Jon___> I will look at how to integrate a noise reduction algorithm into pulseaudio.  I will look at open source solutions for this and how we would construct a module to do so.
[18:58:40] --> unormal (~fmario at adsl-89-217-251-21.adslplus.ch) has joined #simon
[18:58:50] <bedahr> great; also try to find more information about how to activate this from an application level
[18:58:59] <bedahr> lbt: 
[18:59:03] <Jon___> ok
[18:59:31] <bedahr> or skpvox?
[18:59:44] <skpvox> i will focus on LibriVox forced alignment
[18:59:48] <bedahr> great
[18:59:53] <skpvox> improving the transcriber, etc ..
[19:00:13] <bedahr> if you want we can talk afterwards about what I already did in this area
[19:00:20] <skpvox> yep
[19:00:40] <bedahr> unormal: back? any tasks you'll be working on?
[19:01:19] <bedahr> okay nevermind
[19:01:51] <bedahr> the blog post about the new team doesn't make sense until we agree on a common banner so let's get back to that discussion
[19:02:17] <skpvox> "The Open Speech Consortium"
[19:02:39] <bedahr> I propose that everyone that has a strong opinion on this should state their position, goals and "ideal situation"
[19:03:02] <bedahr> ultimately, this will probably not get solved today but we should have a good starting point by the end of the meeting to have a productive discussion on the mailing list afterwards
[19:03:38] <bedahr> I think I already outlined my position; skpvox: what would be your ideal organizational setup?
[19:04:22] <skpvox> i think we should focus on collaborative efforts. i see the advantages of a common banner, as such i would propose "The Open Speech Consortium:
[19:04:43] <drdru> I like it
[19:04:51] <bedahr> what would that actually be?
[19:04:53] <skpvox> "A consortium is an association of two or more individuals, companies, organizations or governments with the objective of participating in a common activity or pooling their resources for achieving a common goal."
[19:05:01] <skpvox> that's exactly what we want
[19:05:23] <bedahr> so who'd be a member of that consortium?
[19:05:40] <skpvox> individuals, corporations, foundations, etc ..
[19:06:01] <-- unormal (~fmario at adsl-89-217-251-21.adslplus.ch) has quit (Remote host closed the connection)
[19:06:12] <bedahr> who, given our current situation, would be a member?
[19:06:23] <skpvox> i think there are MUCH more people/entities that are interested in Open source speech recognition
[19:06:27] --> unormal (~fmario at adsl-89-217-251-21.adslplus.ch) has joined #simon
[19:06:38] <skpvox> i basically just by accident stumbled upon the blog entry @ cmusphinx
[19:10:24] <bedahr> so you are proposing to form a consortium (the organizational structure with all the paperwork that entails). let me ask again: who would be a member of this consortium?
[19:10:38] <bedahr> for example: how would you be a member? as an individual, part of another organization, etc?
[19:10:49] <skpvox> i would be member first and foremost as an individual
[19:10:54] <skpvox> and once things materialize as a foundation
[19:11:12] <skpvox> and maybe even as a corporation (non concrete plans on that yet)
[19:11:50] -*- unormal back. Reading backlog...
[19:12:14] <skpvox> i have no idea how consortiums are structurized from a legal point of view
[19:12:41] <bedahr> skpvox: http://de.wikipedia.org/wiki/Konsortium [German]
[19:12:54] <skpvox> however, it gives us the ability to expand this project beyond it's current scope
[19:13:15] <bedahr> the move to establish a consortium would be highly unusual - to say the least
[19:13:34] <bedahr> traditionally these things are done as foundation if a separate organizational structure is required
[19:14:25] <skpvox> i want to found a foundation with the goal to improve open source speech recognition (and the promotion of polyglotism)
[19:14:39] <skpvox> however, i think our common goal extends beyond that
[19:15:51] <skpvox> if needed for PR i think an informal "consortium" would be fine
[19:15:52] <skpvox> or
[19:15:55] <skpvox> "The Open Speech Project"
[19:16:46] <skpvox> to be frank: i want myself or my foundation to just be *part* of a greater goal, as i think anything else would be limiting ..
[19:17:11] <bedahr> well this is what I'm afraid of as well
[19:17:12] <-- abinash (01261458 at gateway/web/freenode/ip.1.38.20.88) has quit (Ping timeout: 250 seconds)
[19:17:34] <unormal> Don't know if you guys talked about it. But being a KDE application doesn't me it's just a GUI for a linux app. Bodega is e.g. an web application (framework) (content store) and is a KDE project/applicatoin as well.
[19:17:44] <bedahr> I don't know enough about the legal intricacies behind a consortium
[19:18:13] <bedahr> unormal: yes we talked about that - I've been trying to push KDE on at least skpvox for a while but it doesn't stick :)
[19:19:34] <unormal> bedahr: Ok, I'm sure you mentioned that KDE manifesto ( http://manifesto.kde.org ) as well?
[19:19:39] <bedahr> yes
[19:19:44] <bedahr> of course
[19:19:56] <unormal> Ok, sorry then, too flaky internet connection atm.
[19:20:06] <bedahr> nah, you couldn't have known, it was a private chat
[19:20:15] -*- unormal still reading backlog (or what's left of it).
[19:20:17] <unormal> Ah.
[19:21:01] <skpvox> i think it's limiting to the common goal if we make *any* organizational requirements / affiliations
[19:21:34] <bedahr> why would that be?
[19:21:56] <unormal> skpvox: Not limiting of course. Facilitating.
[19:22:13] <skpvox> because i think ultimately everybody does his own thing, but the result of that (shared) will advance the common goal
[19:22:27] <unormal> Why inventing something new if we can use the KDE infrastructure (not just technical) or any other (like Apache, etc....)
[19:22:39] <bedahr> if everybody "does his own thing" without cooperation, then it's not a project but also not a consortium - that's nothing
[19:22:50] <-- Jon___ (171e7429 at gateway/web/freenode/ip.23.30.116.41) has quit (Ping timeout: 250 seconds)
[19:23:04] <unormal> There are other companies in KDE that can profit as well (which is not bad at all I think!!!)
[19:23:08] <skpvox> what i mean that everybody has it's own purpose for getting involved
[19:23:14] <bedahr> absolutely
[19:23:31] <bedahr> but in what way would e.g. KDE impede this?
[19:24:09] <skpvox> well, if we have let's say "The Open Speech Project" and it's a KDE initiated project
[19:24:20] <skpvox> that doesn't sound too bad
[19:24:39] <skpvox> but i saw some of the KDE requirements
[19:24:41] <bedahr> I really think much of the hesitation is based on misunderstanding what the "KDE umbrella" actually does
[19:24:53] <skpvox> and if i can't host my project on github for example
[19:25:14] <skpvox> that would be limiting (just a very trivial example i know)
[19:25:37] <bedahr> okay, I'm pretty sure that even that can be accomodated
[19:25:39] <unormal> Can you explain why github and not KDE git infrastructure? Really interested in the reason.
[19:26:18] <skpvox> it's a matter of taste
[19:26:18] -*- unormal wants to understand what github/gitorious or Co provides that KDE git doesn't...
[19:26:58] <bedahr> even taking the manifesto by the letter, you could host on github provided that you provide access for the KDE sysadmin team
[19:26:58] <unormal> "Just" taste? You meant you e.g. like the webfrontend better or so?
[19:27:36] <skpvox> the whole approach on github is different
[19:27:43] --> scummos (~sven at pD9E54B59.dip0.t-ipconnect.de) has joined #simon
[19:27:43] <skpvox> but the discussion is like vim vs. emacs
[19:28:04] <unormal> Probably ;-). /me should probably take a closer look at github.
[19:28:05] <drdru> from my (uninformed) perspective, KDE is a very specific project, linux desktop. unfortunately, linux desktop is not interesting to me
[19:28:06] <bedahr> unormal: github has a nicely integrated workflow for merge requests
[19:28:08] <skpvox> and i believe ANY limitation (whatsoever) to any potential contributor to our common goal will be detrimental
[19:28:13] <bedahr> drdru: it isn't
[19:28:15] <drdru> i think its great
[19:28:24] <unormal> drdru: KDE is not at all just about linux desktop.
[19:28:29] <drdru> but i am stuck on osx
[19:28:31] <skpvox> drdru: same for me. it might be more than that but the avg. person thinks of it as a linux desktop interface
[19:28:40] <drdru> exactly
[19:28:44] <drdru> perception
[19:28:45] <bedahr> drdru: this is a misconception, pure and simple. As has been pointed out before (even today), KDE is not about "linux desktop appliations"
[19:28:58] <skpvox> well, that's the point
[19:29:07] <bedahr> drdru: this is what KDE is - not more, not less: http://manifesto.kde.org
[19:29:30] <drdru> i believe we need to be completely separate
[19:29:36] <drdru> simon can use the models
[19:29:41] <unormal> I'd like to see this project (speech thing/simon) do reach persons who never heard of KDE btw.
[19:29:47] <drdru> but so can every other supporting aoftware
[19:29:52] <skpvox> yes
[19:30:00] <bedahr> yes, that is a given
[19:30:13] <skpvox> and also can every individual/organziation/whatever contribute
[19:30:20] <drdru> so marketing should be general asr for foss
[19:30:47] <bedahr> I agree with both
[19:30:58] <bedahr> I don't see how KDE would do anything but aid these two goals
[19:31:06] <unormal> +1
[19:31:11] <drdru> its confusing
[19:31:19] <skpvox> well
[19:31:23] <drdru> i dont know what kde does besides linux desktop
[19:31:25] <skpvox> i'm for building "The Open Speech Project"
[19:31:30] <skpvox> it would just be a hub
[19:31:36] <skpvox> for whoever is interested to get together
[19:31:49] <skpvox> i do NOT want to be the owner/father/whatsoever of this project though
[19:31:59] <unormal> drdru: Applications for Windows, Mac, framework for Qt platforms, web content store, icons, etc.
[19:32:33] <unormal> skpvox: What would this open speech project need?
[19:32:34] <drdru> Open Speech Foundation?
[19:32:49] <drdru> like the Linux Foundation?
[19:33:03] <unormal> statutes, website, mailing list, legal body for trademarks?
[19:33:04] <drdru> the name of the project is incredibly important to people finding it
[19:33:15] <bedahr> at this point: having something where "everybody interested can get together" sounds nicer than it is. Without a governing body that dictates some kind of direction you're basically describing the whole internet.
[19:33:47] <bedahr> without prupose, any association is meaningless
[19:34:07] <skpvox> if we form such a project under KDE what restrictions would that entail?
[19:34:19] --> fewcha (~sanjiban at 14.139.221.18) has joined #simon
[19:34:37] <unormal> skpvox: What's written in the manifesto.
[19:34:53] <unormal> I.e. more profits than restrictions IMHO.
[19:35:24] <bedahr> honestly: pretty much none. We'd have to support the code of conduct (be nice), provide access to KDE contributers and license our stuff under open licenses
[19:36:02] <unormal> Which includes lgpl!
[19:36:13] <bedahr> contributers must be "KDE contributers" (have an account on identity.kde.org with the developer flag set)
[19:36:29] <unormal> And we'd get a legal body, technical infrastructure (mailing list, wikis, translators, etc.)
[19:37:00] <bedahr> on the plus side we get: Marketing, many potential contributers, a stable legal body, infrastructure, and access to e.g. Google Summer of Code participation
[19:37:44] <bedahr> as someone coming from my own foundation before moving Simon to KDE: trust me, these are all good things
[19:37:56] <unormal> ;-)
[19:38:12] <skpvox> you guys sound like salesmen ;)
[19:38:27] <bedahr> that's because we know what we're talking about
[19:38:33] <skpvox> how is LibriVox organized?
[19:39:31] <unormal> bedahr and me are e.V. members who sometimes need to work/decide on the less cody and more bureaucratic stuff.
[19:39:55] <skpvox> what are some "projects" (not applications) under KDE?
[19:40:19] <bedahr> e.g. the vivaldi tablet and pretty much it's whole ecosystem
[19:40:21] <bedahr> *its
[19:40:45] <unormal> Thus the aforementioned bodage content store thing.
[19:41:06] <unormal> skpvox: Oxygen is another KDE project.
[19:41:11] <bedahr> good point
[19:42:03] <unormal> Owncloud was another project that started as a KDE project (but with a yet not so good departure: another long story).
[19:42:08] <bedahr> okay, proposal: We create the "Open Speech Recognition Project" under the KDE e.V. umbrella.
[19:42:43] <skpvox> which would publish speech models?
[19:42:47] <unormal> But the manifesto was done exactly done for the case to see what KDE projects can be and they shall not be "just" "linux desktop apps" or "linux desktops"
[19:42:48] <bedahr> the purpose of this project would be broadly defined to "Build and enhance open source speech recognition systems"
[19:43:11] <unormal> Sounds good.
[19:43:26] <skpvox> how can other organizations contribute in that project?
[19:43:31] <bedahr> The project would be about pretty much everything we defined as our earlier goal (create a dictation system and the steps to get there) using whatever means necessary. If that means using an app hosted on github so be it
[19:43:55] <skpvox> e.g. i am planning to create a wiki for open speech
[19:44:01] <unormal> skpvox: Get a KDE identify account and start coding.
[19:44:10] <bedahr> just like everyone else: Either fork the repo and put patches on reviewboard (similar to the github workflow) or by asking for a contributer account
[19:44:26] <skpvox> but what code?
[19:44:31] <unormal> skpvox: Use one of KDE's wikis. There you even may get translations...
[19:44:41] <bedahr> skpvox: I don't understand the question?
[19:45:16] <unormal> skpvox: The simon code is all in KDE's infrastructure.
[19:45:18] <skpvox> i thought the idea of the project was to orchestrate various efforts by different individuals/entties
[19:45:23] <bedahr> yes
[19:45:28] <bedahr> it is
[19:45:39] <skpvox> so what code are we talking about?
[19:45:43] <bedahr> but these individuals / entities should still work towards a common goal
[19:45:48] <bedahr> this goal is what I'm talking about
[19:46:03] <bedahr> if there's software to write towards this goal: this is the code I'm talking about
[19:46:16] <bedahr> if there are models to build: this data files will also be part of "the project"
[19:46:21] <bedahr> etc.
[19:46:25] <skpvox> what if i want to build a platform, or even a software that acts on it's own
[19:46:31] <skpvox> but will contribute regardless?
[19:46:34] <bedahr> so be it
[19:46:44] <bedahr> what exactly means "acts on it's own"?
[19:47:13] <skpvox> for example i build a wiki for user contributions
[19:47:20] <skpvox> i build a site for speech collection
[19:47:29] <skpvox> i build a platform for transcribing
[19:47:54] <skpvox> they would all operate outside of KDE
[19:47:59] <unormal> Use what you can and want to use from KDE.
[19:48:02] <skpvox> but they would form part of our "alliance" so to say
[19:48:49] <unormal> skpvox: See e.g. http://community.kde.org/User:Unormal
[19:48:54] <bedahr> well unless you built it *for* the common project (or allow the project to use it), we have no relation to it and it would be misleading to call it "part of the project".  that is of course regardless of KDE or not
[19:49:12] --> ovidiu-florin (~ovidiu-fl at 79.113.200.156) has joined #simon
[19:49:34] <ovidiu-florin> hello world :D
[19:49:41] <ovidiu-florin> so, I've missed the meeting?
[19:49:49] <bedahr> just because you are part of a project (or alliance) doesn't make everything you do part of that community. If you put it there then of course it's part of "The Open Speech Project"
[19:50:07] <bedahr> ovidiu-florin: not a great time, we're still having it (talking about the proposed organizational structure)
[19:50:58] <skpvox> drdru: what do you think about "The Open Speech Project" under KDE?
[19:51:12] <ovidiu-florin> just finished catching up with my emails, and I feel very sorry that I haven't been here since the beginning.
[19:51:15] <drdru> sorry catching up
[19:51:52] <-- scummos (~sven at pD9E54B59.dip0.t-ipconnect.de) has quit (Ping timeout: 256 seconds)
[19:51:54] <bedahr> btw. some slides about "KDE": http://www.slideshare.net/lydiapintscher/what-makes-kde-tick
[19:56:39] <skpvox> ok
[19:56:53] <bedahr> oh and another btw.: Necessitas, the Qt port to Android was a KDE project before being integrated in Qt proper: http://necessitas.kde.org
[19:56:53] <skpvox> if i look at the majority of KDE projects
[19:57:11] <skpvox> and "Open Source Speech Recognition"
[19:57:40] <skpvox> not sure if it's on the same level
[19:57:59] <skpvox> what would be some alternatives to KDE as an established organization?
[20:00:04] <bedahr> I don't think you'll find an org that looks like a perfect "fit" for speech recognition - it's just a very specialized project spanning several different layers (models, algorithms, end-user applications, etc.)
[20:00:29] <bedahr> the Apache foundation is another popular umbrella org - but I'd very much prefer KDE for obvious reasons
[20:01:09] <-- unormal (~fmario at adsl-89-217-251-21.adslplus.ch) has quit (Remote host closed the connection)
[20:01:32] --> unormal (~fmario at adsl-89-217-251-21.adslplus.ch) has joined #simon
[20:02:32] <unormal> Damn internetconnection.
[20:02:41] <bedahr> didn't miss anything
[20:03:03] <unormal> My last line: [19:59] <skpvox> and "Open Source Speech Recognition"
[20:03:05] <skpvox> so any contribution to the KDE project would just be through a "KDE developer" account?
[20:03:26] <bedahr> skpvox: the "KDE developer account" is really just a login
[20:03:34] <unormal> Yes. The same as on github. You need an account.
[20:03:58] <unormal> But in KDE you've access to all the code (almost, websites are a bit special).
[20:04:25] <skpvox> well, i wouldn't consider github suitable for our common goal
[20:04:37] <skpvox> I'm just thinking ahead, if bigger organizations want to contribute in the project
[20:04:45] <skpvox> and they are reduced to a "developer" account
[20:05:08] <bedahr> unormal: http://paste.kde.org/pd95c8aef/
[20:06:06] <bedahr> skpvox: these are two different things. The developer account is person specific (e.g. a programmer at company X will have an account to push code to the common repository). Some kind of authentication will always be required. that doesn't mean that we can't have formal partnerships with e.g. companies
[20:07:29] <bedahr> quite the contrary, actually. The established KDE e.V. is funded partly by Google, Suse and bluesystems
[20:07:33] <-- unormal (~fmario at adsl-89-217-251-21.adslplus.ch) has quit (Remote host closed the connection)
[20:08:04] --> unormal (~fmario at adsl-89-217-251-21.adslplus.ch) has joined #simon
[20:10:36] <bedahr> okay, does anyone have anything substantial to add at this point? otherwise I'd recommend I post the proposal made here to the mailing list, everybody things about it and comments there
[20:12:32] <skpvox> ok
[20:12:39] <skpvox> i'm all for building sth like "The Open Speech Project"
[20:12:49] <drdru> I like that
[20:13:02] <bedahr> alright
[20:13:06] <skpvox> and i think KDE might be ok for it as an umbrella organization
[20:13:16] <bedahr> well, I do apologize for the insanely long meeting
[20:13:20] <drdru> sure
[20:13:39] <bedahr> in any case, thanks to everybody here, I think it was again very productive
[20:13:58] <drdru> bedahr: who do I talk to about models?
[20:14:02] <bedahr> I'll sent out meeting minutes and the proposal to kde-accessibility at kde.org
[20:14:06] <bedahr> drdru: me
[20:14:15] <skpvox> it would be great if we can organize sth
[20:14:25] <skpvox> regarding models ..
[20:14:29] <bedahr> I think I promised a couple of people to "talk after the meeting"
[20:14:37] <bedahr> sure
[20:15:02] <skpvox> we can organize it by email
[20:16:00] <bedahr> well it's just us three, right
[20:16:08] <-- unormal (~fmario at adsl-89-217-251-21.adslplus.ch) has quit (Remote host closed the connection)
[20:16:10] <bedahr> drdru: skpvox: can we just fix a time right now?
[20:16:34] <bedahr> How about this Friday?
[20:16:38] <bedahr> 2 pm UTC?
[20:16:39] --> unormal (~fmario at adsl-89-217-251-21.adslplus.ch) has joined #simon
[20:17:14] <drdru> bedahr: ideally I'd like to start asap
[20:17:31] <drdru> how long will it take to teach me?
[20:17:34] <bedahr> I understand but my schedule is pretty packed until at least thursday
[20:17:56] <skpvox> friday wouldnt work for me
[20:18:04] <skpvox> before friday, yes
[20:18:17] <bedahr> okay, first of all: what are you two expecting from this?
[20:19:06] <drdru> learning how to build models and how to evaluate them
[20:19:22] <skpvox> and how to adopt them to myself
[20:19:30] <drdru> I would like to play with the same training data you used in the models you created
[20:19:49] <drdru> if you already have the data and scripts available, I would love to access them
[20:19:56] <drdru> even if it's 1TB
[20:19:58] <bedahr> okay sure
[20:20:05] <bedahr> no, it's very much less than that
[20:20:09] <bedahr> and none of it is secret
[20:21:03] <bedahr> okay well the principle SPHINX training procedure basically boils down to preparing the data and excuting a script. Adaption is pretty similar. Most of the problems will arise once you start actually doing this yourself, afterwards :)
[20:21:16] <bedahr> so I'd say we can get through the most important things in ~ 2 hours
[20:22:05] <drdru> ok
[20:22:45] <bedahr> we can do it on Thursday afternoon
[20:22:57] <bedahr> something like 3 pm UTC?
[20:23:47] <skpvox> that works for me
[20:24:03] <bedahr> drdru: ?
[20:24:29] <drdru> 8am pacific?
[20:24:50] <bedahr> yes, sounds right
[20:25:03] <bedahr> maybe a bit early?
[20:25:17] <bedahr> 4 pm utc still okay for you, skpvox? that's the latest i can offer
[20:25:56] <skpvox> ok
[20:27:02] <bedahr> drdru: ?
[20:27:12] <drdru> I think it'll work
[20:27:58] <bedahr> okay, great
[20:28:14] <bedahr> then after just 3 and a half hours: meeting adjourned :)


More information about the kde-accessibility mailing list