
Peter Grasch peter at
Fri Feb 21 07:24:43 UTC 2014


On 02/18/2014 11:11 AM, Peter Bouda wrote:
> One thing came to my mind, about the alignment. Another use case for
> that would be language documentation. Normally a researcher will do
> audio/video recording and then transcribe the data with a software like
> Elan ( The alignment is done manually
> in this case. Most researchers work together with a native speaker
> and/or students in this case, those user need to be trained in Elan,
> which costs a lot of time. So there are cases where MS Word or something
> similar is used for transcription. It would be a *big* help if
> researchers and their team members could just use any editor to write
> down the transcriptions and then align them later (reliably). Most
> languages are very different from English, so I think the systems needs
> to support alternative phoneme inventories in this case. Btw, what kind
> of data does the aligner need for this? Just phoneme inventories or full
> phonetic dictionaries?
Phonetic dictionaries and a matching acoustic model are required.
Because you're operating within a very narrow domain (you already know
what is being said, after all), the models don't actually need to be
very good, though.

> Just a thought. I could maybe get some data (word+audio in several
> languages) and test users for such a system.
Thanks, it's something to keep in mind.

Best regards,

More information about the Kde-speech mailing list