[Kde-accessibility] Lip Reader Demo

Peter Grasch grasch at simon-listens.org
Wed Mar 21 20:37:59 UTC 2012


Hi Yash,

Am Mittwoch, 21. März 2012, 04:17:33 schrieb Yash Shah:
> I just wrote demo code of how to track out lip movements to know whether a
> person is speaking or not.
Again: really, really impressive.

As Julius is quite robust against background noise this might just be a game 
changer: The problem with background noise is often not with the actual 
recognition but with the sound segmentation. Using vision for that might 
really make the system much more robust.
If it's reliable under real world situation is another thing, though :)

> In the video, note how it automatically 'locks on' to the new face if a new
> person comes in front of the camera and starts reading his lips.
Yep, noticed that on the first round already. I also noticed how careful you 
were to minimize the time where two people are in the same frame :P

> The actual face detection takes place only once every two seconds to save
> CPU time. The rest of the time, it only needs to 'track' the face object
> using the CamShift algorithm, which is super fast and lightweight and works
> by tracking difference between consecutive frames.
I asked this the last time, already but could OpenCL be used to further reduce 
the performance footprint?

> This is with respect to the project about face detection for Simon.
It might be better to keep further updates in the same thread to make tracking 
the project easier.

Best regards,
Peter


More information about the kde-accessibility mailing list