<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40"><head><META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=us-ascii"><meta name=Generator content="Microsoft Word 14 (filtered medium)"><style><!--
/* Font Definitions */
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
{font-family:Tahoma;
panose-1:2 11 6 4 3 5 4 4 2 4;}
@font-face
{font-family:Consolas;
panose-1:2 11 6 9 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0cm;
margin-bottom:.0001pt;
font-size:12.0pt;
font-family:"Times New Roman","serif";}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:purple;
text-decoration:underline;}
p
{mso-style-priority:99;
mso-margin-top-alt:auto;
margin-right:0cm;
mso-margin-bottom-alt:auto;
margin-left:0cm;
font-size:12.0pt;
font-family:"Times New Roman","serif";}
pre
{mso-style-priority:99;
mso-style-link:"HTML Preformatted Char";
margin:0cm;
margin-bottom:.0001pt;
font-size:10.0pt;
font-family:"Courier New";}
span.HTMLPreformattedChar
{mso-style-name:"HTML Preformatted Char";
mso-style-priority:99;
mso-style-link:"HTML Preformatted";
font-family:Consolas;}
span.EmailStyle20
{mso-style-type:personal-reply;
font-family:"Calibri","sans-serif";
color:#1F497D;}
.MsoChpDefault
{mso-style-type:export-only;
font-family:"Calibri","sans-serif";}
@page WordSection1
{size:612.0pt 792.0pt;
margin:72.0pt 72.0pt 72.0pt 72.0pt;}
div.WordSection1
{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]--></head><body lang=EN-US link=blue vlink=purple><div class=WordSection1><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>Sounds good. I’d be interested in contributing more as that happens.<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>The only remaining thing from me would be to possibly lower the threshold to 1 second instead of 2 for perceived responsiveness.<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>Also, is the system recording the past 500 to 1,000 ms of audio in a temporary buffer so as to be able to start from there?<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>This will allow for an impression of “smoothness” from the user as they won’t encounter the possible frustration of speaking before the system is ready, even though as you point out, you inform them.<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>Btw, let’s keep all conversations on list.<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>Take care,<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>Sina<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>Website: www.SinaBahram.com<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>Twitter: @SinaBahram<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><b><span style='font-size:10.0pt;font-family:"Tahoma","sans-serif"'>From:</span></b><span style='font-size:10.0pt;font-family:"Tahoma","sans-serif"'> Yash Shah [mailto:blazonware@gmail.com] <br><b>Sent:</b> Tuesday, March 20, 2012 4:50 PM<br><b>To:</b> Sina Bahram<br><b>Subject:</b> Re: [Kde-accessibility] Working Demonstration for Simon[GSoC]<o:p></o:p></span></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal style='margin-bottom:12.0pt'>Hey Sina,<br>Thanks for your Inputs.<br>Yeah it is restrictive, I was just giving the idea. Some people will like to use that as activating/deactivating microphone so it will be kind of optional feature for them. According to survey conducted by me in my campus, People really appreciated the blink feature. My main focus will be on mouth movements detection. <br>For blind people, We can go for fingers/hands gesture detections.<o:p></o:p></p><div><p class=MsoNormal>On Wed, Mar 21, 2012 at 2:06 AM, Sina Bahram <<a href="mailto:sbahram@nc.rr.com">sbahram@nc.rr.com</a>> wrote:<o:p></o:p></p><div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>Some thoughts from universal design:</span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'> </span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>I would suggest that perhaps the blinking might be too restrictive. I’m thinking of blind users that might want to use this but for whom blinking might not be the best way to communicate.</span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'> </span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>Also, it might be nice to activate it in some other way than focus of visual gaze, as again the blind user in this scenario won’t know where to look.</span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'> </span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>Take care,</span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>Sina</span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'> </span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>Website: <a href="http://www.SinaBahram.com" target="_blank">www.SinaBahram.com</a></span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>Twitter: @SinaBahram</span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'> </span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><b><span style='font-size:10.0pt;font-family:"Tahoma","sans-serif"'>From:</span></b><span style='font-size:10.0pt;font-family:"Tahoma","sans-serif"'> <a href="mailto:kde-accessibility-bounces@kde.org" target="_blank">kde-accessibility-bounces@kde.org</a> [mailto:<a href="mailto:kde-accessibility-bounces@kde.org" target="_blank">kde-accessibility-bounces@kde.org</a>] <b>On Behalf Of </b>Yash Shah<br><b>Sent:</b> Tuesday, March 20, 2012 4:28 PM<br><b>To:</b> <a href="mailto:kde-accessibility@kde.org" target="_blank">kde-accessibility@kde.org</a><br><b>Subject:</b> [Kde-accessibility] Working Demonstration for Simon[GSoC]</span><o:p></o:p></p><div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'> <o:p></o:p></p><div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>Hi Peter,<br><br>I am working on project of face detection/recognition for Simon since last few days. I have implement some part of it. I would like to demonstrate that to you.<br><br>As the major obstacle for command and control speech recognition systems is to differentiate commands from background noise, We will activate the recognition only when the user is actively looking at the screen / robot and we will also detect whether the user is speaking or not by recognizing mouth movements. <b>So now we are not only detecting face, We are also detecting whether he is speaking or not.</b> <b>We can also activate/deactivate the microphone on eye blinks. </b>For example, If the user blinks eyes continuously for two times, we can activate/deactivate the microphones. We will also allow user defined gestures to control it. This matches a natural "human to human" communication.<o:p></o:p></p><p>I have uploaded the working video on Youtube.<o:p></o:p></p><p><a href="http://www.youtube.com/watch?v=wGI4lYXxlWg" target="_blank">http://www.youtube.com/watch?v=wGI4lYXxlWg</a><o:p></o:p></p><p><br>I am able to detect:<o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;margin-bottom:12.0pt'>1. Face.<br>2 .Eyes <br>3. Mouth<br>4. Whether the user is speaking or not.<br><br>I am able to track the cropped image of the face accurately which can be seen in the Cropped Window of the video. I am also able to track the size of the face and also for tilted face. Also, The processing of images hardly takes much CPU usages. We are checking for users every 2 seconds which makes it fast and efficient.<br><br>This is just the demonstration of how things will be done. We will be using libKface library for efficient face detection. It was developed by my friend Aditya Bhatt from my college in Gsoc 2010. I will extend that for detecting mouth and other parts. It is not just about 3 months of GSoC, We will keep linking Computer vision to Simon even after that.<br><span style='color:#888888'> <br><br></span><o:p></o:p></p><pre><span style='color:#888888'>-- </span><o:p></o:p></pre><pre><span style='color:#888888'>Regards,</span><o:p></o:p></pre><pre><span style='color:#888888'>Yash Shah</span><o:p></o:p></pre></div></div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'> <o:p></o:p></p></div></div></div></div></div><p class=MsoNormal><o:p> </o:p></p></div></body></html>