hi , this is my revised proposal for the Google Summer of Code KDE-digikam project to implement face recognition in digikam.
<div>I would kindly request you to please provide suggestions / comments to improve the proposal.</div><div>Proposal follows:</div><div><br></div><div><span style="font-family: arial,sans-serif; font-size: 13px; border-collapse: collapse;"><u><b>Title</b></u>: Implementation of Face Recognition engine in digikam for automatic face recognition and tagging.</span></div>
<div><span style="font-family: arial,sans-serif; font-size: 13px; border-collapse: collapse;"><br><b><u>Motivation</u>:</b>I have always been a big fan and supporter of intelligent and elegant technologies and so when i had to join a Special Interest Group at our institute my obvious choice was the face recognition SIG ,as it interested me the most.</span></div>
<div><span style="font-family: arial,sans-serif; font-size: 13px; border-collapse: collapse;"><br></span></div><div><span style="font-family: arial,sans-serif; font-size: 13px; border-collapse: collapse;">Having had a brush with Picasa 3.5 and then Digikam as i shifted to KDE, made me feel the lack of face tagging.</span></div>
<div><span style="font-family: arial,sans-serif; font-size: 13px; border-collapse: collapse;">Automatic Face tagging being most demanded feature in digikam mailing lists (<a href="http://bugs.kde.org/show_bug.cgi?id=146288" target="_blank">request-id 46288</a>) motivated me to take it up as my GSoC project.</span></div>
<div><span style="font-family: arial,sans-serif; font-size: 13px; border-collapse: collapse;"><br></span></div><div><span style="font-family: arial,sans-serif; font-size: 13px; border-collapse: collapse;">
Also since face recognition is what i am working on as my area of research at my institute, i have a deep desire in pursuing this project as it would be most fruitful for the project and me. ( I had discussed this feature with the digikam-devel list <a href="http://old.nabble.com/face-recognition-in-digikam-td26844374.html" style="color: rgb(0, 0, 204);" target="_blank">http://old.nabble.com/face-recognition-in-digikam-td26844374.html</a> ).</span></div>
<div><span style="font-family: arial,sans-serif; font-size: 13px; border-collapse: collapse;"><br>
<b><u>Implementation Details</u>:</b></span></div><div><span style="font-family: arial,sans-serif; font-size: 13px; border-collapse: collapse;"><b><span style="font-weight: normal;">I propose to follow a </span>library and plugin architecture<span style="font-weight: normal;"> wherein the front-end would be menu additions to digikam in the form of a face-recognition plugin and the back end would have the detection and recognition engine performing the image processing.</span><br>
<span style="font-weight: normal;"><br></span></b>
<div style="margin-left: 40px;"><b>Library implementation:</b></div><div style="margin-left: 40px;"><b><br></b></div><div style="margin-left: 40px;">1.The face detection would be done using multiple Haar-Cascades based <a href="http://libface.sourceforge.net/file/Home.html" target="_blank">Libface</a>'s detector which has been tried , tested and implemented in openCV with promising results..</div>
<div style="margin-left: 40px;">2.Since i propose to use Elastic Bunch Graphs for recognizing faces (reason and comparison with other models follow) i propose to use OpenCV to perform the image processing needs. </div><div style="margin-left: 40px;">
3.Since Most modern CPUs have multiple cores and recent advent of General Purpose GPU SDKs i propose to implement the library routines in either CUDA or OpenCL. (<b>discussions with mentors required</b>).<br>CUDA and OpenCL would be more interesting for the library because the recognition speed would increase overtime as more core CPUs and faster GPUs appear common desktop.Hence making digikam future proof.<br>
<br><span style="font-family: arial,sans-serif; font-size: 13px; border-collapse: collapse;">The idea behind dividing the task into a library is to extend the face-recognition to object recognition in general, in future,<br>
and to make it easy for other projects to adopt the same library in their code.</span><br></div>
<div style="margin-left: 40px;"><br></div><div style="margin-left: 40px;"><b>Why Elastic Bunch Graph (EBG) Based Face Recognition:</b></div><div style="margin-left: 40px;">1.It adds semantics to face recognition.It recognizes faces taking into consideration various facial features(eyes, nose etc) and uses matching criterion like distance between the features.</div>
<div style="margin-left: 40px;">As compared to the above PCA and LDA based methods like Eigen Faces and Fisher Faces do not extract facial feature data and thus do not use the patterns in faces , resulting in lower recognition accuracy.</div>
<div style="margin-left: 40px;"><br></div><div style="margin-left: 40px;">2.Elastic Bunch Graphs can store the feature data in matrices (may vary depending on implementation) so adding another training image is just a matter of adding a matrix entry.Where as Eigen Faces and Fisher Faces based methods create a set of representative images (eigen faces or fisher faces) which need to be recalculated every time a new training image is added leading to a lot of time overhead for large training sets.</div>
<div style="margin-left: 40px;">This lack of retraining in the case of EBG based method makes it a better contender for the library.</div><div style="margin-left: 40px;"><br></div><div style="margin-left: 40px;">3.Since EBG based method considers semantic data of face it understands where there is a pose and expression variation in the test image. This is a major concern in a photo management application wherein there may be photos of the same person in a different mood hence a different facial expression.</div>
<div style="margin-left: 40px;"><br></div><div style="margin-left: 40px;">Further statistical data comparing EBG method to various other methods for face recognition can be found here <a href="http://tinyurl.com/facerec-compare" target="_blank">http://tinyurl.com/facerec-compare</a></div>
<div style="margin-left: 40px;"><br></div></span></div><blockquote style="border: medium none ; margin: 0pt 0pt 0pt 40px; padding: 0px;"><div><span style="font-family: arial,sans-serif; font-size: 13px; border-collapse: collapse;"><div>
<span style="font-family: arial,sans-serif; font-size: 13px; border-collapse: collapse;"><b>The UI and the Digikam/KDE integration:</b></span></div><div>The face recognition feature would be a plugin/built-in feature of digikam which can be enabled/disabled from the </div>
<div>"Configure Digikam Menu" in the "Settings" Main Menu item already in digikam.</div><div><br></div><div>Once enabled the detection would be run through all the digikam albums.This would be done asynchronously to not hinder usability.</div>
<div><br></div><div>Once the detection is done, the user is presented with a simple widget containing the detected faces along with UI elements necessary to name the faces and also to reject non face images detected by the detection system.<br>
This would include code from the current "Edit Menu" where in a box can be drawn on a portion of the image.<br></div>
<div><br></div><div>After which the metadata would be stored in RDF or XMP format inside the image. Also region tagging would be based on the following image annotation system <a href="http://www.kanzaki.com/docs/sw/img-annotator.html" target="_blank">http://www.kanzaki.com/docs/sw/img-annotator.html</a> .<br>
<br>The detected faces would be stored in a database ( choice depends on project constraints, mostly SQLite) along with the <br>bunch graphs for faster data access and updation.<br><br>The metadata generated , as in the names of people in a photograph would be registered with Nepomuk for linking with names in emails etc.<br>
</div>
<div><span style="font-family: arial,sans-serif; font-size: 13px; border-collapse: collapse;"><b><br>
</b></span></div></span></div></blockquote><div><span style="font-family: arial,sans-serif; font-size: 13px; border-collapse: collapse;"><u><b>Proposed schedule:</b></u><br></span><span style="font-family: arial,sans-serif; font-size: 13px; border-collapse: collapse;"><b>Now - April 9th </b>:<i>Build digikam</i> and tinker around with the code and fix few bugs to <i>increase familiarity</i>.<br>
<br></span><div style="margin-left: 40px;"><span style="font-family: arial,sans-serif; font-size: 13px; border-collapse: collapse;"></span></div><span style="font-family: arial,sans-serif; font-size: 13px; border-collapse: collapse;"><b>April 10-April 30th</b> :Finalize on the metadata format i.e RDF / XMP and <i>get the basic detection working</i>.<br>
<br><b>May 1st-June 1st</b> :OpenCL/CUDA based Recognition engine ready.<i>Documentation and bundling into library with some test images</i>.<br><b>June 1st-July 1st</b> : Finish UI building plugin and Integration with the recognition engine.<i>Document code and write a User's Guide.</i> <br>
<br><b>July 1st-August 1st</b>: Write a small demon to collect performance data from willing beta testers.Thoroughly test the system and collect anonymous usage data , fine tune the plugin and the recognition engine. <i>Documents the tests and performance statistics.<br>
<br></i><b>August 1st-August 9th</b>:Last minute modifications <i>if any</i>.<br><br></span></div><div><span style="font-family: arial,sans-serif; font-size: 13px; border-collapse: collapse;"><br>
<span style="border-collapse: separate; font-family: arial; font-size: small;"><div><span style="font-family: arial,sans-serif; font-size: 13px; border-collapse: collapse;"><u><b>Why should i be chosen:</b></u></span></div>
<div><span style="font-family: arial,sans-serif; font-size: 13px; border-collapse: collapse;"><u><b><br></b></u></span></div><div><span style="font-family: arial,sans-serif; font-size: 13px; border-collapse: collapse;">1. I have a good understanding of the Qt framework and have used as well as given demos of the same in local LUG meetings.</span></div>
<div><span style="font-family: arial,sans-serif; font-size: 13px; border-collapse: collapse;">2. I am pursuing research on face recognition in my institute and would be able to give a lot to this project.</span></div><div>
<span style="font-family: arial,sans-serif; font-size: 13px; border-collapse: collapse;">3. I have been an active member of the Bangalore Open Solaris User's Group (BOSUG) for over a year and am deeply committed to free software. I am also working on an installer based on Qt for Belenix OS.<br>
</span></div>
<div><span style="font-family: arial,sans-serif; font-size: 13px; border-collapse: collapse;">4. I have experience using svn and git and am comfortable with working with tools like cmake etc.</span></div><div><span style="font-family: arial,sans-serif; font-size: 13px; border-collapse: collapse;"><br>
</span></div></span><br><u><b>About me</b></u></span></div><div><span style="font-family: arial,sans-serif; font-size: 13px; border-collapse: collapse;"><u><b><br></b></u></span></div><div><span style="font-family: arial,sans-serif; font-size: 13px; border-collapse: collapse;">Name: Kunal Ghosh</span></div>
<div><span style="font-family: arial,sans-serif; font-size: 13px; border-collapse: collapse;">IRC : gancient<br>Location: Bangalore , India<br><br>I am right now pursuing my Bachelor of Engineering in Computer Science and Engineering and have been deeply motivated and inspired by the philosophy of Free Software.I take interest in music and arts of all forms.</span></div>
<div><span style="font-family: arial,sans-serif; font-size: 13px; border-collapse: collapse;"><br></span></div><div><span style="font-family: arial,sans-serif; font-size: 13px; border-collapse: collapse;">Pattern Recognition and Robotics interest me ( Though being opposite fields, in my opinion they complement each other ).</span></div>
<div><span style="font-family: arial,sans-serif; font-size: 13px; border-collapse: collapse;"><br></span></div><div><span style="font-family: arial,sans-serif; font-size: 13px; border-collapse: collapse;">I also love to use python to accomplish my programming needs whenever possible (language should be a means of easy and efficient communication and python is exactly that).</span></div>