[Digikam-devel] Regarding face recognition in digikam GSoC project

Sun Apr 4 08:12:18 BST 2010

Hi Kunal,
  As I already said, in an overview the proposal looks really
good and well baked already. There are not much for me to
suggest changes since it is closing perfections. One
important thing is you must still pay closer attention to your
punctuation since it is an integral part of technical writing.
Other than that my minor suggestions inlined.

Also don't wait for anything else, first submit it and then
read the review :P

It will be good although not necessary to wrap your proposal
at 80 chars per line. It will make the proposal look more
professional.

On Fri, Apr 2, 2010 at 11:50 PM, kunal ghosh <kunal.t2 at gmail.com> wrote:

> hi , this is my revised proposal for the Google Summer of Code KDE-digikam
> project to implement face recognition in digikam.
> I would kindly request you to please provide suggestions / comments to
> improve the proposal.
> Proposal follows:
>
> *Title*: Implementation of Face Recognition engine in digikam for
> automatic face recognition and tagging.
>
> *Motivation:*I have always been a big fan and supporter of intelligent and
> elegant technologies and so when i had to join a Special Interest Group at
> our institute my obvious choice was the face recognition SIG ,as it
> interested me the most.
>
> Having had a brush with Picasa 3.5 and then Digikam as i shifted to KDE,
> made me feel the lack of face tagging.
> Automatic Face tagging being most demanded feature in digikam mailing lists
> (request-id 46288 <http://bugs.kde.org/show_bug.cgi?id=146288>) motivated
> me to take it up as my GSoC project.
>
> Also since face recognition is what i am working on as my area of research
> at my institute, i have a deep desire in pursuing this project as it would
> be most fruitful for the project and me. ( I had discussed this feature with
> the digikam-devel list
> http://old.nabble.com/face-recognition-in-digikam-td26844374.html ).
>
> *Implementation Details:*
> *I propose to follow a library and plugin architecture wherein the
> front-end would be menu additions to digikam in the form of a
> face-recognition plugin and the back end would have the detection and
> recognition engine performing the image processing.
>
> *
> *Library implementation:*
> *
> *
> 1.The face detection would be done using multiple Haar-Cascades based
> Libface <http://libface.sourceforge.net/file/Home.html>'s detector which
> has been tried , tested and implemented in openCV with promising results..
> 2.Since i propose to use Elastic Bunch Graphs for recognizing faces (reason
> and comparison with other models follow) i propose to use OpenCV to perform
> the image processing needs.
> 3.Since Most modern CPUs have multiple cores and recent advent of General
> Purpose GPU SDKs i propose to implement the library routines in either CUDA
> or OpenCL. (*discussions with mentors required*).
> CUDA and OpenCL would be more interesting for the library because the
> recognition speed would increase overtime as more core CPUs and faster GPUs
> appear common desktop.Hence making digikam future proof.
>
>
> The idea behind dividing the task into a library is to extend the
> face-recognition to object recognition in general, in future,
> and to make it easy for other projects to adopt the same library in their
> code.
>
> *Why Elastic Bunch Graph (EBG) Based Face Recognition:*
> 1.It adds semantics to face recognition.It recognizes faces taking into
> consideration various facial features(eyes, nose etc) and uses matching
> criterion like distance between the features.
> As compared to the above PCA and LDA based methods like Eigen Faces and
> Fisher Faces do not extract facial feature data and thus do not use the
> patterns in faces , resulting in lower recognition accuracy.
>
> 2.Elastic Bunch Graphs can store the feature data in matrices (may vary
> depending on implementation) so adding another training image is just a
> matter of adding a matrix entry.Where as Eigen Faces and Fisher Faces based
> methods create a set of representative images (eigen faces or fisher faces)
> which need to be recalculated every time a new training image is added
> leading to a lot of time overhead for large training sets.
> This lack of retraining in the case of EBG based method makes it a better
> contender for the library.
>
> 3.Since EBG based method considers semantic data of face it understands
>

It will be good to have a comma in between, a single sentence
is creating some confusion.

>  where there is a pose and expression variation in the test image. This is
> a major concern in a photo management application wherein there may be
> photos of the same person in a different mood hence a different facial
> expression.
>
> Further statistical data comparing EBG method to various other methods for
> face recognition can be found here http://tinyurl.com/facerec-compare
>
> *The UI and the Digikam/KDE integration:*
> The face recognition feature would be a plugin/built-in feature of digikam
> which can be enabled/disabled from the
> "Configure Digikam Menu" in the "Settings" Main Menu item already in
> digikam.
>
> Once enabled the detection would be run through all the digikam albums.This
> would be done asynchronously to not hinder usability.
>
> Once the detection is done, the user is presented with a simple widget
> containing the detected faces along with UI elements necessary to name the
> faces and also to reject non face images detected by the detection system.
> This would include code from the current "Edit Menu" where in a box can be
> drawn on a portion of the image.
>
> After which the metadata would be stored in RDF or XMP format inside the
> image. Also region tagging would be based on the following image annotation
> system http://www.kanzaki.com/docs/sw/img-annotator.html .
>
> The detected faces would be stored in a database ( choice depends on
> project constraints, mostly SQLite) along with the
> bunch graphs for faster data access and updation.
>
> The metadata generated , as in the names of people in a photograph would be
> registered with Nepomuk for linking with names in emails etc.
>  *
> *
>
> *Proposed schedule:*
> *Now     - April  9th *:*Build digikam* and tinker around with the code
> and fix few bugs to *increase familiarity*.
>
> *April 10-April 30th* :Finalize on the metadata format i.e RDF / XMP and *get
> the basic detection working*.
>
> *May 1st-June 1st*   :OpenCL/CUDA based Recognition engine ready.*Documentation
> and bundling into library with some test images*.
> *June 1st-July 1st*   : Finish UI building plugin and Integration with the
> recognition engine.*Document code and write a User's Guide.*
>
> *July 1st-August 1st*: Write a small demon to collect performance data
> from willing beta testers.Thoroughly test the system and collect anonymous
> usage data , fine tune the plugin and the recognition engine. *Documents
> the tests and performance statistics.
>
> *
>

Month long targets are a bit too much. It will be better
to have a more fine grained, say a fortnight deliverables.
But depends on your mentors if he cares about the
schedule much in the proposal. If he doesn't then don't
spend time on this.

***August 1st-August 9th*:Last minute modifications *if any*.
>
>
> *Why should i be chosen:*
> *
> *
> 1. I have a good understanding of the Qt framework and have used as well as
> given demos of the same in local LUG meetings.
> 2. I am pursuing research on face recognition in my institute and would be
> able to give a lot to this project.
> 3. I have been an active member  of the Bangalore Open Solaris User's Group
> (BOSUG) for over a year and am deeply committed     to free software. I am
> also working on an installer based on Qt for Belenix OS.
>  4. I have experience using svn and git and am comfortable with working
> with tools like cmake etc.
>
>
> *About me*
> *
> *
> Name: Kunal Ghosh
> IRC : gancient
> Location: Bangalore , India
>
> I am right now pursuing my Bachelor of Engineering in Computer Science and
> Engineering and have been deeply motivated and inspired by the philosophy of
> Free Software.I take interest in music and arts of all forms.
>
> Pattern Recognition and Robotics interest me ( Though being opposite
> fields, in my opinion they complement each other ).
>
> I also love to use python to accomplish my programming needs whenever
> possible (language should be a means of easy and efficient communication and
> python is exactly that).
>

Awesome proposal I must say. Wish you a very good
luck and let us share the love of summer!!!

-- 
Thanks and regards,
 Madhusudan.C.S

Blogs at: www.madhusudancs.info
My Online Identity: madhusudancs
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.kde.org/pipermail/digikam-devel/attachments/20100404/98231cd8/attachment.html>