[Digikam-devel] face recognition for digikam

Mahesh Hegde maheshmhegade at gmail.com
Tue Mar 27 21:50:54 BST 2012


*Project*: Face Recognition

*Name* :Mahesh M Hegde

*E-mail* :maheshmhegade at gmail.com

*Freenode/IRC* :mmh

*Github*:http://goo.gl/Ivblf

*Location *:Bangalore India

*Motivation*:

Since Development is continuous process,integrating library like opentld
not only solves face recognition bug,but also opens many possibilities
for future development of Digikam,
for instance i can extend face recognition idea to general object
recognition.I believe adding feature like face recognition and automatic
taging which is most requested feature for digikam(492 votes in KDE
bugzilla),add to user experience a lot.It was long time desire to be
developer of image editor or part of it.

*Proposal Title*: Face Recognition, and tagging in digiKam

*Implementation*:

Digikam already has face detection for most of posture of faces for images
from album.Also has UI interface for mannual taging with Name string.But
this information is not being used for recognitionising faces in new
images.This task needs Face Recognition using modified version of OpenTLD.

OpenTLD is opensource algorithm and *working principle* is as below:
As soon as we provide template for tracking object,it initialises intial
training parameters and confidence parameter.Model consists of growing
event and proning event,proning event focusses on providing negative
examples to denoise growing event.Detector and Tracker are key concepts in
this algorithm since they agree upon position of object of interest.
I communicated with the author of this algorithm who actually has
implemented kind of face recognition from a webcam video stream.I discussed
briefly with regard to our digikam project with him,for sure he will be
helping us if we face any abstacles.

Face recognition is crucial part of the project which needs to be very
accurate and precise.Track learn detect is most suitable algorithm for Face
Recognition.Even though its kind of predator based algorithm,its differs a
lot from classical approach,it is pretty new algorithm(designed in 2011)
works very nice with any kind of objects.TLD uses combination of haar
waveletts and local binary patterns as basis of feature space.Its
invariance to illumination and still captures sufficient structural
information.It has confidence parameter which provides possibilities of
varying accuracy-computation trdeoff.


We are not using tracking part of this algorithm as we have already
implented face detection.OpenTLD implementation is available in cpp.I will
be modifying and using this library for specific object(Face)
Recognition.General implementation flow will be storing new Face feature
vectors in database and comparing and finding minimum distance and percent
of accuracy with input intensity normalised patch feature vector.

Digikam already has face detection may with very minute bugs i guess,ui
interface for each detected faces.Tags on regions of image
needs coordinates defining region,which can be obtained during face
detection itself.

PIMO person ontology and NCO contact ontology feature will be added for
taging,for this purpose i will take help from digiKam team and also from my
friend who was last year Nepomuk gsocer and developer.

Optionally adding feature to detect birds,animal and commonly used objects.

*Tentative Timeline*:

24th April to 10th May:Go through Digikam internal architecture,components
coordination and source code related to taging so far.

10th May to 25th May:Work on OpenTLD digikam integration.Library related
stuff and get rid of mannual selection of image template of
opentld.Extending one region selection to many region(faces).

25th June to 10th June:Storing feature space of manually tagged
templates(faces).And checking and polishing functionality by giving
"real"stuff,new face detected image.

10th june to 25 june:Checking for best optional features that can be added
to improve recognition on cost of computation.

25th june to 10th july:Finalising feature considerations,integrating with
Nepomuk may be PIMO person or NCO contact or both of them.

11th july to 30th july:Removing irrelevent portion opentld source code i.e
tracking/predator portion of code,achieving stable recognition.

1st aug to ...:Continue developing Digikam by adding more features like
testing performance in detecting birds,animals and objects through haar
classifier which later on can be used for wiki page integration.

*About Me:*

I am third year telecommunication engineering student of Peoples Education
Society Institute of Tecnology(PESIT) india.Very passionate about
image processing.I love playing chess which is one of my hobby.I am
extremely interested in arteficial neural network.I have developed basic
image editor in matlab guide framework a year ago,now Source code can be
found here:http://goo.gl/OERwN  screen shots can be found here:
http://goo.gl/X7WOr


I was advanced track student of machine learning which was offered online
for free,a few months ago by Andrew NG(stanford lecturer),where i got
opportunity to programme in octave(matlab equivalent) on interesting
systems like handwritten number/character recognition,eigen face(principle
component analysis) recognition,email-spam classification, recommender
systems and lot more.

I am familiar with KDE,c++ and Qt and matlab/octave.I am good at Linear
Algebra,algorithms and machine learing.

I am part of  ARKA 2012  2 day event which includes 1 day technical
workshop on opensource technologies and a 24 hour Hackathon for all
opensource enthusiasts.I freequently attend FSMK(free software movement
karnataka) sessions.

expecting suggesions from you
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.kde.org/pipermail/digikam-devel/attachments/20120328/b34c6076/attachment.html>


More information about the Digikam-devel mailing list