[Digikam-devel] GPU Implementation of image processing algorithms

kunal ghosh kunal.t2 at gmail.com
Sun Apr 18 03:16:57 BST 2010


Thanks for the reply Aditya,
comments in-lined,

If you wish to use GPGPU, OpenCL is the right choice over CUDA. CUDA's API
> is not only proprietary, it also is specifically for nVidia hardware. OpenCL
> is good in this sense. If you have an nVidia card, it seems that OpenCL will
> internally use CUDA, therefore OpenCL will work on all users' computers.
>

 That is my opinion too.

However, I'm not sure, but there seems to be a small problem with adoption
> : http://www.khronos.org/opencl/adopters/
> There are some weird conditions regarding publishable usage that I'm not
> entirely sure about - it seems that you must gain some sort of approval and
> pass some tests before you are allowed to say that you used OpenCL in
> digiKam. If one doesn't want to publish his/her code, but keep it for
> personal/closed usage, then you don't have to pay royalty.
> Please correct me if I'm wrong, since this seems free as in speech/usage,
> but not free as in beer - there are some royalty issues if you don't pass
> the conformance tests.
>

Actually, there is a small interpretation error, the conditions regarding
publishable usage is only for the adopters (ie. NVIDIA or ATI etc
http://www.khronos.org/members/conformant/ ) who come up with OpenCL
compliant drivers / API implementation. As mentioned in the link "main
adopters page <http://www.khronos.org/adopters>" we as implementers should
not have any problems. (Quoting from the site:
http://www.khronos.org/adopters)

"*  Implementers - for no cost or license fees you may:*

   - *Create and deliver a product using the publicly released
   specifications and technologies*
   - *But you may *not* claim that it is "compliant" unless they enter and
   pass conformance testing*
   - *And if the product is a software or hardware engine, you may not
   advertise it using the Khronos API technology logos or trademarks*    "

Regarding the second and the third point , that's ok from an OpenSource
project perspective . As it provides us, both the freedoms
of free speech & free beer , but asks us not to publicize it :) (without
conformance testing).


> As a side note - I had a talk with Alex about using EBGM a few days ago,
> and we decided not to use it for the moment. We don't have anything against
> the algorithm, it's just that if only I did it, there won't be enough time
> to implement all the
>

Actually, freely available implementations of EBGM are available (as i had
pointed out , in a reply to Marcel's mail sometime back the implementations
can be found at  http://malic.sourceforge.net/ and also at
http://www.cs.colostate.edu/evalfacerec/algorithms5.html ). Also i would
like to interact with you'll on IRC,on which IRC do you'll (Alex and you)
usually meetup.

IMHO , there would be sufficient time with respect to create the tagging
widget since its quite easy  to write plugins for
digikam.Also the EBGM algorithm will not take more than about a month as the
free implementations already
exist. I would have to use the OpenCL API to modify the necessary portions
of the already existing code.

algorithms and also complete the tagging part within the GSoC period. We
> definitely want to have eigenface and fisherface, despite the limitations -
> the retraining is slow only if there are more than ~400 tagged friends in
> the database, which is a rarity. The main concern is pose variation - for
> that, I plan to link multiple poses of the same person with the same ID. As
> a consequence, the initial accuracy while training shall be less, but after
> some time it would be good enough.
>

As you mentioned , your main concern is pose variation. Fisherface is IMHO (
from my work on face recognition in the past 1 year ) not the right way to
go for the following reasons:

1. Fisherfaces uses the same methodology as eigenfaces ( which is easily
known from a preliminary survey of the subject) and
   doesn't yield satisfactory results in *expression and pose variant*images.

2. Only advantage of fisherface over eigenface is that it makes the
recognition* illumination* *invariant***. But that's not much of a
    problem as in family / group photos that digikam will mostly encounter
photos taken with camera flash and outdoor photographs
    which result in well lit photographs.

Also the problem with pose and retraining of eigenfaces and fisherfaces is
as follows:

Assume you are training your recognition model with training images of a
single person.
Since eigenfaces and fisherfaces rely on the *nearest neighbour
classifier*for recognition you have to train the model with more
number of images to
give satisfactory recognition results.

Now let us look at *how many* training images we would need (of a single
person) for satisfactory recognition results.

Assuming a person looking straight at the camera, a 1 degree variation in
pose from 0 degree ( face towards left) to 180 degree (face towards right)
would result in 180 images.

Now if the person looks upwards 1 degree (again we would have 180 images
from left to right) and if we keep varying poses we would get approximately
180 x 180 images of the same person.

Also for each successive pose varied image added to the training set the
training time would increase exponentially.

Now, there are two problems to this:

1.There will not be sufficient training images (so many pose variations of a
single person is difficult to get)
   to get satisfactory results.i.e Training would take a long time.

2.Since Eigenfaces and Fisherfaces ( in general Principal Component based
models ) calculate a single set of
   eigenfaces/fisherfaces from the training set. For a well trained
recognizer the training time would be enormous.


>
> But still, if you can implement EBGM for libface, it'd be great :) We'd
> have one more algorithm in the bag. It's just that one person can't finish
> everything in time. So since you're willing, start EBGM then. Eigenfaces is
> almost complete.
>

I would love to add the implementation to libface. But looking at the
similarity between Eigenfaces and Fisherfaces IMHO the
effort should be to get as many different algorithms implemented as
possible.

PS: Hopefully someone will figure out if OpenCL can be used in digiKam or
> not.
>

more opinions / suggestions from the digikam community are welcome.


>
>
>
> On Sat, Apr 17, 2010 at 9:21 PM, kunal ghosh <kunal.t2 at gmail.com> wrote:
>
>> Hi,
>> I am implementing the face recognition algorithm for digikam, and wanted
>> to use GPGPU <http://en.wikipedia.org/wiki/GPGPU> frameworks for the
>> same. But
>> was not able to decide which framework to use OpenCL<http://en.wikipedia.org/wiki/OpenCL>or
>> CUDA <http://en.wikipedia.org/wiki/CUDA> (C for CUDA specifically).
>>
>> PS: I am willingly not including any more information about either of the
>> above frameworks to attract unbiased opinions.
>>
>> Also, i could code part that would execute on the GPU in python ,
>> shortening the development cycle. Good python bindings exist for either GPU
>> programming frameworks. Py[CUDA,OpenCL] are the bindings.
>>
>> Python functions are easily callable from within C/C++ code as
>> demonstrated by Link 1<http://docs.python.org/release/2.5.2/ext/callingPython.html>
>> Link2 <http://www.codeproject.com/KB/cpp/embedpython_1.aspx> and Link3<http://www.linuxjournal.com/article/8497>
>> so, is it fine if the algorithms are implemented in python and then called
>> from within digikam.
>>
>> all suggestions , comments welcome.
>>
>> --
>> regards
>> -------
>> Kunal Ghosh
>> Dept of Computer Sc. & Engineering.
>> Sir MVIT
>> Bangalore,India
>>
>> Quote:
>> "Ignorance is not a sin, the persistence of ignorance is"
>> --
>> "If you find a task difficult today, you'll find it difficult 10yrs later
>> too !"
>> -----
>> "Failing to Plan is Planning to Fail"
>>
>> Blog:kunalghosh.wordpress.com
>> Website:www.kunalghosh.net46.net
>> V-card:http://tinyurl.com/86qjyk
>>
>>
>> _______________________________________________
>> Digikam-devel mailing list
>> Digikam-devel at kde.org
>> https://mail.kde.org/mailman/listinfo/digikam-devel
>>
>>
>
>
> --
> Aditya Bhatt
> Blog : http://adityabhatt.wordpress.com
> Face Recognition Library : http://libface.sourceforge.net
>
> _______________________________________________
> Digikam-devel mailing list
> Digikam-devel at kde.org
> https://mail.kde.org/mailman/listinfo/digikam-devel
>
>


-- 
regards
-------
Kunal Ghosh
Dept of Computer Sc. & Engineering.
Sir MVIT
Bangalore,India

Quote:
"Ignorance is not a sin, the persistence of ignorance is"
--
"If you find a task difficult today, you'll find it difficult 10yrs later
too !"
-----
"Failing to Plan is Planning to Fail"

Blog:kunalghosh.wordpress.com
Website:www.kunalghosh.net46.net
V-card:http://tinyurl.com/86qjyk
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.kde.org/pipermail/digikam-devel/attachments/20100418/ac0051c1/attachment.html>


More information about the Digikam-devel mailing list