[gsoc19-faces-workflow] Face detection/recognition datasets

woenx marcpalaus at hotmail.com
Thu May 16 23:54:59 BST 2019

Hello everyone,

in the context of the GSoC 2019 face recognition workflow improvement
project, I think it would be wise to use a common picture dataset, so we can
test and compare the results.

I personally used the "Labeled Faces in the Wild" dataset in the past
(http://vis-www.cs.umass.edu/lfw/ ), which contains thousands of pictures of
celebrities, which are labelled in the filename. However, pictures in that
database only include one face per photo, so it wouldn't be useful for
testing features such as searching pictures with multiple people in it. But
for the most part, it can work.

I found a huge list of datasets that we could choose from at

Ideally, we would need a dataset that included pictures without any previous
metadata, where each person appears multiple times (and that we already know
the name of that person), and including pictures with more than one person
in it. If possible, it should consist of hundreds of different people.

What do you think? Have you got any experience with any of these datasets?
Do you know any of these that would be perfect for our task?

Sent from: http://digikam.1695700.n4.nabble.com/digikam-devel-f1695701.html

More information about the Digikam-devel mailing list