[Digikam-users] Re: How to export duplicates list?

Elle Stone l.elle.stone at gmail.com
Fri Jan 21 13:52:24 GMT 2011


Hi, Francis,

I just dealt with the same issue, only with a lot fewer duplicates
than you need to deal with. As my solution was round-about, I was
hoping someone would chime in with a better way to find the duplicate
images.

My first clue that there were in fact duplicate images (I mean exactly
duplicate, down to the metadata, only the file name and sometimes the
file path were different) in my database was after I created a new
clean digikam database (I had already written the metadata to the
appropriate images before archiving the previous digikam database).
When I closed digikam and inspected the databases created using SQLite
Database Browser, to my surprise, there were more images in the
database than there were UniqueHashes.

Upon investigation, it turned out that some of the images were in fact
duplicate images. Some of these duplicate images were in the same
directory with slightly different names. Some had inadvertently,
somewhere along the way, been created in the wrong directory.

I used the SQLite Database Browser to locate the images. It wasn't
easy. You can click on "File", then "Export", then "Table as csv
file", to get a comma-separated listing of the contents of each table
in the database. If you export enough tables and pull them all into a
spreadsheet, you can use Images and thumbid and FilePath, along with
the UniqueHash, to locate all the images with duplicate UniqueHashes.

As I was only dealing with about 10 duplicates out of 6000 images,
tracking them down by hand and verifying visually was not such a
chore, given that the spreadsheet I created using the exported
database tables told me where to look.

In your case, if you really have lots and lots of duplicates, and if
nobody comes up with a way to use digikam to track down the
duplicates, all is not lost, but you'll end up doing a lot more work
with the exported tables than I had to do. You can use SQLite Database
Browser to locate the duplicates and make a list. Then you can use
exiftool (or maybe exiv2?) at the command line to move all the
duplicates to a new directory, if that will help. I myself have never
used exiftool to move files listed in a spreadsheet, but I understand
that it can easily do so. Also, the exiftool forum is very friendly
and answer questions quickly.

I'd advise doing a lot of testing on a small set of files before using
exiftool on your real files, as getting the syntax wrong can wreck
major havoc. If you decide to go the exiftool route, I can help you
figure out the syntax to move images on a list.

I know the above suggestions are not easy or quick and I really hope
someone else has an easier answer. It seems unreasonable that digikam
will happily created UniqueHashes that are the same for more than one
image and not issue a warning and a list of affected files.

Also, if your duplicate images don't have exactly the same metadata,
then they probably won't generate UniqueHashes that are the same for
the duplicates.

Elle Stone



More information about the Digikam-users mailing list