[KimDaBa] Duplicates

Jesper K. Pedersen blackie at blackie.dk
Sat Dec 3 15:47:18 GMT 2005

May I suggest you put this into the wiki.

On Saturday 03 December 2005 04:00, Lars Clausen wrote:
| So I found I had a number of duplicate images in my database and tried
| out the Remove Duplicates plugin.  I was surprised to find that it took
| forever even on Fast mode, so I looked at the index.xml file and came up
| with this bash one-liner to find duplicate files:
| grep md5sum index.xml  | cut -d\" -f16- | sort | uniq -w 32
| --all-repeated=separate | cut -d\" -f5
| This merely compares md5sum, but quickly, and prints out all the
| duplicates with a line between them.  It is also very dependant on the
| XML format.  This should be easy to reproduce in a Perl plugin, the
| problem (of course) is making a good interface.
| -Lars

Jesper K. Pedersen          |  Klarälvdalens Datakonsult
Senior Software Engineer    |  www.klaralvdalens-datakonsult.se
Prinsensgade 4a st.         |
9800 Hjørring               |  Platform-independent
Denmark                     |  software solutions

More information about the Kphotoalbum mailing list