[KimDaBa] Duplicates
Jesper K. Pedersen
blackie at blackie.dk
Sat Dec 3 15:47:18 GMT 2005
May I suggest you put this into the wiki.
Cheers
Jesper.
On Saturday 03 December 2005 04:00, Lars Clausen wrote:
| So I found I had a number of duplicate images in my database and tried
| out the Remove Duplicates plugin. I was surprised to find that it took
| forever even on Fast mode, so I looked at the index.xml file and came up
| with this bash one-liner to find duplicate files:
|
| grep md5sum index.xml | cut -d\" -f16- | sort | uniq -w 32
| --all-repeated=separate | cut -d\" -f5
|
| This merely compares md5sum, but quickly, and prints out all the
| duplicates with a line between them. It is also very dependant on the
| XML format. This should be easy to reproduce in a Perl plugin, the
| problem (of course) is making a good interface.
|
| -Lars
--
Jesper K. Pedersen | Klarälvdalens Datakonsult
Senior Software Engineer | www.klaralvdalens-datakonsult.se
Prinsensgade 4a st. |
9800 Hjørring | Platform-independent
Denmark | software solutions
More information about the Kphotoalbum
mailing list