[KimDaBa] Duplicates

Ken Schutte kschutte at csail.mit.edu
Mon Dec 12 15:56:08 GMT 2005


You might also want to check out a package called 'fdupes' which I think 
does basically the same thing.  I think it comes installed on some 
distributions, but is easy to find otherwise.

Ken

Lars Clausen wrote:
> So I found I had a number of duplicate images in my database and tried
> out the Remove Duplicates plugin.  I was surprised to find that it took
> forever even on Fast mode, so I looked at the index.xml file and came up
> with this bash one-liner to find duplicate files:
> 
> grep md5sum index.xml  | cut -d\" -f16- | sort | uniq -w 32
> --all-repeated=separate | cut -d\" -f5
> 
> This merely compares md5sum, but quickly, and prints out all the
> duplicates with a line between them.  It is also very dependant on the
> XML format.  This should be easy to reproduce in a Perl plugin, the
> problem (of course) is making a good interface.
> 
> -Lars
> 




More information about the Kphotoalbum mailing list