[KimDaBa] Duplicates
Ken Schutte
kschutte at csail.mit.edu
Mon Dec 12 15:56:08 GMT 2005
You might also want to check out a package called 'fdupes' which I think
does basically the same thing. I think it comes installed on some
distributions, but is easy to find otherwise.
Ken
Lars Clausen wrote:
> So I found I had a number of duplicate images in my database and tried
> out the Remove Duplicates plugin. I was surprised to find that it took
> forever even on Fast mode, so I looked at the index.xml file and came up
> with this bash one-liner to find duplicate files:
>
> grep md5sum index.xml | cut -d\" -f16- | sort | uniq -w 32
> --all-repeated=separate | cut -d\" -f5
>
> This merely compares md5sum, but quickly, and prints out all the
> duplicates with a line between them. It is also very dependant on the
> XML format. This should be easy to reproduce in a Perl plugin, the
> problem (of course) is making a good interface.
>
> -Lars
>
More information about the Kphotoalbum
mailing list