[KPhotoAlbum] handling duplicate images

Benny Simonsen benny at slbs.dk
Mon Aug 9 15:04:42 BST 2010


Hi

> over the years I arranged the images in several directories with numerous
> duplicates according to different grouping criteria (which are actually
> symlinks and don't consume significant disc space). When I import the top
> level image directory into KPhotoAlbum, all these duplicates appear as
> multiple instances of the same image in the thumbnail view.

1: If the duplicates are symlinks - just remove them ... if you don't
need them any more.
2: If you don't want to remove them you can make a "mirror" directory
with sym links to the images, which isn't symlinks.
3: If not all images are symlinks: Find duplicates - Create a file
with checksums (sha1, md5 or similar) and remove duplicates

A little help for solutions - not tested!!!
The below
1:
Command "find . -type l" do the job
2:
Make a clean dir, go to that, find <old dir> -type f -exec ln -s {} \;
This requires that all file names are uniq, which might not be true,
so the ln -s {} should maybe expanded with a new name - e.g. the old
path included in the filenme.
If not all duplicates are symlinks: Solution can be applied to the
"mirror" dir - just removing duplicate sym links.
3:
This includes the following steps:
 find <image dir> -type f -exec sha1sum {} \; > shasums.txt
cat shasums.txt | sort > shasums.sorted.txt
Then you have to make a script that reads a line, stores the checksum
and filename (incl. path) in variables, and removes the file, if the
checksum is identical to the checksum in last line.

Hope that this might help you.

/Benny



More information about the Kphotoalbum mailing list