[digiKam-users] Bug? Not all duplicates found.

digikam-users.johnny1000 at spamgourmet.com digikam-users.johnny1000 at spamgourmet.com
Mon Jan 24 12:53:30 GMT 2022


Thank you,

yes, I know different digikam starts give different outputs. That is why 
I wrote in the first place :o)

During the test I described in the first post _nothing_ changes in the 
digikam settings or in the file system :o)

If it is not a bug that digikam does not find _all_ similar image pairs 
in a static[1] QSet of images in 1 execution of the "Find duplicate 
images" maintenance function, then I don't understand why that is.
[1]Static in the sense that I assume _all_ the _same_ images are put in 
the QSet at every new digikam start, because _everything_ is the _same_ 
in the file system and in the digikam settings at every digikam start, 
regardless of the fact that a QSet is unordered.

If it is the fact that a QSet is unordered which is the reason why the 
"Find duplicate images" algorithm can't make sure that _all_ images are 
compared to eachother, I have a suggestion:

When the "Find duplicate images" algorithm starts, generate a temporary 
ordered list (indexed array) from the QSet, and use that ordered list to 
make sure that all images are compared to eachother and registered in 
ImageSimilarity.

Is it as simple as that? (Probably not ;o))

The reason I have come across this is that I'm trying to automate (with 
Python scripting) a workflow of going through hundreds, sometimes around 
a thousand, incoming images, where I have to delete/replace/group and 
tag images.
Quite a bit of that I could automate using the databases, if I could be 
sure that "Find duplicate images" would populate the similarity database 
with _all_ current image pairs that match the given similarity range.
I'd like to minimize the risc of physical damage from repetitive 
movement, as well as not waste a lot of time manually pointing and 
clicking my way through so many images one image at a time :o)

Best regards :o)
Johnny :o)

Den 23.01.2022 kl. 16.32 skrev Maik Qualmann - metzpinguin at gmail.com:
> [SNIP very nice ASCII DB table examples, showing what Johnny observes]
> 
> Although the 2nd output has 2 more lines, there are no Image IDs that 
> are not also present in the first output, some are now duplicates 
> because of a different order compared to other pairs formed.
> 
> 
> There is a different order not only because of the QSet, but would also 
> arise if you were to rename images. All results are in the range of 
> 65%-100%.
> 
> 
> Maik



More information about the Digikam-users mailing list