[digiKam-users] Bug? Not all duplicates found.
digikam-users.johnny1000 at spamgourmet.com
digikam-users.johnny1000 at spamgourmet.com
Fri Jan 21 13:18:48 GMT 2022
Greetings,
I can't readily find anything about this in the bug tracker.
Is anyone else seeing what is described below?
== Test setup
First I use the maintainance tools to make sure that all images are
known to DigiKam, that all images have had their fingerprints generated,
and that the databases have been cleaned.
I set the similarity range to 50%-100% (that is a useful range in my use
case).
I make sure no images are added, deleted or moved between tests, so each
test has _exactly_ the same starting point.
As far as I know all parameters stay the same all the time.
I close DigiKam, and with the application sqlitebrowser, I manually
delete the duplicates registered in similarity.db by deleting all rows
in the table ImageSimilarity, so DigiKam has to start from scratch when
finding duplicates.
== The test
1. Open DigiKam.
2. Go to Tools -> Maintainance and run _only_ Find duplicate items.
3. Close DigiKam
4. Open similarity.db and check the number of rows in the table
ImageSimilarity
5. Close similarity.db
Repeat 1-5 and note the number of rows in the database.
== Test results
In my case the number keeps growing with each repeat, as if DigiKam
doesn't find _all_ duplicates the first time around.
Eventually it _seems_ as though the number of rows stabilizes. That is,
no additional duplicates are found at extra repeats.
I stopped the individual test when the number hadn't changed for 3 repeats.
I have to repeat several times before the number stabilizes.
I can reproduce this behaviour at will.
I just delete all rows from the ImageSimilarity table to reset to 0
known duplicates.
Important to note is:
1.
I see this behaviour _only_ when restarting DigiKam between each repeat.
If I leave DigiKam open between repeats, the number of rows do not change.
2.
Each time I reset the database, the starting number of rows is different
from the starting number at the previous test.
The stabilizing number is also different from test to test.
3.
I have consistently rejected to download the large binary files needed
for face recognition and red eye removal.
I have no use for those functions, and want that 1/3 of a gigabyte for
stuff I actually _do_ use.
I don't see any obvious connection between those two functions and the
find duplicates function, but of course I could be wrong.
== Conclusion
Is this a bug, or is this expected behaviour?
Any insights into this would be much appreciated.
And thank you to all developers for making DigiKam for us! :o)
Best regards
Johnny :o)
More information about the Digikam-users
mailing list