[digiKam-users] Bug? Not all duplicates found.

Maik Qualmann metzpinguin at gmail.com
Sun Jan 23 15:32:15 GMT 2022


No, it's not a bug. Here are 2 outputs from the same search at different starts: 

MariaDB [digikam]> SELECT * FROM ImageSimilarity;
+----------+----------+-----------+--------------------+
| imageid1 | imageid2 | algorithm | value              |
+----------+----------+-----------+--------------------+
|    92937 |    92938 |         1 | 0.7276767902341286 |
|    92908 |    92928 |         1 | 0.7030320169782883 |
|    92931 |    92932 |         1 | 0.8336685456455954 |
|    92933 |    92934 |         1 | 0.6706040721483033 |
|    92900 |    92913 |         1 | 0.7505396291793182 |
|    92905 |    92915 |         1 | 0.7405491284049031 |
|    92904 |    92905 |         1 | 0.7127389155463217 |
|    92907 |    92908 |         1 | 0.8155335605445546 |
|    92909 |    92910 |         1 | 0.7758440327474614 |
|    92899 |    92900 |         1 | 0.6565841722229999 |
+----------+----------+-----------+--------------------+
10 rows in set (0.001 sec)

MariaDB [digikam]> SELECT * FROM ImageSimilarity;
+----------+----------+-----------+--------------------+
| imageid1 | imageid2 | algorithm | value              |
+----------+----------+-----------+--------------------+
|    92905 |    92915 |         1 | 0.7405491284049031 |
|    92900 |    92913 |         1 | 0.7505396291793182 |
|    92908 |    92927 |         1 |  0.669382620559168 |
|    92927 |    92928 |         1 |  0.732409125233628 |
|    92899 |    92903 |         1 | 0.6504303766351522 |
|    92909 |    92910 |         1 | 0.7560619468353904 |
|    92907 |    92908 |         1 | 0.8155335605445546 |
|    92904 |    92905 |         1 | 0.7127389155463217 |
|    92934 |    92935 |         1 |  0.671921833944039 |
|    92933 |    92934 |         1 | 0.6706040721483033 |
|    92931 |    92932 |         1 | 0.8449821703465952 |
|    92937 |    92938 |         1 | 0.7095138493232303 |
+----------+----------+-----------+--------------------+
12 rows in set (0.000 sec)

Although the 2nd output has 2 more lines, there are no Image IDs that are not also 
present in the first output, some are now duplicates because of a different order 
compared to other pairs formed.

There is a different order not only because of the QSet, but would also arise if you 
were to rename images. All results are in the range of 65%-100%. 

Maik

Am Sonntag, 23. Januar 2022, 13:51:27 CET schrieb digikam-
users.johnny1000 at spamgourmet.com:
> Thank you Maik for taking the time :o)
> 
> Is it a bug then, so I should open a bug report?
> 
> I think I understand most of what you explain about QSet, since I know
> sets from Python, and they too are unordered and use hashes for their
> internal logistics.
> 
> The difference between GUI and database is a different matter, but it's
> a waste of time to go further into that if the behaviour I see is simply
> a bug :o)
> 
> Best regards :o)
> 
> Johnny :o)
> 
> Den 22.01.2022 kl. 15.44 skrev Maik Qualmann - metzpinguin at gmail.com:
> > Well, the result you see in the GUI certainly won't differ. The entries
> > you
> > see in the ImageSimilarity table are not the complete end result.
> > Why are there different entries in this table? We use a QSet to store the
> > image IDs. QSet is unordered and uses a hash to store the information.
> > Therefore, the order of the image IDs in the QSet is never the same across
> > program starts. This results in the fact that the same images are never
> > always compared with one another. This can result in additional entries
> > in the table, but they have the same search range. The end result is
> > always the same.
> > 
> > Maik


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.kde.org/pipermail/digikam-users/attachments/20220123/12933fa3/attachment.htm>


More information about the Digikam-users mailing list