[digikam] [Bug 376661] When importing ~200,000 video files Digikam crashes in about 2-5 seconds of starting.

Poz bugzilla_noreply at kde.org
Sat Feb 25 23:50:20 GMT 2017


https://bugs.kde.org/show_bug.cgi?id=376661

--- Comment #17 from Poz <pozniakdp at gmail.com> ---
Wow the discussion here is fantastic. Thank you for the time and thought!

So yes, the approach I suggested of just using the thumbnails is clearly not
robust enough given the wide array of video content out there.
I think a lot of the problems come from very uniform videos, for example
standard intros or outros. My case has very non uniform videos (without any
intro or outros) where I can run through windows explorer and find duplicates
myself from simply looking at the thumbnails so I know at least 20% are
duplicates just from simple observation. The problem is that it is to much to
go through that many files and click each one individually. I have used Digikam
before on photos for duplicates and was amazed at how well it worked so
naturally I thought, 'man, I wish I could get digikam to access these
thumbnails for me, I could get rid of +95% of these duplicates in a day'. I
know there could be false positives, but I could live with 1% or something like
that. To further get rid of false positives there could be a video length
option of +-X seconds (default at 2 or something).

I currently use http://www.alldup.de/alldup_help/alldup.php
The content method works very well, I would say less then 0.001% false
positives. But it misses so very very much. It can take up for 48 hour to run,
but builds a database so it only compares new files added into the search. I
even use the file size method, for large files, this works very well. Smaller
files (<10 mb?) tend to have more false positives. Unfortunately due to
different compression and file types this does not catch them all either.

I think in the end, until computer hardware is faster, video duplicate searches
will require a number of different methods and some user input. Until then that
is what we have to work with/ around. I was just hoping for another way to slim
down on this video database. Thumbnail seemed like low hanging fruit.

-- 
You are receiving this mail because:
You are the assignee for the bug.


More information about the Digikam-devel mailing list