[digikam] [Bug 376661] When importing ~200,000 video files Digikam crashes in about 2-5 seconds of starting.

Fri Feb 24 08:32:48 GMT 2017

https://bugs.kde.org/show_bug.cgi?id=376661

--- Comment #15 from Mario Frank <mario.frank at uni-potsdam.de> ---
Hi again,

This will be a quite long text - sorry. But I want to make the problems as
clear as possible.

I thought about the fuzzy search for videos a bit more during my train travel.
In fact, even the first non-plain frame is worthless. If a user really wants to
use digiKam as catalog for videos (which is not the scope of digiKam in first
place IMHO), he will potentially have videos that have the same beginning, i.e.
intro but are different videos. Thus, also the first non-plain frame will
potentially lead to rubbish. I remember that I found some tools to find video
duplicates. The process they applied was to take the first n images of a video
and compare it to all others. A quite bad process IMO as with m videos you
generate n*m images and then have to make a comparison. This is awfully bad
from the view of complexity theory. And in practice, this process is, as can be
expected, awfully slow.

Nevertheless, the process is the probably best way to really recognise
duplicate videos. So, a way could be to generate a fingerprint over the first
or last n images (slows down fingerprint generation extremely). This still is
not robust as many videos may have the same intro (at least the first m
seconds, e.g. about m*25 frames. Usual intros take many seconds. So a *rather*
stable approach would be to take 1000 frames. As you can imagine, this is a big
amount of data to compute fingerprints for. Just imagine your 200,000 videos.
Fingerprinting them would mean to generate 200,000,000 images. Every image must
be generated which is no const-time process but at least linear time. So, even
with 1000 videos, i would expect computation time to be in measure of hours,
not minutes.

Let's take a look from the other side, outros are far more distinct than
intros. So, a lower number n can be taken, e.g. 100. This reduces the time
quite a lot. But is probably still not satisfying.

If no or only short intros/outros are there, only few images should be
sufficient and the process could work quite good.

But we cannot estimate, how the videos are structured. The FPS count may/will
differ from video to video. So, woking on frames explicitly may again lead to
low-quality results. So, the best way would be to take the n first/last seconds
and then the complexity cannot really be estimated here.
Also, I think, users should decide themselves, how many seconds are taken
(configuration) and if beginning or ending should be taken (configuration
again). 

So, *if* this feature should be implemented, I see the following options for
users:
1) Take the first non-plain frame for fingerprinting (fast, probable not
suitable for e.g. cinema movies)
2) Take the n first seconds for fingerprinting (probably awfully slow, may be
suitable for e.g. cinema movies, overkill for self-produced movies)
3) Take the n last seconds for fingerprinting (probably slow, probably suitable
for e.g. cinema movies, less overkill for self-produced movies)

In a more precise algorithmic way, we would need an adoption of the
fingerprints maintenance stage:
Option 1: take the first non-plain frame for video fingerprints
Option 2: take the Option(number n) Option(first,last) seconds for video
fingerprinting.
Changing the current options *must* trigger delete the current fingerprints of
videos as otherwise, different
fingerprintings would coexist which leads to wrong results - except rebuild all
fingerprints is chosen.

Then, the fuzzy search could probably work without adoptions - but I am not
completely sure if it would work out of the box.

Best,
Mario

-- 
You are receiving this mail because:
You are the assignee for the bug.