patch for new feature: acoustic fingerprinting and audio similarity

Soren Harward stharward at gmail.com
Thu Aug 14 20:15:41 UTC 2008


On Tue, Aug 12, 2008 at 1:53 AM, Jeff Mitchell
<kde-dev at emailgoeshere.com> wrote:
> -- You appear to calculate the fingerprint for a track when the track is first
> accessed.  Although you are caching it in the database, this is likely to be
> a very long process.  So forcing it on the user is a no-no.

Okay, I've been thinking about how best to handle scanning files to
calculate the fingerprints.  I agree that the "calculate on load" is a
bad idea.  So I wrote a separate program, modeled on the
collectionscanner, which takes a list of files and calculates
fingerprints for them, writing the results as an XML file on STDOUT.
Now I'm trying to figure out how to best get that data into Amarok.
As I think about it, there are a couple of different ways I could do
this:

1. Integrate the fingerprinting algorithm into collectionscanner, so
that the fingerprint is just one more XML field in the result.  This
would be the easiest thing to do, but it makes the collection scanner
very slow, and it recalculates fingerprints for files that already
have them.

2. Change ScanManager so that it runs two processes in series: the
collectionscanner, and then the fingerprinter only on files that are
still missing fingerprints.  This, to me, seems like the best option,
though it requires major changes to ScanManager.

3. Change SqlCollection so that it has ScanManager and
FingerprintScanManager, running them as needed.  The collection would
need to make sure they didn't run on top of each other.

So, suggestions about which one of these approaches I should follow?

-- 
Soren Harward



More information about the Amarok mailing list