Proposal for new music analysis algorithm with very practical uses inside Amarok

Sun Oct 12 23:40:37 CEST 2008

= Formal proposal sent to the Amarok developers =

Hello, gents.

Let's make this quick, since it's a long writeup.  First of all, Amarok rocks.  
Second, the purpose of this message is to put a proposal forward for a long-
overdue Amarok improvement, one that will give Amarok a series of features no 
other player has.

I'm talking about integrating Butterscotch into Amarok, and (in the medium-
term future) replacing features powered by MusicBrainz using a service that 
could, very conceivably, be integrated with Last.FM (this is technically 
possible).

So, what it is: [ButterScotch Butterscotch] is an algorithm to fingerprint 
songs which is not designed to provide an unique ID for a file (technically, 
it can as long as the file has not been transcoded or you have the first 2 
minutes of audio, but that's not the point).  Read again: it is NOT like MB or 
liblastfm or [http://mail.kde.org/pipermail/amarok/2008-August/006488.html 
Soren's patch].

Butterscotch's unique property -- what makes Butterscotch interesting -- is 
that it lets you identify "duplicates".  Let me illustrate by example: let's 
assume you have a track named "What hurts the most" in MP3 format 128 kbps, 
from an album you ripped that is called "Perfect day", and that you have 
another track "What hurts the most (radio mix U.S.)" from an album called 
"What hurts the most CD maxi" that you ripped in FLAC lossless.  Now, upon 
listening, you will discover that those are the exact same tracks, maybe one 
of them starts half a second later, but they sound exactly the same (except 
for the maxi cut being in perfect CD quality).  Butterscotch will tell you 
beyond a doubt that those songs are the same using simple math (correlation 
coefficient averages).

Several interesting possibilities follow from that (stage 1):

 * Amarok can auto-identify duplicates and normalize ratings, play counts and 
scores) for the duplicates.  This is useful for those of us who prefer full 
albums.  Also, no duplicates and normalized statistics mean that the true 
favorites really bubble up this time.
 * Amarok can avoid putting duplicates in your portable devices, where space 
is a concern.  
 * You can now make UI to weed duplicates out easily.  This is useful for 
those who prefer single tracks and no duplicates.
 * Amarok could tell you "this track is already in your collection under a 
different album" upon UI actions, such as "copy to collection".
 * (In the future, with another *really cheap* music analysis algorithm) 
Amarok can automatically select / visually identify the highest-quality 
versions of duplicate tracks for playback or portable devices.

How, how is this better than using MusicBrainz and the like?  Simple, you 
don't need a central server or unique IDs for tracks (MusicDNS which is the 
partner of MusicBrainz works that way, it has a series of technical problems 
that make it less reliable than Butterscotch).  All you need is computation 
power, which your computer has aplenty.  MusicBrainz also has problems with 
false positives and the like.  And you can decode any format, not just the 
formats that a closed source library can.

But if you throw a central server into the mix (stage 2), this gets much 
better:

 * Amarok can auto-submit tags+fingerprints+amarokuniqueid to the server.
 * Amarok can avoid computing the fingerprint by asking the server for the 
fingerprint corresponding to an uniqueid.
 * Amarok can auto-tag songs based on exact matches of the fingerprints, AND 
it can also show the user alternatives in case of non-exact matches 
(correlations > 0.9 in the current definition of the algorithm) which should 
be very few.  Imagine being able to complete incomplete tags from a database 
vetted by majority "vote" (submission, see below).
 * We can build the definitive music encyclopedia all by ourselves (okay, 
technically others can play too).  We can use that data to provide very 
accurate tagging since our service would know, mathematically, which tags 
correspond to which fingerprints.  We can also make it very user-
participative, allowing user submissions to grow the database and letting the 
user say "no, this info is wrong" so we get a self-correcting encyclopedia 
with minimal user or admin intervention.  The web page could also be wiki-like 
and allow others (even artists and producers themselves) to complete 
information that is not available from the ID3 tags.
 *- We can make the service *completely anonymous* for Amarok users.

There are a few technical roadblocks I still have to cross (the biggest one 
being that identical songs that are sung by the same performer but in 
different languages correlate too highly), but the technology should be 
sufficiently solid for me to write an amarok2 extension that actually makes 
stage 1 a reality.  In the following days.

So, before you commit yourselves exclusively to a MusicBrainz or Last.FM type 
of thing, maybe you'd like to give me a shot and perhaps help me along?

Relevant documentation and working code can be found at:

http://projects.rudd-o.com/python-audioprocessing

Luck there!	
-- 

	Manuel Amador (Rudd-O) <rudd-o at rudd-o.com>
	Rudd-O.com - http://rudd-o.com/
	GPG key ID 0xC8D28B92 at http://wwwkeys.pgp.net/

Now playing, courtesy of Amarok: Weird Al Yankovic - Melanie
When confronted by a difficult problem, you can solve it more easily by
reducing it to the question, "How would the Lone Ranger handle this?"

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part.
Url : http://mail.kde.org/pipermail/amarok-devel/attachments/20081012/962439fa/attachment.sig