patch for new feature: acoustic fingerprinting and audio similarity
Soren Harward
stharward at gmail.com
Tue Aug 12 12:42:15 UTC 2008
On Tuesday 12 August 2008 0:41:17 you wrote:
> So it's MusicBrainz?
No. The big difference is that the fingerprints calculated by MusicBrainz,
the old Moodbar, or Last.fm don't attempt to preserve what the song "sounds
like". With these fingerprints, if you calculate the similarity between the
two fingerprints using any kind of mathematical function, you can't be sure
that two songs that sound similar are going to end up with a greater
similarity value than two songs that sound wildly different. In technical
terms, the metric space defined by these feature vectors (fingerprints)
doesn't correspond to the metric space of perceived human audio similarity.
The metric space defined by this new fingerprinting algorithm does.
The fingerprinting that I'm implementing in Amarok is based on an algorithm
defined by the Marsyas suite of software, which is developed by George
Tzanetakis. Marsyas performed very well in last year's MIREX, specifically
in the Audio Genre Classification and Audio Music Mood Classification tests,
which are most directly applicable to what I'm trying to accomplish:
http://www.music-ir.org/mirex/2007/index.php/Audio_Genre_Classification_Results
http://www.music-ir.org/mirex/2007/index.php/Audio_Music_Mood_Classification_Results
And most importantly, Marsyas is GPL software that's been under development
for 10 years. The holdup for the audio fingerprinting program (as described
in my announcement email) is that I'm trying to pare down the 8MB Marsyas
source code core into only what is needed to calculate the fingerprints.
I've got the fingerprinter working; it just isn't quite in a state that's
ready for distribution. I'll let you know when it is.
A program called Mirage has already been written to add similar functionality
to Banshee, and an GSOC student (Charlotte Curtis) is working on an audio
similarity project for Rhthymbox -- she also independently decided to use
Marsyas, and we're sharing fingerprint calculation code. The MusicIP Mixer
also implements this kind of functionality, though this program is not open
source and it lacks many features of Amarok and other audio players.
So, use cases:
1. I'm in the mood for only a certain kind of music, and I think that "Man on
the Moon" by R.E.M. represents this mood well. So I go to the dynamic
playlist editor and add a bias that everything has to be more than 80%
similar to "Man on the Moon" by R.E.M. "The Village Green Preservation
Society" by The Kinks and "We Both Go Down Together" by The Decemberists are
added to the playlist, but "Killing in the Name Of" by Rage Against the
Machine and "Mahadeva" by Astral Projection are excluded.
2. I want random shuffle, but I don't want any major jumps between genres; I
want a "smoothed" shuffle. So I add a restriction that each track has to be
at least 75% similar to the previous. So I start with "Busy Child" by The
Crystal Method, and the following songs play:
"Take California" by The Propellerheads
"Minefields" by The Prodigy
"Galvanize" by The Chemical Brothers
"Satan" by Orbital
"We Have Explosive" by The Future Sound of London
and so on. After an hour or so, the shuffle may wander over to "Hey Jude" by
The Beatles (which is only 15% similar to "Busy Child"), but it got there by
smoothly transitioning between songs.
2.1: As a perverse subcase, I could force the playlist to jump all over the
place by specifying that each track added has to be less than 20% similar to
the preceding.
Both of these *could* be done using only the artists and the "retrieve similar
artists from Last.fm" function. But this is only a rough approximation that
fails to account for the diversity of an artist's music. For example, using
the artist similarity system, Moby and Air are about 80% similar, even though
their respective tracks "That's When I Reach for My Revolver" and "Alone in
Kyoto" are only about 25% similar. Such jumps are not really desireable.
Possible use cases that would probably be better implemented as scripts which
call on the similarity functionality:
3. 30% of my tracks are missing genre tags, and I want to assign them to
genres, but I don't want to do it by hand. I run a script that assigns
tracks to genres based on how similar they are to tracks I've already
assigned to genres.
4. I want a "wake up" CD for the morning which starts off with "One Perfect
Sunrise" by Orbital and ends up at "Fuel" by Metallica. I run a script that
picks tracks from my collection to fill the 80 minutes with songs that
transition between the two.
I hope that's enough to explain what I'd like to be able to accomplish, and
why new functionality is needed. Your feedback about how I'm currently
implementing it is much appreciated; I'll make some changes to accommodate
these suggestions.
--
Soren Harward
stharward at gmail.com
More information about the Amarok
mailing list