gsoc idea and query

Wed Feb 25 15:27:39 UTC 2009

Soren,

I'm involved in a research group[1] right now at Georgia Tech that is
looking at the same problem of creating recommendations in a
content-based approach.  We're creating a distance matrix based on
MFCCs of many small snippets of the tracks, modeling the distributions
of these MFCCs as a Gaussian mixture model for each track, and then
comparing those distributions with Earth mover's distance.  We're
running the algorithm on only about 1700 tracks of Indian music, but
it's still taking dozens of hours to complete the full analysis
(because by the nature of the approach, it's an O(n^2) algorithm --
calculating the distance of every song with every other one).  Then
songs with a smaller distance are considered more similar, so we pull
a nearby song as a recommendation for the current track.

I'm interested in how you're doing the recommendations -- what
features of tracks you're using and how you can create on-demand, if
that's what you're doing, without killing the CPU.

Do you have the code publicly available somewhere to look at?

Best,
Andrew Ash

[1] http://paragchordia.com/research/cbr.html

On Wed, Feb 25, 2009 at 7:26 AM, Soren Harward <stharward at gmail.com> wrote:
> On 2/25/09, amit sethi <amit.pureenergy at gmail.com> wrote:
>> I have been working on the idea of an automatic playlist generator .
>
> I've already got a playlist generator, including similarity metrics,
> about 80% done. It's sitting in a non-trunk git branch because it
> needs some UI work and a code review before we put it into the trunk
> (hopefully) some time in the 2.2 series.  So if you'd like to help me
> out with it, I'd gladly accept some assistance.
>
> --
> Soren Harward
> _______________________________________________
> Amarok mailing list
> Amarok at kde.org
> https://mail.kde.org/mailman/listinfo/amarok
>