GSOC Analyzer Support

Thu Feb 26 00:44:17 UTC 2009

On Wed, Feb 25, 2009 at 2:19 PM, Ricard Marxer Piñón
<email at ricardmarxer.com> wrote:
> Hi,
>
> I'm Ricard Marxer and I'm planning to apply for the GSOC Analyzer Support
> Idea presented by Ian Moore:
>
> http://techbase.kde.org/Projects/Summer_of_Code/2009/Ideas#Project:_Analyzer_Support
>
> Ideas extending GSOC Analyzer Support proposal
> -----------------------------------------------------------------
>
> Data:
> I have seen that Phonon offers an experimental AudioDataOutput class made
> for the purpose of visualization and analysis.  This is the obvious entry
> point for accessing the audio frames to perform further processing.

Yep! Whoever ends up mentoring, you'll probably be getting some
direction from Kretz (the creator of Phonon) or maybe one of the
Trolls especially on API issues.

> Processing:
> As for the library to be used for the mathematical processing of the audio
> (FFT, MFCC, Onset Detection, Pitch Estimation...) I would like to use Eigen,
> since it makes the code very readable and clear and it is highly optimized.
> I would of course use some external libraries for some specific algos such
> as FFTW.

I'm not sure how much processing is needed and how much is already
done by Xine and/or Gstreamer. Eigen is pretty nifty though yea. :)

Here's an example of a Xine plugin used to give Amarok analyzer
information from Amarok 1.4:
http://kollide.net:8060/browse/Amarok/src/engine/xine/xine-scope.c?r=10660
And GstEngine::scope() from Amarok 1.4:
http://kollide.net:8060/browse/Amarok/src/engine/gst/gstengine.cpp?r=2375

In both cases the code is probably utterly worthless, just showing you
the sort of stuff you might end up doing on the backend side.

Given your experience using audio analysis techniques I'm sitting here
wondering if perhaps we could expand the project to give applications
access to the raw decoded audio. This would be useful for some other
things we want to do in Amarok, and it would let you do more advanced
analysis stuff. (The first priority is still the basic Codeine/Amarok
1.4 analyzer though).

> Visualization:
> Here is where I would like to ask for some help about what would be the best
> choice.  I think to start with, the simplest thing would be to first hook
> the output of the processing directly to a Graphics View.
> If you guys think it is a good idea it might be nice to have the Graphics
> View inside a plasma applet (which could then fit in Amarok's context view).

Actually some of the analyzers in Amarok 1.4 were pretty cool and
could probably be ported. Some used QPainter and others used OpenGL to
do 3D stuff. QGV would also make sense.

Probably for the bigger plasmoid-sized visualizations (as opposed to
analyzers) we would want to use http://projectm.sourceforge.net/. Like
the 'giving access to raw audio' idea above we might be getting out of
scope for the project depending on what you want to do.

> Notes:
> Of course the main goal is to have the lowest possible hit in CPU and still
> keep beautiful visualization.  Also it should be possible to completely turn
> off the audio processing and visualization when in power saving mode.
>
> Anyway, this is just to create some discussion about the directions the idea
> could take.  What do you all think?

I'm really glad that someone is taking an interest in this. As you can
see there's some flexibility on where you want to make the emphasis of
your proposal.

A little inside baseball (sorry don't know a more international term,
lol): this project has a decent chance of being selected as
kdelibs-related projects (and Phonon at least used to be in kdelibs)
typically aren't proposed so often but have a lot of people voting for
them since they help out many parts of KDE. Making this clear in your
proposal would be good politics.

Ian