GSOC Analyzer Support in Amarok and possibly Phonon

Sat Feb 28 08:44:42 UTC 2009

Hi,

I'm Ricard Marxer and I'm planning to apply for the GSOC Analyzer Support
Idea presented by Ian Monroe:

http://techbase.kde.org/Projects/Summer_of_Code/2009/Ideas#Project:_Analyzer_Support

I have been exchanging some thoughts in the Amarok
mailinglist<http://mail.kde.org/pipermail/amarok/2009-February/007726.html>and
I wanted to also discussed it here, since this project is closely
related to Phonon.

The main idea is to make audio visualizations (using Phonon to access audio
data) for Amarok 2 similarly as what was available in Amarok 1.x.  Most
visualizations there were spectral based and we would then need some
processing of each audio frame to be able to this.

I know that there already exists an experimental class in Phonon to access
the audio data (AudioDataOutput) which has been made for that purpose.

And the following questions arise:
A - Use AudioDataOutput and then perform some basic analysis (windowing,
FFT, bands and weighting) outside of Phonon.

B - Create another class in Phonon (sth like AudioSpectrumOutput) that will
do the processing and just throw dataReady() signals with a qvector with the
magnitudes per spectral band.

C - Do A and if it proves interesting to Phonon go for B.  We could keep the
processing we have implemented in A inside Phonon as a fallback, and then
little by little switch to using the backends when/if they are capable of
performing the processing needed.

Option A means doing the processing outside of the Phonon backends
(personally I would choose FFTW and Eigen to do this processing).

Option B would mean that we could use processing algorithms that are already
implemented in the backends and create in Phonon an interface for them.
Amarok 1.x performed the processing of the audio for visualizations in the
gstreamer and xine backends (see Ian's answer in the mailinglist).  However
we still must see if other backends have this capability.  Of course, like
in the rest of Phonon we could always just expose it as a capability of the
backend.

Option C might be the most appropiate since it is a flexible plan that will
let us see as we go.

Another main objective of the project is to make the AudioDataOutput more
mature by making use of it in a Plasma Applet that will wrap
ProjectM<http://projectm.sourceforge.net/>.
This will serve as an applet for Amarok's Context or even on the Plasma
desktop for general visualization of audio passed through Phonon.

Any thoughts about this GSOC idea or the plan?

As for the long term, I think this could lead to having more AudioXXXOutput
classes such as AudioOnsetOutput or AudioPitchOutput, so maybe a special
Analysis namespace/submodule in Phonon would be interesting for this.

Note: I CC the amarok ML to keep them uptodate about the discussion.
-- 
ricard
http://www.ricardmarxer.com
http://www.caligraft.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.kde.org/pipermail/amarok/attachments/20090228/d08761cb/attachment.html>