Google Summer of Code: Visualization-support in Phonon

Mon Jun 1 19:19:15 BST 2009

Hi,

On Mon, Jun 1, 2009 at 7:17 PM, Martin Sandsmark <sandsmark at samfundet.no> wrote:
> Hi!
>
> This summer I'll work on getting support for visualizing audio into Phonon,
> and also into Amarok (and DragonPlayer, if there's time left over).
>
> As some of you are undoubtedly aware, there is some API for this already in
> the Experimental namespace of Phonon, but there is no support in any of the
> backends (there is a skeleton implementation in the Xine engine, but it
> doesn't actually do anything, seemingly copied from the null backend).
>
> I have spent most of my time so far peeking and poking at Amarok 1 and VLC,
> to see how they did/do visualizations, and talking a bit with Max Howell (who
> worked a lot on visualizations in Amarok 1), and some people (sorry, I've
> forgotten the nicks) in #phonon about what thoughts they had on the subject.
>
> I will spend the next couple of days finalizing this work, and fixing up the
> experimental API.
>
> On friday I will spend most of the day on a train, with power but without
> internet, so I plan to bring my laptop and try to implement support in the
> Xine engine (which seems to be the most used/usable backend). I already have
> attempted a half-assed implementation of the current experimental API in the
> Xine backend, so I think I have a fairly good idea about how to implement my
> changes.
>
> I will probably push my work into a public git repo (github or gitorious),
> until it is finished and working.
>
>
> Now for the AudioDataOutput-NG:
>
> I plan on “emulating” a QIODevice, with a readyRead()-ish-signal, which is
> emitted when we have accumulated 512 samples of audio, which we store in a
> ring-buffer. 512 samples means that we get about 80 signals a second (which
> should be quite enough for a smooth visualization), and a 256 wide FFT (which
> should also be quite adequate for visualizations). Thanks for Max for input
> on this.
>

I like this idea.  If the ring buffer will be in the AudioDataOutput
it would be nice to be able set the frameSize and the hopSize? Then it
would be very useful also for audio analysis, because this would allow
to avoid to create another ringbuffer.  And even for some
visualizations it's also really nice to control this.
Probably if you will be already writing a ringbuffer it should'nt be
too hard to also implement these parameters.

> This should allow this API to not be used only for visualizations, but also
> if you for example just want to dump audio data to a file.
>
> Another idea is to have some static functions to do FFT (or FHT, I have the
> impression that Hartley transform is easier to implement), so you don't have
> to re-implement this in every app (or depend on yet another third-party
> library).
>
> So, anyone have any thoughts or comments?
>

You can use an external library to implement FFT or FHT, for instance
there are several:

fftw: http://www.fftw.org/ (GPL)
FFT Ooura: http://www.kurims.kyoto-u.ac.jp/~ooura/fft.html (a very
permissive license)
FFTReal: http://ldesoras.free.fr/prod.html (LGPL)

I really like fftw and have used it in some of my projects, but the
license could be a problem for Phonon, don't know.

> --
> Martin Sandsmark
> GSoC Student
> :wq
>
> _______________________________________________
> kde-multimedia mailing list
> kde-multimedia at kde.org
> https://mail.kde.org/mailman/listinfo/kde-multimedia
>

If you need any help with the signal processing part, I might be able
to give a hand, I have some code at github, where I use Eigen to do
audio analysis processing:
http://github.com/rikrd/loudia/tree/master
it might be of your interest.

-- 
ricard
http://www.ricardmarxer.com
http://www.caligraft.com