KDE Sound and Multimedia Plan

Colin Guthrie gmane at colin.guthr.ie
Wed Dec 1 17:40:38 GMT 2010

'Twas brillig, and Arnold Krille at 01/12/10 09:01 did gyre and gimble:
> On Wednesday 01 December 2010 01:11:18 Alex Fiestas wrote:
>> This is my list of needs:
>> 2-Low latency easy concurrency
>>      The time between the "play" function in our application and when
>> the sound actually play is important for a lot of applications (KNotify
>> for instance), we need to ensure the lower latency possible. Also we
>> need to provide an easy way to play more than one sound in parallel
>> (KNotify as app example again).
> No, neither knotify nor multimedia players nor games or webcam-apps need low-
> latency.
> Of course they need a rather short time between api-call and actual playing.
> But low-latency is the stuff where the time from soundcard_in-
>> processing_in_cpu->soundcard_out takes less then 10ms. If you really need 
> this, use jack (www.jackaudio.org) natively and drop any layers between.
> What you need is a framework that plays files/media, regardless whether its 
> wav, mp3, ogg, flac, mp4, avi, wmv or flv. You also want easy access to webcam 
> and soundcard io to allow for phonon to be used in more cases then media 
> players.
> But all that doesn't need latencies of 10ms or less. It will all be perfectly 
> well with latencies of 100ms. And you gain a lot in simplicity when you don't 
> have to deal with low-latency...

[Apologies, I've ended up writing way beyond what I want and what Alex
requested.... but I don't want to throw it away, so I'm going to send it
anyway.... I've written two requirements right at the end, so skip to
that if you don't want to read the rest, but I think it's useful :D]

Yeah I have to second the "low latency" comments in this reply.

To be honest, I really get P155ed off with a lot of application
developers who make comments about *needing* low latency (and this is
very much not directed at your good self Alex, just a general grumble!).
The main issue is that latency is often something that you literally
cannot control to any sensible degree and you have to cope as best you
can with what you are given, but the one thing you cannot do is to
mandate a system that has low latency - that is dangerous and just plain

To give some contextual examples:
 Bluetooth headsets will almost always have a much higher latency than
built in h/w sound card. Likewise for something like UPnP or Apple
AirPlay etc. These outputs simple have a latency built into them and we
have to live with it.

Using these types of devices for games is likely going to require the
"live with it" solution as you cannot delay the visuals to compensate as
the user will likely be fragged due to "unresponsive input", but for
media players (and this is where I've had most people (incorrectly) say
they /need/ low latency), then the visuals *can* be delayed to
compensate for the audio delays introduced by strange outputs (and or
system load, other concurrent playing streams and any other thing likely
to change latency)

So with that grumble out of the way, I'd agree with Arnold here that low
latency is not a requirement per-se (despite what people think) and in
fact I'd go as far as to say we should in many cases advocate high
latencies whenever possible: e.g. allow the application to pump 2s or
more of data into the system and then just sleep, not bothering the CPU
again until 1.9s later. This is a particularly attractive scenario on
mobile devices such as mobile phones, netbooks and tablets etc. (on
mobile phones particularly as often there is no UI to care about either,
so you really can sleep properly without waking up to repaint the song
time UI component!). Note that pumping 2s of pre-decoded audio into the
audio system and having a latency of 2s does not preclude a quick (i.e.
"instantaneous") response in any given use case. That 2s can be thrown
away at any point and new data loaded, allowing users to skip forward
and back at leisure without waiting for a 2s buffer to run itself out.

I mention this because I've had people (e.g. from Intel) ask me in the
past how to enable support for this kind of operation via Phonon API.
The Phonon API does not provide any method to control latency like this
even if the underlying backend does (gstreamer on top of PulseAudio is
the only backend configuration that would work here AFAIK).

So to be taken seriously and used in a mobile environment, Phonon would
need to support a method to *request* specific latencies. Whether this
is exposed to $APP or just dealt with internally in a semi-intelligent
way, I don't know. And I say "request" as the application (or perhaps
phonon itself in most cases) should always deal as gracefully as it can
with the latency that it is actually given or occurs at any point in the
future (e.g. a hotswap from a low latency local device to a high latency
BT headset etc.). There is no escaping that requirement!

As for Knotify, I find it very hard to feel that current system support
the current method of operation well. Knotify does indeed want a quick
turn around on the event happening and the sound being output. This is
already handled very well in Gnome systems via libcanberra on top of
PulseAudio. libcanberra implements the FDO Sound Theme specification
(similar to the icon theme) and while some people have previously
reported a desire to reimplement it if this spec was to be used in KDE,
I find that quite hard justify overall (the main arguments so far have
very much been non-technical and personal which I think is a terrible
method to judge something technical, but meh).

It would be easy enough to write a phonon backend for libcanberra and
adapt knotify to use libcanberra for event sounds rather than phonon
directly. If a system uses PulseAudio, we could avoid the phonon layer
completely and talk directly to PulseAudio which would gain the
following advantages:
 1. Sample Cache. Short sounds would be cached by the PA server for very
quick and responsive playback of event sounds when they happen.
 2. Ability to get positional sounds... if an event happens near the
left side of the screen, it will come mostly out of the left speaker.
Partly a gimmick but it's surprising nice in action :)

Now, I've just done what I specifically tried not to do (i.e. talk about
implementation rather than requirements!), but the requirements can be
summed up thus:

 1. Request latency requirements (or handle automatically based on
category - e.g. games, communication = low latency, all others = high)
 2. Respond quickly to play certain sounds for notifications etc.

Hope I've not ranted too much :D



Colin Guthrie

Day Job:
  Tribalogic Limited [http://www.tribalogic.net/]
Open Source:
  Mageia Contributor [http://www.mageia.org/]
  PulseAudio Hacker [http://www.pulseaudio.org/]
  Trac Hacker [http://trac.edgewall.org/]

More information about the kde-multimedia mailing list