KDE4 and video acquisition

Matthias Kretz kretz at kde.org
Thu Jan 26 22:30:22 GMT 2006


Hi Laurent,

thanks for all the insight from your side. I guess I should start with an  
explanation about the scope of Phonon.

Phonon is supposed to provide a simple and "task oriented" API to the KDE 
developer. With "task oriented" I mean that the API tries to reflect what a 
KDE program will need and not how the hardware works. So on the one hand the 
knowledge about what the hw can do and what interfaces there are is usefull, 
but on the other hand when designing the Phonon API we should concentrate on 
"use cases" of the API and how a developer would be easily able to achieve 
his goal.

For the matter at hand I could think of a "restricted" API (restricted in the 
sense that the developer cannot use or set every feature the hw provides) 
that allows a KDE application to select a video capture source, an audio 
capture source (if both are selected that implicitly means they need to be 
synchronized) and set a few parameters like brightness, contrast, hue, 
saturation (which are usefull to expose to the user of the application). 
Everything else can be configured in a KControl module. Let me draw a 
scenario:
- I buy a new Webcam to finally be able to video-chat with Kopete
- I plug it into the USB port
- a dialog pops up: new USB device found: camera - do you want to configure 
it?
- I want to...
- I get a dialog where I see the video stream from the camera and I can modify 
all parameters the hw exposes (this is done completely on the side of the 
backend implementation and does not use the Phonon API).
- After some experimenting I'm satisfied and save the settings.
- I start kopete and select the new Webcam as video source
- At the evening I realize the camera needs different settings for the changed 
lights in the room
- Kopete provides a button to get to the camera settings (alternatively I can 
reach it through KControl)
- there I create a new setting (profile) and save it with the name "evening 
lights"
- in kopete I now have the possibility to select what camera profile I want to 
use

So Phonon exposes a rather simple API to the video capture. The complete 
control is only accessible in the backend implementation and therefor through 
the KControl module of the backend. Phonon and Phonon using applications 
don't have to know or understand any of these things.

If you agree with me on that the question is still open what the API of Phonon 
needs to expose. The rules for me here are:
- keep it simple (a developer should be able to use the API by looking at the 
class declaration - no docs, no explanations except the class name and method 
names)
- make common tasks simple
- make not so common tasks possible
- leave as much of the hard work to the backends as possible :-P

On Wednesday 25 January 2006 13:50, Laurent Pinchart wrote:
> I can only comment on V4L2 and IP-based devices, I'll rely on your
> experience for Firewire video capture. We also need someone with DVB
> experience.

Firewire capture seems a lot less involved than v4l2. It's only about 
receiving a DV stream, with the main problem that the amount of data has to 
be processed fast enough. In the times I worked with it, standard IDE systems 
were incapable of saving the data fast enough and so the DV stream would drop 
out eventually. But of course this is out of our control - we would only have 
to do good error reporting so that the user understands what he has to do to 
get it working.

> > Here are a few thoughts from the top of my head:
> > 1. it can be a capture source that delivers both audio and video (like a
> > DV stream)
>
> I can think of different kind of capture devices :
>
> 1. video only : this is probably the simplest case. The device delivers
> video in a given format, either frame based (uncompressed, MJPEG, DV) or
> stream based (MPEG).
> 2. audio/video in a single stream : the device delivers a data stream which
> contains interleaved audio and video data. This applies to MPEG and DV
> devices.
> 3. audio/video in different streams : the device delivers two streams of
> data, one for audio, the other for video. The audio clock can be locked
> (synchronized) to the video clock or not. A common case of such devices are
> USB devices with separate audio and video interfaces.
>
> 1. and 3. are easy to handle with either a single VideoCapture instance
> (1.) or separate VideoCapture and AudioCapture instances (3.). 2. is a bit
> more difficult, and we will need input from experienced users.
>
> > 2. if it's a video only capture you often want to have the audio signal
> > from the soundcard to be in sync with the video signal. So either there
> > needs to be an interface for defining the sync between a SoundcardCapture
> > and a VideoCapture or there should be only one interface "Capture" where
> > the audio and video source can be independently set, and if both are set
> > they're implicitly kept in sync.
>
> Audio/video sync will probably be one of the major problems. Even if the
> device can capture both audio and video, the two streams can be unlocked,
> and we must provide a way to resynchronize them.

Phonon does not really care. ;-) Well, what Phonon does is to tell the backend 
what the developer wants. The backend then has to take care of sync. For 
example I create an AvCapture object and set the AudioSource to the soundcard 
and the VideoSource to a webcam. The VideoPath and the AudioPath are both 
connected to an AvWriter object:
.                  -------------
.                 ,| AudioPath |.
. -------------  / ------------- \  ------------
. | AvCapture |<                   >| AvWriter |
. -------------  \ ------------- /  ------------
.                 `| VideoPath |'
.                  -------------
The backend now knows it has to sync the soundcard and webcam signals. This is 
implicitely defined by the Phonon objects.

On the other hand if the output is not multiplexing the audio and video 
signals (internally) it wouldn't be implicitly clear that the audio and video 
signals should be synced. I think it makes sense to require the backend to 
always sync the audio and video data from the same AbstractMediaProducer 
(AvCapture, MediaObject and ByteStream are subclasses of 
AbstractMediaProducer).

Then the following would result in the video and audio data streams to be in 
sync (meaning the RTP timestamps are correctly synced):
.                  -------------    ----------------
.                 ,| AudioPath |----| AudioOverRTP |
. -------------  / -------------    ----------------
. | AvCapture |<
. -------------  \ -------------    ----------------
.                 `| VideoPath |----| VideoOverRTP |
.                  -------------    ----------------

Though, AFAIU, there remains a problem on the receiver side:
. --------------------    -------------    ---------------
. | RTPAudioReceiver |----| AudioPath |----| AudioOutput |
. --------------------    -------------    ---------------
.
. --------------------    -------------    ---------------
. | RTPVideoReceiver |----| VideoPath |----| VideoWidget |
. --------------------    -------------    ---------------
If anybody that knows more about the RTP streaming stuff you can tell me 
whether that keeps it in sync already if the 


> > 3. There's a class CaptureSource in SVN that describes the available
> > sources to the user. (It's a really simple class, like a struct
> > containing an id, name and description, but with users of the API only
> > being able to read, of course). This class could just as well be used to
> > describe the available video capture sources to the user, or is there any
> > information missing?
>
> Where in SVN ? Is there an SVN repository for Phonon ? In the main KDE
> repository ?

branches/work/kdelibs-phonon/phonon

> Maybe we could add some kind of capability. What is the CaptureSource used
> for exactly ?

The CaptureSource is replaced now by AudioSource and VideoSource. Both provide 
an id, name, description and associated Audio-/VideoSource.

> There is a gain for video signals as well.

Is it a common setting for the enduser? In what ways does it differ from the 
brightness setting?

> I think we need to deal with properties exported by the hardware first, and
> then implement video processing in the VideoPath (whatever that is, can you
> tell me where I can find more information ?).

Yes, hardware processing should be defined in the source object, while 
software processing should be defined with the help of the VideoPath object. 
You can find more information at
- http://conference2005.kde.org/slides/multimedia-api--matthias-kretz.pdf
- http://wiki.kde.org/tiki-index.php?page=Multimedia+API+Talk

Enough for now, I have to go to bed.

-- 
C'ya
        Matthias
________________________________________________________
Matthias Kretz (Germany)                            <><
http://Vir.homelinux.org/
MatthiasKretz at gmx.net, kretz at kde.org,
Matthias.Kretz at urz.uni-heidelberg.de
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://mail.kde.org/pipermail/kde-multimedia/attachments/20060126/931390ae/attachment.sig>
-------------- next part --------------
_______________________________________________
kde-multimedia mailing list
kde-multimedia at kde.org
https://mail.kde.org/mailman/listinfo/kde-multimedia


More information about the kde-multimedia mailing list