[Nepomuk] The quest for media metadata in libstreamanalyzer

Evgeny Egorochkin phreedom.stdin at gmail.com
Wed Feb 3 05:45:58 CET 2010


I've just finished a libxine analyzer and run some tests.

Can't say I'm really happy.

Good news is that:
 * We've got tons of supported formats
 * It isn't going to require lots of maintenance

The bad news:
 * The metadata produced is really spotty.

The only reliably extracted properties are: width, height, duration(apart from 
mpg where this information isn't always present), codec names.

Especially nasty is that most of the time the trivial to extract data like 
channel count or audio sample rate is absent :(

 * The speed is far from great. For users with movie collections it isn't such 
a problem because the speed is offset by the small number of files that can fit 
the hdd. If users collect small clips(eg music), it may be a bigger problem.

* only files and some remote urls are supported. This means that one of core 
advantages of libstreamanalyzer is lost...

What am I going to do about it?

Type ffmpeg -formats :)

ffmpeg libs have a much more complex api, but it also has a huge number of 
supported formats and can be made to work with streams(!).

So, if ffmpeg-produced metadata is going to suck as much(which indeed may 
happen), at least it's going to suck the same regardless of data location and 
we have 1 less wrapper.

The last alternative is gstreamer, but again it looks like it depends on ffmpeg 
too much to make any difference.

Ok. Really the last alternative is to write native analyzers :)

-- 
Evgeny


More information about the Nepomuk mailing list