[Nepomuk] Bangarang player media library

Sebastian Trüg trueg at kde.org
Mon Feb 1 11:52:43 CET 2010

Hi guys,

On Sunday 31 January 2010 18:12:32 Jamboarder wrote:
> > From: Evgeny Egorochkin
> > I have uncovered another issue with Bangarang, It seems to have it's own
> > media library scanner and metadata extractor which also marks song
> > metadata as user- created, and not automatically generated.
> Bangarang doesn't really have an explicit media scanner function.  It just
> updates nepomuk whenever a file with extractable metadata is opened or
> when a user updates metadata for a media resource.  It uses the methods
> (setProperty, addType and setTypes) of Nepomuk::Resource to set/update
> data. Am I using the api incorrectly?

You do. It is just that we try to keep data that can be extracted from files 
separate from the rest, i.e. have it in one graph which can easily be removed 
and recreated. That is what the Strigi indexer service does.

> > A much better solution would be to tell nepomuk which files to index and
> > add any of you custom data extractors into libstreamanalyzer or at least
> > refactor them as libstreamanalyzer plugins.
> Sounds good. That makes sense for metadata that is techinically extractable
> from the file itself.  Since there's quite a lot of metadata that may or
> may not be extractable there will still be a need for Bangarang to
> maintain the in-context update of the nepomuk resource.  At the moment,
> when updating nepomuk data, Bangarang doesn't distinguish between
> techinically extractable file metadata and non-extractable metadata. Video
> metadata is the prime example.  The entirety of the the video metadata in
> Bangarang right now is user created. I hope to add some support of video
> file metadata formats in the 2.0 version.  I like the idea of refactoring
> the file metadata extractors as libstreamanalyser plugins, but I'll still
> need to maintain functionality for in-context user-created metadata that
> is not technically extractable.

The problem here is that Bangarang uses Taglib which cannot be used in Strigi. 
Strigi is stream-based while Taglib is not. This is a well known and old 
problem which leads to so much rewriting of code for Strigi.
So converting the Banganrang analyzer to lsa is not an option, at least not 
until the latter becomes a non-stream-based API.

> > You would get the same or better functionality, contribute to nepomuk and
> > avoid lots of pitfalls like your custom extractors and file indexing
> > service adding what essentially is 2 copies of the same data.
> That seems a little odd.  Does that mean a duplicate triple for the same
> resource is created when setProperty is called on an existing resource
> property? I can kinda see the reasoning for setting it to user-created but
> it seams like it shouldn't duplicate it.  Is there anyway for an app to
> just update the automatically generated triples since, whether strigi or
> Bangarang or any other app does it, the data is still automatically
> generated from the file metadata.

Resource::setProperty overwrites any existing triple with that 
subject/property pair.

> In the short term I'd like to sort out this duplicate issue as a potential
> bugfix.  Then, if it still makes sense, in the medium term (version 2.0)
> I'd like to try for the libstreamanalyser solution.

There is no duplication of data at the moment. The only thing that needs 
fixing if I saw correctly is that Banganrang does use plain strings for 
artists instead of nco:Contact resources.
This is another problem (not of Banganrang but lsa): it would be good to reuse 
these contacts instead of recreating them everytime. The latter results in a 
lot of Contact resources which confuses KDEPIM.

> > I can help you with strigi part, and I hope Sebastian can find a way to
> > let Bangarang extend the media collection via some nepomuk API call.

I have been thinking about this many times but so far did not come up with a 
good solution yet. The Strigi service does create the one graph which is 
marked as the index graph for that particular file[1]. This graph contains 
only data that can be recreated by re-indexing the file. So in theory 
Bangarang would need to add its own graph with data only extracted from the 
file. But then we need to sync that data. If we were to put it in the same 
graph that is used by Strigi then Strigi would delete that data again on 
update. Also not a perfect solution. But maybe better since media files almost 
never change....

> Thanks much Evgeny.   A quick first thought is that since a
> libstreamanalyser plugin would result in a split in the code path for
> writing to nepomuk in Bangarang, to keep the user feedback consistent I'll
> probably need some progress feedback. Does that already exist?  Also,
> probably a silly question but can the plugin be delivered at app install
> time? Or does it have to be delivered at strigi install time?

Like most plugin systems Strigi also loads all available plugins at runtime 
which means that you can install any plugin you want at any time. KDE actually 
contains a few of them.


[1] http://sourceforge.net/apps/trac/oscaf/ticket/27

More information about the Nepomuk mailing list