[Nepomuk] Web Metadata Extractor GSoC idea

Adrien Bustany madcat at mymadcat.com
Sun Feb 21 22:26:47 CET 2010


On Sun, 21 Feb 2010 19:40:41 +0200, Evgeny Egorochkin
<phreedom.stdin at gmail.com> wrote:
> On Sunday 21 February 2010 18:59:33 Adrien Bustany wrote:
>> On Sun, 21 Feb 2010 17:50:31 +0200, Evgeny Egorochkin
>> 
>> <phreedom.stdin at gmail.com> wrote:
>> > On Sunday 21 February 2010 16:15:55 z wrote:
>> >> What do you mean by "calculate a MusicBrainz ID"? The only way to
know
>> >> MBID
>> >> is to query their online database on known terms about inersting
track
>> >> AFAIK. And then there is a quest to pick one real match from
returned
>> >> results. Am i right?
>> > 
>> > Not at all. There's a lib which calculates a fingerprint of the audio
>> > itself:
>> > http://musicbrainz.org/doc/How_PUIDs_Work
>> > 
>> >> 2010/2/21 Evgeny Egorochkin <phreedom.stdin at gmail.com>
>> 
>> >> Added one idea:
>>
http://community.kde.org/GSoC/2010/Ideas#Web_Metadata_Extractor_Framework
>> 
>> >> > _and_Service
>> >> > 
>> >> > It doesn't look very hard on the surface(mostly grunt work with
>> >> > configuration
>> >> > and such) and is a very useful.
>> >> > 
>> >> > One more thing left out is "slow indexing" mode. At this moment,
we
>> >> > don't
>> >> > calculate hashes for files or MusicBrainz IDs for music to speed
up
>> >> > indexing.
>> >> > Maybe it's worth to either let the user enable the
features(possibly
>> >> > for
>> >> > a subset of dirs) or implement a second pass of crawling to handle
>> 
>> the
>> 
>> >> > heavy lifting.
>> >> > 
>> >> > Basically, if you have plain mp3 with no tags, the "slow" crawler
>> 
>> could
>> 
>> >> > calculate a MusicBrainz ID, and the Web Metadata Extractor would
>> 
>> fetch
>> 
>> >> > the rest of metadata.
> 
>> Hi,
>> I'm currently developing such an app for videos, but using Tracker (the
>> gnome
>> counterpart of KDE-Nepomuk). I found that the best datasource I could
use
>> is
>> actually DBPedia (LMDB misses quite a few titles).
> 
> Thanks for advice.
> 
>> But for music, I think
>> MusicBrainz is the way to go. The source isn't published yet, but it'll
>> be
>> soon, just the time for me to polish a few things.
> 
> Is this publicly announced already?
Nothing public yet, more because I don't want to blog about a half baked
piece of code than because I want to keep it secret :)
I'll keep you tuned if you wish.

Cheers


More information about the Nepomuk mailing list