GSoC application: tag guessing (improving and modularizing)
vedant agarwala
vedant.kota at gmail.com
Fri May 3 18:07:40 UTC 2013
Hello,
With less than one hour left for Student Application deadline, I am posting
my final proposal for Google Summer of Code.
Below is a plain text copy of my project proposal as submitted on Google
melange.
Thank-you,
Vedant
Name: Vedant Agarwala
Email Address: vedant.kota at gmail.com
Freenode IRC Nick: vedant
IM Service and Username: xmpp-google: vedant.kota at gmail.com
Location (City, Country and/or Time Zone): Kolkata, India GMT+5.30
Proposal Title: Improving and modularizing tag guessing
Motivation for Proposal and Goal: Currently, Amarok “guesses” tags of music
files through the MusicBrainz web service using either existing tags or
MusicIP (PUID) audio fingerprints. This method, however, is somewhat
outdated. MusicBrainz is phasing out MusicIP in favour of AcoustID and
other methods of guessing tags have emerged. Hence, getting tags for tracks
with missing/incorrect track name doesn't work as it should.
Say a user imports tracks from a CD. The CD didn't support/have track names
so all the songs he imported were named "Track 1", "Track 2", etc. He at
least needs the correct track names. He tries to get tags from MusicBrainz
but, as expected, he doesn't get the correct tags. It is not uncommon for
people to have tracks with missing or incorrect or useless (like in this
case) track names. My project aims to add more tag getters like MusicBrainz
AcustID tag getter and Last.fm tag getter, so that such situations are
minimized, if not removed entirely.
Also, getting tags from MusicBrainz is hard coded into the Amarok source
code and not modularized; this makes it hard to add new Tag Getters (like,
for example, Last.fm). Moreover, MusicBrainz code is not very well
documented. It's time consuming for a programmer trying to add features
even just to MusicBrainz since they will have to read and understand the
code. As of now, this is useful only to update existing tags. Tags for
songs with no/incorrect tags are not fetched properly.
The aim of this project is to create an abstract base class for tag
guessing that can be suitably inherited by other classes that aim to guess
tags.
Implementation Details:
-
Creating a generic framework for tag getters: I will replace the
existing musicbrainz directory by a taggetter directory. It will contain
Controller and Provider classes (similar to StatSyncing) under the
namespace TagGetter. The settings UI and related classes will also be
present in this directory. I will change the musicbrainzTagger() in the
TagDialog class to tagGetterController(). Once this is called, the
TagGetter::Controller (a singleton class) will be invoked and process of
tag guessing will begin. The Controller will have a list (a QList of
ProviderPointers) of available Providers and it will create objects of each
of them. Each of those objects will implementing the abstract base class
Provider. As required, the Controller will be calling the methods (and
connecting signals to slots) of the Provider on the main thread and via
polymorphism appropriate methods of each Provider will be called.
Each TagGetter::Provider will have its own directory under the TagGetter
directory. They will contain classes and one that will implement the
Provider class. Providers will be memory manged among different classes
(like the classes handling GUI will require information from the provider)
via the QExplicitlySharedDataPointer or KSharedPointer. Each provider will
be running in parallel. As the different Providers do their work, results
and progress of the work will available to the Controller via the signals
connected to the slots of the Controller. Establishing a connection to the
web service server, converting the music into data sendable over the
internet, fetching results and the error handling will all be part of each
Provider’s work. Separate threads won’t be required because the network
operations will be handled by Qt (QNetworkRequest and QNetworkReply) and
they are the most time consuming. If something else takes time to run (like
decoding the audio and creating the fingerprint) then the provider will
have to manage this on another thread (probably by using the
ThreadWeaver::Job). As part of the contract of each provider, each of its
methods must return quickly. This is important because the Controller
will guarantee calling these methods on the main thread.
The Controller will provide the Providers with the TrackPtr and hence
with track data (name, album, artist, length, file location etc.). I will
write a TagGetterMeta class. Member variables of this class will include a
trackPtr (to store the track pointer of the track whose metadata is being
fetched) and metadata variables (name, artist, album, etc.). After data has
been fetched, Providers will fill the metadata to the TagGetterMetra
objects according to the trackPtr and return this object via a signal to
TagGetter::Controller. Providers should not store this data but they should
keep the data that authenticates the provider (probably an API key that is
received on authentication) as long as the objects of themselves exist
since many lists of tracks can sent to be the Provider in quick succession.
Each Provider can store a small amount of data in the data using the
KGlobal::config(). A provider should add itself to the "plugins" section in
Amarok Settings so that it can be enabled/disabled by the user. If badly
required also provide some additional user changeable settings, since
its best to keep working of plugins as abstracted from the user as
possible.
-
Adapt MusicBrainz tag getter to implement the abstract Provider class: I
will rewrite MusicBrainz code according to the above framework. All the
existing classes will shifted to the new musicbrainz subdirectory inside
the taggetter directory. The MusicBrainzFinder will become MusicIpProvider,
implementing the abstract Provider class. Other code will also be
re-written. Currently, libofa is used to create the MusicIP (PUID) that is
sent to MusicBrainz. MusicBrainz is phasing out PUIDs in favour of AcustIDs.
Hence, another Provider will be made (in the same directory). A
MusicBrainzProvider will use
Chromaprint<http://acoustid.org/chromaprint>to compute AcustIDs rather
than the MusicIP generated by libofa. Chromaprint
will be used to generate the AcoustID. Chromaprint uses the standard C
library[*]<https://bitbucket.org/acoustid/chromaprint/src/master/src/chromaprint.h>so
code can easily be used withing existing classes. I will make
Chromaprint a part of the optional tag guessing package (alongwith libofa)
and update the cmakelists, making Chromaprint as a requirement for
MusicBrainz tag guessing. Hence, the MusicBrainzProvider will be available
only if Chromaprint has been installed. Chromaprint will be part of the
optional package. A HAS_CHROMAPRINT macro will keep track of this and the C
code will only be defined if the marco has a value. The provider will be
“available” only if the macro is defined.
-
Creating Last.fm tag getter: Create the Last.fm tag getter. First, I
will add the tag based service and then, if schedule permits, I can add the
fingerprint based service. Hence, two LastFm Providers will be created.
Implementation will be very easy thanks to
liblastfm<https://github.com/lastfm/liblastfm>-
A Qt C++ library for the Last.fm webservices. It is already used in Amarok.
Both the Providers will share the authentication data and reply so that
network requests aren’t needlessly duplicated. They will be in the same
directory but will have two available providers. The network replies by
Last.fm and parsing will be handled by liblastfm itself. Now the
similarities end. The network requests and replies will have to handled
differently by the different providers, since one will be tag based and the
other fingerprint based.
For the LastfmTagProvider, the track/artist name has to be sent to the
Last.fm webservice API. The webservice will then return the other track
metadata. For the LastFmFingerprintProvider we will first have to generate
a “fingerprint” of the track using the Last Fm
Fingerprinter<https://github.com/lastfm/Fingerprinter>.
This audio fingerprint will then be sent to the Last fm web service via a
call to the track.getFingerprintMetadata<http://www.last.fm/api/show/track.getFingerprintMetadata>.
Then the web service will return the track metadata along with a
“confidence rating” to specify how accurately the track has been identified.
Tentative Timeline:
June-
<--- GSoC commences--->
week 3: Make the directories and write the abstract classes (with
documentation).
week 4: Polish, make cmakelists for, write make tests and compile the
written abstract classes.
July-
week 1: Re-write the MusicBrainz tag getter according to the framework with
better documentation
week 2: Add the features (like using AcoustID) to this MusicBrainz tag
getter
week 3: Compile and run the new MusicBrainz. Fix issues and repeat
week 4: Complete the MusicBrainz tag getter for mid term submission.
<--- Mid term --->
August-
week 1: Write the Last.fm tag getter (as many features as possible)
week 2: Compile and run the new Last.fm. Fix issues and repeat
week 3: Finish writing and test the Last.fm tag getter
week 4: Write make tests to make sure that new tag getters follow the set
framework
September-
week 1: Improve documentation. Fix bugs that have been discovered over the
weeks
week 2: Buffer Period
<--- suggested “pens down” --->
week 3: Fix Bugs, streamline and optimize the code.
<--- firm “pens down” --->
week 4: Do some code cleaning and fix bugs so that my project can be pushed
into the master branch.
October-
week 1: Improve documentation. Fix bugs.
Other Obligations: I have no other obligations. I can easily spend about 50
hours a week: around 8 hrs a day in slots of 2 to 3 hrs. Summer vacations
will be going on till mid July. Even after college starts, I will be
following the same schedule for GSoC coding. It is easily possible
becausevery few classes are held in the beginning of the semester and
even if they
do clash with my coding time-period, the college teachers excuse GSoC
students' absence from class. Since I will have no other coursework
obligations, I can continue to code 50 hours a week (with a similar
schedule) even up till September.
About Me: I am currently in my second undergraduate year in National
Institute of Technology, Durgapur, India, studying Computer Science and
Engineering. I have experience coding experience with C/C++, Java
(including Android and making GUI using Java swing) and web services. I
have submitted 3 patches (Junior Jobs, one each for
Rekonq[1]<https://git.reviewboard.kde.org/r/107662/>and Amarok
[2] <https://git.reviewboard.kde.org/r/109283/> as well as a improved
formatting patch for Amarok[3] <https://git.reviewboard.kde.org/r/109295/>)
and 2 more are under review (both Junior Jobs for
Amarok[4]<https://git.reviewboard.kde.org/r/110101/>
[5] <https://git.reviewboard.kde.org/r/110082/>).
I love coding for open source. I have been hacking with Amarok recently.
Earlier I used to take weeks to fix bugs but now I am able to submit a new
request with in 2-4 days after taking up a bug. I have become familiar with
the source code of Amarok and it's coding style. Most of my work has been
for Amarok mainly because it is a wonderful music player.
I’m sure working with a mentor who is virtually present won’t be a problem.
I have interacted (mainly with Matej a.k.a. Strohel on IRC) over IRC, the
amarok mailing list, reviewboard and also the KDE bug tracking system.
During the work period code can easily be shared via my personal scratch
git repositories.
After GSoC I plan to become an active developer for Amarok and also for
other projects of KDE.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.kde.org/pipermail/amarok-devel/attachments/20130503/e4ef7bbe/attachment-0001.html>
More information about the Amarok-devel
mailing list