GSoC draft application

Matěj Laitl matej at laitl.cz
Mon Apr 29 20:41:39 UTC 2013


On 29. 4. 2013 vedant agarwala wrote:
> In the meantime, I have updated my project proposal as shown below. Your
> inputs are is still as valuable so please comment :)

Hi, I'm very sorry to reply this late, I haven't unfortunately found time 
earlier, neither did any other developer apparently.

Alberto, perhaps you'd be able to review this proposal (unless you want to 
submit a similar one) too? (as you've lately touched the code significantly)

> Proposal Title: Improving and modularizing tag guessing
> 
> Motivation for Proposal / Goal: Currently, Amarok “guesses” tags of music
> files through the MusicBrainz web service using either existing tags or
> MusicIP (PUID) audio fingerprints. This method, however, is somewhat
> outdated. MusicBrainz is phasing out MusicIP in favour of AcousID and other
> methods of guessing tags have emerged. The aim of this project is to create
> an abstract base class for tag guessing that can be suitably inherited by
> other classes that aim to guess tags.
> 
> Getting tags from MusicBrainz is hard coded into the Amarok source code.
> Hence tag guessing is not modularized; this makes it hard to add new Tag
> Getters (like, for example, Last.fm). Moreover, MusicBrainz code is not
> very well documented. So, it'll be very time consuming for a programmer
> trying to add features even just to MusicBrainz since they will have to
> read and understand the code. As of now, every track’s tags can only be
> changed one by one that also after waiting a few seconds for each.

The last sentence is not true, you can change then all in batch using the Edit 
Tags dialog or throught existing MusicBrainz support. See also how Alberto 
Villa 

> More often than not, the tags returned by MusicBrainz is incorrect or tags
> of different kinds are returned.

This might get better thanks to recent Alberto's patch. I view MusicBrainz as 
a good way for slight corrections to existing tags, but is often provides very 
bad results for files with no tags at all (which is expected).

> Say when a user is listening to a particular song he/she realizes that some
> tags to the song are missing. He/she will try to correct this by himself
> and will be very happy to see a “Get tags from MusicBrainz” button. Rather
> than updating the tags himself he/she clicks on the button and selects the
> correct tags. But this is where it all ends. The user is only saved from
> the trouble of typing. He/she still has to choose the correct tags. Many
> times the user will not find any tags at all or they will be guessed wrong
> entirely.

I don't think that this project, even is successful, will make the situation 
above significantly better.

> Moreover, there is no way of updating/correcting tags of multiple files.

This is entirely false.

> If
> a user adds some new songs to the local collection with poorly formatted
> tags. It is not uncommon for tracks to contain a track name in the format
> “trackname - albumname - artistname” with the other details blank. Many
> tracks contain names of websites (like downloaded from)  at the end of one
> or more tags of the file. A user will want these tags to be fixed but it
> will be a real pain to do this one by one. If he/she clicks on the “guess
> tags from file” he/she is greeted by another dialog that takes a long while
> to understand. Its likely that he/she will just close this dialog. Even if
> he/she gets tags from MusicBrainz (assuming the fetched tags are correct),
> getting them from each file one by one will be too much of a task for him.

This paragraph is very misleading - such problem doesn't exist in current 
Amarok.

> Implementation Details:
>    -
>    Creating a generic framework for tag getters: I will replace the
>    existing musicbrainz directory by a taggetter directory. It will contain
>    Controller and Provider classes (similar to StatSyncing) under the
>    namespace TagGetter. The settings UI and related classes will also be
>    present in this directory. I will change the musicbrainzTagger() in the
>    TagDialog class to tagGetterController(). Once this is called, the
>    TagGetter::Controller (a singleton class) will be invoked and process of
>    tag guessing will begin. The Controller will have a list (a QList of
>    ProviderPointers) of available Providers and it will create objects of
> each of them. Each of those objects will implementing the abstract base
> class Provider. As required, the Controller will be calling the methods
> (and connecting signals to slots) of the Provider on the main thread and
> via polymorphism appropriate methods of each Provider will be called.

Right.

>    Each TagGetter::Provider will have its own directory under the TagGetter
>    directory. They will contain classes and one that will implement the
>    Provider class. Data will be shared among the classes (in the TagGetter
>    namespace) probably via QExplicitlySharedDataPointer.

what data?

>    Each provider will
> be running in parallel. As the different Providers do their work, results
> and progress of the work will available to the Controller via the signals
> connected to the slots of the Controller. Establishing a connection to the
> web service server, converting the music into data sendable over the
> internet, fetching results and the error handling will all be part of each
> Provider’s work. Separate threads won’t be required because the network
> operations will be handled by Qt (QNetworkRequest and QNetworkReply) and
> they are the most time consuming. If something else takes time to run (like
> decoding the audio and creating the fingerprint) then the provider will
> have to manage this on another thread (probably by using the
>    ThreadWeaver::Job). As part of the contract of each provider, each of its
> methods must return quickly. This is important because the Controller will
> guarantee calling these methods on the main thread.

Right.

>    The Controller will provide the Providers with track details (through
>    the track pointer)such as track data (name, album, artist, length, file
>    location etc.). The framework will contain a TagGetterMeta class. Objects
> of this class will store a TrackPtr to identify the the track in the
> Collection and metadata (name, artist, album, etc.) returned by the
> providers. Providers will return a QList of objects of this class. The
> metadata that has been updated will be non-zero.

Oh, I'm getting a bit lost in this, but I hope I get the point, which seems 
okay.

> Providers should not store
> this data but they should keep the data that authenticates the provider
> (probably an API key that is received on authentication) as long as the
> objects of themselves exist since many lists of tracks can sent to be the
> Provider in succession, either by the user or programmatically (refer to
> last point of “implementation details”).

> A AbstractSettings class will also
> be present so that users can customize the settings of each Provider and
> each Provider can store a small amount of data in the data using the
> KGlobal::config(). This will also be beneficial for the programmer(s) who
> add more Providers. It will also keep a consistent look and feel of all the
> TagGetterProviders.

Hmmm, the original intent was to make the providers rather dumb and almost 
invisible to the user. What do you think would be need to be configurable by 
the user?

>    Rewriting MusicBrainz tag getter: I will rewrite MusicBrainz code
>    according to the above framework. All the existing classes will shifted
> to the new musicbrainz subdirectory inside the taggetter directory. The
> MusicBrainzFinder will become MusicBrainzProvider, implementing the
> abstract Provider class.

Right, but I wouldn't say "rewrite", but rather "adapt to be an implementation 
of the new abstract class".

> Other code will also be re-written but its work
> will mainly remain the same, except for one major difference. Currently,
> libofa is used to create the MusicIP (PUID) that is sent to MusicBrainz.
> MusicBrainz is phasing out PUIDs in favour of AcustIDs.
> 
>    Hence, the MusicBrainzProvide will use
> Chromaprint<http://acoustid.org/chromaprint>to compute AcustIDs rather
> than the MusicIP generated by libofa.

Oh beware! The musizbrainz and musicip are rather independent and were 
intended to be factored out into 2 separate providers! (one changing to 
acoustid)

> Codecs will identified using ffmpeg.

What/why?

> Phonon will be used for decoding while Chromaprint will be used to generate
> the AcoustID.

Have you checked this is possible? What Phonon methods will you use?

> Chromaprint uses the
>    standard C
> library[*]<https://bitbucket.org/acoustid/chromaprint/src/master/src/chroma
> print.h>so using this won’t be a problem.

Eh? What is a connection between a project using a standard C library and 
feasibility of its usage?

> I will make Chromaprint a part of
> the optional tag guessing package (replacing libofa) and update the
> cmakelists, making Chromaprint as a requirement for MusicBrainz tag
> guessing. Hence, the MusicBrainzProvider will be available only if
> Chromaprint has been installed. A HAS_CHROMAPRINT macro will keep track of
> this. The provider will be “available” only if the macro is defined. Just
> as now, the “The::networkAccessManager()”
>    will handle the connection to the MusicBrainz server. The audio
>    fingerprint generated by Chromaprint will be sent to MusicBrainz via a
>    QNetoworkRequest and a slot will be connected to listen for its reply.

The above is obvious and it is perhaps redundant to mention it.

>    Creating Last.fm tag getter: Create the Last.fm tag getter. First, I
>    will add the tag based service and then, if schedule permits, I can add
> the fingerprint based service. Hence, two LastFm Providers will be created.
> Implementation will be very easy thanks to
> liblastfm<https://github.com/lastfm/liblastfm>->    A Qt C++ library for the Last.fm webservices. Both the Providers will
> share the authentication data and reply so that network requests aren’t
> needlessly duplicated. They will be in the same directory but will have two
> available providers. The network replies by Last.fm is in XML format.
> Hence, the QtXML module will parse the replies accordingly.

In fact, liblastfm does it. (and we already use it in Amarok)

> Now the
> similarities end. The network requests and replies will have to handled
> differently by the different providers, since one will be tag based and the
> other fingerprint based.
> 
>    For the LastfmTagProvider the track/artist name has to sent to the
>    Last.fm webservice API. The webservice will then return the other track
>    metadata. After being correctly parsed it will signal the required slot
> so that the Controller receives the tracks whose metadata has been fetched.
> For the LastFmFingerprintProvider we will first have to generate a
> “fingerprint” of the track using the Last Fm
> Fingerprinter<https://github.com/lastfm/Fingerprinter>.
>    This audio fingerprint will then be sent to the Last fm web service via a
> call to the
> track.getFingerprintMetadata<http://www.last.fm/api/show/track.getFingerpri
> ntMetadata>. Then the web service will return the track metadata along with
> a “confidence rating” to specify how accurately the track has been
> identified. All this data will be parsed by the QtXML module and sent to
> the Controller in a manner similar to above.

^^ the above could be shortened, many things mentioned are rather obvious.

>    A better GUI: A better GUI is required. Though I will need to discuss
>    with the Amarok developers what the new GUI should like so that it
>    follows the general look and feel of Amarok, I am proving my
> interpretation as follows: I will replace the existing "Get tags from
> MusicBrainz" button by "Get tags from the following services: ". It will be
> followed by a drop-down list with check-boxes for each item containing
> names of the services.
> 
>    [image: http://troll.ws/image/458a61bc#.UX0fS0GH7gx]

I don't like it. We should just query all enabled services, users don't want 
to think about it every time. If some service doesn't work well for them, they 
should be able to disable it in Plugins section.

>    In “Amarok Settings” there will be a tag guessing section and every
>    TagGetterProvider can create a tab to change settings of the individual
>    provider.

I don't like this either. Every setting just bothers the user, the providers 
should just work without any configuration.

>    As getting tags from the individual services will run in
>    parallel, results will be displayed as soon as they are fetched via the
>    TagGetter::Controller. This list will persist in memory so that if the
> user closes the dialog and then reopens it, tags won’t have to be fetched
> again (until Amarok restart).

Oh please not, this is very convoluted and may lead to very unexpected memory 
usage.

> The user can choose the best tags from the
> generated list. If tags from multiple sources are the same they should be
> clubbed and appear higher up the list compared to the others.

clubbed: yes; higher in the lists: yes, but I propose using the max(a, b) 
rather than a + b function to combine the scores. 

> Tags from
> different sources have to cross-checking for this similarity.

I don't understand this sentence.

> Each set of
> tags that have been fetched will be accompanied by a “confidence rating”-
> between 0 and 1, and provided by the respective Provider. For similar tags
> the ratings will be added (or maybe similar way of increasing the rating
> that can be found out by testing). The list will be continually sorted
> according to the rating of each tag. The list will be a bullet list with
> colors depicting the “confidence rating”- 0 to 1 = red to green (as it is
> now).

Well, this is largely what the current music brainz dialog is, isn't it?

> All user visible strings (buttons and tool-tips) will be written
> using i18n(). i18nc() will be used wherever requiered- like, for a label
> stating "number of track(s)".

This is an obvious requirement, no need to mention it.

>    A GUI for getting tags of multiple songs simultaneously: Its a  feature
>    by which tags for multiple tracks (if not the entire local collection)
> can be changed. Getting tags requires many user clicks as of now,
> especially for multiple tracks. To solve this, a "Tag Getter Wizard" will
> select the best tags for every track.

This is already working, please try to understand *actual* Amarok features 
more!

> Best ones are written to files
> without user approval but in a reversible way. Tags will be saved to
> database and can be reversed/changed anytime (even after amarok restart) in
> both ways- 1) for individual tracks 2) through a list showing the changed
> tags.

Oh, please don't. This would be really convoluted to implement and use.

>    A user can start the wizard select the playlists and/or tacks whose tags
>    need to be guesses, then make it run in the background or make it run in
>    the foreground showing the details of each track whose tags are being
>    changed (and missing ones added).

Hmm, no, please don't make this part of the project (+ it is already working 
to some extent).

>    Or, he/she can set it to run on the
>    entire local collection and/or as new tracks are added to the local
>    collection.

This doesn't look like a good idea to me.

>    I will make decoupled and modularized code so that support
>    for this wizard can easily be extended to other collections.
>    After the
>    wizard has completed its operation a list will show the user the updated
>    tags each a button that will reverse the the changes. Not all fetched
>    tags will be updated. Though I can achieve the right balance only by user
> testing, as of now I set the default as "tags with confidence rating >1
> will be updated". This can easily be changed by the user in the settings
> page. Again, I will write every code according to the KDE
>    internationalization guildlines. The exact look of this wizard will be
>    decided after discussion with the Amarok team but a rough idea of
> features is depicted below:[image:
> http://troll.ws/image/438a77c9#.UX0fXEGH7gx]

No, please don't do this. I'm already pretty fine this is current "MusicBrainz 
Tags Dialog" - especially when recent Alberto's patch is applied.

> <--- GSoC commences--->
> 
> week 3: Make the directories and write the abstract classes (with
> documentation).
> 
> week 4: Polish, make cmakelists for, write make tests and compile the
> written abstract classes.
> 
> July-
> 
> week 1: Re-write the MusicBrainz tag getter according to the framework with
> better documentation
> 
> week 2: Add the features (like using AcoustID) to this MusicBrainz tag
> getter
> 
> week 3: Compile and run the new MusicBrainz (displaying results in the old
> UI itself)
> 
> week 4: Write the new GUI code. Run MusicBrainz tag getter in this fashion
> 
> <--- Mid term --->
> 
> August-
> 
> week 1: Write the Last.fm tag getter (as many features as possible)
> 
> week 2: Finish writing and test the Last.fm tag getter
> 
> week 3: Test the Last.fm and MusicBrainz tag getter together in the new GUI
> 
> week 4: Make the tag getter wizard
> 
> September-
> 
> week 1: Test the new wizard. Write make tests to make sure that new tag
> getters follow the set framework
> 
> week 2: Improve documentation. Fix bugs that have been discovered over the
> weeks
> 
> <--- suggested “pens down” --->
> 
> week 3: Fix Bugs, streamline and optimize the code.
> 
> <--- firm “pens down” --->
> 
> week 4: Do some code cleaning and fix bugs so that my project can be pushed
> into the master branch.
> 
> October-
> 
> week 1: Improve documentation. Fix bugs.

Vedant, I really think that implementing all the stuff you've outlined is 
unachievable, at least not in a code quality we'd expect. Please take some 
time to significantly reduce the amount of work you plan to do. Many of the 
ideas you have on top of the idea (published by devs) simply don't align with 
Amarok goals and/or are based on misunderstanding of how current tag guessing 
works.

> Other Obligations: I have no other obligations. I can easily spent about 50
> hours a week: around 8 hrs a day in slots of 2 to 3 hrs, one each in the
> afternoon (10am-1pm), evening (4pm-6pm) and at night (10pm-1am), with some
> hours less on weekends.

Hehe, I don't think you need to be this detailed. :-)

> Summer vacations will be going on till mid July.
> Even after college starts, I will be following the same schedule for GSoC
> coding. It is easily possible because very few classes are held in the
> beginning of the semester and even if they do clash with my coding
> time-period, the college teachers excuse GSoC students' absence from class.
> Since I will have no other coursework obligations, I can continue to code
> 50 hours a week (with a similar schedule) even up till September.
> 
> About Me: I am currently in my second undergraduate year in National
> Institute of Technology, Durgapur, India, studying Computer Science and
> Engineering. I have experience coding experience with C/C++, Java
> (including Android and making GUI using Java swing) and web services. I
> have submitted 3 patches (Junior Jobs, one each for
> Rekonq[1]<https://git.reviewboard.kde.org/r/107662/>and Amarok
> [2] <https://git.reviewboard.kde.org/r/109283/> as well as a improved
> formatting patch for Amarok[3] <https://git.reviewboard.kde.org/r/109295/>)
> and 2 more are under review (both Junior Jobs for
> Amarok[4]<https://git.reviewboard.kde.org/r/110101/>
> [5] <https://git.reviewboard.kde.org/r/110082/>).
> 
> I love coding for open source. I’m sure working with a mentor who is
> virtually present won’t be a problem. I have interacted (mainly with Matej
> a.k.a. Strohel on IRC) over IRC, the amarok mailing list, reviewboard and
> also the KDE bug tracking system. During the work period code can easily be
> shared via GitHub.

We prefer personal scratch git repositories on KDE infrastructure for this.

> If it is possible, meeting a mentor face to face will
> obviously be much more helpful. I have worked with my college seniors in
> this fashion who have introduced me to Linux and Open-Source. Countless
> times I have been to their rooms for advice and to solve my problems.
> 
> After GSoC I plan to become an active developer for Amarok and also for
> other projects of KDE.

As stated above, this looks to me like a way too much work to do. Perhaps you 
are trying to make the proposal more interesting, but I fear that the effect is 
the opposite - we'd actually appreciate if you wanted to implement much less 
things (ideally just what the initial idea was about), but make it interesting 
by level of insight, research and thought that has been put to it.

Also please study/use the current tag guessing in Amarok more (and do have a 
look at Alberto's patch) to avoid a situation where you describe lack of 
Amarok feature while it is not the case.

Regards,
		Matěj


More information about the Amarok-devel mailing list