GSoC draft application

vedant agarwala vedant.kota at gmail.com
Mon Apr 29 20:56:32 UTC 2013


Thanks a lot Matej for this in-depth review. I will address all the issues
you pointed out and make another draft.

Reagards,


On Tue, Apr 30, 2013 at 2:11 AM, Matěj Laitl <matej at laitl.cz> wrote:

> On 29. 4. 2013 vedant agarwala wrote:
> > In the meantime, I have updated my project proposal as shown below. Your
> > inputs are is still as valuable so please comment :)
>
> Hi, I'm very sorry to reply this late, I haven't unfortunately found time
> earlier, neither did any other developer apparently.
>
> Alberto, perhaps you'd be able to review this proposal (unless you want to
> submit a similar one) too? (as you've lately touched the code
> significantly)
>
> > Proposal Title: Improving and modularizing tag guessing
> >
> > Motivation for Proposal / Goal: Currently, Amarok “guesses” tags of music
> > files through the MusicBrainz web service using either existing tags or
> > MusicIP (PUID) audio fingerprints. This method, however, is somewhat
> > outdated. MusicBrainz is phasing out MusicIP in favour of AcousID and
> other
> > methods of guessing tags have emerged. The aim of this project is to
> create
> > an abstract base class for tag guessing that can be suitably inherited by
> > other classes that aim to guess tags.
> >
> > Getting tags from MusicBrainz is hard coded into the Amarok source code.
> > Hence tag guessing is not modularized; this makes it hard to add new Tag
> > Getters (like, for example, Last.fm). Moreover, MusicBrainz code is not
> > very well documented. So, it'll be very time consuming for a programmer
> > trying to add features even just to MusicBrainz since they will have to
> > read and understand the code. As of now, every track’s tags can only be
> > changed one by one that also after waiting a few seconds for each.
>
> The last sentence is not true, you can change then all in batch using the
> Edit
> Tags dialog or throught existing MusicBrainz support. See also how Alberto
> Villa
>
> > More often than not, the tags returned by MusicBrainz is incorrect or
> tags
> > of different kinds are returned.
>
> This might get better thanks to recent Alberto's patch. I view MusicBrainz
> as
> a good way for slight corrections to existing tags, but is often provides
> very
> bad results for files with no tags at all (which is expected).
>
> > Say when a user is listening to a particular song he/she realizes that
> some
> > tags to the song are missing. He/she will try to correct this by himself
> > and will be very happy to see a “Get tags from MusicBrainz” button.
> Rather
> > than updating the tags himself he/she clicks on the button and selects
> the
> > correct tags. But this is where it all ends. The user is only saved from
> > the trouble of typing. He/she still has to choose the correct tags. Many
> > times the user will not find any tags at all or they will be guessed
> wrong
> > entirely.
>
> I don't think that this project, even is successful, will make the
> situation
> above significantly better.
>
> > Moreover, there is no way of updating/correcting tags of multiple files.
>
> This is entirely false.
>
> > If
> > a user adds some new songs to the local collection with poorly formatted
> > tags. It is not uncommon for tracks to contain a track name in the format
> > “trackname - albumname - artistname” with the other details blank. Many
> > tracks contain names of websites (like downloaded from)  at the end of
> one
> > or more tags of the file. A user will want these tags to be fixed but it
> > will be a real pain to do this one by one. If he/she clicks on the “guess
> > tags from file” he/she is greeted by another dialog that takes a long
> while
> > to understand. Its likely that he/she will just close this dialog. Even
> if
> > he/she gets tags from MusicBrainz (assuming the fetched tags are
> correct),
> > getting them from each file one by one will be too much of a task for
> him.
>
> This paragraph is very misleading - such problem doesn't exist in current
> Amarok.
>
> > Implementation Details:
> >    -
> >    Creating a generic framework for tag getters: I will replace the
> >    existing musicbrainz directory by a taggetter directory. It will
> contain
> >    Controller and Provider classes (similar to StatSyncing) under the
> >    namespace TagGetter. The settings UI and related classes will also be
> >    present in this directory. I will change the musicbrainzTagger() in
> the
> >    TagDialog class to tagGetterController(). Once this is called, the
> >    TagGetter::Controller (a singleton class) will be invoked and process
> of
> >    tag guessing will begin. The Controller will have a list (a QList of
> >    ProviderPointers) of available Providers and it will create objects of
> > each of them. Each of those objects will implementing the abstract base
> > class Provider. As required, the Controller will be calling the methods
> > (and connecting signals to slots) of the Provider on the main thread and
> > via polymorphism appropriate methods of each Provider will be called.
>
> Right.
>
> >    Each TagGetter::Provider will have its own directory under the
> TagGetter
> >    directory. They will contain classes and one that will implement the
> >    Provider class. Data will be shared among the classes (in the
> TagGetter
> >    namespace) probably via QExplicitlySharedDataPointer.
>
> what data?
>
> >    Each provider will
> > be running in parallel. As the different Providers do their work, results
> > and progress of the work will available to the Controller via the signals
> > connected to the slots of the Controller. Establishing a connection to
> the
> > web service server, converting the music into data sendable over the
> > internet, fetching results and the error handling will all be part of
> each
> > Provider’s work. Separate threads won’t be required because the network
> > operations will be handled by Qt (QNetworkRequest and QNetworkReply) and
> > they are the most time consuming. If something else takes time to run
> (like
> > decoding the audio and creating the fingerprint) then the provider will
> > have to manage this on another thread (probably by using the
> >    ThreadWeaver::Job). As part of the contract of each provider, each of
> its
> > methods must return quickly. This is important because the Controller
> will
> > guarantee calling these methods on the main thread.
>
> Right.
>
> >    The Controller will provide the Providers with track details (through
> >    the track pointer)such as track data (name, album, artist, length,
> file
> >    location etc.). The framework will contain a TagGetterMeta class.
> Objects
> > of this class will store a TrackPtr to identify the the track in the
> > Collection and metadata (name, artist, album, etc.) returned by the
> > providers. Providers will return a QList of objects of this class. The
> > metadata that has been updated will be non-zero.
>
> Oh, I'm getting a bit lost in this, but I hope I get the point, which seems
> okay.
>
> > Providers should not store
> > this data but they should keep the data that authenticates the provider
> > (probably an API key that is received on authentication) as long as the
> > objects of themselves exist since many lists of tracks can sent to be the
> > Provider in succession, either by the user or programmatically (refer to
> > last point of “implementation details”).
>
> > A AbstractSettings class will also
> > be present so that users can customize the settings of each Provider and
> > each Provider can store a small amount of data in the data using the
> > KGlobal::config(). This will also be beneficial for the programmer(s) who
> > add more Providers. It will also keep a consistent look and feel of all
> the
> > TagGetterProviders.
>
> Hmmm, the original intent was to make the providers rather dumb and almost
> invisible to the user. What do you think would be need to be configurable
> by
> the user?
>
> >    Rewriting MusicBrainz tag getter: I will rewrite MusicBrainz code
> >    according to the above framework. All the existing classes will
> shifted
> > to the new musicbrainz subdirectory inside the taggetter directory. The
> > MusicBrainzFinder will become MusicBrainzProvider, implementing the
> > abstract Provider class.
>
> Right, but I wouldn't say "rewrite", but rather "adapt to be an
> implementation
> of the new abstract class".
>
> > Other code will also be re-written but its work
> > will mainly remain the same, except for one major difference. Currently,
> > libofa is used to create the MusicIP (PUID) that is sent to MusicBrainz.
> > MusicBrainz is phasing out PUIDs in favour of AcustIDs.
> >
> >    Hence, the MusicBrainzProvide will use
> > Chromaprint<http://acoustid.org/chromaprint>to compute AcustIDs rather
> > than the MusicIP generated by libofa.
>
> Oh beware! The musizbrainz and musicip are rather independent and were
> intended to be factored out into 2 separate providers! (one changing to
> acoustid)
>
> > Codecs will identified using ffmpeg.
>
> What/why?
>
> > Phonon will be used for decoding while Chromaprint will be used to
> generate
> > the AcoustID.
>
> Have you checked this is possible? What Phonon methods will you use?
>
> > Chromaprint uses the
> >    standard C
> > library[*]<
> https://bitbucket.org/acoustid/chromaprint/src/master/src/chroma
> > print.h>so using this won’t be a problem.
>
> Eh? What is a connection between a project using a standard C library and
> feasibility of its usage?
>
> > I will make Chromaprint a part of
> > the optional tag guessing package (replacing libofa) and update the
> > cmakelists, making Chromaprint as a requirement for MusicBrainz tag
> > guessing. Hence, the MusicBrainzProvider will be available only if
> > Chromaprint has been installed. A HAS_CHROMAPRINT macro will keep track
> of
> > this. The provider will be “available” only if the macro is defined. Just
> > as now, the “The::networkAccessManager()”
> >    will handle the connection to the MusicBrainz server. The audio
> >    fingerprint generated by Chromaprint will be sent to MusicBrainz via a
> >    QNetoworkRequest and a slot will be connected to listen for its reply.
>
> The above is obvious and it is perhaps redundant to mention it.
>
> >    Creating Last.fm tag getter: Create the Last.fm tag getter. First, I
> >    will add the tag based service and then, if schedule permits, I can
> add
> > the fingerprint based service. Hence, two LastFm Providers will be
> created.
> > Implementation will be very easy thanks to
> > liblastfm<https://github.com/lastfm/liblastfm>->    A Qt C++ library
> for the Last.fm webservices. Both the Providers will
> > share the authentication data and reply so that network requests aren’t
> > needlessly duplicated. They will be in the same directory but will have
> two
> > available providers. The network replies by Last.fm is in XML format.
> > Hence, the QtXML module will parse the replies accordingly.
>
> In fact, liblastfm does it. (and we already use it in Amarok)
>
> > Now the
> > similarities end. The network requests and replies will have to handled
> > differently by the different providers, since one will be tag based and
> the
> > other fingerprint based.
> >
> >    For the LastfmTagProvider the track/artist name has to sent to the
> >    Last.fm webservice API. The webservice will then return the other
> track
> >    metadata. After being correctly parsed it will signal the required
> slot
> > so that the Controller receives the tracks whose metadata has been
> fetched.
> > For the LastFmFingerprintProvider we will first have to generate a
> > “fingerprint” of the track using the Last Fm
> > Fingerprinter<https://github.com/lastfm/Fingerprinter>.
> >    This audio fingerprint will then be sent to the Last fm web service
> via a
> > call to the
> > track.getFingerprintMetadata<
> http://www.last.fm/api/show/track.getFingerpri
> > ntMetadata>. Then the web service will return the track metadata along
> with
> > a “confidence rating” to specify how accurately the track has been
> > identified. All this data will be parsed by the QtXML module and sent to
> > the Controller in a manner similar to above.
>
> ^^ the above could be shortened, many things mentioned are rather obvious.
>
> >    A better GUI: A better GUI is required. Though I will need to discuss
> >    with the Amarok developers what the new GUI should like so that it
> >    follows the general look and feel of Amarok, I am proving my
> > interpretation as follows: I will replace the existing "Get tags from
> > MusicBrainz" button by "Get tags from the following services: ". It will
> be
> > followed by a drop-down list with check-boxes for each item containing
> > names of the services.
> >
> >    [image: http://troll.ws/image/458a61bc#.UX0fS0GH7gx]
>
> I don't like it. We should just query all enabled services, users don't
> want
> to think about it every time. If some service doesn't work well for them,
> they
> should be able to disable it in Plugins section.
>
> >    In “Amarok Settings” there will be a tag guessing section and every
> >    TagGetterProvider can create a tab to change settings of the
> individual
> >    provider.
>
> I don't like this either. Every setting just bothers the user, the
> providers
> should just work without any configuration.
>
> >    As getting tags from the individual services will run in
> >    parallel, results will be displayed as soon as they are fetched via
> the
> >    TagGetter::Controller. This list will persist in memory so that if the
> > user closes the dialog and then reopens it, tags won’t have to be fetched
> > again (until Amarok restart).
>
> Oh please not, this is very convoluted and may lead to very unexpected
> memory
> usage.
>
> > The user can choose the best tags from the
> > generated list. If tags from multiple sources are the same they should be
> > clubbed and appear higher up the list compared to the others.
>
> clubbed: yes; higher in the lists: yes, but I propose using the max(a, b)
> rather than a + b function to combine the scores.
>
> > Tags from
> > different sources have to cross-checking for this similarity.
>
> I don't understand this sentence.
>
> > Each set of
> > tags that have been fetched will be accompanied by a “confidence rating”-
> > between 0 and 1, and provided by the respective Provider. For similar
> tags
> > the ratings will be added (or maybe similar way of increasing the rating
> > that can be found out by testing). The list will be continually sorted
> > according to the rating of each tag. The list will be a bullet list with
> > colors depicting the “confidence rating”- 0 to 1 = red to green (as it is
> > now).
>
> Well, this is largely what the current music brainz dialog is, isn't it?
>
> > All user visible strings (buttons and tool-tips) will be written
> > using i18n(). i18nc() will be used wherever requiered- like, for a label
> > stating "number of track(s)".
>
> This is an obvious requirement, no need to mention it.
>
> >    A GUI for getting tags of multiple songs simultaneously: Its a
>  feature
> >    by which tags for multiple tracks (if not the entire local collection)
> > can be changed. Getting tags requires many user clicks as of now,
> > especially for multiple tracks. To solve this, a "Tag Getter Wizard" will
> > select the best tags for every track.
>
> This is already working, please try to understand *actual* Amarok features
> more!
>
> > Best ones are written to files
> > without user approval but in a reversible way. Tags will be saved to
> > database and can be reversed/changed anytime (even after amarok restart)
> in
> > both ways- 1) for individual tracks 2) through a list showing the changed
> > tags.
>
> Oh, please don't. This would be really convoluted to implement and use.
>
> >    A user can start the wizard select the playlists and/or tacks whose
> tags
> >    need to be guesses, then make it run in the background or make it run
> in
> >    the foreground showing the details of each track whose tags are being
> >    changed (and missing ones added).
>
> Hmm, no, please don't make this part of the project (+ it is already
> working
> to some extent).
>
> >    Or, he/she can set it to run on the
> >    entire local collection and/or as new tracks are added to the local
> >    collection.
>
> This doesn't look like a good idea to me.
>
> >    I will make decoupled and modularized code so that support
> >    for this wizard can easily be extended to other collections.
> >    After the
> >    wizard has completed its operation a list will show the user the
> updated
> >    tags each a button that will reverse the the changes. Not all fetched
> >    tags will be updated. Though I can achieve the right balance only by
> user
> > testing, as of now I set the default as "tags with confidence rating >1
> > will be updated". This can easily be changed by the user in the settings
> > page. Again, I will write every code according to the KDE
> >    internationalization guildlines. The exact look of this wizard will be
> >    decided after discussion with the Amarok team but a rough idea of
> > features is depicted below:[image:
> > http://troll.ws/image/438a77c9#.UX0fXEGH7gx]
>
> No, please don't do this. I'm already pretty fine this is current
> "MusicBrainz
> Tags Dialog" - especially when recent Alberto's patch is applied.
>
> > <--- GSoC commences--->
> >
> > week 3: Make the directories and write the abstract classes (with
> > documentation).
> >
> > week 4: Polish, make cmakelists for, write make tests and compile the
> > written abstract classes.
> >
> > July-
> >
> > week 1: Re-write the MusicBrainz tag getter according to the framework
> with
> > better documentation
> >
> > week 2: Add the features (like using AcoustID) to this MusicBrainz tag
> > getter
> >
> > week 3: Compile and run the new MusicBrainz (displaying results in the
> old
> > UI itself)
> >
> > week 4: Write the new GUI code. Run MusicBrainz tag getter in this
> fashion
> >
> > <--- Mid term --->
> >
> > August-
> >
> > week 1: Write the Last.fm tag getter (as many features as possible)
> >
> > week 2: Finish writing and test the Last.fm tag getter
> >
> > week 3: Test the Last.fm and MusicBrainz tag getter together in the new
> GUI
> >
> > week 4: Make the tag getter wizard
> >
> > September-
> >
> > week 1: Test the new wizard. Write make tests to make sure that new tag
> > getters follow the set framework
> >
> > week 2: Improve documentation. Fix bugs that have been discovered over
> the
> > weeks
> >
> > <--- suggested “pens down” --->
> >
> > week 3: Fix Bugs, streamline and optimize the code.
> >
> > <--- firm “pens down” --->
> >
> > week 4: Do some code cleaning and fix bugs so that my project can be
> pushed
> > into the master branch.
> >
> > October-
> >
> > week 1: Improve documentation. Fix bugs.
>
> Vedant, I really think that implementing all the stuff you've outlined is
> unachievable, at least not in a code quality we'd expect. Please take some
> time to significantly reduce the amount of work you plan to do. Many of the
> ideas you have on top of the idea (published by devs) simply don't align
> with
> Amarok goals and/or are based on misunderstanding of how current tag
> guessing
> works.
>
> > Other Obligations: I have no other obligations. I can easily spent about
> 50
> > hours a week: around 8 hrs a day in slots of 2 to 3 hrs, one each in the
> > afternoon (10am-1pm), evening (4pm-6pm) and at night (10pm-1am), with
> some
> > hours less on weekends.
>
> Hehe, I don't think you need to be this detailed. :-)
>
> > Summer vacations will be going on till mid July.
> > Even after college starts, I will be following the same schedule for GSoC
> > coding. It is easily possible because very few classes are held in the
> > beginning of the semester and even if they do clash with my coding
> > time-period, the college teachers excuse GSoC students' absence from
> class.
> > Since I will have no other coursework obligations, I can continue to code
> > 50 hours a week (with a similar schedule) even up till September.
> >
> > About Me: I am currently in my second undergraduate year in National
> > Institute of Technology, Durgapur, India, studying Computer Science and
> > Engineering. I have experience coding experience with C/C++, Java
> > (including Android and making GUI using Java swing) and web services. I
> > have submitted 3 patches (Junior Jobs, one each for
> > Rekonq[1]<https://git.reviewboard.kde.org/r/107662/>and Amarok
> > [2] <https://git.reviewboard.kde.org/r/109283/> as well as a improved
> > formatting patch for Amarok[3] <
> https://git.reviewboard.kde.org/r/109295/>)
> > and 2 more are under review (both Junior Jobs for
> > Amarok[4]<https://git.reviewboard.kde.org/r/110101/>
> > [5] <https://git.reviewboard.kde.org/r/110082/>).
> >
> > I love coding for open source. I’m sure working with a mentor who is
> > virtually present won’t be a problem. I have interacted (mainly with
> Matej
> > a.k.a. Strohel on IRC) over IRC, the amarok mailing list, reviewboard and
> > also the KDE bug tracking system. During the work period code can easily
> be
> > shared via GitHub.
>
> We prefer personal scratch git repositories on KDE infrastructure for this.
>
> > If it is possible, meeting a mentor face to face will
> > obviously be much more helpful. I have worked with my college seniors in
> > this fashion who have introduced me to Linux and Open-Source. Countless
> > times I have been to their rooms for advice and to solve my problems.
> >
> > After GSoC I plan to become an active developer for Amarok and also for
> > other projects of KDE.
>
> As stated above, this looks to me like a way too much work to do. Perhaps
> you
> are trying to make the proposal more interesting, but I fear that the
> effect is
> the opposite - we'd actually appreciate if you wanted to implement much
> less
> things (ideally just what the initial idea was about), but make it
> interesting
> by level of insight, research and thought that has been put to it.
>
> Also please study/use the current tag guessing in Amarok more (and do have
> a
> look at Alberto's patch) to avoid a situation where you describe lack of
> Amarok feature while it is not the case.
>
> Regards,
>                 Matěj
> _______________________________________________
> Amarok-devel mailing list
> Amarok-devel at kde.org
> https://mail.kde.org/mailman/listinfo/amarok-devel
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.kde.org/pipermail/amarok-devel/attachments/20130430/d8f4f495/attachment-0001.html>


More information about the Amarok-devel mailing list