GSoC application review - reimplementing personal metadata importers

Konrad Zemek konrad.zemek at gmail.com
Mon Apr 29 20:52:18 UTC 2013


I introduced a few changes since last time I sent my proposal on review. 
Unfortunately, because the proposal is already on Google Melange 
website, it's now in HTML - and so I'm no longer maintaining the 
contents in git repository, therefore there's no change log this time. :-(

A few minor changes were made, like fixing some typos, changing "in 
review" to "merged" and fixing timeline (thanks again, Mate(j!). I also 
added a GUI section to "Implementation Details", with a few early 
mockups to show in which direction I'm going; of course that's all 
subject to change.

I should be able to prepare some design mockups by the submission 
deadline, though I must say I'm a bit wary of this, as I'm almost 
certain that I would come up with something much better - and possibly 
much different - in the period between application deadline and the 
"coding start" date.

So below is the current incarnation of my proposal. I hope you'll 
forgive me HTML - I'm pasting it exactly as it is on Google Melange. I 
tried to keep as basic as possible, while still being correct and nicely 
formatted.

*Name:* Konrad Zemek
*Email Address:* konrad.zemek at gmail.com
*Freenode IRC Nick:* kzemek
*IM Service and Username:* Skype, handle: konrad-kun
*Location:* Kraków, Poland. GMT+2 during the Summer.


      Proposal Title

Reimplement Amarok 1.4 (FastForward) & iTunes importers on top of 
Statistics Synchronization framework, and add Amarok 2.x and Rhythmbox 
as synchronization targets.


      Motivation for Proposal / Goal

Currently, Amarok has the ability to import personal track metadata, 
like playcounts, ratings and last played time, from Amarok 1.4 and 
iTunes databases. Although this code still fulfills its purpose, the 
StatSyncing framework has since been added to the codebase and serves as 
a perfect base for a rewrite of importers. Perhaps the most significant 
advantage that would be obtained by implementing importers on top of 
StatSyncing is the ability to resynchronize information and to work with 
collections other than local. Still, synchronization with Amarok 1.4 and 
iTunes covers only a small portion of users that may be willing to use 
Amarok 2.x, but are tied to their current music player because of the 
consequence of losing their personal metadata. To alleviate this, I 
intend to add Rhythmbox as a new target for synchronization. It stores 
following personal track metadata compatible with Amarok:

  * user rating
  * playcount
  * last played time

Rhythmbox is the default GNOME's music player, and since Ubuntu 12.04, 
the music player shipped by default in Ubuntu. I feel that this is an 
important target for metadata synchronization, easing not only the 
transition between music players, but between whole desktop environments 
- GNOME and KDE. To complete the set, synchronization with Amarok 2.x 
will be added, so that user can synchronize his personal track metadata 
between separate Amarok databases.


      Implementation Details

This project breaks up in a few parts with well-defined deliverables 
(details about the timeline are in the Tentative Timeline section).

On the high level, the StatSyncing framework consists of the following 
elements:

  * Controller, which registers providers and starts synchronization
  * Config, used to store preferences for statistics synchronization and
    can be set through UI
  * Provider, which works with the backend and provides StatSyncing::Tracks
  * Track, which is an abstraction of a track and contains relevant
    metadata used for trackmatching, as well as methods to modify it
  * Process, which is responsible for matching the tracks between
    collections and synchronizing them
  * ScrobblingService, which is used for scrobbling capabilities
    (transferring information about played tracks; see [1] <#ref1>)

The project can be broken into three parts; reimplementation of existing 
importers (read-only), introduction of new importers (read-only) and 
finally an optional goal of making the synchronization work in 
read-write mode (so that foreign statistics can be updated as well as 
local ones). This last goal is optional (soft goal).


        Importer tests

There are no tests for importing in the current trunk. Since a big part 
of this project consists of refactoring, I will have to prepare a 
quality suite of tests to make sure that I introduce no regressions.

  * Implement whole test suite using QTest and gmock best practices
  * Test cases would involve using a dummy input, invoking importer and
    verifying collection.
  * The actual logic behind importing tracks will be changed, so these
    tests must treat importer as a black box (black box integration tests)


        Reimplementing importers

  * Reimplement importers to use StatSyncing
  * Data-reading modules, would be cleaned up and factored out into
    their own classes that will be used by a subclass of
    StatSyncing::Provider
  * Current services using StatSyncing framework will be used as a
    reference (primarily last.fm service)
  * At each step where the code is functional (at least between
    reimplementing FastForwardImporter and ITunesImporter), supplement
    the tests so that new capabilities are accounted or (e.g. make sure
    that foreign playcounts are not simply stacked on locally saved
    playcounts with each synchronization)
  * Additionally, unit tests can now be written to confirm correct
    implementation
  * The importer code may prove to be repeatable (in the extreme case
    the only thing differing these importers will be their respective
    parsers); this code will be deduplicated. If it fits its
    responsibilities, a parent class of importers should ontain the
    duplicated logic, otherwise I will introduce new helper classes.


        GUI

Because of new capabilities, importers will be moved from Local 
Collection -> Import tab to the Metadata tab (where Statistic 
Synchronization settings are located). Because these targets may be 
added at will and multiple times (e.g. an user may like to synchronize 
with several Amarok installations on different machines, with database 
files available through NFS shares) an "Add" button will be added to 
enable adding new synchronization targets. Also, because database files 
can change URLs, a "Configure" button will be added to modify existing 
settings. I prepared early mockups of the modified config UI [8] 
<#config_metadata_tab>, of "Add Synchronization Target" dialog [9] 
<#add_synchronization_target_dialog> and an example "Add Rhythmbox 
Synchronization Target" dialog [10] 
<#add_rhythmbox_synchronization_target_dialog>.


        Implement Rhythmbox and Amarok 2.x importers

Rhythmbox stores its personal metadata in an XML file, which should be 
easy to read through QXml SAX parser. When in doubt about the format, I 
will consult Rhythmbox source code.

  * Rhythmbox parser shall be robust, that is it shouldn't depend on the
    order of information (this is obvious, but important)
  * At this point both Amarok 1.4 and iTunes importers are rewritten, so
    their implementations will be used as a reference for Amarok 2.x and
    Rhythmbox importers.
  * The subgoal is to make sure that adding additional import targets is
    straightforward and easy, and if possible doesn't introduce code
    duplication (this may involve modifying StatSyncing framework).
  * As with reimplementing importers, every piece of working code should
    be tested to ensure that it is performing correctly, and so that
    regressions don't occur in the future.


      Tentative Timeline


        MIDTERM EVALUATION (deadline: August 02)

The hard goal for midterm evaluation is to have both FastForward and 
iTunes importers reimplemented, and if possible (soft goal) unit-tested.


        FINAL EVALUATION (suggested: September 16, deadline: September 23)

The hard goal for final evaluation is to have all importers implemented 
and working. The slightly softer goal is to have them unit-tested. The 
soft goal is to implement two-way synchronization.

 1. /June 17 - 23:/

    Drafting up test cases for FastForward and iTunes importers.

 2. /June 24 - 30:/

    Implementing the initial test suite for importers (exam week).

 3. /July 01 - 07:/

    Reimplementing FastForward importer, cleaning up data-reading
    module. At the end of the week I expect to have working
    implementation of FastForwardProvider.

 4. /July 08 - 14:/

    Reimplementing FastForward importer. At the end of the week I expect
    to have a functional importer, possibly with some quirks to iron
    out. Implementing unit tests to assert correctness.

 5. /July 15 - 21:/

    Reimplementing iTunes importer, cleaning up data-reading module. As
    with FastForward, at the end of the week the ITunesProvider should
    be working.

 6. /July 22 - 28:/

    Reimplementing iTunes importer. As with FastForward importer, after
    this period the importer should be functional. Preparing GUI,
    implementing unit tests.

 7. /July 29 - August 04:/ *MIDTERM EVALUATION*

    Ironing out last issues with reimplemented importers, submitting
    current work for midterm evaluation.

 8. /August 05 - 11:/

    In-depth research of Rhythmbox metadata format, implementing
    Rhythmbox importer. I do not expect the RhythmboxProvider to be
    fully functional at the end of the week.

 9. /August 12 - 18:/

    Implementing Rhythmbox importer. RhythmboxProvider should be working
    at the start of the week, whole synchronization should work at the
    end of the week. Implementing unit tests.

10. /August 19 - 25:/

    Implementing Amarok 2.x importer. Like FastForward and ITunes, at
    the end of the week the AmarokProvider should be functional.

11. /August 26 - September 01:/

    Implementing Amarok 2.x importer. Ironing out any problems with
    AmarokProvider. At the end of the week one-way import should be
    working, along with the GUI. Implementing unit tests.

12. /September 02 - 08:/

    Investigating and implementing two-way synchronization. I expect at
    least one working implementation of two-way synchronization at the
    end of the week.

13. /September 09 - 15:/

    Implementing two-way synchronization for remaining importers,
    writing remaining tests.

14. /September 16 - 22:/ *FINAL EVALUATION*

    Ironing out bugs, writing remaining tests. Cushion period in case of
    delays. Submitting work for final evaluation.

15. /September 23 - 29:/

    Ironing out possible quirks, polishing soft goals in case they were
    not fully finished for the final evaluation.


      Obligations from late May to early August

I have school until the very early July, with exams starting the last 
week of June. I will be putting my work on hold for the whole period of 
Google Summer of Code, but up until mid-June I may still be required to 
commit at least a few hours a week to my current job. So from mid-June 
to late September I have no commitments except for school, and from 
early July to late September no commitments at all.


      About me

I'm a student at AGH University of Science and Technology, Kraków, 
Poland. I study Computer Science, and I'm currently in my second year. 
At the moment I'm employed as a part time C++ programmer at X-Formation 
[2] <#ref2>, Poland, where my work ranges from conducting interviews 
with potential employees, through refactoring and bugfixing to 
implementing new features for our product.

I'm a big fan of open source, and KDE in general. I've been using Amarok 
a lot and have not only read the source, but contributed to it [3] 
<#ref3>[4] <#ref4>. I'm currently working on more changes to Amarok's 
codebase. I am also familiar with Qt framework and have used it to 
prototype several small applications. I'm an early adopter, and my 
fiddling with Qt 5 beta has led me to uncover an important bug [5] 
<#ref5>. I also have source-level experience with Boost libraries.

I'm very skilled in C++. Other than that I program mainly in Scala, and 
Java when required (and my university requires that a lot). I'm 
experienced with test driven development, continuous integration, code 
reviews and other agile techniques. I have work experience with both SVN 
and git, with the latter being my VCS of choice for personal projects. I 
also hold a title of finalist of Polish Olympiad in Informatics, a 
prestigious annual programming competition for Polish high school 
students [6] <#ref6> (about the Olympiad; Google translate).

If my application is accepted, I am going to commit my full time to 
complete the project, including putting my job on hold for the duration 
of GSoC. I expect to be working 40 hours a week.I am comfortable with 
working independently under a supervisor living on the other side of the 
globe. Although I have not have worked in this style before, I foresee 
no problems with this and can even change my working hours if needed. I 
think that code reviews and email contact are the best way of tracking 
progress in this case. I am fluent in English, so communication 
shouldn't be a problem. Of course, I would work as well with a local 
mentor with whom I could discuss issues over a coffee.

During the bonding period I expect to complete some more tasks for 
Amarok, possibly continuing what I started with [3] <#ref3> (in review 
comments I mentioned streamlining track sorting and making it consistent 
across the source code). I will also use this time to thoroughly prepare 
for the coding period.

I'm seeing GSoC as a way to kickstart me into the open source 
development at large. After the end of Google Summer of Code program I 
hope to remain in Amarok community as a frequent contributor.

This proposal has been discussed over on Amarok-devel mailing list. [7] 
<#ref7>


      Junior job link

[3] <#ref3>


      References

[1] http://www.last.fm/help/faq?category=Scrobbling
[2] http://www.x-formation.com/
[3] https://git.reviewboard.kde.org/r/110070/
[4] https://git.reviewboard.kde.org/r/110139/
[5] https://bugreports.qt-project.org/browse/QTBUG-26832
[6] 
http://translate.google.com/translate?js=n&sl=pl&tl=en&u=http://www.oi.edu.pl/l/35/
[7] http://mail.kde.org/pipermail/amarok-devel/2013-April/011946.html


      Images

[8] http://student.agh.edu.pl/~zemek/amarok/config_metadata_tab.png 
<http://student.agh.edu.pl/%7Ezemek/amarok/config_metadata_tab.png>
[9] 
http://student.agh.edu.pl/~zemek/amarok/add_synchronization_target_dialog.png 
<http://student.agh.edu.pl/%7Ezemek/amarok/add_synchronization_target_dialog.png>
[10] 
http://student.agh.edu.pl/~zemek/amarok/add_rhythmbox_synchronization_target_dialog.png 
<http://student.agh.edu.pl/%7Ezemek/amarok/add_rhythmbox_synchronization_target_dialog.png>


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.kde.org/pipermail/amarok-devel/attachments/20130429/a8734d3b/attachment-0001.html>


More information about the Amarok-devel mailing list