GSoC Proposal v2: Statistics synchronization for pluggable devices and Last.fm

Matěj Laitl matej at laitl.cz
Wed Mar 28 23:23:10 UTC 2012


Hi,
this is a second, improved version of my GSoC proposal that incorporates 
suggestions and comments made by Bart, Teo, Markey, Sam and Christoph, thanks 
for them! Keep them coming. I will upload identical version to GSoC site, too. 
Changes from v1:
 * made it clear that there will be abstraction for Last.fm access to support 
alternatives in future
 * provided more implementation details and how access to Last.fm will be 
carried out (ruling off any shadow collections)
 * detailed (hopefully enough) timeline (Hi Teo`) :-)
 * made my school obligations clear
 * brag a tiny little bit more in About me ;-)

Abstract
========
Amarok can maintain useful per-track play statistics and meta-data such as: 
play count, first/last played date, rating and labels; these are tracked for 
each collection separately. This project will implement statistics 
synchronization between all collections that support them (local, iPod) in its 
first part. Second part is to implement synchronization of the statistics from 
scrobbling services such as Last.fm to Amarok. (other way around already 
partially works)

Contact
=======
Name: Matěj Laitl
Email Address: matej at laitl.cz
Freenode IRC Nick: strohel
IM Service and Username: XMPP: matej at laitl.cz
Location: Prague, Czech Republic, GMT+1

Motivation for Proposal:
========================
Amarok has an ability to store per-track play statistics such as play count, 
first & last played date, rating and labels. It then has powerful means to 
generate custom-tailored playlists based on gathered statistics (e.g.: play me 
what I've listened to last month) that many users like to exploit. This works 
well when your computer is the only device you play music from. More likely 
situation is that you play music using Amarok at home, listen to iPod while 
commuting and use Amarok or another music player at work. All these 3 devices 
are able to keep track of what you've listened to, but each one only a third 
of it, which makes Amarok statistics more or less useless. This project aims 
to remedy the situation; Last.fm is an on-line service that can keep track of 
music a user listens to[1] and can help us with a part of this project. Amarok 
Users group on Last.fm has over 23 000 users.[2]

Project Goals:
==============
This project will implement:
 * track statistics synchronization between Amarok collections that support 
statistics; these are currently Local Collection and iPod Collection, but the 
framework will be general
 * Revamp of the existing Last.fm-scrobbling mechanism including addition of a 
thin abstraction layer to allow alternative implementations
 * Last.fm scrobbling from pluggable media players that support statistics 
(iPods, currently)
 * ability to synchronize labels and ratings (see caveats below) from Amarok 
to Last.fm
 * ability to synchronize play counts, first & last played date from Last.fm 
to Amarok collections (other way around is already implemented by scrobbling)
 * GUI dialogue for performing the synchronization/resolving conflicts

Bonus points (what will Amarok gain for free):
 * ability to synchronize statistics of Amarok and other media player that 
scrobbles to Last.fm
 * track statistics backup through Last.fm
 * synchronization of Amarok and Nepomuk meta-data when Nepomuk collection is 
implemented

Caveats:
 * Last.fm has no concept of track ratings. This can be however worked around 
by special Last.fm-side labels such as "7/10 stars"
 * advanced features will be initially available only Last.fm users; 
implementations dealing with alternative sites can be added in future

Implementation Details:
=======================
Amarok represents audio tracks by Meta::Track abstract C++ class that provides 
getter methods for meta-data (title, artist, album..) and getter/setter 
methods for statistics (rating, play count...). These tracks are grouped into 
so-called Collections, where each Collection represents one source of songs 
(iPod, Local, USB Mass Storage..). Tracks from different collections will be 
matched together using their meta-data and other collection's QueryMaker to 
perform the search. Moreover, iPods provide additional data that can be 
used for conflict-resolution: app_rating and recent_playcount. [3] I plan to 
expose these as new capability offered by Collections. This capability will 
also be used to implement Last.fm scrobbling from iPods (in fact, every 
collection that will support this capability), exploiting recent_playcount 
field in the iPod case. It should be noted that I have already implemented 
similar synchronization in my spare time back in summer 2011 [4], but I was 
not satisfied with its iPod-specific design and GUI, so I decided not to 
strive for its inclusion. But I have the code working and ported to Amarok 
2.5, so it can be used to fast-start this project.

Another interesting note is that scrobble-from-iPod-to-Last.fm was functional 
in Amarok 1.4 days, but this feature got dropped during rewrites leading to 
2.0, so this will fix one long-overdue regression.

Speaking about Last.fm integration, Last.fm provides rather nice RESTful API 
[5] a subset of which is already used through liblastfm [6] library in Amarok 
to submit (scrobble) currently played songs. I plan to introduce an 
abstraction layer, a class called ScrobblingService with an interface to 
scrobble tracks and an optional interface to query scrobbled tracks. Existing 
Last.fm code will be adapted and extended to be an implementation of this 
interface; Last.fm API is powerful enough to support all claimed features.

It should be noted that there is already Last.fm on-line service Collection in 
Amarok, but it focuses just on playing Last.fm radio streams and won't be 
touched in this project. In order to implement synchronization between Last.fm 
and Amarok, there will be a special case for ScrobblingService in the 
synchronizer adapted to constraints of its web service nature.

The plan for the actual synchronizer is to be N-way (e.g. to synchronize N 
collections at once) and to work in per-artist chunks; this is mainly because 
that would allow efficient use of Last.fm API [5], a part of which will be 
mimicked by ScrobblingService.

The GUI will be implemented as a new window using Model/View pattern as 
supported by Qt's AbstractItemModel and QListView, custom delegates will be 
probably needed to display in-line controls for conflict resolution.

Tentative Timeline:
===================
week 1: Core classes (StatsSynchronizer, TrackMatcher probably) with stub 
implementations created; matching tracks from different collections by meta-
data
week 2: First iteration of the GUI; will be ugly but will show matched tracks
week 3: TrackSynchronizer class implemented to actually synchronize meta-data 
of N tracks; at this stage unable to cope with conflicts
week 4: tracks being updated and conflicts shown in the GUI; basic conflict 
resolution in the GUI ("this collection wins")
week 5: RecentStatsCapability introduced and implemented for iPod collection, 
plugged into StatsSynchronizer to aid with conflict resolution
week 6: Introduction of the ScrobblingService class, existing Last.fm 
implementation turned to implementation of this class (just scrobbling at this 
time, no support for querying Last.fm data)
week 7: making the GUI beautiful, more convenient conflict resolution (a 
button to set a "master" collection, with a way to override this for 
individual tracks)

******: mid-term evaluation: inter-collection synchronization should be 
working at this time

week 8: LastfmScrobblingService extended to be able to query user's Last.fm 
library for tracks she played, their playcounts, labels
week 9: Basic ScroblingService support in StatsSynchronizer: ability to match 
collection tracks with remote Last.fm tracks
week 10: LastfmScrobblingService extended to be able to set track labels; 
track rating getting and setting implemented using special labels
week 11: Actual synchronization with ScroblingServices in StatsSynchronizer; 
may support just one instance of StatsSynchronizer if supporting multiple at 
the same time shows tricky
week 12: final touches to the GUI, improving usability, progress bars for 
longer-running operations; scrobbling from just-plugged devices that offer 
RecentStatsCapability.

*******: suggested pencils-down date

week 13: resolving any remaining issues, testing the code for various cases 
(ability to use Last.fm as a statistics data backup in case of their loss 
etc.)

*******: hard pencils-down

week 14: preparing the code (stored in a git branch) for inclusion into 
mainline, proof-reading it, stripping debugging statemets
week 15: review request sent, resolving any possible remarks, ended by 
inclusion into Amarok master hopefully.

Do you have other obligations from late May to early August?
============================================================
If accepted, GSoC will be my main commitment during the summer. I plan to have 
a week-long vacation and a few 3-day trips, but I'm used to work during 
weekends on open source projects, so the vacation will be compensated. First 6 
weeks of GSoC coding period coincide with my university examination period, 
but I have just 2 real exams this semester and coding is my favourite excuse-
not-to-study so I expect it won't hamper my productivity.

About Me:
=========
I'm a 24-year-old student of mathematical informatics from Prague, Czech 
Republic. I've been passionate about FLOSS since high school and recently I've 
started contributing to a couple of projects (mainly KDE related), most 
notably Amarok where I worked on fixing various bugs singe last autumn [7] and 
recently I've rewritten the iPod collection from scratch [8] as suggested by 
Amarok's Bart Cerneels; I plan to submit a review request for it in coming 
weeks. I've been also particularly active on KDE's bugzilla (where I've 
commented to more than 300 bugs [9]) and reviewboard [10]. I know C, C++, 
Python, Java, a bit of French (pun intended) and some other less relevant 
languages. Thanks to my work on Amarok I have some experience in GUI 
programming in Qt & KDE libs.

I've chosen Amarok because I fell in love with it last year, and statistics 
synchronization because it is an area where I found it a bit lacking; I'm a 
music enthusiast who dislikes to listen to the same song twice in a week and I 
believe more Amarok users are that picky and will therefore benefit from this 
work.

[1] http://www.last.fm/user/strohel
[2] http://www.last.fm/group/Amarok+Users
[3] http://www.gtkpod.org/libgpod/docs/libgpod-Tracks.html#Itdb-Track
[4] http://mail.kde.org/pipermail/amarok/2011-June/032736.html
[5] http://www.last.fm/api/intro
[6] https://github.com/mxcl/liblastfm/
[7] https://www.ohloh.net/accounts/strohel
[8] http://goo.gl/B3Odu
[9] http://goo.gl/afJN3
[10] https://git.reviewboard.kde.org/r/104307/


More information about the Amarok-devel mailing list