GSoC application review - reimplementing personal metadata importers

Matěj Laitl matej at laitl.cz
Sat Apr 20 17:04:35 UTC 2013


On 20. 4. 2013 Konrad Zemek wrote:
> Hi fellow developers,

Hi,

> For example I have experience with Boost, and
> submitted a bug report recently with a proposed fix, which was
> promptly integrated into upstream. On the one hand that's experience
> with developing and open source, on the other hand Amarok has nothing
> to do with Boost. Should I add that?

Perhaps just a sentence.

> Motivation for Proposal / Goal:
> 	Currently, Amarok has the ability to sync personal track metadata
> 	with Amarok 1.4 and iTunes databases.

Not true, just one-way import. You should probably mention what these 
"personal data" are, it may not be clear to someone new to Amarok.

> 	Although this code still
> 	fulfills its purpose, the StatSyncing framework has since been added
> 	to the codebase, and serves as a perfect base for a rewrite of
> 	importers. Perhaps the most significant advantage implementation
> 	on top of StatSyncing offers over the current one, is the ability
> 	to resynchronize information and to work with collections other
> 	than local.

English probably uses less commas than our Slavic languages; *of* an 
implementation?

> 	Still, synchronization with Amarok 1.4 and iTunes covers only
> 	a small portion of users that may be willing to use Amarok 2.x,
> 	but because of the consequences of losing their personal metadata,
> 	are tied to their current music player. To alleviate this, I intend
> 	to add Rhythmbox as a new target for synchronization.

What type of statistics that would be compatible with Amarok Rhythmbox 
maintains? Please research and mention it in the proposal.

> 	Rhythmbox
> 	it the default GNOME's music player, and since Ubuntu 12.04, the
> 	music player shipped by default in Ubuntu. I feel that this is an
> 	important target for metadata synchronization, easing not only the
> 	transition between music players, but between whole desktop
> 	environments - GNOME and KDE.

Right.

> 	To complete the set, synchronization with Amarok 2.x will be added,
> 	so that user can synchronize his personal track metadata between
> 	separate Amarok databases.

Good.

> Implementation Details:
> 	This project breaks up in a few parts with well-defined deliverables
> 	(details about timeline are in Tentative Timeline section).
> 
> 	* Importer tests
> 		There are either no tests for importing in the current trunk,
> 		or they are in a wrong place. Regardless, since a big part of
> 		this project consists of refactoring, I will have to prepare
> 		a quality suite of tests to make sure that I introduce no
> 		regressions.

I think there are no importer tests, please check it and then be more certain 
in the proposal.

> 		- Review current importer tests (if any)

Better to check before submitting the proposal.

> 		- Implement whole test suite using QTest and gmock best
> 		practices
> 		- The main test cases would involve using a dummy input,
> 		invoking importer and verifying collection.
> 		- For best test results, at least the collection class may
> 		be mocked

I think you don't know yet what you'll have to mock, so better be more 
general.

I think there could be 2 types of tests: unit-test that would check whether 
you implement the trackprovider API correctly and return tracks with right 
metadata (when given a pre-filled database to import), and integration tests 
(perhaps optional if time permits?), which would actually run the 
synchronization and check the result. (problem: interactivity)

> 	* Reimplementing importers
> 		- Reimplement importers to use StatSyncing
> 		- Data-reading modules, after needed changes, would become track
> 		providers

Using fully qualified class name would clear any doubt from what you mean by 
"track providers". (because we have multiple in Amarok)

> 		- Current services using StatSyncing framework will be used
> 		as a reference (primarily last.fm service)
> 		- Clean-up the code responsible for data-reading

I think you rather mean factor out data-reading code from importers to newly 
written trackproviders.

> 		- At each step where the code is functional (at least between
> 		reimplementing FastForwardImporter and ITunesImporter),
> 		supplement the tests so that new capabilities are accounted for
> 		(e.g. make sure that foreign playcounts are not simply stacked
> 		on locally saved playcounts with each synchronization)

The example would be actually testing the statsyncing framework, but is still 
beneficial.

> 		- The importer code may prove to be repeatable (in the extreme
> 		case the only thing differing these importers will be their
> 		respective parsers); this code will be deduplicated. If it fits
> 		DatabaseImporter (a parent class for importers) responsibilities
> 		then the code should land in DatabaseImporter implementation,
> 		otherwise I will introduce new helper classes.

Hmm, I think DatabaseImporter serves as a common superclass to the 2 
importers, now the track provider will serve the role. OTOH yes, it is 
important not to duplicate code, perhaps using mentioned helper classes and/or 
more elaborate class hierarchy.

You should also mention whether the newly implemented provider would be read-
only or whether you plan to make some of them read-write (perhaps optionally 
if time permits?).

To prove that you understand the StatSyncing framework, you could also provide 
high-level *brief* overview how it currently works, what are the involved 
classes and what's their purpose. This is the rare part of Amarok that is 
actually documented, so it shouldn't be hard.

> 	* Implement Rhythmbox and Amarok 2.x importers
> 		Rhythmbox stores its personal metadata in an XML file, which
> 		should be easy to read through QXml SAX parser. When in doubt
> 		about the format, I will consult Rhythmbox source code.
> 
> 		- Rhythmbox parser shall be robust, that is it shouldn't depend
> 		on the order of information (this is obvious, but important)
> 		- At this point both Amarok 1.4 and iTunes importers are
> 		rewritten, so their implementations will be used as a reference
> 		for Amarok 2.x and Rhythmbox importers.
> 		- The subgoal is to make sure that adding additional import
> 		targets is straightforward and easy, and if possible doesn't
> 		introduce code duplication.

Yeah, nice point. That may actually involve touching the StatSyncing framework 
itself however.

> 		- As with reimplementing importers, every piece of working code
> 		should be tested to ensure that it is performing correctly,
> 		and so that regressions don't occur in the future.

Right. The implementation part looks good otherwise from my PoV.

> Tentative Timeline:
> 	June 17 - 23:
> 		Investigating tests and drafting up test cases
> 	June 24 - 30:
> 		Investigating tests and drafting up test cases (exam week)
> 	July 01 - 07:
> 		Preparing the test suite
> 	July 08 - 14:
> 		Implementing the test cases
> 	July 15 - 21:
> 		Reimplementing FastForward importer
> 	July 22 - 28:
> 		Reimplementing FastForward importer / iTunes importer
> 	July 29 - August 04:
> 		Reimplementing iTunes importer
> 	August 05 - 11:
> 		Implementing test cases for new functionality
> 	August 12 - 18:
> 		In-depth research of Rhythmbox metadata format
> 	August 19 - 25:
> 		Implementing Rhythmbox importer
> 	August 26 - September 01:
> 		Implementing Rhythmbox importer / Amarok 2.x importer
> 	September 02 - 08:
> 		Implementing Amarok 2.x importer
> 	September 09 - 15:
> 		Implementing test cases for new importers
> 	September 16 - 22:
> 		fix bugs, QA
> 	September 23 - 29:
> 		fix bugs, QA

Hmmm, thinking more about the "fist tests, then reimplementation" approach I 
think it would least to a fair deal of wasted work. I agree that this is the 
right approach for "pure" reimplementation, but here you are actually 
reimplementing *and* changing the function. I think the tests you would write 
for old importers would require so significant changes to be adapted to 
StatSyncing that it would equal writing a brand new test suite.

Therefore I think it would be best if you started with FastForward importer 
and then implement tests right after it starts to work, similarly for other 
importers. Writing tests right with the implementation might also be less 
monotonous. I also think you'll need more time for FF and iTunes iporters than 
3 weeks.

It may be wise to mention where the mid-term evaluation is and to provide 
criteria for us to judge your work. Perhaps your hard goal would be 
reimplemented FF and iTunes importers and soft goal to have them tested? What 
should be the criteria for final evaluation?

Other general remark could be that the timeline could be more detailed.

> 	I'm a big fan of open source, and KDE in general. I've used Amarok
> 	a lot and not only read the source, but contributed to it [2].
> 	I am also familiar with Qt framework and have used it to prototype
> 	several small applications. I'm an early adopter, and my fiddling
> 	with Qt 5 beta has led me to uncover an important bug [3].

Nice. I should be able to review & merge your review request soon, but please 
still make a distinction between already merged and yet-in-review patches. 
Having submitted/merged one more little patch would be a small bonus, too.

> 	If my application is accepted, I am going to commit my full time to
> 	complete the project, including putting my job on hold for the
> 	duration of GSoC. I expect to be working 40 hours a week. I am
> 	comfortable with working independently under a supervisor living
> 	on the other side of the globe. Although I have not have worked
> 	in this style before, I foresee no problems with this and can even
> 	change my working hours if needed. I think that code reviews
> 	and email contact are the best way of tracking progress in this
> 	case. I am fluent in English, so communication should be a problem.

"shouldn't"? ;-) Otherwise looks well.

> 	Of course I would work as well with a local mentor, with whom
> 	I could discuss issues over a coffee.
> 
> 	During the bonding period I expect to complete some more tasks for
> 	Amarok, possibly continuing what I started with [2] (in review
> 	comments I mentioned streamlining track sorting and making it
> 	consistent across the source code).

If you have time, we would expect you'd also do more research on how you'll 
proceed with the implementation and perhaps some high-level sketching of the 
design.

Regards,
		Matěj


More information about the Amarok-devel mailing list