<html>
<body>
<div style="font-family: Verdana, Arial, Helvetica, Sans-Serif;">
<table bgcolor="#f9f3c9" width="100%" cellpadding="8" style="border: 1px #c9c399 solid;">
<tr>
<td>
This is an automatically generated e-mail. To reply, visit:
<a href="http://git.reviewboard.kde.org/r/104294/">http://git.reviewboard.kde.org/r/104294/</a>
</td>
</tr>
</table>
<br />
<blockquote style="margin-left: 1em; border-left: 2px solid #d0d0d0; padding-left: 10px;">
<p style="margin-top: 0;">On March 18th, 2012, 9:39 p.m., <b>Ralf Engels</b> wrote:</p>
<blockquote style="margin-left: 1em; border-left: 2px solid #d0d0d0; padding-left: 10px;">
<pre style="white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">Are the unit test running?
There is one specific scenario that the current scanner is handling (since years. No new change).
ITunes stile collections put tracks from the same album but different artist into different directories like this:
Michael Jackson
+ Bad
+ Smooth Criminal
Quincy Jones
+ Bad
+ Commentary
If your fix doesn't break the unit test I am all for committing the fix.
Also you might want to make a new unit test to prevent somebody later on to change the behaviour for the worse.</pre>
</blockquote>
<p>On March 19th, 2012, 12:45 a.m., <b>Alexey Neyman</b> wrote:</p>
<blockquote style="margin-left: 1em; border-left: 2px solid #d0d0d0; padding-left: 10px;">
<pre style="white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">Ralf,
I wasn't able to run the unit test; even though I installed google mock 1.6.0 ("apt-get install google-mock") on Kubuntu 12.04, cmake does not detect it:
-----------------------------------------------------------------------------
-- The following REQUIRED packages could NOT be located on your system.
-- You must install these packages before continuing.
-----------------------------------------------------------------------------
* gmock (1.4 or higher) <http://code.google.com/p/googlemock/>
Used in Amarok's tests.
-----------------------------------------------------------------------------
For the same reason, I didn't add a new unit test.
But I am afraid this is exactly the behavior I am trying to fix. With this change, your example would result in two (or more separate albums): "Michael Jackson - Bad" and "Quincy Jones - Bad".
So, your change was fixing iTunes but breaking the heuristics Amarok used before. Could you think of a solution that would also handle the following scenario:
ABBA
+ The Album
+ Eagle
+ Take a chance on me
+ One man one woman
Chris Norman
+ The Album
+ As good as it gets
+ Every little thing
+ Red hot screaming love
I.e., two completely separate albums being merged? One solution I could think of would be to merge tracks into single compilation if they all share the same year (is it the case with iTunes?). However, this is still prone to errors, for example if the tracks in both directories lack the year tag.
Perhaps, make it a config option? Something like:
[ ] Detect iTunes-style compilations
</pre>
</blockquote>
<p>On March 19th, 2012, 12:49 a.m., <b>Alexey Neyman</b> wrote:</p>
<blockquote style="margin-left: 1em; border-left: 2px solid #d0d0d0; padding-left: 10px;">
<pre style="white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">By the way, is it the case that if one had both albums from my example in iTunes collection, they would also be merged in a compilation?</pre>
</blockquote>
<p>On March 19th, 2012, 1:23 a.m., <b>Alexey Neyman</b> wrote:</p>
<blockquote style="margin-left: 1em; border-left: 2px solid #d0d0d0; padding-left: 10px;">
<pre style="white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">And, just a few examples of such "coincidental" clashes in album names:
http://en.wikipedia.org/wiki/The_Album_%28disambiguation%29
http://en.wikipedia.org/wiki/The_White_Album_%28disambiguation%29
http://en.wikipedia.org/wiki/The_Black_Album_%28disambiguation%29
http://en.wikipedia.org/wiki/The_Brown_Album_%28disambiguation%29
http://en.wikipedia.org/wiki/Animal_%28disambiguation%29
http://en.wikipedia.org/wiki/Red_%28disambiguation%29
http://en.wikipedia.org/wiki/Big_Bang_%28disambiguation%29
http://en.wikipedia.org/wiki/The_Hits_%28disambiguation%29
http://en.wikipedia.org/wiki/Blues_%28disambiguation%29
http://en.wikipedia.org/wiki/Monolith_%28disambiguation%29
...
And a lots more, googling by [disambiguation "album by" site:wikipedia.org]. That it, if one had two albums named "Monolith" from iTunes, they would also be merged into a single "compilation".</pre>
</blockquote>
<p>On March 19th, 2012, 2:04 p.m., <b>Sam Lade</b> wrote:</p>
<blockquote style="margin-left: 1em; border-left: 2px solid #d0d0d0; padding-left: 10px;">
<pre style="white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">Here's a thought: if it's a single compilation album, you won't get multiple copies of a given track number/disk number pair. If they're two different albums, you will. You'd have to account for the disk number tag not being present, and it's still not perfect, but it should be a better heuristic for both cases discussed above than either system.</pre>
</blockquote>
<p>On March 19th, 2012, 5:11 p.m., <b>Alexey Neyman</b> wrote:</p>
<blockquote style="margin-left: 1em; border-left: 2px solid #d0d0d0; padding-left: 10px;">
<pre style="white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">Thanks Sam, for detection of compilations/regular albums that sounds better. So, I think the algorithm should be:
- if there are tracks with disc# tag and without disc# tag, they are separate albums
- if there are matching (disk#,track#) tuples, excluding tracks without track# tag, they are separate
- otherwise, it is a compilation album
This would solve the detection issue, but the issue of separation remains. If they are detected to be separate albums, should they be separated by the directory? Or by some more complex rules? There are several possible types of clashes:
a. Regular album (in a single dir) vs. regular album
b. Regular album vs. iTunes-style compilation
c. iTunes-style compilation vs. iTunes-style compilation
d. Multi-way clashes (e.g. two regular albums and iTunes-style compilation)
Current code handles none of these. Separation by directory would handle (a) and partially (b)/(d) for regular albums, but would be overzealous in separating iTunes-style compilations. However, I don't think cases (b) and (c) can be reliably handled in any way. Imagine the following structure:
Artist_1
+ Album_A
+ 1_Foo
+ 2_Far
+ 3_Faz
Artist_2
+ Album_A
+ 4_Zeb
Artist_3
+ Album_A
+ 1_Joo
+ 2_Jar
+ 3_Jaz
In this case, Amarok would be able to detect clash by track# tags, but how could one separate them? Is it (compilation of artists 1&2 + regular album of artist 3)? Or (compilation of artists 2&3 + regular album by artist 1)? Or (regular albums by all three, with tracks missing from artist 2's album)?
</pre>
</blockquote>
<p>On March 19th, 2012, 6:15 p.m., <b>Alexey Neyman</b> wrote:</p>
<blockquote style="margin-left: 1em; border-left: 2px solid #d0d0d0; padding-left: 10px;">
<pre style="white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">Ralf,
I just thought: shouldn't the iTunes compilation that you gave as an example have a proper "Album Artist" tag? In your example,
Michael Jackson
+ Bad
+ Smooth Criminal
Quincy Jones
+ Bad
+ Commentary
if both tracks had album artist set to "Michael Jackson" (even with different track artist tags), they would be properly merged into a single album by the following code in ScanResultProcessor::commit(), even before the compilations are analyzed:
for( int i = tracks.count() - 1; i >= 0; --i )
{
CollectionScanner::Album *album = sortTrack( tracks.at( i ) );
if( album )
{
dirAlbums.insert( album );
dirAlbumNames.insert( album->name() );
tracks.removeAt( i );
}
}
On other hand, in my example:
ABBA
+ The Album
+ Eagle
+ Take a chance on me
+ One man one woman
Chris Norman
+ The Album
+ As good as it gets
+ Every little thing
+ Red hot screaming love
the albums will be improperly merged into a compilation despite properly set tags (all ABBA's tracks have album artist and track artist set to "ABBA", all Chris Norman's tracks have album artist and track artist tags set to "Chris Norman").
So the question becomes: it is okay to mishandle properly tagged collections to somewhat improve handling of improperly tagged ones?</pre>
</blockquote>
</blockquote>
<pre style="white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">Hi Alexey,
regarding the google mock. Have you tried to remove the ccache file? Sometimes that remembers when libraries are not present and prevents cmake from rechecking.
Regaring the behaviour,
sometimes design decisions later on turn out to be bad.
In this case I am not yet sure but I remember thinking about some more heuristics,
e.g. if you just have too many tracks in an album.
Also I implemented the "Best Of" heuristic and the compilation flag two years ago.
I think that going by the amount of tracks would really help further.
So add a rule: If ceil(<tracks> / 15.0) >= <albums with the same name but different artists> then don't split them up.
</pre>
<br />
<p>- Ralf</p>
<br />
<p>On March 16th, 2012, 12:13 a.m., Alexey Neyman wrote:</p>
<table bgcolor="#fefadf" width="100%" cellspacing="0" cellpadding="8" style="background-image: url('http://git.reviewboard.kde.org/media/rb/images/review_request_box_top_bg.png'); background-position: left top; background-repeat: repeat-x; border: 1px black solid;">
<tr>
<td>
<div>Review request for Amarok.</div>
<div>By Alexey Neyman.</div>
<p style="color: grey;"><i>Updated March 16, 2012, 12:13 a.m.</i></p>
<h1 style="color: #575012; font-size: 10pt; margin-top: 1.5em;">Description </h1>
<table width="100%" bgcolor="#ffffff" cellspacing="0" cellpadding="10" style="border: 1px solid #b8b5a0">
<tr>
<td>
<pre style="margin: 0; padding: 0; white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">Older amarok used the following heuristics to determine whether a particular
album is a compilation (from a comment in
amarok-2.2.1/src/collection/sqlcollection/ScanResultProcessor.cpp):
//using the following heuristics:
//if more than one album is in the dir, use the artist of each track as albumartist
//if all tracks have the same artist, use it as albumartist
//try to find the albumartist A: tracks must have the artist A or A feat. B (and variants)
//if no albumartist could be found, it's a compilation
However, more recent Amarok versions started to merge different albums
with different artist in separate directories together, as explained above.
Amarok started to assume all albums with same name to be compilations
(even if in separate directories) since the following commit:
dfd8b457d7094144563c51b2528afdbe23ffc344
Ralf Engels
Fix all collection scanner auto tests.
Now, amarok first scans all directories (sorting albums by the name)
and then tries to process *album names*, one at a time. If it finds
more than one instance of an album name, it assumes it to be a compilation.
Thus, it lost the heuristics in employed before ("if more than one album
is in the dir...").
While it is still possible to force the right behavior
by selecting "Do not show under Various Artists" for each of the erroneous
albums, it would still be better to restore the original heuristics as there
may be lots of albums merged this way. I think the old heuristics made sense
(why would albums be put into separate directories otherwise, if they are
a single compilation album?).
The attached patch restores the following logic: If any given directory
contains tracks that were sorted into a single album and and that album
was not created as a compilation (i.e. it has non-empty artists), this
album is excluded from being merged with other albums to create a "compilation".</pre>
</td>
</tr>
</table>
<h1 style="color: #575012; font-size: 10pt; margin-top: 1.5em;">Diffs</b> </h1>
<ul style="margin-left: 3em; padding-left: 0;">
<li>src/core-impl/collections/db/ScanResultProcessor.cpp <span style="color: grey">(4f02a16)</span></li>
</ul>
<p><a href="http://git.reviewboard.kde.org/r/104294/diff/" style="margin-left: 3em;">View Diff</a></p>
</td>
</tr>
</table>
</div>
</body>
</html>