A question for devs
Jeff Mitchell
kde-dev at emailgoeshere.com
Thu Jun 21 13:09:24 UTC 2007
On Thursday 21 June 2007, Seb Ruiz wrote:
> On 21/06/07, Vladimir Kulev <me at lightoze.net> wrote:
> > Hello Amarok devs! I would like to know - what are you thinking about
> > https://bugs.kde.org/show_bug.cgi?id=144761, and would it be covered in
> > Amarok 2 database model?
>
> We haven't spoken about making changes to accommodate these sorts of
> problems. I would anticipate that it is very easy to make a database
> change by adding a foreign key which links to the duplicate song.
>
> However, my personal opinion is that it would be a big effort to
> implement such a feature, and I don't know if it could be justified.
> The hardest thing, and most error prone would be determining if two
> songs are the same. How can this be done, I don't think that we can
> rely on simply tags since many users have very poor tag management
> features (think Track 01.mp3). Using an external library to analyse
> the files would also be out of the question.
>
> Seb
To go into a little further detail (I'm probably going to close the bug), the
bug suggests detecting "duplicate songs" by same tags/metadata. This is
actually a bad idea. Even if the tags are *exactly* the same, that doesn't
mean the song _data_ is the same. Maybe you have two files with the same
tags (especially if all you have is the name of the track and the artist) but
they are both VBR with different bit rates. Clearly these are not duplicate,
then. Maybe one is live but isn't marked as such. You could let the user
pick which one to remove, but that gets messy.
This goes back to the use case in the bug of the person with both .flac
and .mp3 files of the same music. I used to do this too, and I never ran
into any problem, because I put the .flac files and .mp3 files in separate
but identical directory trees. So my .flacs would be rooted
in /mnt/music/FLAC and my .mp3s would be rooted in /mnt/music/MP3 with the
directory trees underneath being the same. Then you simply add one, or the
other, to Amarok's collection (and if you want to use the .flacs and
put .mp3s onto devices, there's always the File Browser). If you already
have your .flacs and .mp3s in the same directories, it's trivial to write a
script to separate them.
I don't agree with the assertion that scores become inaccurate, as scores are
based on file usage, not "song" usage. There are many reasons why basing
on "song" usage doesn't make a lot of sense, the main one being the fact that
people don't usually have libraries with good, proper tagging. And if scores
relied solely on metadata, and the metadata was changed, there goes your
score. By scoring on file, if the metadata changes the statistics are still
fine; if the file name changes in the majority of cases AFT detects this and
the statistics are again fine.
As for the other use cases (2 and 3 in the bug report), I think there are
better ways to handle this than by putting kludgy duplicate-detection in
Amarok:
#2: Delete one or the other of the files when you find them. This doesn't
seem like it'd happen too often.
#3: Take the files that are duplicate, and move them into a separate place on
your local machine that is not a part of Amarok's collection. Then when you
have access to the NFS share, you can get at the files there; when you don't,
add the directory back into your collection or use the file browser/Konqueror
to add the files to your playlist when you want. This will also keep
statistics sane. In Amarok 2.0 there will be support for multiple local
collections, which would mean that you could have these files in a
second/third/whatever collection and simply ignore that collection when you
have access to the local NFS mount.
--Jeff
More information about the Amarok
mailing list