[Nepomuk] Duplicate resources because of using different sources and creators/performers
trueg at kde.org
Fri Oct 14 13:46:40 UTC 2011
The Vorbis comments spec http://xiph.org/vorbis/doc/v-comment.html
states that for popular music "artist" is normally the performing band
and "performer" is omitted. But for classical music "artist" would be
the composer while "performer" would be the - well performer. :P
So the question is: how do we handle this in the flac (and also
vorbisbtw) analyzer plugin? Do something like "if perfomer is set use
nco:creator otherwise nmm:performer"? That seems reasonable. Opinions?
As for resource merging: we have APi to merge two resources in the Data
management service also easily accessible through DBus. But there is no
code yet that tries to find duplicates.
On 10/14/2011 02:19 PM, Ignacio Serantes wrote:
> As my Nepomuk's database grows I found that duplicate resources is a
> growing problem, and I'm not referring to the bug when indexing.
> As Nepomuk collect data form several sources, in my case Strigi,
> Bangarang and me, occurs that data collected is not coherent and this is
> expected. One simple example with actress/singer Shibasaki Kou (柴咲コ
> ウ) <http://en.wikipedia.org/wiki/Kou_Shibasaki>:
> ignacio at misaki:~> nepoogle --nogui contacts:shibasaki or tag:shibasaki
> 柴咲コウ (Shibasaki Kou)
> shibasaki kou, 柴咲コウ
> Kô Shibasaki, Kô Shibasaki
> and there are more combinations, Kō Shibasaki, Shibasaki Kô, Kou
> Shibasaki, etc..., more or less valid. This, in fact, is a common
> problem with Asian names but Occidental names are not free, for example
> ELO and Electric Light Orchestra.
> Obviously if you search only "shibasaki" you found what you're looking
> for but other resources you don't want also.
> This is not a bug, is a simple logical problem because we found the Real
> World™ so I wonder if there is implemented, planning or discussed
> something about merge all this records.
> Other different question is the fact that there is "performers" and
> "creators", both are the singers of a music file, and the main
> difference is this data are collected from mp3 files or flac files. By
> surprise I found yesterday that strigi is indexing flac files again.
> Terrific :).
> So, for example, if I want to search for all Shibasaki Kou's songs I
> must type:
> nepoogle --nogui performer:shibasaki or creator:shibasaki
> Of course I could search by contact
> nepoogle --nogui contact:shibasaki
> but in this case I also get movies and other records and I only looking
> for music.
> I'm thinking in add a shortcut named "singer", with implements a
> "performer or creator", to nepoogle but I wan't to confirm if this would
> be changed or not.
> Best wishes,
> Nepomuk mailing list
> Nepomuk at kde.org
More information about the Nepomuk