[Nepomuk] Duplicate resources because of using different sources and creators/performers

Sebastian Trüg trueg at kde.org
Fri Oct 14 13:46:40 UTC 2011


The Vorbis comments spec http://xiph.org/vorbis/doc/v-comment.html
states that for popular music "artist" is normally the performing band
and "performer" is omitted. But for classical music "artist" would be
the composer while "performer" would be the - well performer. :P

So the question is: how do we handle this in the flac (and also
vorbisbtw) analyzer plugin? Do something like "if perfomer is set use
nco:creator otherwise nmm:performer"? That seems reasonable. Opinions?

As for resource merging: we have APi to merge two resources in the Data
management service also easily accessible through DBus. But there is no
code yet that tries to find duplicates.

Cheers,
Sebastian

On 10/14/2011 02:19 PM, Ignacio Serantes wrote:
> Hi,
> 
> As my Nepomuk's database grows I found that duplicate resources is a
> growing problem, and I'm not referring to the bug when indexing.
> 
> As Nepomuk collect data form several sources, in my case Strigi,
> Bangarang and me, occurs that data collected is not coherent and this is
> expected. One simple example with actress/singer Shibasaki Kou (柴咲コ
> ウ) <http://en.wikipedia.org/wiki/Kou_Shibasaki>:
> 
> ignacio at misaki:~> nepoogle --nogui contacts:shibasaki or tag:shibasaki
> 柴咲コウ (Shibasaki Kou)
> shibasaki kou, 柴咲コウ
> Kô Shibasaki, Kô Shibasaki
> 
> and there are more combinations, Kō Shibasaki, Shibasaki Kô, Kou
> Shibasaki, etc..., more or less valid. This, in fact, is a common
> problem with Asian names but Occidental names are not free, for example
> ELO and Electric Light Orchestra.
> 
> Obviously if you search only "shibasaki" you found what you're looking
> for but other resources you don't want also.
> 
> This is not a bug, is a simple logical problem because we found the Real
> World™ so I wonder if there is implemented, planning or discussed
> something about merge all this records.
> 
> 
> Other different question is the fact that there is "performers" and
> "creators", both are the singers of a music file, and the main
> difference is this data are collected from mp3 files or flac files. By
> surprise I found yesterday that strigi is indexing flac files again.
> Terrific :).
> 
> So, for example, if I want to search for all Shibasaki Kou's songs I
> must type:
> 
> nepoogle --nogui performer:shibasaki or creator:shibasaki
> 
> Of course I could search by contact
> 
> nepoogle --nogui contact:shibasaki
> 
> but in this case I also get movies and other records and I only looking
> for music.
> 
> I'm thinking in add a shortcut named "singer", with implements a
> "performer or creator", to nepoogle but I wan't to confirm if this would
> be changed or not.
> 
> -- 
> Best wishes,
> Ignacio
> 
> 
> 
> 
> _______________________________________________
> Nepomuk mailing list
> Nepomuk at kde.org
> https://mail.kde.org/mailman/listinfo/nepomuk


More information about the Nepomuk mailing list