[Digikam-users] Difference between collection types

Paul Waldo paul at waldoware.com
Thu Jun 11 15:28:06 CEST 2009


Hi Marcel, comments and analysis below.

Paul
----- "Marcel Wiesweg" <marcel.wiesweg at gmx.de> wrote:

> > Hi Marcel,
> >
> > I've imported the new album and indeed I see the messages that you
> speak
> > of.  I imported the album and then let digikam sit "idle" and it
> looked
> > through the whole collection and found many identical images.  The
> new
> > album shows no tags though!
> 
> When this message appears AlbumDB::copyImageAttributes is called
> immediately 
> afterwards, no way to avoid that, and it copies the tags (among all
> other 
> info). Very strange that this does not happen. 
> When I add a symlink to a subfolder of my collection here as a second
> 
> collection, all tags are available.
> Is it possible that the database contains third entries to identical
> files 
> already that are not tagged?

It looks like theree is simply no additional tagging going on.  Lets look at one image that has tags in my original local collection (which is actually pointing to a mounted samba share).  Its name is CRW_1507.CRW.

sqlite> select * from Images where name='CRW_1507.CRW';
31341||CRW_1507.CRW|3|1|2006-06-24T17:39:28|2557936|8e3bed88d7cf91c811991e86fcf9394c
33963|3|CRW_1507.CRW|1|1|2006-06-24T17:39:28|2557936|8e3bed88d7cf91c811991e86fcf9394c
51007||CRW_1507.CRW|3|1|2006-06-24T17:39:28|2557936|f8e04060cbcaa34b5f7dd6618259ada4
68188|1682|CRW_1507.CRW|1|1|2006-06-24T17:39:28|2557936|f8e04060cbcaa34b5f7dd6618259ada4
83523|2417|CRW_1507.CRW|1|1|2006-06-24T17:39:28|2557936|f8e04060cbcaa34b5f7dd6618259ada4

sqlite> .schema Images
CREATE TABLE Images
 (id INTEGER PRIMARY KEY,
  album INTEGER,
  name TEXT NOT NULL,
  status INTEGER NOT NULL,
  category INTEGER NOT NULL,
[...]

Now lets see if we can look at the tags:

sqlite> .schema ImageTags
CREATE TABLE ImageTags
 (imageid INTEGER NOT NULL,
  tagid INTEGER NOT NULL,
  UNIQUE (imageid, tagid));
CREATE INDEX tag_index  ON ImageTags (tagid);
sqlite> select * from ImageTags where imageid in (31341, 33963, 51007, 68188, 83523);
31341|41
31341|54
31341|55
31341|56
33963|41
33963|54
33963|55
33963|56

So, as we can see, we have two images that have the same tags.  This is good!


Let's go back to the albums.  As you can see, this image is in the DB 5 times, but only belongs to three albums :-O  Which albums are these?
sqlite> select * from Albums where id in (3, 1682, 2417);
3|1|/2003/2003-06-29|2008-01-19|||
1682|3|/2003/2003-06-29|2008-01-19|||
2417|5|/2003/2003-06-29|2008-01-19|||
sqlite> .schema Albums
CREATE TABLE Albums
 (id INTEGER PRIMARY KEY,
  albumRoot INTEGER NOT NULL,
  relativePath TEXT NOT NULL,
  date DATE,
  caption TEXT,
  collection TEXT,
  icon INTEGER,
  UNIQUE(albumRoot, relativePath));
CREATE TRIGGER delete_album DELETE ON Albums
BEGIN
 DELETE FROM Images
   WHERE Images.album = OLD.id;
END;
sqlite> select * from AlbumRoots where id in (1, 3, 5);
1|camera|0|1|volumeid:?path=%2Fhome%2Fpaul%2FPictures%2Fcamera|/
5|Camera|0|3|networkshareid:?mountpath=%2Fmnt%2Fcamera|/
sqlite> .schema AlbumRoots
CREATE TABLE AlbumRoots
 (id INTEGER PRIMARY KEY,
  label TEXT,
  status INTEGER NOT NULL,
  type INTEGER NOT NULL,
  identifier TEXT,
  specificPath TEXT,
  UNIQUE(identifier, specificPath));
CREATE TRIGGER delete_albumroot DELETE ON AlbumRoots
BEGIN
 DELETE FROM Albums
   WHERE Albums.albumRoot = OLD.id;
END;

So, if we trace all of this back to the image, we can see that from a tags perspective, image 33963 and 83523 should have the same tags.  The mapping between images and tags shows that it should be images 31341 and 33963. A mismatch! Also the reason the tags don't show up in the newly added album.  Any ideas on where to proceed from here?

> 
> 
> >
> > I'm not sure what is happening with digikam though.  It finally
> finished
> > finding identical images (after 15 hours: ugh!), 
> 
> It needed 15 hours to scan a collection??


Yup.  15511 images stored on a NAS samba share.  Digikam running at 28% CPU on a Quad Xeon and approx 256 MB/sec constant network throughput.  Think something might be wrong?  As you can imagine, Digikam startup with DB scan is quite painfull...

> 
> 
> > but I still see major
> > network (disk) activity and digikam is still chugging along doing
> > something.  I see lots of messages that say
> > "Digikam::AlbumManager::slotDirWatchDirty: KDirWatch detected change
> at
> > /mnt/camera". Should I allow this activity to finish before
> expecting tags
> > to be present in the newly imported album?  Thanks!
> 
> KDirWatch behavior was quite fundamentally changed between KDE 4.2.2
> and 
> 4.2.3. Since then it reports single files instead of directories. (I
> dont want 
> to comment further on such changes on undocumented behavior between
> minor 
> revisions and breaking applications)
> This leads to an endless loop of KDirWatch triggering a collection
> scan, which 
> then again accesses the db file and triggers KDirWatch. Ignore it.

Part of the 15 hours above was probably this behavior.  Based on the DB analysis above, I'm sure I'll be trying the album import again, so I'll update the numbers ;-)

> 
> Marcel
> _______________________________________________
> Digikam-users mailing list
> Digikam-users at kde.org
> https://mail.kde.org/mailman/listinfo/digikam-users


More information about the Digikam-users mailing list