[KPhotoAlbum] Very experimental image-grouping patch

Wed Nov 28 13:55:01 GMT 2007

2007/11/28, Robert Krawitz <rlk at alum.mit.edu>:
>    Date: Wed, 28 Nov 2007 14:21:46 +0100
>    From: "Paul Fleischer" <paul at xpg.dk>
>
>    2007/11/28, Robert Krawitz <rlk at alum.mit.edu>:
>    > It depends upon what you're doing with those 50,000 images.  If you're
>    > looking at a large number of other images for each of those images,
>    > it's going to be slow.  If you're going to hit the disk for each of
>    > those images (check if it's present), it's going to be very slow
>    > indeed.  If you're just checking an in-memory flag, it might not be so
>    > bad.
>    >
>    > Can you write out the algorithm in pseudo-code?
>
>    Sure:
>
>    images = loadImagesFromXML();
>    foreach img in images {
>      if( images->parent() != null ) {
>         DB::ImageDB::instance()->info( img->parent() )->addChildImage(
>    img->fileName() );
>      }
>    }
>
>    So, basically, we just do in-memory modifications. A more efficient
>    algorithm would be, if
>    loadImagesFromXML() builds a list of images that have parents:
>
>    (images, childImages) = loadImagesFromXML();
>    foreach img in childImages {
>      if( images->parent() != null ) {
>         DB::ImageDB::instance()->info( img->parent() )->addChildImage(
>    img->fileName() );
>      }
>    }
>
>    (where childImages is a subset of images).
>
>    As far as I see it, there is a small gain in the second algorithm,
>    especially if no grouping is not used. But as far as I see it, all
>    four calls (parent(), info(), fileName() and addChildImage()) that are
>    made in each iteration of the loop are fairly cheap.
>
> info() is the one I'd be most concerned about -- it looks like it has
> to do a lookup against the image database.  If that lookup is fast
> constant time, that's OK, but if it isn't, we'll need to be careful.
> If the lookup is logarithmic, we'll need to measure the performance.

info() uses a QMap, which has logarithmic lookup time. So there is a
reason to be careful.

> I don't think the second form would gain all that much -- parent() is
> presumably just an instance variable, and if there's no parent,
> there's no problem.

The advantage of the second version, is that the extra complexity
becomes independent of the total number of images, and dependent on
the total number of images being part of a group.

BTW, I'm currently working with an implementation of the second scheme.

/Paul