One of my goals is to push all this onto a worker thread, so it is 
backgrounded. Esp md5sum calculation is tedious at when new images are found.

On Sunday 24 December 2006 17:16, Robert L Krawitz wrote:
|    From: "Jesper K. Pedersen" <blackie at blackie.dk>
|    Date: Sun, 24 Dec 2006 11:32:17 +0100
|    Seems like you are right indeed. That just shows how little I use
|    subcategories myself.
|    I spent the better of an hour profiling it, and came to the
|    conclusion that it unfortunately is the algorithms that sucks
|    rather than the implementation. In my database with ~7000 images,
|    it takes billions of string compares to search for a large enough
|    category :-/
| This is probably what I reported about a month ago.
|    It is unfortunately too late to do something about that for the
|    next release.  I do, however, plan to spent some time to make a
|    patch level release after the next one, which will include bug
|    fixes and optimizations only.
| Another thing I'd like to look at is scanning for images.  It takes me
| about 100 seconds on a database of 20000 images the first time I do it
| (succeeding times are very fast -- a fraction of a second).  This is
| using reiserfs with a single 250 GB 7200 RPM drive.  It's likely that
| different filesystems will yield different -- possibly radically
| different -- results.
| This is due to the fact that every file is being stat()'ed, even if
| it's already in the database.  I tried rearranging the code in
| question to not call isReadable() unless the file is not registered in
| the database, but that didn't help.  The problem is that QDir always
| calls stat() on every directory entry even if no filters are
| specified.  Since we're not going to be able to change Qt, we may want
| to devise our own directory class (or derive a class from QDir) for
| this purpose.
| There's one other issue: people who use RAW+JPEG and choose not to
| index both, or who use cameras that generate thumbnail files, will
| still have to stat all of these files.  There may be workarounds, such
| lists of extensions to ignore outright (which of course would break if
| someone actually named a directory foo.thm or the like), or some other
| way to ignore JPEG files if the RAW files are already indexed.

