[KPhotoAlbum] Speed up new image load time

Robert Krawitz rlk at alum.mit.edu
Tue May 30 00:05:47 BST 2017


On Mon, 29 May 2017 18:47:05 -0400 (EDT), Robert Krawitz wrote:
> On Mon, 29 May 2017 17:27:49 -0400 (EDT), Robert Krawitz wrote:
>> Some timings, for loading 1133 images:
>>
>>      	      Old 	  New
>> 20 MP	      5:41	  0:32
> ...
>> It looks like storing the EXIF data in the database takes about 3
>> seconds.  The next big time consumer is file version detection; if I
>> turn that off, the total time drops off to about 7 seconds.  At that
>> point, in a realistic scenario, I'd likely be I/O-bound; if I were
>> loading 3000 images (30 GB, typically), I'd need on the order of
>> 250-300 seconds just to read the data from disk.  But if someone were
>> storing their images on nVME, it might matter.
>
> Well, there's some very low hanging fruit here: the modified file
> detection computes the MD5 checksum of each file twice!  It's a very
> simple matter to get rid of one of those; the time drops to about 20
> seconds (which is consistent with what I saw running md5sum on all of
> the files: it took about 10 seconds).

If I take out MD5 checksumming altogether it drops to about 8 seconds,
as would be expected.

Of that time, about 3-4 seconds is spent in what looks like saving the
EXIF data, 2-3 seconds scanning the filesystem, and 2-3 seconds
reading the files in (when I interrupted gdb several times during
that, it looked like most of it was library routines scanning the EXIF
headers).

So, 20'ish seconds to read in 1100 files, which would normally be
around 10 GB.  And that's with a fairly slow processor; with a
contemporary fast processor it would be more like 10.  With a large
amount of data, thatt would be completely I/O-bound unless you had an
nVME.

I think this problem is solved.
-- 
Robert Krawitz                                     <rlk at alum.mit.edu>

***  MIT Engineers   A Proud Tradition   http://mitathletics.com  ***
Member of the League for Programming Freedom  --  http://ProgFree.org
Project lead for Gutenprint   --    http://gimp-print.sourceforge.net

"Linux doesn't dictate how I work, I dictate how Linux works."
--Eric Crampton



More information about the Kphotoalbum mailing list