[KPhotoAlbum] More thumbnail investigations
rlk at alum.mit.edu
Mon May 14 13:41:43 BST 2018
I was able to get the thumbnail build time for ~10000 images down
further, but it took somewhat drastic measures.
There are two ways to load JPEG files using libjpeg: from a FILE *
and, as of libjpeg8, from a memory buffer. Loading the file into
memory myself and then feeding the memory buffer to libjpeg improved
the load time with 3 threads from 20 to 18 minutes (which is
significant, since nothing else I had tried, including increasing the
stdio buffer size, did that). It also decreased the IO/sec to around
60. Increasing the max threads from 3 to 8 got the time down to 16
minutes, with slightly higher I/O rates. I'm using 20 MB as the upper
limit to load that way, just for experimentation.
Even with such a large number of threads, it's using very little CPU
time -- mostly about 8-13% (less than one hyperthread). iostat
indicates that it's spending between 1/2 and 2/3 of the time in I/O
Running it on the SSD, I got well in excess of 400 MB/sec, with rather
modest IOPS in the range of 500/sec, indicating average I/O size on
the order of 1 MB. That's pretty close to saturating the SATA SSD,
which is rated in the range of 500 MB/sec (and is far better than I
can get with any single threaded program). That lends further
credence to my hypothesis that it's I/O limited with more typical
image storage. However, the iostat numbers I'm getting don't look
saturated; this disk should be able to sustain about 100-120 MB/sec
and 120 IO/sec or thereabouts.
To do better, I'd likely either need to use a scout thread or increase
the number of threads still further. Due to the file buffers, that
would likely increase memory consumption, although at least on my
system (which has plentiful memory by typical standards) that's not
likely to cause a major problem. Introducing a scout thread into this
code would not be particularly easy.
The best solution would be to generate thumbnails upon image load for
images up to a certain size. That would combine nicely with the MD5
code, which can also profit from having the entire file (since the
underlying crypto code in Qt only does 16K I/O ops). We could always
postpone the thumbnail generation for really big files (and files that
need load methods other than JPEG or thumbnail extraction from RAW) to
This work may not be entirely trivial, but it could have a pretty big
payoff when loading files.
Robert Krawitz <rlk at alum.mit.edu>
*** MIT Engineers A Proud Tradition http://mitathletics.com ***
Member of the League for Programming Freedom -- http://ProgFree.org
Project lead for Gutenprint -- http://gimp-print.sourceforge.net
"Linux doesn't dictate how I work, I dictate how Linux works."
More information about the Kphotoalbum