[KPhotoAlbum] Very, very weird performance problem

Robert Krawitz rlk at alum.mit.edu
Sun May 20 02:49:23 BST 2018


Well, I guess the first three words are redundant.

(One quick aside, it looks like readdir_r is deprecated, so we should
remove it from FastDir, but I verified that that had no impact on my
observations below.)

I got things to the point where the performance on my test directory
is essentially I/O limited; it took about 16:30 to read all of the
files, which works out to about 90 MB/sec.  That's not too bad, but
the disk should be capable of about 115 MB/sec.  So just for yuks, I
tried cat'ing all of the files through dd to observe the
performance...and it took about 14 minutes, somewhere up around 112
MB/sec.  So it sure looks like I'm not I/O limited...except that I'm
using about 6% CPU time, am spending a lot of time in I/O wait, and am
seeing about 100-110 IO/sec.

Just on a hunch, I looked at the order in which readdir() returned the
files; it looked random.  Which is strange since ext4 uses a B-tree
(well, an H-tree, but same difference) to store directory entries, and
the files were laid down on the disk in alphanumeric order (read out
from the camera by a script that does "for i in img_*; do...").  So
for the heck of it, I tried modifying kpa to sort the filenames...and
guess what, I got times in the 14:30-15:00 range.  Well, that's weird
(and reproducible).  So I captured the order in which kpa read
filenames via readdir, and pushed that list through the same cat |
dd...and now it takes about 1055 seconds (17:35 or so).

The other oddity is that find prints the filenames in alphabetical
order.  But I verified with other directories that it doesn't actually
sort the filenames; it just uses getdents to grab all of the entries.
So obviously these files *are* sorted in the directory, and the inodes
happen to be sequential too (again, no surprise).  It would appear
that what's happening is that the kernel's having to do extra I/O to
look up inodes when they're not in sequence (for whatever value of
"sequence"), and that even with 10-20 MB files, the extra IO ops are
significant.

The really squirrely thing is why readdir() appears to be messing up
the order.  Again, it's apparent that find is not deliberately sorting
by either filename or inode number.

I don't particularly want to put anything that ad hoc in code,
especially since it's dependent on cthulu-knows-what, but a difference
of 15% or so in load time is not insignificant.  It is likely that
when people are loading new images that they'll have been written in
sequence, since digital cameras write files in sequence.

I'm trying this with another directory to see what happens.  But I am
glad to see at least that kpa can load photos and be fully limited by
irreducible I/O (the I/O needed to compute MD5 checksums on all
files), at low CPU utilization.


[2(rlk)||{!52}<rlk-mobile>/mnt_images2/images/7dmk2/dcim/100eos7d]
$ /usr/bin/time cat /tmp/files.out |xargs cat | dd of=/dev/null bs=10485760
0.00user 0.00system 0:00.00elapsed 42%CPU (0avgtext+0avgdata 1548maxresident)k
0inputs+0outputs (0major+118minor)pagefaults 0swaps
cat: .thumbnails: Is a directory
0+702571 records in
0+702571 records out
94791940984 bytes (95 GB, 88 GiB) copied, 1055.95 s, 89.8 MB/s
[2(rlk)||{!53}<rlk-mobile>/mnt_images2/images/7dmk2/dcim/100eos7d]
$ /usr/bin/time cat /tmp/files.out |xargs cat | dd of=/dev/null bs=10485760
0.00user 0.00system 0:00.00elapsed 42%CPU (0avgtext+0avgdata 1532maxresident)k
0inputs+0outputs (0major+120minor)pagefaults 0swaps
cat: .thumbnails: Is a directory
0+680643 records in
0+680643 records out
94791940984 bytes (95 GB, 88 GiB) copied, 1055.5 s, 89.8 MB/s
[2(rlk)||{!53}<rlk-mobile>/mnt_images2/images/7dmk2/dcim/100eos7d]
$ /usr/bin/time sort /tmp/files.out |xargs cat | dd of=/dev/null bs=10485760
0.00user 0.00system 0:00.00elapsed 66%CPU (0avgtext+0avgdata 2328maxresident)k
240inputs+0outputs (1major+279minor)pagefaults 0swaps
cat: .thumbnails: Is a directory
0+685892 records in
0+685892 records out
94791940984 bytes (95 GB, 88 GiB) copied, 847.027 s, 112 MB/s
[2(rlk)||{!54}<rlk-mobile>/mnt_images2/images/7dmk2/dcim/100eos7d]
$ /usr/bin/time sort /tmp/files.out |xargs cat | dd of=/dev/null bs=10485760
0.01user 0.00system 0:00.01elapsed 83%CPU (0avgtext+0avgdata 2244maxresident)k
224inputs+0outputs (2major+272minor)pagefaults 0swaps
cat: .thumbnails: Is a directory
0+696308 records in
0+696308 records out
94791940984 bytes (95 GB, 88 GiB) copied, 889.148 s, 107 MB/s

-- 
Robert Krawitz                                     <rlk at alum.mit.edu>

***  MIT Engineers   A Proud Tradition   http://mitathletics.com  ***
Member of the League for Programming Freedom  --  http://ProgFree.org
Project lead for Gutenprint   --    http://gimp-print.sourceforge.net

"Linux doesn't dictate how I work, I dictate how Linux works."
--Eric Crampton



More information about the Kphotoalbum mailing list