[KPhotoAlbum] Hopefully final (at least for now) performance improvements
Robert Krawitz
rlk at alum.mit.edu
Tue May 22 03:07:18 BST 2018
BTW, I'd like the NFS users to give this a try. I suspect that NFS,
like SATA SSD, will actually benefit from more scout threads, so you
may want to try increasing that (in DB/NewImageFinder.cpp on the
Load-performance branch, try setting imageScoutCount to 2 or 3 and see
if you do any better).
Perhaps counterintuitively, I suspect that NFS behaves a lot like a
very slow SATA SSD -- the actual transfer rate off the media is very
fast compared to the protocol latency, so a higher degree of
parallelism will allow for better I/O overlap and therefore better
throughput. With hard disks, on the other hand, transfers are
dominated by rotational speed and rotation and head seek latency --
things that are not amenable to parallelization. The only thing the
scout thread really gives us -- and it's not unimportant, to the tune
of 10% -- is the ability to keep the disk busy, because we don't have
to have the disk wait for us to finish processing an image and load
another one. So we can pipeline operations on HDD's, but not really
parallelize them, where we can with SATA SSD, and I suspect with NFS
too.
NVMe is something else; the interface transfer rate is quite a lot
faster, but the latency is also a lot lower. But the image loading
pipeline simply isn't fast enough on current processors that I have
available to me. It's possible that an i7-8700K or i9-7940X or the
like might just be fast enough to take some advantage of an NVMe
device, particularly if overclocked (the former because of its single
thread performance, the latter because of the combination of very good
single thread performance and high thread count to process
thumbnails). I don't have such a system available, but it's possible
that either of those with fast memory might be able to load images at
1-1.2 GB/sec with the kpa image pipe, which is a pretty good match for
NVMe throughput. Either one would likely need some extra scout
threads. But in reality, someone needs to have an awfully big photo
shoot, a ridiculous way to transfer data to the system (maybe raw
4K or 8K video frames over Infiniband?), and an absurd budget to make
this meaningful in any pratical sense.
NVMe is simply too fast right now for most workloads to take full
advantage of it. Historically, it's not common for CPU to be the
limiting factor with data-intensive workloads, but NVMe with current
CPUs is an exception.
--
Robert Krawitz <rlk at alum.mit.edu>
*** MIT Engineers A Proud Tradition http://mitathletics.com ***
Member of the League for Programming Freedom -- http://ProgFree.org
Project lead for Gutenprint -- http://gimp-print.sourceforge.net
"Linux doesn't dictate how I work, I dictate how Linux works."
--Eric Crampton
More information about the Kphotoalbum
mailing list