[KPhotoAlbum] Speaking of performance...
rlk at alum.mit.edu
Sun Feb 17 00:06:16 GMT 2019
On Sun, 27 Jan 2019 10:28:38 -0500, Robert Krawitz wrote:
> We really do need to attack the startup performance somehow, for
> people (like myself) with big collections. I currently have a little
> over 300,000 shots in my collection, and depending upon how well we do
> in the postseason that could increase by another 10% over the next few
> months. It takes about 13 seconds for kpa to start up, largely due to
> the XML parsing. Since the XML file is "only" 57 MB, it's clearly not
> I understand (and agree with) the desire for a readable and editable
> file format. I've fixed things up myself on occasion. But I don't
> want to pay that kind of startup price every time.
> What I'm thinking in terms of is to save the file in two formats, a
> fast format (which could be an SQL database, a binary serialization,
> or such) and the XML format. The fast format would have an embedded
> timestamp; if the XML file were newer, it would be used instead, or
> the user would be prompted to choose which.
> Autosave would save only the fast format (possibly only a delta, but
> that would likely be quite difficult). Full save would save both
> formats; if we were really clever, we might be able to parallelize the
> two operations.
So one possible way to eke out a little improvement might -- I haven't
actually tried this to evaluate the performance difference, but it
saved about 20% (44 MB vs. 55 MB) in size -- is to use single
character XML tags rather than the verbose ones we currently use. It
would at least allow dispatching via a switch on the first character
of the tag rather than having to do a strcmp.
We could also store the times as a simple decimal (or hex?) seconds
since the epoch. In decimal, these would be 10 or (for old photos) 9
bytes; in hex they would be 8 bytes for quite a while yet. That
compares to 19 currently, plus the more complex parsing needed.
Saving 10 bytes, with 300,000 photos, would represent another 3 MB or
so, which would amount to about 25% saving over the current format
(although that would make dates unreadable).
I suspect the improvement in load time would be small, very likely not
enough to really matter. In any event, it will be a while yet before
I have any chance to look at this.
Robert Krawitz <rlk at alum.mit.edu>
*** MIT Engineers A Proud Tradition http://mitathletics.com ***
Member of the League for Programming Freedom -- http://ProgFree.org
Project lead for Gutenprint -- http://gimp-print.sourceforge.net
"Linux doesn't dictate how I work, I dictate how Linux works."
More information about the Kphotoalbum