[KPhotoAlbum] Search performance
Andreas Schleth
schleth_es at web.de
Thu Oct 18 22:02:41 BST 2018
Hi everybody,
Am 18.10.18 um 21:26 schrieb Johannes Zarl-Zierl:
...
> I should have mentioned in the first mail what I mean by "canonical file
> format". I've no problems with storing the data into a persistent database for
> caching.
> But I still think that the index.xml format has good properties (resistant
> against file corruption, easy/robust versioning, readable and writable "by
> hand"). Also, many people use kphotoalbum on different machines in different
> versions - with the XML format, you can easily pull that off as long as you
> take some care.
Yes, yes, yes!
Eg: I still use an old 4.2 KPA with all the glorious KIPI plugins to
turn time (when someone gives me pictures with the date/time off to sync
them with my own images). This works nicely with index files otherwise
used with the latest git master.
And I occasionally tweak the database manually. Eg. setting the time to
somewhere between Jan 1st and Dec 31st makes the image show up in at
least 2 consecutive years. Changing this to Jan 2nd and Dec. 30th are
just two commands in vim.
Even if I am usually a bit critical about XML because it is a bit chatty
(lots of text in names and attributes), it has the great benefit of
being very robust. Robustness must come first, then the code has to be
understood by future maintainers, then performance. We are talking about
data that we want to keep for (many) years to come. My own databases
date back to 2004/2005, when Blackie himself twiddled with the code.
This is at least 2 generations of maintainers back.
Thus, everybody involved did a really terrific job in keeping the file
format stable and backwards compatible over so long a time frame.
> If we take the caching approach, we should be able to eat our cake (index.xml
> format, fast queries) and still have half of it (usually fast loading with
> "slow" saving to index.xml).
>
I somewhat doubt that a large number of images really makes loading much
slower. There are other factors too, such as (maybe) total tree size or
type and size of media.
My image databases all load fairly quick - all around 30 to 40k images:
as at wshome5:~/eigene_Bilder> time kphotoalbum -c index.xml
real 0m8,219s
user 0m5,219s
sys 0m0,448s
(open & close without save / tree size: 141,449,556 kB / 35457 images /
index 31 MB)
My movie database with only around 1k clips and movies takes "forever"
to load:
as at wshome5:~/Filme> time kphotoalbum -c index.xml
real 0m40,944s
user 0m8,874s
sys 0m4,718s
(open & close without saving / tree size: 1,568,340,336 kB / 1100 films
/ index 1,7 MB)
This big difference tells me (I did not look into the code) that looking
at a few large files takes KPA much longer than looking at many smaller
ones...
All my files sit on a NFS share (spinning rust) via GB Ethernet.
Just my thoughts.
Best regards & thanks for keeping the project alive!
Andreas
More information about the Kphotoalbum
mailing list