[KPhotoAlbum] Optimization of index.xml

Jesper K. Pedersen blackie at blackie.dk
Mon Aug 28 01:43:32 BST 2006


Hmmm rather undecided on this. It would indeed make it harder to write scripts 
and other similar things against the index.xml file.

Here is a few random thought:
- if it really would change speed, then I would be rater interestd
- My long term goal is to get away from the index.xml and rather use a 
database (no no, breath again, please, and read the rest of the sentence :)
this database should be something which does not require any installation, 
like sqlite. In addition there would be an option to export the db from the 
index.xml format on exit, and an option to import from this file, so that 
people would still have this safetynet. The reason for this move would be to 
free resource from maintaining two backends.
- the compressed index.xml option in the settings menu does actually only save 
index for each image, did you try that?

Cheers
Jesper.

On Sunday 27 August 2006 20:34, Robert L Krawitz wrote:
| I think we could further optimize the index.xml file by removing data
| that either has obvious defaults or can otherwise be computed easily
| without having to look at the actual image file.
|
| 1) What's the purpose of storing both a startDate and an endDate in
|    the index.xml file?  Is this for videos (and if so, my index.xml
|    shows identical start and end dates for my videos)?  Would it make
|    sense to store only the startDate unless the endDate differs?
|
| 2) All images have a "description", even though I rarely use it.
|    Would it make more sense to not insert the description unless it's
|    actually present?
|
|    Also, would it make more sense for the description to be a child of
|    the image, rather than an attribute?  That way it could be free
|    text.
|
| 3) The angle is always stored, even though for most people it's 0
|    (landscape format) for most images.  Again, would it make more
|    sense to only store this if needed?
|
| 4) Finally, the label is usually (if not always) simply the basename
|    of the image.  Would it be better to not actually store this and
|    simply find it when loading the file?  It could be found
|    efficiently while parsing the folder -- simply skip beyond the last
|    separator and search for the final . in the filename.
|
| Some stats for my current index.xml:
|
| 	       Size		% vs. snapshot	% vs. SVN
| Last snapshot: 8801509		100.0		N/A
| Current SVN:   7108972		 80.8		100.0
| (1):	       6616631		 75.2		 93.1
| (2):	       6372641		 72.4		 89.6
| (3):	       6232615		 70.8		 87.7
| (4):	       5953336		 67.6		 83.8
|
| We could save more by storing option values as their id's rather than
| in actual text form.  That would offer the potential of quite
| substantial savings, but I'm not so sure that we should do that
| because it's a lot riskier if something goes wrong -- if index numbers
| get mixed up, it could be very hard to unscramble -- and because it
| makes it harder for someone to examine the file.  On the other hand, I
| don't really see why we need to have the index numbers stored for each
| value as opposed to simply building up the list of values as the file
| is loaded (is it to preserve ordering in the attribute lists?).

-- 
Having trouble finding a given image in your collection containing
thousands of images?

http://www.kphotoalbum.org might be the answer.




More information about the Kphotoalbum mailing list