[KPhotoAlbum] Optimization of index.xml

Robert L Krawitz rlk at alum.mit.edu
Mon Aug 28 01:34:57 BST 2006


I think we could further optimize the index.xml file by removing data
that either has obvious defaults or can otherwise be computed easily
without having to look at the actual image file.

1) What's the purpose of storing both a startDate and an endDate in
   the index.xml file?  Is this for videos (and if so, my index.xml
   shows identical start and end dates for my videos)?  Would it make
   sense to store only the startDate unless the endDate differs?

2) All images have a "description", even though I rarely use it.
   Would it make more sense to not insert the description unless it's
   actually present?

   Also, would it make more sense for the description to be a child of
   the image, rather than an attribute?  That way it could be free
   text.

3) The angle is always stored, even though for most people it's 0
   (landscape format) for most images.  Again, would it make more
   sense to only store this if needed?

4) Finally, the label is usually (if not always) simply the basename
   of the image.  Would it be better to not actually store this and
   simply find it when loading the file?  It could be found
   efficiently while parsing the folder -- simply skip beyond the last
   separator and search for the final . in the filename.

Some stats for my current index.xml:

	       Size		% vs. snapshot	% vs. SVN
Last snapshot: 8801509		100.0		N/A
Current SVN:   7108972		 80.8		100.0
(1):	       6616631		 75.2		 93.1
(2):	       6372641		 72.4		 89.6
(3):	       6232615		 70.8		 87.7
(4):	       5953336		 67.6		 83.8

We could save more by storing option values as their id's rather than
in actual text form.  That would offer the potential of quite
substantial savings, but I'm not so sure that we should do that
because it's a lot riskier if something goes wrong -- if index numbers
get mixed up, it could be very hard to unscramble -- and because it
makes it harder for someone to examine the file.  On the other hand, I
don't really see why we need to have the index numbers stored for each
value as opposed to simply building up the list of values as the file
is loaded (is it to preserve ordering in the attribute lists?).

-- 
Robert Krawitz                                     <rlk at alum.mit.edu>

Tall Clubs International  --  http://www.tall.org/ or 1-888-IM-TALL-2
Member of the League for Programming Freedom -- mail lpf at uunet.uu.net
Project lead for Gutenprint   --    http://gimp-print.sourceforge.net

"Linux doesn't dictate how I work, I dictate how Linux works."
--Eric Crampton




More information about the Kphotoalbum mailing list