[KPhotoAlbum] Optimization of index.xml
Jesper K. Pedersen
blackie at blackie.dk
Mon Aug 28 02:53:28 BST 2006
On Sunday 27 August 2006 21:47, Robert L Krawitz wrote:
| From: "Jesper K. Pedersen" <blackie at blackie.dk>
| Date: Sun, 27 Aug 2006 20:43:32 -0400
|
| Hmmm rather undecided on this. It would indeed make it harder to
| write scripts and other similar things against the index.xml file.
|
| I don't see how any of the suggestions I made would make it harder to
| write scripts against it, but I agree that storing option values as
| ID's would make it harder.
Well they can't assume that a given item is there. I'm not saying much, but a
little
| Here is a few random thought:
| - if it really would change speed, then I would be rater interestd
|
| - My long term goal is to get away from the index.xml and rather
| use a database (no no, breath again, please, and read the rest of
| the sentence :) this database should be something which does not
| require any installation, like sqlite. In addition there would be
| an option to export the db from the index.xml format on exit, and
| an option to import from this file, so that people would still have
| this safetynet. The reason for this move would be to free resource
| from maintaining two backends.
|
| Actually, I have nothing really against a database back end per se,
| other than the fact that it seems like overkill. My intuition may not
| be correct, however. Certainly a flat file is very expensive if
| you're typically doing only a few updates, and that may be a very
| common way of doing things.
Well there are two issues.
1) loading everything into memory is bad when you have a big DB - heck someone
sent me an index.xml file the other day that I could not open on my laptop
with 500 Mb of ram.
2) loading everything into memory means that only one person can access the db
at a time. Therefore you and your wife (or you and your coworkers) can't
annotate images at the same time.
| - the compressed index.xml option in the settings menu does
| actually only save index for each image, did you try that?
|
| Given the history of the compressed option, no. If the compressed
| index.xml isn't simply a zip or gzip or bzip2 of the index.xml file
| (and it apparently isn't, given what you say here and what other
| people have reported), I'm not touching it with a ten foot pole.
| Anything that increases the number of code paths through the save code
| is asking for trouble.
OK, here is a very good reason for using this if you think that way:
*I* am using the compressed option.
Basically what it does is that it saves a more unreadable index.xml (indexes
vs. the real names). This index.xml is approx twice as fast loading as the
full index.xml.
|
| On Sunday 27 August 2006 20:34, Robert L Krawitz wrote:
| | I think we could further optimize the index.xml file by removing data
| | that either has obvious defaults or can otherwise be computed easily
| | without having to look at the actual image file.
| |
| | 1) What's the purpose of storing both a startDate and an endDate in
| | the index.xml file? Is this for videos (and if so, my index.xml
| | shows identical start and end dates for my videos)? Would it make
| | sense to store only the startDate unless the endDate differs?
| |
| | 2) All images have a "description", even though I rarely use it.
| | Would it make more sense to not insert the description unless it's
| | actually present?
| |
| | Also, would it make more sense for the description to be a child of
| | the image, rather than an attribute? That way it could be free
| | text.
| |
| | 3) The angle is always stored, even though for most people it's 0
| | (landscape format) for most images. Again, would it make more
| | sense to only store this if needed?
| |
| | 4) Finally, the label is usually (if not always) simply the basename
| | of the image. Would it be better to not actually store this and
| | simply find it when loading the file? It could be found
| | efficiently while parsing the folder -- simply skip beyond the last
| | separator and search for the final . in the filename.
| |
| | Some stats for my current index.xml:
| |
| | Size % vs. snapshot % vs. SVN
| | Last snapshot: 8801509 100.0 N/A
| | Current SVN: 7108972 80.8 100.0
| | (1): 6616631 75.2 93.1
| | (2): 6372641 72.4 89.6
| | (3): 6232615 70.8 87.7
| | (4): 5953336 67.6 83.8
| |
| | We could save more by storing option values as their id's rather than
| | in actual text form. That would offer the potential of quite
| | substantial savings, but I'm not so sure that we should do that
| | because it's a lot riskier if something goes wrong -- if index numbers
| | get mixed up, it could be very hard to unscramble -- and because it
| | makes it harder for someone to examine the file. On the other hand, I
| | don't really see why we need to have the index numbers stored for each
| | value as opposed to simply building up the list of values as the file
| | is loaded (is it to preserve ordering in the attribute lists?).
|
| --
| Having trouble finding a given image in your collection containing
| thousands of images?
|
| http://www.kphotoalbum.org might be the answer.
|
| _______________________________________________
| KPhotoAlbum mailing list
| KPhotoAlbum at kdab.net
| http://mail.kdab.net/mailman/listinfo/kphotoalbum
--
Having trouble finding a given image in your collection containing
thousands of images?
http://www.kphotoalbum.org might be the answer.
More information about the Kphotoalbum
mailing list