[Digikam-users] backup and data integrity

Arnd Baecker arnd.baecker at web.de
Wed Jan 16 20:19:35 GMT 2008


Hi,

one important issue with digital images is the
question of backup (eg. CD/DVD, optical media, ..
several separate hard-disks, off-site hard-disks, ...).

Another (maybe often over-looked?) aspect is whether
the data (both on the master disk and the backups)
are still correct.
It may well happen that files just get corrupted
on the hard-disk.
(I recently had such an experience, where
fortunately an old back-up on CD allowed to recover
the few files).

So my question is:
How do you ensure the correctness of your data?

What methods are useable and could one maybe
integrate/provide part of the needed tools
inside digikam?

One approach might be to use a hash value (eg. md5):

- Digikam could compute a hash value for every
  image and store it inside of digikam's database.

  This would allow, by an additional tool, to periodically (etc.)
  check for any possible changes (=corruption) of images.

- Of course, if an image gets changed
  (e.g by adding comments, ratings, tags or other meta-data),
  the hash needs to be recomputed by the photo management application.

  ((Another possibility is to only compute the hash of the image
  data itself, but I think that a hash for the full file is better).

- Also, one might even think of checking the hash before
  editing an image to ensure that it did not get corrupted.

  ((And maybe for the paranoid: even after saving a file
  one could compare with the data in memory?))

For backups one could add a file with all the hash-values for
the files. Or each image file could be supplemented by
a *.hash file.
Again with a (simple) tool these hash values could
be recomputed and compared.

While maybe not yet fully sophisticated, this might
be already better than blindly believing
that all files on the hard-disk are still ok ;-).

Are there any other important aspects digikam
would need to enable checks of data integrity?

Note that this is to some extent related to
- "Md5 Checksums to identify pictures"
     http://bugs.kde.org/show_bug.cgi?id=110066
- "Uniquely identifying each image in a collection of images"
     http://bugs.kde.org/show_bug.cgi?id=125736
- "backup on dvd (and maybe sync with dvd-ram?)"
     http://bugs.kde.org/show_bug.cgi?id=113715

Any comments, thoughts, suggestions are very welcome!

Best, Arnd



More information about the Digikam-users mailing list