[Digikam-users] backup and data integrity

Arnd Baecker arnd.baecker at web.de
Mon Jan 21 19:44:41 GMT 2008


On Thu, 17 Jan 2008, Arnd Baecker wrote:

[...]

> Would some checksum system, integrated into digikam, be useful,
> in view of ensuring data integrity for backups?
> I think it wouldn't be too difficult to implement something like
> this (I briefly discussed with Marcel on the IRC and
> with digikam >=0.10 such additions to the database will be easy).
> Note that it might come with a bit of a speed penalty when
> images/metadata get changed; however, this could be made
> configurable.

So in order to not just talk about stuff, but to try it out, I
set up two python scripts which
A) Generate a recursive tree which contains
   for each file below digikams root (e.g. ~/Pictures)
   a corresponding md5sum *.hash file

B) Perform a check for each file in the backup
   if the checksum matches.

Interestingly, in my case this already revealed
around 500 files which did not match.
(In this particular case it was essentially a user
error, because I changed the metadata (GPS info) for
those files, but without changing the file date.
As I used rsync such that it would not copy over these
files, the back-up went out of sync).

So without a hash comparison, I would have never realized
the inconsistency!

Well, in my opinion we should get some tools to
enable the check of data-integrity into digikam itself ...

Any thoughts/comments/suggestions/... are welcome
to flesh out the ideas of what would be necessary/what makes sense/...!

Best, Arnd




More information about the Digikam-users mailing list