[Digikam-users] backup and data integrity
Gerry Patterson
thedeepvoice at gmail.com
Mon Jan 21 20:25:35 GMT 2008
On Jan 21, 2008 1:44 PM, Arnd Baecker <arnd.baecker at web.de> wrote:
> On Thu, 17 Jan 2008, Arnd Baecker wrote:
>
> [...]
>
> > Would some checksum system, integrated into digikam, be useful,
> > in view of ensuring data integrity for backups?
> > I think it wouldn't be too difficult to implement something like
> > this (I briefly discussed with Marcel on the IRC and
> > with digikam >=0.10 such additions to the database will be easy).
> > Note that it might come with a bit of a speed penalty when
> > images/metadata get changed; however, this could be made
> > configurable.
>
> So in order to not just talk about stuff, but to try it out, I
> set up two python scripts which
> A) Generate a recursive tree which contains
> for each file below digikams root (e.g. ~/Pictures)
> a corresponding md5sum *.hash file
>
> B) Perform a check for each file in the backup
> if the checksum matches.
>
> Interestingly, in my case this already revealed
> around 500 files which did not match.
> (In this particular case it was essentially a user
> error, because I changed the metadata (GPS info) for
> those files, but without changing the file date.
> As I used rsync such that it would not copy over these
> files, the back-up went out of sync).
>
> So without a hash comparison, I would have never realized
> the inconsistency!
>
> Well, in my opinion we should get some tools to
> enable the check of data-integrity into digikam itself ...
>
> Any thoughts/comments/suggestions/... are welcome
> to flesh out the ideas of what would be necessary/what makes sense/...!
>
> Best, Arnd
>
Hello Arnd,
What options are you passing to rsync? If you give it the '-c' option rsync
will skip based on a checksum instead of mod-time and size. This would at
least make your backup consistent with your master. However, it would not
avoid the original-corrupted-then-backup issue you brought up earlier.
As I think about this, it sounds like implementing a SCM. Basically, you
want to know if a file has changed on disc with or, in your case, without
intention. In theory, when you have a new file you would 'check it in' to
the picture repository. If you make changes you 'check in' the new version
of the file. In your case a "check-in" would be to create a check-sum of
the file. This leads me to thinking about the "Versioned image" request
that is already in digikam. Perhaps a single solution would handle both
cases?
Best Regards,
Gerry
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.kde.org/pipermail/digikam-users/attachments/20080121/aaea7762/attachment.html>
More information about the Digikam-users
mailing list