[Digikam-users] backup and data integrity

Gerhard Kulzer gerhardkgmx at gmail.com
Mon Jan 21 22:17:05 GMT 2008


Am Monday 21 January 2008 schrieb Arnd Baecker:
> On Thu, 17 Jan 2008, Arnd Baecker wrote:
>
> [...]
>
> > Would some checksum system, integrated into digikam, be useful,
> > in view of ensuring data integrity for backups?
> > I think it wouldn't be too difficult to implement something like
> > this (I briefly discussed with Marcel on the IRC and
> > with digikam >=0.10 such additions to the database will be easy).
> > Note that it might come with a bit of a speed penalty when
> > images/metadata get changed; however, this could be made
> > configurable.
>
> So in order to not just talk about stuff, but to try it out, I
> set up two python scripts which
> A) Generate a recursive tree which contains
>    for each file below digikams root (e.g. ~/Pictures)
>    a corresponding md5sum *.hash file
>
> B) Perform a check for each file in the backup
>    if the checksum matches.
>
> Interestingly, in my case this already revealed
> around 500 files which did not match.
> (In this particular case it was essentially a user
> error, because I changed the metadata (GPS info) for
> those files, but without changing the file date.
> As I used rsync such that it would not copy over these
> files, the back-up went out of sync).
>
> So without a hash comparison, I would have never realized
> the inconsistency!
>
> Well, in my opinion we should get some tools to
> enable the check of data-integrity into digikam itself ...
>
> Any thoughts/comments/suggestions/... are welcome
> to flesh out the ideas of what would be necessary/what makes sense/...!
>
> Best, Arnd
>
> _______________________________________________
> Digikam-users mailing list
> Digikam-users at kde.org
> https://mail.kde.org/mailman/listinfo/digikam-users

Arnd, can you send me the script? I'd like to try too.

I just read that strigi is exactly doing what we want, comparing files with 
sha1. Maybe sha1 is faster than md5? 
Strigi creates a sha1 of every file and stores it its DB. Then it checks for 
file date changes and if yes, runs sha1 to see if it really has changed 
before grepping it thouroughly.

Gerhard

-- 
><((((º> ¸.·´¯`·... ><((((º> ¸.·´¯`·...¸ ><((((º>
http://www.gerhard.fr
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part.
URL: <http://mail.kde.org/pipermail/digikam-users/attachments/20080121/55a5b3f3/attachment.sig>


More information about the Digikam-users mailing list