<br><br><div class="gmail_quote">On Jan 21, 2008 1:44 PM, Arnd Baecker <<a href="mailto:arnd.baecker@web.de">arnd.baecker@web.de</a>> wrote:<br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
On Thu, 17 Jan 2008, Arnd Baecker wrote:<br><br>[...]<br><div class="Ih2E3d"><br>> Would some checksum system, integrated into digikam, be useful,<br>> in view of ensuring data integrity for backups?<br>> I think it wouldn't be too difficult to implement something like
<br>> this (I briefly discussed with Marcel on the IRC and<br>> with digikam >=0.10 such additions to the database will be easy).<br>> Note that it might come with a bit of a speed penalty when<br>> images/metadata get changed; however, this could be made
<br>> configurable.<br><br></div>So in order to not just talk about stuff, but to try it out, I<br>set up two python scripts which<br>A) Generate a recursive tree which contains<br> for each file below digikams root (
e.g. ~/Pictures)<br> a corresponding md5sum *.hash file<br><br>B) Perform a check for each file in the backup<br> if the checksum matches.<br><br>Interestingly, in my case this already revealed<br>around 500 files which did not match.
<br>(In this particular case it was essentially a user<br>error, because I changed the metadata (GPS info) for<br>those files, but without changing the file date.<br>As I used rsync such that it would not copy over these<br>
files, the back-up went out of sync).<br><br>So without a hash comparison, I would have never realized<br>the inconsistency!<br><br>Well, in my opinion we should get some tools to<br>enable the check of data-integrity into digikam itself ...
<br><br>Any thoughts/comments/suggestions/... are welcome<br>to flesh out the ideas of what would be necessary/what makes sense/...!<br><div><div></div><div class="Wj3C7c"><br>Best, Arnd<br></div></div></blockquote><div><br>
Hello Arnd,<br><br>What options are you passing to rsync? If you give it the '-c' option rsync will skip based on a checksum instead of mod-time and size. This would at least make your backup consistent with your master. However, it would not avoid the original-corrupted-then-backup issue you brought up earlier.
<br><br>As I think about this, it sounds like implementing a SCM. Basically, you want to know if a file has changed on disc with or, in your case, without intention. In theory, when you have a new file you would 'check it in' to the picture repository. If you make changes you 'check in' the new version of the file. In your case a "check-in" would be to create a check-sum of the file. This leads me to thinking about the "Versioned image" request that is already in digikam. Perhaps a single solution would handle both cases?
<br><br>Best Regards,<br><br> Gerry<br></div></div>