[Digikam-users] backup and data integrity
Gerhard Kulzer
gerhardkgmx at gmail.com
Wed Jan 23 16:28:31 GMT 2008
Am Tuesday 22 January 2008 schrieb Arnd Baecker:
> On Mon, 21 Jan 2008, Gerhard Kulzer wrote:
>
> [...]
>
> > Arnd, can you send me the script? I'd like to try too.
>
> Done (off-list, it is really not ment for general consumption ... ;-)
>
> > I just read that strigi is exactly doing what we want, comparing files
> > with sha1. Maybe sha1 is faster than md5?
>
> No idea. Maybe we should do a speed test at some point ;-)
>
> > Strigi creates a sha1 of every file and stores it its DB. Then it checks
> > for file date changes and if yes, runs sha1 to see if it really has
> > changed before grepping it thouroughly.
>
> Looking at
> http://strigi.sourceforge.net/?q=features
> it does not seem to support images?
>
> I don't yet fully see how strigi will finally fit into
> "the" solution, this is definitively something to look at in more detail!
> Thanks for the pointer.
>
> Best, Arnd
Hi Arnd,
I try to sumarize what we said last night on IRC, just as a public memo.
Aim is to
a) prevent corrupt images to be saved onto disk and to
b) detect existing corrupt files on disk
(to prevent overwriting of potentially good backups)
Strategies like DIF and HARD are not available in the consumer market for
another couple of years, but given the inclrease in size, speed and
complexity of systems, consumer system will implement some kind of ECC
(horizon ~ 3y).
Protection on file system level as provided by zfs and btrfs are good but
insufficient as they protect the disk only and not the transmission chain
appl - OS - I/O controller - fs
So we have to do it 'by hand' (meaning digikam)
While saving a file after modification a)
1. keep it in memory
2. save it to disk
3. flush disk to clear cache
(3a. make sure all disk internal buffers are cleared by reading other data the
size of the disk buffer) = optional
5. run CRC checksum on file on disk and file in memory
5a. alternative: store checksum already in metadata and save it with file.
6. if mismatch, re-write file and repeat procedure
for problem b)
7. if 5a was used, as simple scrubbing scan can be launched, manually or
programmed at frequency X
7a. try to open files and look for errors produced (but this method is not
reliable, I have images that show the upper part, are corrupt and produce no
error message. However, the more severe error can be found)
8. generate user alert so that one can manually check between backup and
original.
This method may seem tedious, but has the advantage of being independent of OS
and file system, works on nfs as well.
Gerhard
--
><((((º> ¸.·´¯`·... ><((((º> ¸.·´¯`·...¸ ><((((º>
http://www.gerhard.fr
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part.
URL: <http://mail.kde.org/pipermail/digikam-users/attachments/20080123/d9a74a13/attachment.sig>
More information about the Digikam-users
mailing list