[Digikam-users] Images.uniquehash calculation

Wed Jun 19 01:23:51 BST 2013

On Tuesday 18 June 2013 21:00:44 Marcel Wiesweg wrote:
> 
> > For the second data block (the last 100 kB), as there is a seek just
> > before, that could make a difference if the file is <100kB:
> > - in C++, the file's probably in an error state, so no data will be 
read, so
> > the second data block will not be fed to the hash routine.
> > - in Python, the data block /is/ fed, but will probably contain rubbish 
if
> > the file is <100kB...
> 
> Interesting observation. Anyway, if this was a bug, we wont change it to 
keep 
> the hash stable.

The C++ version seems to me to do the correct thing, in that it doesn't 
feed data to the hash generator if the file doesn't provide the data...

What I ment to show was that the two routines are /not/ identical, in that 
they can feed different data to the hash generator, and in that case, 
/should/ end up with a different hash value.

Remco

P.S. There might be a situation where the hash isn't stable: if the data 
buffer isn't initialised, and not completely filled by the file reads, the end 
of the buffer could differ between two calls on the same file, and thus the 
hash value could differ (as the full buffer is sent to the generator).