Re: Updated patch for Detecting Duplicates in F-Spot

On 9/9/05, Gabriel Burt <gabriel burt gmail com> wrote:
> On Wed, 2005-09-07 at 20:56 +0200, Mattias Holmlund wrote:
> > One problem is if exif-editing is implemented in f-spot. Take for
> > example the Edit-Time function. Upon import, the user is given the
> > opportunity to correct the timestamp of the image. If we decide to
> > write the new timestamp to the exif-information, then both the
> > Exif-information and the md5sum of the image is changed before it is
> > stored in the archive.
> I don't see that as being a problem - when the file changes (due to EXIF
> changes, image rotation, etc) then it should just trigger a recompute of
> the MD5.  Am I missing something?
> If you're talking about catching duplicates when importing, then do the
> duplicate-detection with the MD5 taken from before any import-related
> changes take place. After you've seen it's not a dupe, you can change it
> as needed (eg update the date/time or something) and recompute the MD5,
> storing that one. That does mean two MD5 computations for each image,
> but we're talking about rolling two operations (import and batch modify)
> into one, so I think it's fair, and sane. 

I think it is perfectly ok to store two MD5 sums for each image, one
computed when the image was first imported (original) and one computed
on the current contents of the image. What I was pointing out was
simply that storing a single MD5 sum calculated on the current
imagefile is not enough if you want to take exif-editing into account
and still detect duplicates. Storing the both the original MD5 sum and
the current and comparing against both of them upon import probably
solves the relevant part of the problem. This solution does not detect
duplicates if the user tries to import two image files that were
originally from the same picture, but he has changed the exif
information for one of them in another application. But that case is
probably not worth taking into account.


