Re: Updated patch for Detecting Duplicates in F-Spot


On lun, 2005-09-05 at 11:02 +0800, Indulis Bernsteins wrote:
> Alvaro, 
> I think what you're doing is very importan and usefult- can I suggest
> an idea I've been thinking about for a while to extend what you've
> been doing?  I'm not suggesting you code this personally :-) 

:) It sounds like a big job reading your email.

> What about also using the EXIF data, as well as the filename, date
> modified, and other "metadata" (implicit or explicit). 

Using the EXIF data has the problem that some images could not have EXIF
data for them. 

> If you have 2 images, you can easily look at the EXIF information
> common between the 2.  If the camera body serial number is the same,
> and the date/time of shooting is the same, and the EXIF stored image
> name is the same, you don't have to check the MD5 sum to know that the
> 2 are at least derived from the same image.  You can then just check
> file size.  If they are the same, then you've got a match. 

Currently F-Spot code follows this approach:

But I think that the MD5 signature is very useful because you can be
sure the two files are really the same without error.

> When you apply for a passport in Australia, each piece of ID you show
> has "points" associated with it, when you get to the right # of
> points, you can get a passport.  A similar thing could be done for
> images and the user can select the metadata to use (inc explicit
> metadata like info in f-spot's database and in the EXIF, as well as
> "implicit" metadata like filesize and filename), and the weightings
> applied to each factor to see that one image "qualifies" to be a copy
> (or derivative) of another image. 
> I've also been thinking that it'd be possible to implement an approach
> where the images are recognised as not just being the same, but also
> derived from the same image.  This metadata is currently constructed
> when f-spot creates a new image, so there is an "original" and
> copies. 

Yes, I think F-Spot is following this way when you modify images. It
stores the relationships between original and derivatives.

What I really would love is to use some minimal information inside the
images to take some metadata using some kind of image pattern detection.
For example, find the images with persons in them.

> The long term aim would be to be able to use the classification of
> related images to allow the user to set a policy for their images.
>  For example, "I want the original kept in this directory, and a copy
> in this directory.  I'd also like a small version (640x480 or as close
> as you can get to this) in this directory, kept for 12 months only.
> I'd also like 2 copies on off-line storage (CDs).  Then I'd like a
> medium size copy uploaded to Flickr."  Then, f-spot can use its
> knowledge about related images to make sure that all of the images
> have had a second copy made, and have been shrunk down- if not, this
> can happen in the background.  And if you want to change your policy,
> for example, to have 3 copies, or store your photos stored in
> directories sorted by the predominant colour, f-spot automagically
> does this for you without you having to manually make the changes.
>  Duplicate photos would be simply deleted (according to your policy) 

I am thinking in a more dump user that hasn't to do anything about
configuring the program with policies. But I understand what you are
proposing and it is really cool. But for advanced users. Normal user
will want for example export photos to Flickr (like it is already
implemented) without being asked for any question.

> I think there is a real advantage to losing the "filesystem" oriented
> approach to managing images, and moving to a metadata-based one.
> f-spot could become not just an image viewer, but also a darn good
> image manager! 

Yes! But we have to be very careful and not introduce complex concepts
or configurations that would make F-Spot a solution for Professional


> Cheers, 
> Indulis
> _______________________________________________
> F-spot-list mailing list
> F-spot-list gnome org

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]