Re: [Evolution] Deleting duplicate emails



run md5sum on the mail message body and store the resulting string in
a file then compare each message against this list in the file, if the
md5sums of the message body are the same then the message is
guaranteed to be the same.

Nope.

If the md5sum hashes are different, the messages are guaranteed to be
different. If the hashes are the same, there is always a slight
probability, that the messages are *NOT* the same.

With a limited length of hash value, you cannot guaranteed distinct
longer data chunks.


In some folder, for some reasons I have duplicate mails (same mail, two
or three times).

Vincent,

I have posted a small hack (shell script using formail) to delete
duplicate messages based on the Message-Id: header. Search the archive
for it and read my notes carefully.

As I got some feedback and it currently is not wise to run it more than
once [1] I already planned to rewrite it and post it again. Silly me
even sort of announced it without the time to code.

This seems like a good possibility to actually rewrite it and release
it...

...guenther


-- 
char *t="\10pse\0r\0dtu\0  ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
(c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]