[Evolution] Spam Filter




I'm using both spambouncer(1.5) and spamassassin (2.40), via procmail,
which serves as the local delivery agent... both are set up to just tag
the messages.

I have filters set up in evolution to move the spam tagged messages into
folders, to mark them as important, to mark them as read, and to cease
further processing. Right now, I just act on spambouncer tags alone,
while I compare the two.

If you want to do both, like me, you have to do spamassassin first in
your procmail file, then after spamassassin adds its tags, let
spambouncer do it's thing. Do terse reports in the headers, don't defang
the mime, and Don't modify the subject with spamassassin, or you won't
have spambouncer work right.

http://www.spambouncer.org

http://www.spamassassin.taint.org

http://www.procmail.org

And, of course, you'll need perl for spamassassin. Spambouncer is purely
procmail based scripts. 

And again, of course, you may want to install vipul's Razor, and the DCC
stuff, as they are pretty efficient now at spotting spam.

I've added some updates to spambouncer that make it better at catching
the base64 stuff, better at spotting letters that say they are from
whitelisted individuals or yourself, but really aren't, and better at
finding out if the sender is a free email account. 

Personally, it's almost a tie between the two; spambouncer is a bit
better percentage wise, when spamassassin has a threshold of 5. I
haven't had the time to crank down the threshold for spamassassin till
it has about equal percentages with spambouncer; my guess that would
result in a good number of false positives, which would be odious. Right
now the false positives are fairly low, maybe 2 or 3 a week, out of
maybe 200 or so junk emails spotted in the same period. I've gotten
spambouncer to where I hardly EVER get a false negative.

With Spamassassin, the threshold of 5 is letting a few spams thru
unchallenged. At least they both unerringly always get the
Korean/Chinese spam, which seems to be around 80% of the junk I get
personally.


Tips and hints:

1. With spambouncer, define all program paths explicitly in the
.procmailrc; don't count on any of them being in the path.

2. With evolution, and only because I'd like to contribute
 to dsbl, razor, dcc, and ordb, I add a recipe to the .procmailrc which saves
 copy of the incoming emails into a directory, with the (hopefully unique)
 message-id as its title. This way, I have a copy of the ORIGINAL letter on
 hand, which I can feed to razor-report, and dccproc, etc.
 The following are the very last lines in my .procmailrc file:

...
...
...
LETTERDIR=<path to some lonely directory>
MESSID=XXXX
:0:
* ^Message-Id: \/.+
{
        MESSID=${MATCH}
}

:0c
${LETTERDIR}/${MESSID}

:0fw
| /usr/local/bin/spamassassin -P -a

INCLUDERC=${SBDIR}/sb.rc


3. My advice is to mark the letter as read when you file it as spam. This
 is purely for psychological reasons. You can check in the "spam" folder 
 as often as you please, but at least your mail reader isn't going to be 
 calling to you to read the stupid stuff.

4. I wrote a little program that will search thru the stdin for Message-Id
 lines, and, for each one found, will use the message-id field to find the
 copy of the original message, and will feed that letter to razor-report, the
 dsbl spamtrap, and DCC. I could also submit it to ORDB as well... I set 
 up an alias so mail sent to the alias gets run thru this program. spamassassin
 has a "report" option as well, to submit to dcc and razor, which I could have
 used just as well. For all the spam you wish to report, just highlight all the
 appropriate messages in the spam folder, and hit control-j. Give the spamtrap
 address, and away it goes. This will probably be even easier with 1.1 of 
 evolution, so you don't have to address the message to your spamtrap alias
 every time. Setuid the executable of such sendmail-invoked programs.

5. I add a crontab command to limit the number of saved messages to maybe the
 last 3 days or so.

6. Take ALL your regular mailing lists (I'm on maybe 65 or so), and prefilter
 them out in your .procmailrc, so neither spamassassin nor spambouncer see it,
 and you don't save any copies. Why waste cpu cycles on stuff you know won't be
 spam?

7. To keep down false positives, do a whitelist in spambouncer. It seems less
 important in spamassassin (at least, if you stick with the default threshold
 of 5!). Filter your addressbook into the .nobounce file. And if you do short
 domain names in there, like "sun.com", say "@sun.com" instead.

8. I mark spam letters as important, because they end up in different folders,
 based on the tags from spambouncer. All I have to do is glance into the
 "Important mail" VFolder to see them all at once. Nice.

9. Do NOT automatically report all mail the filters tag as spam to razor. You
could do this with dcc, if you are careful that the count increment is "1".
In general, I'd advise a human eyeballs the message and classifies it as spam
before submitting. It will make for much better databases. And, you can 
tell dcc that it's spam with an "infinite" count that way.

It would be real cool if evolution would more tightly couple with the
spam filters, and have some features to help in spam submission to razor
and dcc, etc. But, it munges the incoming letter, which may be bad for checksum
databases like dcc and razor (but then, maybe not...! But rather than take a
chance, I keep an original around.).

What could evolution do to better couple with spam checking software? How
about:

1. Have a special attribute, similar to "importance", for "spam".
2. Provide a default "spam"  folder in evolution, which is just part of the 
   default installation.
3. Default filters that file stuff in the spam folder if X-SBClass: is not "OK",
   or "X-Spam-Flag: YES": is set. This'd give a new user a "fast start". And they
   could set the "spam" attribute.
4. If the configure script spots a working installation of razor-report or
   dccproc, it'd be nice to turn on options that would allow you to submit
   selected letters to them. If everybody feels that evolution-munged letters
   yield the same checksums as the original, that is!
5. Auto-whitelist update for spambouncer and spamassassin, based on the
   addressbook. Maybe even options in the addressbook to mark (checkbox?) 
   for those you don't want to add to the whitelists, with general options
   of what whitelists to generate, and where they would be located. Spamassassin
   and spambouncer don't use the exact same format. Maybe a button to write the 
   whitelists from the addressbook... or somesuch.


murf


On Tue, 2002-07-09 at 12:22, evolution-admin ximian com wrote:
Message: 14
From: "Patrick J. Doland" <pjdoland pjdoland com>
To: evolution ximian com
Date: 09 Jul 2002 14:33:11 -0400
Subject: [Evolution] Spam Filter

Howdy-

Does anyone have an relatively effective set of filter rules to reduce
spam with Evolution? 




Attachment: signature.asc
Description: This is a digitally signed message part



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]