Re: [Evolution-hackers] spam filtering

From: Brett Johnson <brett hp com>
To: Radek Doulík <rodo ximian com>
Cc: Evolution Hackers Mailing List <evolution-hackers ximian com>
Subject: Re: [Evolution-hackers] spam filtering
Date: Thu, 02 Oct 2003 09:59:27 -0600
On Wed, 2003-10-01 at 10:27, Radek Doulík wrote:

>       * new [No]Spam button on toolbar and item in menubar
>         Actions/[No]Spam. when message was identified as Spam,
>         button/item says NoSpam to revive the message from Spam folder
>         (spam flag is set to false and incoming message filters are
>         applied). For nospam messages it says spam to mark message as
>         Spam (spam flag is set to true and message is moved to Spam
>         folder).

>From a UI standpoint this seems unnecessarily complicated.  If we're
going to have a dedicated spam folder, couldn't the act of simply moving
(dragging) a message in or out of the spam folder indicate a
reclassification (and call the filter's report_(no)spam methods)?  I'd
much prefer this to adding more cruft to the toolbar & menus (and having
a toolbar button move the message as a side-effect scares me).

> 
>       * new page labeled "Spam filtering" in Mail preferences section
>         of Settings dialog 
>                 [checkbox] filter incoming messages - default: enabled

A point of clarification: This checkbox should not disable the ability
to reclassify spam/ham.  In my case, filtering incoming messages is a
complete waste of time, as I run my spam filter on the server.  But
being able to easily reclassify messages for the server (via an
appropriate plugin) would somewhat useful.

> Additional features
> 
>       * display spam filter score

Seems like pure geek eye-candy to me.  What would be the possible
utility for Average Joe User?  If you really want to see the score, just
display the message headers (almost all spam filters have the ability to
add a spam-score type header to the message).

> How do you feel about forcing spam messages to be listed only in Spam
> folder?

It's OK as long as it's consistent (i.e. putting a message in the spam
folder automatically classifies it as spam, and removing it
automatically classifies it as non-spam).

> Implementation
> 
> I believe it's worth to make spam filter(s) pluggable.

I'll go a step further and state that IMO it's the only way to implement
spam filtering that's worth doing.  If it's not pluggable, don't bother.

> typedef struct _SpamFilterPlugin SpamFilterPlugin;
> struct _SpamFilterPlugin
> {
> 	/* spam filter human readable name */
> 	gchar *name;
> 	/* should be set to 1 */
> 	gint   api_version;
> 
> 	/* when called, it should return TRUE if message is identified as spam,
> 	   FALSE otherwise */
> 	gboolean (*check_spam)    (CamelMimeMessage *message);
> 	/* called when user identified a message to be spam */
> 	void     (*report_spam)   (CamelMimeMessage *message);
> 	/* called when user identified a message not to be spam */
> 	void     (*report_nospam) (CamelMimeMessage *message);

I think it should be specified that these pointers are allowed to be
NULL, and are simply not called in that case.  (I can see an immediate
utility for a plugin that doesn't implement "check_spam", but that does
implement both the "report_" routines).

Also, there needs to be some way to *re*classify spam/ham, ala
bogofilter (i.e. remove it from the spam database, and enter it into the
ham database).  This could be two new entrypoints:

 	void     (*reclassify_spam)   (CamelMimeMessage *message);
	void     (*reclassify_nospam) (CamelMimeMessage *message);

Or, could be an additional argument to the "report_*" methods:

	void     (*report_spam)   (CamelMimeMessage *message, gboolean reclassify);
	void     (*report_nospam) (CamelMimeMessage *message, gboolean reclassify);

> Spam flag will be stored in X-Spam: header. Also for IMAP we may need
> X-Evolution-Spam-Checked header.

I think it would be *much* simpler to identify spam by what folder it's
in (kinda like presence in the "Drafts" folder currently identifies an
unfinished message), and leave message headers (or any munging of the
message contents) completely out of this.  This concept obviously
requires that the user have the ability to specify which folder(s)
contain "spam" -- maybe this could be added to the "Defaults" tab in the
account editor (ala the "drafts" folder)?

> >From discussion on the mailing list, it looks like everybody is for
> using vFolder for Spam folder.

Well, using a vfolder pretty much dictates that the implementation muck
with the message to mark it as spam somehow (which, as I just stated,
would be much more complicated than simply moving the message to a
specific folder).  Not using vfolders would allow the implementation to
co-exist with other server-side spam filtering/classification techniques
(so I'd vote for *not* using vfolders).

> If we put them in vfolder, are they going to be visible in the source
> folder?

Ack -- that's just what I need -- double the number of places that spam
is displayed...  No thanks ;o)

> If spam messages will stay in Spam folder only, we don't need new mail
> message list column with spam flag and also "Delete spam mails" action
> in menu.

Yay!  Less complexity == goodness.

> So the spam mails location seems to be crucial here. I like the
> simplicity of spam mails to be only visible in Spam folder. What do
> you think, are there any advantages of having spam messages visible in
> source folders?

I can't think of any reason to display spam in two places (in fact,
doing so seems to defeat the whole purpose of spam filtering in the
first place, doesn't it?).

Cheers!
-- 
Brett Johnson <brett hp com>
   -  i  n  v  e  n  t  -
References:
- [Evolution-hackers] spam filtering
  - From: Radek Doulík
[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]