[Evolution-hackers] spam filtering

From: Radek Doulík <rodo ximian com>
To: Evolution Hackers Mailing List <evolution-hackers ximian com>
Subject: [Evolution-hackers] spam filtering
Date: Wed, 01 Oct 2003 18:27:19 +0200

Hi all,

before I start implementing spam filtering for evolution, I would like to discuss my plan. Please read the whole mail and comment. I am describing the model from user view and then implementation details and some things to think about. I took Ettore's model as a base and modified it a little bit - mostly simplified.

User view

incoming messages are identified by spam filter as spam or nospam (IMAP messages are filtered once completed - fully downloaded).
spam messages are moved to Spam folder or deleted
new [No]Spam button on toolbar and item in menubar Actions/[No]Spam. when message was identified as Spam, button/item says NoSpam to revive the message from Spam folder (spam flag is set to false and incoming message filters are applied). For nospam messages it says spam to mark message as Spam (spam flag is set to true and message is moved to Spam folder).

new page labeled "Spam filtering" in Mail preferences section of Settings dialog

[checkbox] filter incoming messages - default: enabled
Spam messages are [option menu - moved to Spam folder/deleted] default: moved to Spam folder
Spam filter [option menu - spam filters list] default: 1st filter
Filter options frame with filter specific options

Described above is the simplest model I have. I think simplicity is good here. It also lowers risks of time based schedule. Additional features could be implemented once this model works.

Additional features

display spam filter score

"Check spam" filter rule

some people may not want to filter every incoming message (because it could be too slow) and instead to filter messages only per folder. (it's OK to have spam messages in mailing list folders but not in personal mail folder)

more - add your favorite feature here

What do you think about this model?

How do you feel about forcing spam messages to be listed only in Spam folder?

Implementation

I believe it's worth to make spam filter(s) pluggable. There are advantages it has:

it's possible to develop spam filter plugin outside evolution => faster development, lower barrier for external developers
simple API, no added complexity
I don't see anything we cannot do with plugins compared to filter implemented inside evolution

Plugin will be shared library which will be loaded by dlopen/dlsym. Evo will get SpamFilterStruct by dlsym, check api_version and then use supplied methods.

typedef struct _SpamFilterPlugin SpamFilterPlugin;
struct _SpamFilterPlugin
{
	/* spam filter human readable name */
	gchar *name;
	/* should be set to 1 */
	gint   api_version;

	/* when called, it should return TRUE if message is identified as spam,
	   FALSE otherwise */
	gboolean (*check_spam)    (CamelMimeMessage *message);
	/* called when user identified a message to be spam */
	void     (*report_spam)   (CamelMimeMessage *message);
	/* called when user identified a message not to be spam */
	void     (*report_nospam) (CamelMimeMessage *message);

	/* when called, it should insert own GUI configuration into supplied.
	   container. returns data pointer which is later passed to apply,
	   plugin has to call (*changed_cb) (); whenever configuration
	   is changed to notify settings dialog about a change.
	   if setup_config_ui is NULL, it means there are no options */
	gpointer (*setup_config_ui) (GtkWidget *container, void (*changed_cb) ());
	void     (*apply)           (gpointer data);
};

򻮻

Spam will be identified by check_spam method, spam status changes will be reported to filter by report_[no]spam methods. Plugin may or may not provide configuration gui for Settings dialog.

Spam flag will be stored in X-Spam: header. Also for IMAP we may need X-Evolution-Spam-Checked header.

>From discussion on the mailing list, it looks like everybody is for using vFolder for Spam folder. I am not sure if it's that great. Consider this: about 90% of spam messages is identified right, so at worst only 10% of spam will be moved between folders. I am not sure how resources hungry vfolders are. Also messages which end in vfolder, stay there until Expunge. So if I am correct we have to implement message removal from vfolder, mail guys is that right?

If we put them in vfolder, are they going to be visible in the source folder?

If spam messages will stay in Spam folder only, we don't need new mail message list column with spam flag and also "Delete spam mails" action in menu.

So the spam mails location seems to be crucial here. I like the simplicity of spam mails to be only visible in Spam folder. What do you think, are there any advantages of having spam messages visible in source folders?

I plan to write Spamassassin and Bogofilter plugins (I expect it may work faster, but I tried only spamassassin so far).

Looking forward to your comments
Radek

Follow-Ups:
- Re: [Evolution-hackers] spam filtering
  - From: Steve Heggood
- Re: [Evolution-hackers] spam filtering
  - From: Mark Gordon
- Re: [Evolution-hackers] spam filtering
  - From: Jeffrey Stedfast
- Re: [Evolution-hackers] spam filtering
  - From: Curtis C. Hovey
- Re: [Evolution-hackers] spam filtering
  - From: Not Zed
- Re: [Evolution-hackers] spam filtering
  - From: Brett Johnson
- Re: [Evolution-hackers] spam filtering
  - From: Ettore Perazzoli
- Re: [Evolution-hackers] spam filtering
  - From: guenther
- Re: [Evolution-hackers] spam filtering
  - From: Dave Malcolm

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]