Re: [Evolution] Junk handling wishlist



On Mon, 2009-03-09 at 13:12 -0400, Sal Valente wrote:
Patrick O'Callaghan wrote:

Have you read

http://www.go-evolution.org/FAQ#Why_does_Evolution_not_automatically_filter_for_spam.3F ?

Yes.  It says:

Note: "Learn", not identify. Messages are learned either by manually
classifying them, or if a certain threshold is reached (which is more
extreme than the line between Spam and Ham).

That's exactly the distinction I'm talking about.

The above refers to the *initial* learning phase when you first install
the filter. In fact it's best to do that manually, outside of Evo.

Patrick wrote:
 
2. It should be easy to make sure that I train my spam filter with
every single email that I receive.

If it's being junk-filtered, the filter is being trained.

A message can be filtered but not trained.  Or, to use the terminology
in the FAQ, a message can be identified as spam but not trained as spam.

You may be right. I've reviewed the documentation and I find it very
unclear on this point, but the online help does contain the following:
"When you correct it, the filter can recognize similar messages in the
future, and becomes more accurate as time goes on.", which would favor
your interpretation.

However if this is the case, the point of training is to tell the filter
where it went wrong, i.e. it trains when you manually correct its
classification. Why would you train it if it got it right?

2a. For the messages outside of my Junk folder, can the user
interface
show the status - Not Junk or Unknown?

If it's not in the Junk folder, it's not Junk.

Once again from the online help: "Messages that are flagged as junk mail
are displayed only in the Junk folder."

When I see an old message in some folder other than the Junk folder, I
know that one of three things has happened.  Either:

1. bogofilter said the message was 100% ham, and the message was
learned.
2. bogofilter said the message was 50% ham, and then I clicked "Not
Junk",
   and the message was learned.
3. bogofilter said the message was 50% ham, and the message has not been
   learned.

4. Bogofilter has not seen the message. The Junk filter only looks at
new (\Unseen) messages, unless you mark them manually for training or do
Message->Check For Junk.

I want the user interface to identify "Not Junk" messages (types 1 and
2) and "Unknown" messages (type 3).

Currently Evo only marks Junk messages by putting them (virtually) in
the Junk folder. Everything else is either not Junk or not classified,
i.e. it has no concept of "Classified but Unknown". What you are asking
for is an enhancement, which you should request on
http://bugzilla.gnome.org.

Note that you can use "<Junk Test> <is Junk>" in filters, which might
get you part of the way, perhaps assigning a color or label as the
filter action.

The interface should discourage
me from clicking "Not Junk" (again) on messages of type 1 and 2, and
it should discourage me from deleting messages of type 3 without first
clicking "Not Junk".  This seems like a fairly basic requirement for
spam handling.  I've used Thunderbird a little bit, and I've used
Apple Mail a little bit, and I think that they both do it.  I think
they color-code the message headers.  I assumed that Evolution can do
this too, somehow.  Can't it?

Not as far as I know, see above.

Also:

2c. When I do "Check for Junk" (or the check happens automatically)
and Evolution moves a message to the Junk folder, can it train the
message as Junk while moving it?

That's what it does. Do you have an indication that that isn't
happening?

To repeat: simply running BF doesn't train it. It trains when you
correct it. See above.

Yes.  First, I run "bogoutil -d .bogofilter/wordlist.db | grep
MSG_COUNT"
and it says:
.MSG_COUNT 430 919 20090309

Then, I go to evolution, select a new message, and do "Check for
Junk".  Evolution moves the message into my junk folder.  Then I run
the bogoutil command again, and it still says "430 919".

AFAIK "bogoutil -d .bogofilter/wordlist.db | grep MSG_COUNT" gives you
the spam and ham word counts, which is not the whole story, e.g. if your
new spam message doesn't contain any new words these counts will not
change, *even if* BF is training on the message. You'd need to fabricate
a message that looks to BF like ham, but with some unknown word,
explicitly mark it as Junk, and repeat the experiment to see what
happens.

Sorry for the confusion.

poc




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]