[Kde-pim] CRM114 antispam score display (was: Re: branches/KDE/3.5/kdepim/kmail)
Martin Steigerwald
Martin at lichtvoll.de
Sun Jul 8 18:07:56 BST 2007
Am Sonntag 03 Juni 2007 schrieb Ingo Klöcker:
Hi Ingo, Andreas, KMail and KDEPIM developers,
> On Saturday 02 June 2007 01:18, Martin Steigerwald wrote:
[...]
> > Am Montag 28 Mai 2007 schrieb Ingo Klöcker:
[...]
> > Headers look like this:
> >
> > ---------------------------------------------------------------------
> > martin at shambala:~/Mail> grep -ir "X-CRM114-Status:" * | cut -d":"
> > -f3,4 | grep SPAM
> > X-CRM114-Status: SPAM ( -43.62 )
> > X-CRM114-Status: SPAM ( -17.78 )
> > X-CRM114-Status: SPAM ( -61.96 )
> > X-CRM114-Status: SPAM ( -15.03 )
> >
> > martin at shambala:~/Mail> grep -ir "X-CRM114-Status:" * | cut -d":"
> > -f3,4 | grep GOOD | head -10
> > X-CRM114-Status: GOOD ( 11.09 )
> > X-CRM114-Status: GOOD ( 304.35 )
> > X-CRM114-Status: GOOD ( 81.34 )
[...]
> > martin at shambala:~/Mail> grep -ir "X-CRM114-Status:" * | cut -d":"
> > -f3,4 | grep UNSURE | head -10
> > X-CRM114-Status: UNSURE ( -1.80 )
> > X-CRM114-Status: UNSURE ( 3.46 )
> > X-CRM114-Status: UNSURE ( 3.68 )
> > X-CRM114-Status: UNSURE ( 9.94 )
[...]
> It seems we have to introduce yet another score type since with CRM114
> spam has large negative scores while ham has large positive scores.
Well yes. Maybe something general where you can specify the complete score
range and the necessary thresholds would be suitable.
ScoreRange=-400,400
ScoreUnsureThreshold=-10
ScoreGoodTreshold=10
Or just one range for each of those?
> > From what I understand I need to know the exact treshold on that
> > CRM114 classifies a mail as SPAM at least?
>
> Yes.
I will ask on the crm114-general mailinglist for that one. CRM114 does not
seem to specify the treshold in its headers and depending on the
classifier one uses the tresholds may vary. Maybe it would be good if
CRM114 puts thresholds for SPAM and UNSURE into the headers somehow.
> > Ingo, Andreas what about mails classified as UNSURE? Does spam score
> > display in KMail support those?
>
> Well, I guess for scores corresponding to UNSURE the color bar should
> be partially filled. For ham the color bar should be empty and for spam
> it should be completely filled.
Actually I do not quite understand the spam score display completely...
> > I have holidays in the next two weeks, I will be with limited
> > internet access next week but after that really like to take the time
> > to look into trying to bring together suitable KMail spam score
> > display configuration statements for KMail to finally complete the
> > CRM114 configuration for KMail...
... well after facing the difference of theory and experience I managed to
do at least a minimal spam score display for CRM114. I just put in a
boolean filter for now[1]:
ScoreName=CRM114
ScoreHeader=X-CRM114-Status
ScoreType=Bool
ScoreValueRegexp=SPAM
ScoreThresholdRegexp=
But as far as I understand thats the best that works out of the box for
now. At least KMail makes a difference between spam and ham/unsure mails
in the spam score display.
But I do not yet get that: When I mail is spam I get a color gradient from
green over yellow to red displayed. Is that correct behaviour? I wonder
why there is green in there after all when its a spam. When a mail is
unsure or ham I get a blank box. I would have expected something green
here ;-). For some mails that were flagged by SpamAssassin I got a
partially filled box with a partial color gradient, for example the
gradient up to yellow. I would have interepreted this as UNSURE.
So how does this actually work? Maybe it should be rethought a bit, I do
not think its very intuitive. I would use the following:
- a red (SPAM) / yellow (UNSURE) / green (HAM) box for a boolean /
triplean ;-)
- a red (SPAM) / yellow (UNSURE) / green (HAM) bar that displays the
amount of spamicity, unsurecity or hamicity. Hmmm, but this might be
confusing as well. Need to think about this a bit more.
Anyway, to support unsure mails in the spam score displays some C++ code
needs to be touched. As well as for supporting a new score type for the
CRM114 score range. I did not yet dig into this. My last C programming
experience is years ago, and that wasn't C++ altough it was using an
object orientated GUI framework nonetheless. Well let's see. If I manage
to take some more time for this, I will have a look at the source code of
the antispam stuff and look whether I can make a sense of it.
If someone wants to help with the C++ part, I gadly appreciate it. And if
I have questions when looking at the source I will find someone to ask
those ;-).
[1]
http://websvn.kde.org/trunk/KDE/kdepim/kmail/kmail.antispamrc?r1=675120&r2=685300
Regards,
--
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA B82F 991B EAAC A599 84C7
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part.
URL: <http://mail.kde.org/pipermail/kde-pim/attachments/20070708/1b365a7e/attachment.sig>
-------------- next part --------------
_______________________________________________
kde-pim mailing list
kde-pim at kde.org
https://mail.kde.org/mailman/listinfo/kde-pim
kde-pim home page at http://pim.kde.org/
More information about the kde-pim
mailing list