Kmail and spam filtering

Nigel Henry cave.dnb at tiscali.fr
Thu Jul 27 23:05:04 BST 2006


On Saturday 22 July 2006 23:01, Thierry de Coulon wrote:
> On Saturday 22 July 2006 22.16, Nigel Henry wrote:
> > I DL'd Bogofilter 1.0.2, and 1.0.3, and read through a lot of the FAQs,
> > and have just found some HOWTOs for Bogofilter and Kmail, on Google.
> >
> > Thanks.  Nigel.
>
> I hope you get it working.
>
> In short, here is how it works for me:
>
> I set up three filters in kmail:
>
> 1) filter "bogoheader" set to:
>
> match all of the following
>
> <any header>           matches regular expression              .*
>
> remove header             X-Bogosity
> remove header            X-attachments
>
> 2) filter "bogofilter" set to:
>
> match all of the following
>
> <any header>           matches regular expression              .*
>
> Pipe through         bogofilter  -epv
>
> 3) filter "bogofilter_is_spam" set to:
>
> X-Bogosity    contains   YES
>
> move to   <whatever you like>
>
> I first chose to move "spam" to a special directory to check them, but now
> that I know it works well they are moved directly to trash, and that is
> emptied automaticaly.
>
> I also set two mail directories (spam and NonSpam) where I put spam
> messages that got through (and when training in NonSpam examples for good
> messages). I then wait until "spam" contains around 100 spams and run the
> folowing script (I use it under the name bogolearn.sh from my home
> directory). You must of course check the info in the first part and adapt
> it to your settings:
>
> ************************ bogolearn.sh***************************
> #!/bin/sh
> # train bogofilter with new spam and non-spam
> BOGOFILTER="/usr/bin/bogofilter";
> MAILDIR="/home/<user>/Mail"
> GOODDIR="NonSpam/cur";
> SPAMDIR="spam/cur";
> GOODLIST="goodlist";
> SPAMLIST="badlist";
>
> cd $MAILDIR/$SPAMDIR;
> echo Spam:
> for i in *; do
> if [ -z "`grep -e " $i " $MAILDIR/$SPAMLIST`" ]; then
> echo Processing Mail ID \#$i;
> bogofilter -s -v < $i ;
> echo " $i " >> $MAILDIR/$SPAMLIST;
> fi
> done;
>
> cd $MAILDIR/$GOODDIR;
> echo NonSpam:
> for i in *; do
> if [ -z "`grep -e " $i " $MAILDIR/$GOODLIST`" ]; then
> echo Processing Mail ID \#$i;
> bogofilter -n -v < $i ;
> echo " $i " >> $MAILDIR/$GOODLIST;
> fi
> done;
> *********************** end bogolearn.sh **********************
>
> That's it. I tried spamassassin but it's fairly slow if you check your mail
> once or twice a day.
> The first couple of spam-filtering with bogofilter will probably be rather
> poor - put what got through in "spam" and run bogolearn. In my experience,
> filtering results get good very soon and remain so.
>
> Thierry

Hi Thierry.  I've got Bogofilter 1.0.2 working ok. It's separating the Spam, 
and the Ham ok, although I've had a few problems setting it up. There appear 
to be quite a few changes in 1.0.2 from the version you're using.

For example. In the bogofilter_is_spam filter, "bogosity   contains    Yes", 
has now been changed to, "bogosity   contains   Spam".  Yes, and No have been 
changed to Spam, and Ham respectively, with the addition of another 
possibility "Unsure". So the default with version 1.0.2 is tri-state.

Using the line, "bogosity     contains     Spam (Yes)" just results in all the 
incoming mail going straight to the temporary "spam" folder.

I had to add a second line, so that "Filter Criteria" now reads.
bogosity           contains             Spam  (Yes)
bogosity      doesn't contain      Ham   (No)

There is still only one entry for the mailbox to send the Spam to. The Ham 
seems to automatically end up in the inbox.

I'm still not sure if this is working as it should, as all the mail that ends 
up in the spam box does not have any X-Bogosity additions to the headers, but 
all the mail ending up in the inbox has X-Bogosity additions to the headers, 
indicating that it is Ham.

Also I havn't figured out how to get mail that bogofilter is "Unsure" about   
set up.  Adding a third line to "Filter Criteria" for "Unsure", and then 
applying, just results in the line being greyed out.  Apart from that, I'm 
getting very good results from bogofilter. No false negatives, and only 2 
false positives up to now.

Nigel.



___________________________________________________
This message is from the kde mailing list.
Account management:  https://mail.kde.org/mailman/listinfo/kde.
Archives: http://lists.kde.org/.
More info: http://www.kde.org/faq.html.




More information about the kde mailing list