Spam Filtering on the iSchool server

  
  
  
  
Layout:

Get the Flash Player to see this player.

Mark Video Segment:
begin
end
play
[Hide] Copy and paste this link to an email or instant message:
[Hide] Right click this link and add to bookmarks:
Dock windowSearch
Terms:
 
Loading ...
Metadata
Publisher:School of Information
Creator:Sam Burns, video by Quinn Stewart


One of the unfortunate realities of having an email account is that sooner or later you will receive unwanted (spam) emails from a variety of sources and possibly in substantial quantities. Negative feelings about spam range from mild annoyance to outrage depending on the type and quantity of spam received. Spam often makes reading and managing the email you care about more cumbersome, and at its worst, may contain offensive language, marketing scams, or some other malicious intent.

Because determinations about spam are ultimately unique to each individual user, there is no way to systematically mark and delete spam email from the server AND have complete assurance that all non-spam email makes it to you. In light of this, the School of Information utilizes open source spam filtering software called SpamAssassin that scores all of the email sent to your ischool.utexas.edu address (or forwards to that address) and adds the spam score in the header portion of your messages. SpamAssassin is a content based email filtering system that uses machine learning algorithms and other rule-based methods to determine the relative "spamminess" of each email based on a number of different features of each email.

This tutorial explains how to filter your spam email based on the spam score applied by the SpamAssassin email filtering tool, using WebMail to setup the filter. This will filter spam and either set it aside, or delete it, BEFORE it is sent to your email client. If you chose to set your spam aside, you will have to login to monitor your spam and make sure you are not exceeding your disk quota.

Contents

How SpamAssassin Scores

X-Spam-Flag: YES
X-Spam-Checker-Version: SpamAssassin 3.1.1 (2006-03-10) on
        fiat.ischool.utexas.edu at Tue, 28 Mar 2006 16:13:47 -0600
X-Spam-Level: ************
X-Spam-Status: Yes, hits=12.8 required=5.0 tests=DATE_IN_PAST_06_12=0.746,
        EXTRA_MPART_TYPE=0.815,HTML_IMAGE_ONLY_32=0.836,HTML_MESSAGE=0.001,
        HTML_TAG_EXIST_TBODY=0.126,RAZOR2_CF_RANGE_51_100=0.5,
        RAZOR2_CF_RANGE_E4_51_100=1.5,RAZOR2_CF_RANGE_E8_51_100=1.5,
        RAZOR2_CHECK=0.5,URIBL_SBL=1.094,URIBL_SC_SURBL=3.6,
        URIBL_WS_SURBL=1.533 autolearn=spam version=3.1.1
X-Spam-Report:
        *  0.8 EXTRA_MPART_TYPE Header has extraneous Content-type:...type= entry
        *  0.7 DATE_IN_PAST_06_12 Date: is 6 to 12 hours before Received: date
        *  0.8 HTML_IMAGE_ONLY_32 BODY: HTML: images with 2800-3200 bytes of words
        *  0.1 HTML_TAG_EXIST_TBODY BODY: HTML has "tbody" tag
        *  0.0 HTML_MESSAGE BODY: HTML included in message
        *  1.5 RAZOR2_CF_RANGE_E8_51_100 Razor2 gives engine 8 confidence level
        *      above 50%
        *      [cf: 100]
        *  1.5 RAZOR2_CF_RANGE_E4_51_100 Razor2 gives engine 4 confidence level
        *      above 50%
        *      [cf: 100]
        *  0.5 RAZOR2_CHECK Listed in Razor2 (http://razor.sf.net/)
        *  0.5 RAZOR2_CF_RANGE_51_100 Razor2 gives confidence level above 50%
        *      [cf: 100]
        *  1.1 URIBL_SBL Contains an URL listed in the SBL blocklist
        *      [URIs: nullywood.com]
        *  1.5 URIBL_WS_SURBL Contains an URL listed in the WS SURBL blocklist
        *      [URIs: nullywood.com]
        *  3.6 URIBL_SC_SURBL Contains an URL listed in the SC SURBL blocklist
        *      [URIs: nullywood.com]

The text above are entries that SpamAssassin adds in your email headers when it runs each of your emails through a series of spam tests. In the case of the email above, SpamAssassin identified the email as spam and gave it a spam score of 12 (denoted by ************). The spam score is denoted by the number of asterisks following the header label X-Spam-Level. SpamAssassin scores of 5 or more are considered to be spam and the indication "YES" will be made next to the header label X-Spam-Flag: in those cases.

In order to set up filters on the server that will route spam email you will need to write a rule(s) that will take some action (e.g. move spam to a directory called spam or delete it) based on some condition (e.g. a spam score). The following steps will show you how to set up two spam filtering rules, one that deletes all spam emails above a certain score (we selected a score of 15) and another that moves all email from a minimum spam score of 5—the baseline threshold for Spamassassin to label an email as spam—to a directory called spam on the server. We are confident that by deleting spam with a score of 15 or higher automatically that you run an infinitesimal chance of deleting a non-spam email (i.e. a false positive), but you are free to choose not to automatically delete any of your emails or to adjust the second filter settings to a little higher score. If you only want to delete spam from your email automatically, just follow Part 1 of this tutorial and set your spam score accordingly. Likewise, if you only want to separate your spam without deleting it only follow Part 2 of this tutorial.

Part 1: Set up a spam filter on the server

First we will set up a rule that will delete all spam above a certain score

  1. If you have never checked your @ischool mail using Webmail or pine you need to do this first so that a mail directory will be created in your account. If you are unfamiliar with Webmail follow these instructions in order to log into Webmail. When you have finished, logout and return to this tutorial.
  2. Go to http://www.ischool.utexas.edu/horde and log in using your iSchool user name and password
  3. horde_login.gif

  4. Click on the small + sign next to the Mail link on the left-hand navigation and then click on the Filters link
  5.    mail_filter.gif
    
  6. At this point you should see a list of approximately four Existing Rules. At the bottom of that list, click the button marked New Rule. [note: the color scheme of this tool varies based on permissions so your color scheme will be one or the other of the ones depicted here].
  7.  new_rule.gif
    
  8. In the field marked New Rule, type your new rule name. Since our first rule is going to be to delete all email above a certain score this rule will be named spam_delete or something similar.
  9.  rule_name.gif
    
  10. Make sure the All of the Following radio button is indicated (it should be by default), and then select X-Spam-Level from the Select a field dropdown box.
  11.  x_spam_level.gif
    
  12. A new dropdown box with the word contains should appear alongside a blank text field box. Leave the contains field as it is, and add the appropriate number of asterisks equal to your delete spam threshold score. A score of 15 is very unlikely to result in any false positives, but you may adjust this number depending on your experiences. If you have no other information to go on, just make it 15 and type *************** in the blank.
  13.  spam_score.gif
    
  14. Click the Do this dropdown back and select Delete message completely
  15.  do_this_delete.gif
    
  16. Next click Save
  17. Now click on Script in the menu bar along the top (Script is the sixth icon from the left)
  18.  script.gif
    
  19. Click the Activate Script button. This will save a script in the home directory of your server space called .procmailrc.
  20.  activate_script.gif
    

Part 2: Next we will set up a rule that will move the remaining spam to a directory called spam

  1. If you are not already logged into horde Go to http://www.ischool.utexas.edu/horde and log in using your iSchool user name and password
  2. Click on the small + sign next to the mail link and then click on the Filters link
  3. mail_filter.gif

  4. Click the button marked New Rule
  5.  new_rule.gif
    
  6. In the field marked New Rule, type your new rule name. This rule will move all email that is spam (i.e. contains a score of 5 or higher) to a directory called spam. Remember that the first rule deletes all email with a spam score of 15 or higher so this should move any email that has a spam score between 5 and 15. You can call this rule spam.
  7.  rule_name_spam.gif
    
  8. Make sure the "All of the Following" radio button is indicated (it should be by default), and then select X-Spam-Level from the "Select a field" dropdown box.
  9.  x_spam_level.gif
    
  10. A new dropdown box with the word "contains" should appear alongside a blank text field box. Leave the "contains" field as it is, and add the appropriate number of asterisks equal to your move spam threshold score, which is 5. Type ***** in the blank.
  11.  spam_score2.gif
    
  12. Click the Do this dropdown back and select the Deliver to folder: option.
    • If next to the Do this dropdown you have a blank field, type mail/spam as the directory name.
    •  spam_move_1.gif
      
    • If instead you have a dropdown box labeled Select target folder choose Create new folder and name your folder spam. Then select your newly created folder from the dropdown menu.
    •  spam_move_2.gif
      
  13. Next click Save
  14. Now click on Script in the menu bar along the top (Script is the sixth icon from the left)
  15.  script.gif
    
  16. Click the Activate Script button
  17.  activate_script.gif
    

Now you should have two active spam filtering scripts on the server. The first one deletes all spam email with a spam score of 15 or higher and the second one moves the remaining spam to a folder called spam that is located in your home directory. It is important that you follow these instructions in this order with delete first and send to folder second otherwise none of your spam will get deleted automatically.

You should check the spam folder in your server space periodically to both remove messages from your server space and to assess whether or not you feel comfortable adjusting your spam score threshold higher or lower depending on what you see being filtered.

Adapted from Sam Burns most excellent tutorial

Dock windowTranscript
This tutorial is going to guide you through using Spam assassin to filter spam out of your e-mail on the iSchool server.
From the main iSchool webpage, at the end of the URL, were going to type horde, h-o-r-d-e
and then we're going to login with our username and password.
The next thing will do is click mail here, and you can see we've got a pretty good problem with spam in this e-mail account.
Let's quickly take a look at one of these, and we'll show you what SpamAssassin is doing in the background.
I'm going to go down here to where it says Headers here, and have it show all Headers.
And what we're interested in is this line right here, X-Spam level, each one of these asterisks represents a score that SpamAssassin has assigned to this e-mail.
And so it's pretty sure that this is spam, and were going to trust it to make this decision for us, in a simple fashion.
so what were actually going to do is go appearing create some filters, and filter based on this expand level here,.
So I'm going to go the filters here, and yours should probably look something like this. What we're going to do is go down here and create a new rule.
I'm going to click on new rule here, and we're going to give this ruling name, were going to call this spam_sign delete.
We're going to make sure that all of the following is selected right here, and then were going to pull this down, and use X. spam level here.
And here were going to leave contains open right here, and remember this uses Asterix right here, to actually score this, so we're going to use 15 of whom write here
and then we're going to go down here to do this, and have it delete messages completely.
And were going to stop checking if this rule matches. Now are going to click save here, to save this rule.
And you can see, that this is activated our script here,. Will need to make sure in our options here, for filters, that this box is checked, and that stays, and that will automatically update the scipt each time we mae a change to it.
So now let's go back to our filter rules, and we're going to do now is add another rule.
And here were just going to call the spam, because our first rule actually eliminated everything that had a score of 15 or above, what we're going to do is make sure all of the following is selected here.
were going to go back to that egg spam level again, we're going to make sure this time it contains five*12345.
So what that's going to do is move any e-mail that has five Astra soar above into a folder called spam.
We'll actually were going to have to create that folder, By going down here and go to liver to folder, and will have to select a target folder, we're going to create a new folder, and were going to call it mail/spam.
And we're going to click okay here, and what we're going to see is now we have a folder called spying him, that if it scores between five and 15, it goes into this folder so we can actually check and make sure if it's a him.
So now I'm going to save this rule, and it should have activated the script as well, and we can go back to our e-mail here and if we'll actually pull this stand on, will see the spam folder that we created down here.
So everything that has a spam score between five and 15 will go into this folder everything that is five or below will go into our regular inbox, and anything that's over 15 will just go away.
Now let's see if our spam filters actually working. I've waited a little bit of time till I'm pretty sure somebody has tried to send me some spam, and now what I'm going to do is go look into the spam folder here.
And in here I can see some spam, this was yesterday 630 and here's 430 in the morning for 42 1245 in the morning, so I know these have come in since the last e-mail that went into my inbox, which was yesterday.
So these have been successfully been moved to the spam folder, and if I'll take a look at them here, and look at the full headers here, I should see a score between five and 15.
So I've got 123456789 it looks like 11 or 12 right here, so that let's me know that this filter is actually working, and that anything between five and 15 should be going into this box.
But the important thing for me is to go back to my inbox, and see that after yesterday, all the spam has disappeared. And anything that scored over 15, has gone away.
So let's look one more time, had her spam folder here, and what we can see rather quickly is that this is all spam to.
So you connect a monitor this for a while, and change the number*selected, and your filter rules, to kind of find two what you determine this band and what's not based upon how Spam assassin scorched things.
Dock windowTable of contents
Setup spam filters
To delete spam completely
To move spam to a folder for evaluation
Testing spam filters