Too much spam is getting through MailScanner. What is the problem and how can I fix it?

Spammers are always trying to keep one step ahead of the spam scanning systems and it is a constant battle to keep up with them. Because of this it is important to keep MailScanner and all its components up to date and be sure that everything continues to run correctly as when it was first installed. There are a few things you can check to improve the effectiveness of MailScanner against spam:

1. Make sure MailScanner and all the components are up to date, including SpamAssassin.

2. Make sure crond is running (try "ps axf | grep crond").

3. Check whether the file /etc/mail/spamassassin/ exists. If it doesn't, you might want to create such a file and add the following lines to it:

score URIBL_SBL 5.0
score URIBL_AB_SURBL 5.0
score URIBL_OB_SURBL 5.0
score URIBL_PH_SURBL 5.0
score URIBL_SC_SURBL 5.0
score URIBL_WS_SURBL 5.0
score URIBL_JP_SURBL 5.0
These lines increase the score for SA rules that check for known spam-related URLs contained within emails.

4. Make sure DCC and Razor are working and being used by SpamAssassin. Check in Mail Control by doing a Search and creating a report with the filter "SpamAssassin Rule CONTAINS dcc" to check for dcc, and "SpamAssassin Rule CONTAINS razor" for Razor. You can check them both at the same time as often both will be triggered on the same spam mail. You should get a positive result, i.e. a message count of greater than 0.

If DCC and/or razor are not working, check that the following ports are open in your firewall:

DCC   - out-bound UDP port 6277
DCC   - out-bound TCP port 587 (for reporting spam)
Razor - out-bound TCP port 2703
If those ports are open but DCC or Razor are still not working, try reinstalling them. You can find instructions in #2 in this article:

5. Check that your Bayes database is working correctly.

Check that SpamAssassin is actually using the Bayes database. Go to WHM > ConfigServer MailScanner Front-End > MailControl. In MailControl, click on the Menu button at the top right, then SpamAssassin Lint Test. Search for "dbg: bayes" and make sure there aren't any errors related to opening and accessing the bayes database. There should be a line that says dbg: bayes: score = [number]. This means bayes is working and it has given a score for the lint test.

In MailControl, click on the Menu button (top right) and then SpamAssassin Bayes. The date for the Last Journal Sync and Last Expiry should be within the last day or two. If it is not that recent, it could be because the bayes database is too large and it is timing out when attempting to expire and/or sync it. The maximum size will vary from server to server so it is not possible to give an optimum or maximum size. If it seems large or the last expiry date is not in the last 24-48 hours, you should either delete the database (see below) or attempt manual expiry. For manual expiry do the following in SSH as root:

/usr/local/cpanel/3rdparty/bin/sa-learn --force-expire
If the database is quite large this may take several minutes.

If you are seeing obvious spam being given a negative score for low bayesian spam probability, consider deleting the bayes database and starting again. Instructions this are in this FAQ.

6. You could start training the bayes database yourself using examples of emails that have been mis-identified. Please see these two FAQs:

7. Check the SpamAssassin Rules section in the headers or in MailControl for a selection of emails that are clearly spam but have not been identified as such. Make sure they are not being whitelisted for some reason. If the spam report for some of them says "too large", you might want to increase the "Max Spam Check Size" setting in the MailScanner configuration to something larger than what it's set to by default.

8. Check the SpamAssassin Rules section in MailControl for the spamassassin test "URIBL_BLOCKED". If you are getting a lot of these, it may mean that your nameserver is not allowed to access certain RBLs via SpamAssassin. In this case you may want to switch to using a local nameserver if you are able to. Bind must be configured as a local caching nameserver on your server (i.e. you must make sure that you allow localhost DNS recursion in /etc/named.conf - your server admin should be able to do this if it is not already configured this way). Change the first nameserver listed in /etc/resolv.conf to, eg. like this:

If you do this step and have not already been manually training your bayes database, it might be a good idea to reset your bayes database at the same time so that it becomes more accurate more quickly. Instructions are in this FAQ.

9. Make sure your exim configuration is optimised. See this FAQ for our recommendations: Exim Configuration

In particular we would recommend using the spamhaus and spamcop RBLs at exim level rather than MailScanner.

10. If you want to do even more, you could try adding some extra Spamassassin rules and other add-ons such as the sought rules and the KAM ruleset.

Kam rules are here (just put this file in your /etc/mail/spamassassin/ directory):

(The Kam rules are now included by default on cPanel servers.)

