Aah, a challenging one. I got asked this same question just today. So you’re sending out e-mails, but the receiver tells you they arrived marked as spam? Then consider yourself lucky, another spam filter just might have totally blocked your e-mail there.
Spam filters will analyze your e-mail content & headers for certain patterns, and assign a score to each pattern (also called Block Rule). If the sum of all those scores exceeds a pre-defined value, the mail will be tagged as spam and either blocked from arriving, or arrive with a specific subject (prefixed *\*SPAM**, as Plesk does).
Here are just some of the reasons mail might be getting blocked at your recipient.
Note; in bold are the Block Rule names often used by spam-filters – they might seem gibberish to you, and you’re probably right too ;-) .
INVALID_DATE
If the Header of your mail does not comply to RFC 2822 (see point 3.3 Date and Time Specification), it’ll be flagged for having an invalid date in the header.
Invalid means it’s got illegal characters, or doesn’t follow the standard layout of DATE. Here’s a header that failed and received a spam-score.
Date: Wed, 27 Aug ´ëÇѹα¹ Ç¥ÁؽÃ
Note the weird numbers of trailing characters. It’s not quite clear why a spammer would use this technique, but it’s easily picked up by mailscanners. A correct date format would look like this:
Date: Mon, 25 Aug 2008 17:22:29 +0200
Other rule names that hint a similar date-issue (not being in the correct format) are: SARE_HEAD_8BIT_DATE, SARE_HEAD_8BIT_RECV. The latter is the “Receiver”-header which is in the incorrect syntax.
**NO_REAL_NAME
** Probably the easiest to decipher, this pattern is detected when no name was used to determine the FROM-header. This will default back to showing the e-mail address, and receives a penalty for that.
BAYES_99
This technique uses the Bayesian Spam Filtering to determine the spam probability based on the content of the e-mail, and the amount of times certain keywords are present. Here’s the “scientific” explanation.
Bayes’ theorem, in the context of spam, says that the probability that an email is spam, given that it has certain words in it, is equal to the probability of finding those certain words in spam email, times the probability that any email is spam, divided by the probability of finding those words in any email.
This is a more mathematical approach in determining the legitimate content of an e-mail.
RAZOR2_CHECK
Vipul’s Razor (not to be confused with Occam’s) is yet another piece of software to detect and prevent spam e-mails. It uses user input to track down illegal e-mails. Users can contribute by marking spam-mails, who’s signatures are then stored and shared to other users.
Detection is done with statistical and randomized signatures that efficiently spot mutating spam content. User input is validated through reputation assignments based on consensus on report and revoke assertions which in turn is used for computing confidence values associated with individual signatures.
RCVD_IN_BL_SPAMCOP_NET
Probably the most common reason e-mails are blocked, is because the sending mailserver (either your company’s, or your ISP’s) is on a blacklist. The Blocking List maintained by SpamCop contains a list of blocked hosts/ips, based on user input. If you think you might be on this list, you can check it on their website.
If your IP address was found on that list, it’ll give you an explanation as to why the block occurred – wrong HELO/EHLO, missing reverse DNS, invalid SPF records, … so you can act quickly to resolve the issue.
FORGED_MUA_OUTLOOK
When the sending utility (often a hacked webserver, sending out mails through sendmail/qmail) pretends to be a program it’s not, it’ll get blocked. A faked header could look like this.
_X-Mailer: Microsoft Outlook Express 6.00.2600.0000
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2600.0000_
Here’s what legitimate X-Mailer headers look like.
_X-Mailer: Apple Mail (2.752.3)
X-Mailer: Microsoft Office Outlook 12.0
X-Mailer: Microsoft Office Outlook 11_
MSGID_FROM_MTA_HEADER
Every mail you send, gets a certain Message ID assigned to it. This is usually done by the sending mailserver (your local mailserver, or your ISP’s). If the message ID was missing, it can be added by a mailrelay – a mailserver your e-mail message passes in order to get to its destination.
If a mailrelay-server adds the message-ID, a spam-score will be assigned to it. This can also be known as block rule MSGID_FROM_MTA_ID.
SUBJ_ALL_CAPS
Writing your subject in all capital letters is never a good thing, certainly not when you have a receiver with a spam filter enabled.
URIBL_BLACK
The content of the message will pass through several scanners, and one of those checks for URLs used in your e-mail. If one of those URLs is on a blacklist (such as SURBL), a spam-score will be assigned.
This block rule is also known as URIBL_JP_SURBL, URIBL_OB_SURBL, URIBL_SBL, URIBL_SC_SURBL, URIBL_WS_SURBL. You can check to see if a URL is blacklisted by visiting the SURBL+ Checker Tool.
UPPERCASE_25_50
It’s important to keep a good UPPER-case/lower-case ratio. If 25%-50% or more of your e-mail is written in all capital letters – you have a high chance of receiving a penalty for this.
INLINE_IMAGE
This gets more common by the day, as more and more people have a standard signature or footer for each mail they send out, with their corporate logo in it.
It might look nice, but it’s a technique often used by spammers – so it’s severily punished by spamfilters. Other indications of images being included in mail are the block rules SARE_GIF_ATTACH, HTML_IMAGE_ONLY_32.
**Now that’s a lot to check …
** Yes it is. And this just about 0,0001% of possible reasons your mail is being blocked. This article mostly covers mail-content, but it could also be miss-configuration or the mailserver, bad dns-records (such as SPF/SRV-records), reverse dns records missing, …
Here are some tools to help you see what might be wrong with the mailserver.
- BoxCheck 3-way mail check: http://www.boxcheck.com/
- Blacklist checker by IP: http://www.mxtoolbox.com/blacklists.aspx
- Multi RBL check: http://www.robtex.com/rbl/
- Full DNS Report check: http://www.checkdns.net/quickcheckdomainf.aspx
If this was even remotely useful, I might go into more detail on the server-side of this story – more specificly the DNS side of it.