Spam Prevention by Enforcing Standards

Rick Moen [rick at linuxmafia.com]

Wed, 11 Jun 2008 20:51:25 -0700

Recently received Chez Moen, and I replied back saying it's indeed an astute and elegant idea, which I might very well adopt for my own domains. (In the absence of an MX = mail exchanger record in the DNS for a mail-receiving machine on the Internet, RFC 2821 specifies that the sending host should fallback on the "A" = forward lookup record, instead. This was also true with the original SMTP-defining RFC, RFC 821.)

And, in case anyone is wondering, Gokhan Gucukoglu's name is Turkish. He seems to be based in the UK, and is one of those protean characters active broadly across free / open source software, just to make the rest of us look bad by comparison. ;->

----- Forwarded message from Sabahattin Gucukoglu <mail@sabahattin-gucukoglu.com> -----

Date: Wed, 11 Jun 2008 09:48:55 +0100
From: Sabahattin Gucukoglu <mail@sabahattin-gucukoglu.com>
To: rick@linuxmafia.com
Subject: Spam Prevention by Enforcing Standards

Hi,

I notice that linuxmafia.com has just one MX, linuxmafia.com. See RFC 2821 section 5: remove your MX record; there is an implicit MX rule. Good MTAs know it, most spammers, their spamware, their agents, etc, etc, still don't. It's a great little trick and is doing me bloody wonders. (It may on very, very rare occasions break mailers which insist that there be an MX record when you issue MAIL FROM linuxmafia.com; such mailers are broken and you really don't want to talk to them anyway :-) .)

(If you wonder why I noticed at all, it's your multiline greeting.)

Cheers, Sabahattin

-- 
Sabahattin Gucukoglu <mail<at>sabahattin<dash>gucukoglu<dot>com>
Address harvesters, snag this: feedme@yamta.org
Phone: +44 20 88008915
Mobile: +44 7986 053399
https://sabahattin-gucukoglu.com/

Top Back

Ben Okopnik [ben at linuxgazette.net]

Fri, 13 Jun 2008 15:36:56 -0400

On Wed, Jun 11, 2008 at 08:51:25PM -0700, Rick Moen wrote:

> Recently received Chez Moen, and I replied back saying it's indeed an
> astute and elegant idea, which I might very well adopt for my own
> domains.  (In the absence of an MX = mail exchanger record in the DNS
> for a mail-receiving machine on the Internet, RFC 2821 specifies that
> the sending host should fallback on the "A" = forward lookup record,
> instead.  This was also true with the original SMTP-defining RFC, RFC
> 821.)

Since I'm not an SMTP guru, I'm mystified. Spammers harvest email addresses off the Web; why would they need to know what the MX is for a given domain when they have the actual address? If they do manage to get a useful result out of 'dig -tmx <domainname>', what does that actually get them?

> Sabahattin Gucukoglu wrote:
> >
> > (If you wonder why I noticed at all, it's your multiline greeting.)

So what was the greeting? I gather it was one of those snarky "I'm a sysadmin and you're NOT - nyah!" types, which made your correspondent go 'seebling!'

-- 
* Ben Okopnik * Editor-in-Chief, Linux Gazette * https://LinuxGazette.NET *

Top Back

René Pfeiffer [lynx at luchs.at]

Fri, 13 Jun 2008 22:01:51 +0200

On Jun 13, 2008 at 1536 -0400, Ben Okopnik appeared and said:

> On Wed, Jun 11, 2008 at 08:51:25PM -0700, Rick Moen wrote:
> > Recently received Chez Moen, and I replied back saying it's indeed an
> > astute and elegant idea, which I might very well adopt for my own
> > domains.  (In the absence of an MX =3D mail exchanger record in the DNS
> > for a mail-receiving machine on the Internet, RFC 2821 specifies that
> > the sending host should fallback on the "A" =3D forward lookup record,
> > instead.  This was also true with the original SMTP-defining RFC, RFC
> > 821.)
>=20
> Since I'm not an SMTP guru, I'm mystified. Spammers harvest email
> addresses off the Web; why would they need to know what the MX is for a
> given domain when they have the actual address? If they do manage to get
> a useful result out of 'dig -tmx <domainname>', what does that actually
> get them?

The spammers don't need the MX records, their software does (provided they don't use a misconfigured MTA). Their spamming tools need to get the MX record unless they don't use the fallback mentioned in RFC 2821. Some do, and some anti-spam rules don't accept email from domains without MX record. Either way you get less mail.

Best, René.

Top Back

Ben Okopnik [ben at linuxgazette.net]

Fri, 13 Jun 2008 16:24:05 -0400

On Fri, Jun 13, 2008 at 10:01:51PM +0200, René Pfeiffer wrote:

> On Jun 13, 2008 at 1536 -0400, Ben Okopnik appeared and said:
> > 
> > Since I'm not an SMTP guru, I'm mystified. Spammers harvest email
> > addresses off the Web; why would they need to know what the MX is for a
> > given domain when they have the actual address? If they do manage to get
> > a useful result out of 'dig -tmx <domainname>', what does that actually
> > get them?
> 
> The spammers don't need the MX records, their software does (provided
> they don't use a misconfigured MTA). Their spamming tools need to get
> the MX record unless they don't use the fallback mentioned in RFC 2821.
> Some do, and some anti-spam rules don't accept email from domains
> without MX record. Either way you get less mail.

Perhaps I'm simply unclear on spammers' methods. Why would they use anything other than a standard MTA? Does, e.g., 'sendmail' instantly die of shame when it's used in that manner?

-- 
* Ben Okopnik * Editor-in-Chief, Linux Gazette * https://LinuxGazette.NET *

Top Back

René Pfeiffer [lynx at luchs.at]

Fri, 13 Jun 2008 22:35:35 +0200

On Jun 13, 2008 at 1624 -0400, Ben Okopnik appeared and said:

> On Fri, Jun 13, 2008 at 10:01:51PM +0200, René Pfeiffer wrote:
> > [...]
> > The spammers don't need the MX records, their software does (provided
> > they don't use a misconfigured MTA). Their spamming tools need to get
> > the MX record unless they don't use the fallback mentioned in RFC 2821.
> > Some do, and some anti-spam rules don't accept email from domains
> > without MX record. Either way you get less mail. 
>
> Perhaps I'm simply unclear on spammers' methods. Why would they use
> anything other than a standard MTA? Does, e.g., 'sendmail' instantly die
> of shame when it's used in that manner?

The opinion is divided on this feature of Sendmail. ;) Gathering from the reports and articles I read most spammers move their SMTP operations to botnets. They give money to botherders and have their infected PCs spew out the spams. This means that at the first stage no MTA is involved. If the bots have to send the email directly they might have to lookup the MX record. If they use the ISP's upstream mail hub, then this might not work, but I doubt that the software infecting the bots has a highly complicated SMTP code (yet).

I'll have a chat with some anti-spam guys later this year, I may know more before the Christmas spams arrive. ;)

Best, René.

Top Back

Rick Moen [rick at linuxmafia.com]

Fri, 13 Jun 2008 13:36:05 -0700

Quoting Ben Okopnik (ben@linuxgazette.net):

> Since I'm not an SMTP guru, I'm mystified. Spammers harvest email
> addresses off the Web; why would they need to know what the MX is for a
> given domain when they have the actual address? If they do manage to get
> a useful result out of 'dig -tmx <domainname>', what does that actually
> get them?

Certainly, those address lists are built in part through Web-spidering and harvesting mailto: hyperlinks -- not to mention the equally common practice of grabbing addresses from virus-infected Wind0ws hosts via MAPI address-book calls and grepping through the MSIE Web cache. That's part of what builds the CDs of believed-good addresses that the real pros sell to spammers. (The real pros, who also develop mass-mailing software dummied down enough so that even spammers can use them, are probably the only ones actually making money from this industry. Spammers themselves -- most of them -- are the technological equivalent of Herbalife distributors. Losers. Criminal-leaning luftmenschen.[1])

However, Sabahattin's point is that the spammers mass-mailing tools have to _not only possess_ a large list of mailbox targets to hit, but then must actually _deliver to them_. And those tools tend to be very badly written, by people who don't give a damn about SMTP RFCs but just want to push stuff out the door.

The fact that spammers' software characteristically ignores RFCs can be used against them: Just make sure that your receiving MTA enforces the RFCs, and reject any mail from a non-compliant host. (He had a specific idea, in that area: I'm building up to that.)

I already do follow that general strategy, using "callouts": The RFCs require that any domain doing SMTP mail accept mail to two standard mailboxes for administrative reasons, postmaster and abuse.[2] My own MTA, at any time when a delivering MTA has a pending SMTP session open to drop off mail and mine hasn't yet accepted it, does a return SMTP side-session to the claimed delivering domain's mail server and initiates a test message to postmaster and one to abuse. If the other side indicates that those are acceptable delivery addresses, my MTA considers the domain to pass that test (and drops the test mails without completing their delivery). Otherwise, my MTA issues a 550 SMTP permanent refusal code on the incoming delivery attempt, including an explanatory error code telling the sender why his/her sending mail system needs to comply with the RFCs.

Some accuse me of being too militant in this area: I've had sysadmins write back complaining that they'd manually disabled acceptance of postmaster and abuse mail, because they've become spam-targets. (In other words, they've adopted the "hide from spammers" strategy.) My response is "Sorry, but you really do need to follow the RFCs" -- though I also do whitelist their domains, exempting them from the callout check.[3] I also point out to the complaining sysadmins that their strategy will make their systems look like spam-sources, and so they might want to consider switching strategies, to implementing better spam-rejection instead of hiding.

Getting back to Sabahattin's suggestion: One of RFC 2821's fine points is that, although the primary means in the DNS of specifying where a domain's mail goes is "MX" entries, there's also an implicit fallback: If a domain (or fully qualified domain name) lacks an MX entry, the sending SMTP system is supposed to fall back on the "A" (forward lookup) entry.

Spammers' software tools being designed in an arrogantly RFC-ignorant fashion, they will often fail to be -able- to deliver mail to domains lacking MX records. Thus, Sabahattin's suggestion was: Remove your MX entry. Doing so will trip up a signficant fraction fo spammers, while it should not sabotage any but a very few legitimate mailers.

[1] Yiddish is the only language with quite the requisite degree of disdain, in this context. Quoting Leo Rosten's _The Joys of Yiddish_:

  luftmensch
 
   Pronounced LOOFT-mensh.  German:  Luft "air"; Mensch "man".
 
   1.  Someone with his head in the clouds.
   2.  An impractical fellow, but optimistic.
   3.  A dreamy, sensitive, poetic type.
   4.  One without an occupation, who lives or works ad libitum.

Except, luftmenschen would typically be harmless, likeable people, in which category spammers do not qualify.

[2] A fair reading of the RFCs is that the addreses must accept reasonable mail related to their purposes, e.g., they're not obliged to accept arbitrary content or spam.

[3] All common SMTP daemons default to handling those two mailboxes as valid incoming addresses: The sysadmin who have trouble are those who've manually acted to disable that RFC-required feature. Additionally, I am told that some releases of Microsoft Exchange have not bothered to be RFC-compliant.

Top Back

Rick Moen [rick at linuxmafia.com]

Fri, 13 Jun 2008 13:39:37 -0700

Quoting Ben Okopnik (ben@linuxgazette.net):

> Perhaps I'm simply unclear on spammers' methods. Why would they use
> anything other than a standard MTA? Does, e.g., 'sendmail' instantly die
> of shame when it's used in that manner?

Spammers typically do not use standard MTAs at all. Instead, they use off-the-shelf spammer-specialty software they've purchased, that mass-crams SMTP out the door directly to target MTAs' port 25/tcp, usually breaking a whole bunch of RFCs in the process.

In many cases, these specialised MTA engines are built into malware that people get tricked into running on MS-Windows desktop boxes, which thereby become zombified and fall under spammers' remote control as spam sources at their direction.

Top Back

René Pfeiffer [lynx at luchs.at]

Fri, 13 Jun 2008 22:55:50 +0200

On Jun 13, 2008 at 1336 -0700, Rick Moen appeared and said:

> Quoting Ben Okopnik (ben@linuxgazette.net):
>
> > Since I'm not an SMTP guru, I'm mystified. Spammers harvest email
> > addresses off the Web; why would they need to know what the MX is for a
> > given domain when they have the actual address? If they do manage to get
> > a useful result out of 'dig -tmx <domainname>', what does that actually
> > get them?
> [...]
> The fact that spammers' software characteristically ignores RFCs can be
> used against them:  Just make sure that your receiving MTA enforces the
> RFCs, and reject any mail from a non-compliant host. [...]

Here is an example of an rejected SMTP session by a Postfix MTA getting impatient (taken from https://scott.yang.id.au/2004/01/a-recent-postfix-log-on-spamming-attempt/):

 Out: 220 mx.example.net ESMTP Postfix
 In:  POST / HTTP/1.0
 Out: 502 Error: command not implemented
 In:  Content-Type: text/plain
 Out: 502 Error: command not implemented
 In:  Content-Length: 1111
 Out: 502 Error: command not implemented
 In:  Host: mx.example.net
 Out: 502 Error: command not implemented
 In:  X-Forwarded-For: [Spammer's fake real address]
 Out: 502 Error: command not implemented
 In:  Connection: Keep-Alive
 Out: 502 Error: command not implemented
 In:
 Out: 500 Error: bad syntax
 In:  RSET
 Out: 250 Ok
 In:  HELO yahoo.de
 Out: 250 mx.example.net
 In:  MAIL FROMSpammer's fake Hotmail address>
 Out: 250 Ok
 In:  RCPT TO:
 Out: 554 Service unavailable; [Spammer's real IP] blocked using bl.spamcop.net,
     reason: Blocked - see https://www.spamcop.net/bl.shtml?Spammer's real IP
 In:  DATA
 Out: 554 Error: no valid recipients
 In:  To: <My real email address>
 Out: 502 Error: command not implemented
 In:  From: "eddie" <Spammer's another fake Hotmail address>
 Out: 221 Error: I can break rules, too. Goodbye.

I like especially the last line. As you can see from the first "commands" the spammers use a lot of hardcoded stuff which breaks (or can be made to break, as Rick explained).

You can do a lot with Postfix' protocol checks, even more if you add some vicious regexps.

Best, René.

Top Back

Neil Youngman [ny at youngman.org.uk]

Fri, 13 Jun 2008 22:01:41 +0100

On Friday 13 June 2008 21:36, Rick Moen wrote:

> (The real pros, who also develop mass-mailing
> software dummied down enough so that even spammers can use them, are
> probably the only ones actually making money from this industry.

Apparently not. There is apparently a lot of money made by pushing "high value" items via SPAM, specifically (counterfeit) pharmaceuticals. [1]

> I already do follow that general strategy, using "callouts":

A valuable technique, but a little controversial. Some people regard the use of sender verification callouts as abuse and will blacklist. They claim they shouldn't have to deal with the load imposed by a large burst of joe jobs. This ignores the fact that it's pretty cheap compared with the load caused by the backscatter that the joe jobs also create.

Neil

[1] https://www.darkreading.com/document.asp?doc_id=156139&f_src=drdaily

Top Back

Rick Moen [rick at linuxmafia.com]

Fri, 13 Jun 2008 14:08:33 -0700

Quoting Neil Youngman (ny@youngman.org.uk):

> Apparently not. There is apparently a lot of money made by pushing "high 
> value" items via SPAM, specifically (counterfeit) pharmaceuticals. [1]

I guess I should have qualified what I said: Almost all spammers are no-hope luftmenschen. Some few make real money at it.

[callouts:]

> A valuable technique, but a little controversial. Some people regard the use 
> of sender verification callouts as abuse and will blacklist. They claim they 
> shouldn't have to deal with the load imposed by a large burst of joe jobs.

This assumes that callouts inevitably impose a signicant load. That is not the case: Implemented properly, the results are cached and reused -- which is what is done on my system.

You're right that some places make the trigger-happy assumption that any callouts are inherently bad. I've just now forwarded an example of people thinking exactly that.

I believe them to be in error. They are, of course, not obliged to accept my system's mail. They think I'm misguided (usually without actually bothering to know the particulars of what I'm really doing). I think they're misguided. We each get our way, to the extent property laws permit. ;->

Top Back

Ben Okopnik [ben at linuxgazette.net]

Fri, 13 Jun 2008 17:17:46 -0400

On Fri, Jun 13, 2008 at 10:35:35PM +0200, René Pfeiffer wrote:

> On Jun 13, 2008 at 1624 -0400, Ben Okopnik appeared and said:
> > On Fri, Jun 13, 2008 at 10:01:51PM +0200, René Pfeiffer wrote:
> > > [...]
> > > The spammers don't need the MX records, their software does (provided
> > > they don't use a misconfigured MTA). Their spamming tools need to get
> > > the MX record unless they don't use the fallback mentioned in RFC 2821.
> > > Some do, and some anti-spam rules don't accept email from domains
> > > without MX record. Either way you get less mail. 
> > 
> > Perhaps I'm simply unclear on spammers' methods. Why would they use
> > anything other than a standard MTA? Does, e.g., 'sendmail' instantly die
> > of shame when it's used in that manner? 
> 
> The opinion is divided on this feature of Sendmail. ;)

Ah - it only happens if the binary was compiled with the DIE_OF_SHAME_IF_USED_FOR_LAME_PURPOSES flag! Understood; carry on.

> Gathering from
> the reports and articles I read most spammers move their SMTP operations
> to botnets. They give money to botherders and have their infected PCs
> spew out the spams. This means that at the first stage no MTA is
> involved. If the bots have to send the email directly they might have to
> lookup the MX record. If they use the ISP's upstream mail hub, then this
> might not work, but I doubt that the software infecting the bots has a
> highly complicated SMTP code (yet).
> 
> I'll have a chat with some anti-spam guys later this year, I may know
> more before the Christmas spams arrive. ;)

This sounds like waiting for the arrival of the Christmas winds in the Caribbean - ~30 knot north/northwesterlies for a couple of weeks straight. You have to make sure, beforehand, that your rodes are in good shape and your anchors are well set - or you might not make it through the bad stretch.

-- 
* Ben Okopnik * Editor-in-Chief, Linux Gazette * https://LinuxGazette.NET *

Top Back

Ben Okopnik [ben at linuxgazette.net]

Fri, 13 Jun 2008 17:42:05 -0400

On Fri, Jun 13, 2008 at 01:36:05PM -0700, Rick Moen wrote:

> 
> However, Sabahattin's point is that the spammers mass-mailing tools 
> have to _not only possess_ a large list of mailbox targets to hit, but
> then must actually _deliver to them_.  And those tools tend to be very
> badly written, by people who don't give a damn about SMTP RFCs but just
> want to push stuff out the door.

Ah. I had a guess about your original statement that tended in that direction; that's the only way I could see this working.

> Some accuse me of being too militant in this area:  I've had sysadmins
> write back complaining that they'd manually disabled acceptance of
> postmaster and abuse mail, because they've become spam-targets.  (In
> other words, they've adopted the "hide from spammers" strategy.)  My
> response is "Sorry, but you really do need to follow the RFCs" -- though
> I also do whitelist their domains, exempting them from the callout
> check.[3] I also point out to the complaining sysadmins that their
> strategy will make their systems look like spam-sources, and so they
> might want to consider switching strategies, to implementing better
> spam-rejection instead of hiding.

From everything I've seen, a competent sysadmin on a system with a decent connection shouldn't have to hide - or, more to the point, can't. The mail, as the expession goes, must go through - and hiding a couple of boxes seems like a pitiful attempt to propitiate the Gods of Spam. What's supposed to happen to the addresses in their domain that can't hide?

> Spammers' software tools being designed in an arrogantly RFC-ignorant
> fashion, they will often fail to be -able- to deliver mail to domains
> lacking MX records.

Nice. Of course, the spammers - who clearly do spend money on this stuff - will have someone code around the problem when it becomes a major one, but it'll work for a while. Years ago, I used to convert all the email addresses on my site (along with a couple of test addresses used for just this purpose) to HTML entities - e.g., ben@ben.com would look like

&#98;&#101;&#110;&#64;&#98;&#101;&#110;&#46;&#99;&#111;&#109;

The point, of course, was that browsers would display it just fine (and the 'mailto' links using the above 'address' also worked fine) - but text-processing bots wouldn't see an address at all. That gave me something like five years of not having those addresses harvested; an excellent return on a minimal time investment.

> [3] All common SMTP daemons default to handling those two mailboxes as
> valid incoming addresses:  The sysadmin who have trouble are those
> who've manually acted to disable that RFC-required feature.
> Additionally, I am told that some releases of Microsoft Exchange have
> not bothered to be RFC-compliant.

Y'know, that's odd. I've just looked through 2821 again, and I couldn't find a requirement for 'abuse' - although there definitely is one for 'postmaster'. Couldn't find it anywhere the last time I looked, either. Do you happen to know where it's defined?

-- 
* Ben Okopnik * Editor-in-Chief, Linux Gazette * https://LinuxGazette.NET *

Top Back

Rick Moen [rick at linuxmafia.com]

Fri, 13 Jun 2008 15:03:52 -0700

Quoting Ben Okopnik (ben@linuxgazette.net):

> From everything I've seen, a competent sysadmin on a system with a
> decent connection shouldn't have to hide - or, more to the point, can't.
> The mail, as the expession goes, must go through - and hiding a couple
> of boxes seems like a pitiful attempt to propitiate the Gods of Spam.
> What's supposed to happen to the addresses in their domain that can't
> hide?

Quite so.

At a certain point, if you're serious about having a stable Internet presence, you find yourself obliged to take a stand and say "No. I will not attempt to conceal my perfectly valid e-mail address, just because someone might try to abuse it. This is mine. I am not going to be driven away from the ability to make it publicly known as a way to reach me."

The postmaster@ and abuse@ addresses are required (see below) to be deliverable because they serve a functional, legitimate need. My own address serves a functional, legitimate need. Accordingly, I will neither disable nor obscure any of them.

Some percentage of sysadmins, and a much larger percentage of users, have not yet arrived at that conclusion. I predict that logic will drive acceptance of that view, but might be wrong, and in any event it's a very slow process.

[the fact that spammers' mass-mailing software tools characteristically run roughshod over various RFC requirements:]

> Nice. Of course, the spammers - who clearly do spend money on this stuff
> - will have someone code around the problem when it becomes a major one,
> but it'll work for a while.

Better than that: It's not just a useful heuristic over the short term. In very general terms, the more spammers are obliged to behave in an RFC-compliant way in order to hit their targets, the more readily their mailings can be programmatically identified and refused.

This is the same logic that underlies adoption of SPF (and DomainKeys) for mail domains' DNS reference records: Critics object that spammers will merely start publishing SPF RRs of their own. However, (1) the main goal of preventing them from believably masquerading as legitiimate (non-spammer) sender domains gets achieved, and (2) they are obliged to establish their own sending domains rather than pretend to be legitimate ones, which spammer domains can then be treated as having poor (or unknown) reputations by receiving systems.

[RFC requirement for abuse@]

> Y'know, that's odd. I've just looked through 2821 again, and I couldn't
> find a requirement for 'abuse' - although there definitely is one for
> 'postmaster'. Couldn't find it anywhere the last time I looked, either.
> Do you happen to know where it's defined?

My friend Derek Balling operates a RBL and related informational Web site, out of rfc-ignorant.org. It's worth visiting.

On the https://www.rfc-ignorant.org/ front page's left-side navbars, you'll note "Listing Policy" links including "abuse". Follow that link, and you reach https://www.rfc-ignorant.org/policy-abuse.php , which cites the exact RFC requirement: RFC2142, section 4.

Top Back