Journal

Spam Tracing

PC Pro logo Posted: 1st March 2000 | Filed under: Press Articles, Technical
Author: Paul Ockenden
First Appeared in PC Pro 2000

It's been a month of email problems. First, we and several other PC Pro contributing editors got placed on to a junk mailing list and ended up with inboxes full of rubbish. We've also had to deal with two cases of fake emails.

Unsolicited commercial email - UCE for short, spam for too long is one of the big bug-bears for most Internet users. Once your address appears on a 'spamming list' you'll start to see an avalanche of junk, which will probably start with people trying to sell you a spamming list, followed by offers of holidays in Florida and get-rich-quick schemes.

As well as finding its way into my inbox, such email is also arriving addressed to our children. It takes up a huge proportion of the already overcrowded Internet bandwidth and costs us all money while we download it. Basically, it's a nightmare and its perpetrators are thought of as scum. So what happens when one day the perpetrator suddenly becomes you? Not really you, but someone else sending out spam that either purports to be from you, or else from someone within a domain for which you are responsible.

We've seen this scenario three times this month, two of them involving our clients. Imagine the consequences. If you're a company, your reputation is at stake. Firms spend fortunes to project a clean image, but spamming can trash this in an instant. For personal domains or email accounts, the damage is no less worrying as you're probably a professional whose clients and friends will think less of you after you've appeared to spam them.

There are far more serious consequences: once you're labelled as a spammer, people may take measures against you. You'll probably be 'mail bombed' by people with fast Internet connections who will flood your inbox with multi-megabyte email attachments for weeks and render your email account useless. You'll also find your email being blocked, as individuals use facilities such as the Junk Sender List in Outlook Express to make sure they never see another email from you.

More seriously still, you may find that someone adds your mail server to a 'blocking list' - lists of IP addresses known to be associated with junk email. Various ISPs and companies check all incoming email against such lists and reject the mail if a match is found. Most serious of all, some network administrators are now tying these blocking lists into their firewall rules, so once you've been found 'guilty' of spamming, not only will you not be able to send email to much of the Internet anymore, many people won't be able to access anything within your address space.

In short, a nightmare, and even more so if the original spam that sparked it off had nothing to do with you. So what can you do?

Detective work

With the help of Brian Dorricott, MD of Gordano www.gordano.com, which created the excellent NT Mail and NT List email server products, let's see how we can trace who was really behind the UCE that trashed your good name. There are three levels of evidence that you can examine: the received clauses in the message itself, the mail server log files, and DNS resolution if you have the IP addresses at hand.

So what is a received clause? These are part of the 'headers' you'll see if you examine the original source message of an incoming email. Emails are typically not sent point-to-point, but will bounce around different machines before reaching their destination; a received clause gets added to the top of the message by each server as the email passes through it, providing an 'audit trail' showing where the message has been. It's important that the header contains as much information as possible about the identities of all machines participating in mail transfer, and so received clauses are defined by RFC822, the main standard for SMTP email. Received clauses are important to us because the spammer has little or no control over them.



received = "Received" ":" ; one per relay
["from" domain] ; sending host
["by" domain] ; receiving host

["via" atom] ; physical path
*("with" atom) ; link/mail protocol
["id" msg-id] ; receiver msg id
["for" addr-spec] ; initial form
";" date-time ; time received

Let's look at the meaning of some of those fields:

from domain: The machine that delivered this email message to your server. The RFC recommends that the reverse look-up of the IP address be used for this field, but many SMTP servers don't do this because it can cause loss of information and slow down the acceptance of messages while the mail server waits for the DNS to fail.

by domain: The name of the machine receiving the message, for instance, your mail server.

id msg-id: A server-specific identifier that the mail server uses to trace the email message through the system, which should be a unique number for each mail server. for addr-spec: This is the full 'global address' of the destination, for example, 'for brian@gordano.com'.

date-time: The standard RFC specification for the date and time that the message was received (note received, not sent). You can use this information to establish if there have been delays in delivering the mail and where they occurred. A full set of headers should lead you on your way to identifying the spamming culprit.

There are some times when headers can't be used:

Some mail servers provide the service of removing all the received clauses within the message: these are known as anonymous remailers. This practice is legitimised by privacy considerations, where the identity of the parties must remain hidden.

Some mail servers can be configured not to add a received clause, which will cause a discrepancy in the audit trail so progress through servers can't be observed fully.

Some mail servers don't insert all the information available into the received clauses: for example, the 'for' clause is often missing. Without this clause, it can be difficult to establish why a message has been routed to a specific server.

There can also be problems with the reverse look-ups. We've said that many mail servers don't even attempt it, but there are several reasons why those that do may fail to give correct information. Failure isn't usually deliberate but a result of errors or external ISP requirements. Examples include badly-configured DNS servers and mail clients running on temporary IP addresses.

One of the biggest headaches is 'IP spoofing'. Every TCP/IP packet carries details of the address of its sender, but it's possible to fake these and make emails appear to come from any IP address, without the message ever having been near the machine owning that IP address.

Let's take a look at some received clauses. The following one gives the full information (as per RFC822) and allows investigation of the route that the email message may have taken:

'Received: from [111.222.111.222] by mail.net-shopper.co.uk (NTMail 5.00.0010/AB0000.00.719cfeeb) with ESMTP id vkgdtcaa for ; Sun, 14 Nov 1999 11:55:02 +0000.'

However, take a look at the following:

'Received: from mail.xxx.com by nico.yyyy.co.uk with SMTP (Microsoft Exchange Internet Mail Service Version 5.0.1460.8) id VJSBPLAC; Fri, 22 Oct 1999 10:07:13 +0100.'

Its missing 'for' clause means there's no way you can determine where this message is going to at this point, so another server could have changed the destination address and you wouldn't know anything about it.

'Received: from 111.222.111.222 by dns (SMI-8.6/SMI-SVR4) id GAA04407; Sun, 14 Nov 1999 06:52:44 +0100.' Here we know neither which server processed the message nor where it is going to: you could only identify this mail server by looking at the previous received clause, if there is one.

As for this one:

'Received: (qmail 7775 invoked from network); 22 Oct 1999 06:39:17 -0000.' This is the classic. You don't know anything about the message except when it passed through your local server, and that the software being used is 'qmail'.

Using a combination of the received clauses from the message, a few traceroutes, some Internet searches and some intuition, you can probably trace the sender of the spam, or his ISP. Once you've found this out, you'll want to notify someone. First, email the accounts 'postmaster' and 'abuse' at the domain of the originator of the message. Most ISPs will monitor these accounts and take action against any customers using their facilities for spamming.

We asked Gordano's Brian Dorricott some questions about spam from the mail server vendor's perspective:

PO: Can mail software vendors prevent their software being used for spamming?

BD: No. A mail server is designed to send email and send it fast. However, vendors can help make email traceable by including all the information available in the headers of messages, so the trace of where the message went can be found and the owners of the software that created the UCE can be contacted.

As you can see from the examples above, Brian's product, NTMail, puts the maximum amount of information into the header of each mail message. This is to be applauded.

PO: Is it possible to stop spam?

BD: It's not possible to stop UCE without losing some wanted emails because the definition of UCE depends upon the attitude of the person reading the message. However, a series of techniques can be applied (filtering, volume monitoring) that can reduce the amount of UCE entering systems. Our NTMail product has an additional option called 'JUCE' (Junk Unsolicited Commercial Email) that does some of this.

So the message seems to be to stay on your guard, use sensible mail software and don't send spam emails.