Why Is It So Easy To Spoof E-Mail Headers?

I’m not that technically knowledgeable about the inner workings of e-mail, so I’d be obliged for a little explanation, or better yet, citation, as to how spoofing/altering/masking return e-mail address/header information is done (in general – I’m not trying to get a primer on how to start my own mass spam ring). Why, technically, is it so simple?

Or is it? How much of the “source” information can be concealed, and how much, technically, must be accessible (by source I mean the combination of data about the originating computer but also about relays, routes, etc. used after the original sending of the e-mail)? I ask because Spamcop usually gets some information, so it’s not literally true (as far as I can tell) that a message could land in your inbox with absolutely no indication of from where it came (if only in the sense that you could trace it back to the last hijacked relay in the chain, etc.).

I know that part of the answer to “why is this so easily hacked?” will likely be that the existing e-mail protocols were developed at a time when no one thought there was any money ein e-mail or the Internet, so they didn’t build in big safeguards that might hinder e-mail’s ease of use. Okay . . . but given that ingenuous initial approach, what specifically would have to be done (and how hard would it be) to indelibly mark every e-mail with an accurate source/return address? (I’m not saying this would be good on policy grounds, etc., or that systems based on such source-identification might not be used for ‘bad’ purposes, if only in the sense of imposing/tracking an electrronic postage requirement, but it might stop some of the spam at least for awhile).

And even there . . . I’m sure that ending spoofing wouldn’t end spam. I’m just curious, as someone non-knowledgeable about e-mail mechanics, how much of an opportunity hackers/spammers have to disguise the source of e-mail, why this loophole exists, and how hard it would be to close it.

The analogy of physical mail is a good one here. The return address on an envelope is not checked by anyone at the post office unless the mail is not deliverable. You can write anything you want for the return address.

Like wise almost nothing is checked in the email system. Each machine that processes the mail puts a little something in the header and passes it on. Nothing in the system checks that these messages are true. I pulled an email that I received today there are 11 machines listed in the headers. Only the last 3 are within my company lets assume I trust those. The only thing that I know is not forged is the IP address of the machine just before it got to my company.

This is similar to the post mark on a physical envelope you trust that the post office that stamped the envelope is reporting true information but all else is just stuff someone wrote down.

For comparison, you could ask why the return address on postal mail can be easily forged. It’s nominally required, but there is no mechanism to verify its accuracy. The From and Reply-to headers exist for convenience, but the routing which occurs in delivering a message does not use them or verify them, so they can be forged to whatever the sender desires. Only the recipient address is required to be accurate in order for the email to reach its destination.

You can gain some information about the real sender by examining the “Received:” headers in email. This field tracks the SMTP servers which relayed the message. False Received headers can be added, but each mailserver which sends the message will also insert a valid one. These are like the post office post mark indicating where the letter was sent. I believe this is what SpamCop and other tools use to trace spam to its source.

There’s a lot of reasons why spoofing is possible. Some are simply oversights - as you mention, the designers probably didn’t anticipate the current situation since email was used only within a very tightly controlled group for many years. Other reasons are practical - you have to make a judgement on whether it’s worth the computing power to resolve and verify the sender addresses. Given the current situation, we might make a different judgement, but back in the day it just wasn’t worth requiring the SMTP server to validate every sender.

Also, there are good reasons for spoofing. I may have an email client that uses several different addresses (e.g. my address at several different company domains) but I always want to send mail using a single return address regardless of which domain I’m sending through. Spoofing allows me to do that, and there’s nothing fraudulent about it.

Headers are not special data, they are merely lines of text that are appended to the start of the body of a message. Headers + body = one big glob of text with nothing more than a blank line to separate the two parts. (Outlook hides the nature of email messages so this may not be apparent to you, but one can see this clearly in other software.) So it is really just as easy to write lies in the headers of a message as it is in the body, because it’s all one big text document. To start with, your own email software appends some basic headers to the top of the body, such as From: and Subject: and Date: and X-Mailer:. Then each time your message passes through an email server, that server appends one or more lines of text (headers) to the top of the message.

In the case of a spam email, the topmost (most recently appended) headers are likely telling the truth, while the bottommost headers and the body are full of lies. One trick is knowing where to draw the line between the truth at the top and the lies at the bottom. There is also the problem that certain older email server software doesn’t write good headers, and ignorant people throughout the world still run such software, and spammers illicitly take advantage of them.

And for more fun:

There is a "Answer to " address that gets used by your e-mail program when you respond, and there is a “return path” address that gets used when a server along the way decides the the “send to” address can’t be found.

The “Return Path” can be a single accounf for every one using the same domain (mail gets bounced to the postmaster) or it may be a copy of the “Answer to” address.

Some ISPs have begun checking the return path - they do a reverse name server look up (they ask for the IP address of the domain name after the @) and drop the message with comment if the domain is invalid.

I know they do this because for a while I was running a sendmail system that sent mail out with a fantasy return path - I didn’t have a domain of my own at the time. Most of the time it was no problem, but some messages weren’t getting through to the receiver. The easy fix was to rewrite the “Return Path” address to match the “Answer To” address. The best solution was of course to get my own domain name and IP address.

So, there is some checking of e-mail headers going on out there.

I do think something has got to be done about this crap. My new domain doesn’t get spam, but I still have a regular e-mail account from an ISP, and it gets about twenty pieces a day. I can (luckily) filter the crap quite easily, but I know people who don’t have the options that I do in that direction, and they are getting flooded.

SMTP (the current email standard) was published in 1982, and was designed as a stop-gap measure, with the intent that something better would come along in a few years. 22 years later, we’re still stuck with it.

This type of thing happens often in computers. You are in a rush to put out something, anything, that will just work. You don’t intend for it to last forever, but it gets too popular too fast, and suddenly the cost of replacing it with something better becomes prohibitive, or the functionality of the “better” thing is crippled by being forced to maintain backwards compatability.

Authenticated SMTP is one way of enforcing the sanctity of e-mail, but it is not a panacea.

It bascially injects another level of security before giving the e-mail sender access to the mailserver. You can’t send mail untill you provide the server with a valid username and password. Any mail then sent from the server will include a header line (encrypted) that indicates (to the mailserver admins) which username that was used to send the mail, regardless of what appears in the message’s From: or Return Path: fields.

Why isn’t AuthSMTP our panacea for unwanted mail?

Because it relies too heavily on a very weak link in the cyberspace chain: password security.

AuthSMTP is only as secure as the usernames and passwords that drive it. Unfortuneately, your average internet user doesn’t understand the concept of password security any more than they understand string theory.

[QUOTE=honeydewgrrl]
Why isn’t AuthSMTP our panacea for unwanted mail?

Because it relies too heavily on a very weak link in the cyberspace chain: password security.

[QUOTE]

SMTPAuth only applies to the first outgoing step, authenticating the user sending email from their own computer to their own ISP/company/school/whatever mail server. Assuming the recipient is not served by the same mail server, the mail server will contact another server at another ISP/company/school/whatever and pass the message along. There may be multiple server-to-server hops, all unauthenticated. The last hop delivers the mail to the recipient, again unauthenticated.

The main hole is here that it is trivial to bypass that first server. I can write a simple send-only mail client in about 100 lines of C code. Many virus/worm/etc have their own little mail client, and they lookup the mail server of the recipient and connect directly to that server. This is an unauthenticated connection.

There are a few fixes for this. First, an ISP can block outgoing mail connections except those going through one of its approved servers. If you try to make a direct mail connection to an outside mail server, the ISPs routers will just drop it on the floor. Many large ISPs already do this. It isn’t perfect, because you canget around this by setting up proxies around the internet on computers that aren’t restricted on outgoing mail connections and bounce your stuff off these proxies.

Second, there are a few proposals to do rudimentary (or better) authentication on those server-to-server hops. The simplest of these takes a look at the network where the message came from and lookup what IP addresses the sending network claims are legitimate mail servers. Again, you could get around this by using proxies to bounce stuff off insecure networks.