Why HTML is Inappropriate for E-Mail
Here’s the overview:
- There’s already a better way
- It reduces interoperability
- You need MIME, anyway
- It opens security holes
Now, I’ll explain in detail.
A little history
Netscape, once upon a time, had a web browser. To make their offering a more complete Internet client, they decided to add email support. This made some sense; they already had the most popular client for the Internet technology that was bringing people to the ’net in bigger numbers than anyone had previously imagined, and e-mail was the undisputed Internet "Killer App." Why not offer both, and take over the world?
Well, that was probably the plan, anyway.
Now, consider for a moment where e-mail was at the time. Most people used text-based mail readers, while a growing number of people were using graphical mail readers. The graphical readers (and many of the text-based ones) supported styled text, attachments, and a whole host of modern features. They did this using such well-designed and extensible standards as MIME (Multipurpose Internet Mail Extensions) and Enriched text.
Now, return to Netscape. They saw that they had a fairly extensive code base implementing their web browser, and they wanted to have to do as little work as possible to implement a mail client. They decided that they could get style and embedded images (one of the infinite things that MIME messages can embed) if they used HTML as their "rich" email format, leveraging their existing code base. And it was true; they could.
Doing that, of course, was the Wrong Thing (tm), for all the reasons that we’re going to discuss in this essay. Suffice it to say, for the moment, that what they did was good for them on the short term, but it was lazy, it departed from working and well-accepted standards, and it caused a whole host of problems.
Now, the rest of the world is stuck fighting our way back out of the hole that Netscape has dug us into.
Why standards are good
To be filled in...
Refer to the RFC repository at http://www.rfc-editor.org/
There was already a better way
The reason that most people think they want HTML email is to enable styled text. They want to be able to make the text larger, or make some text bold, or change the font style or color.
All of those features are already present in a more appropriate format, called Enriched text (MIME type "text/enriched"). I hope you will see in a moment why this format is more appropriate.
HTML as a text format does not allow the flexibility that enriched text does, while it adds a number of features that are entirely inappropriate for email.
Remember, email is not hypertext. HTML is the HyperText Markup Language. Email is a message-based communication medium. It doesn’t make sense to use a language that’s intended for marking up hypertext to apply style to email messages. That’s what the Enriched text content-type is for. HTML gives you only some of the features that you want for styled text, and it brings along with it a number of features that are wholly inappropriate for email.
It reduces interoperability
Enriched text was created to be easily parsable with even pure text-based mail readers. HTML mail, on the other hand, requires a complete HTML rendering engine in your mail client to be able make sense of it. Unless you happen to be using Netscape as your mail client, you probably don’t care to have all that baggage dragged into your mail client, making it bigger, slower, and more prone to serious bugs. And many people don’t even have the option.
Then there’s the issue of attachments. While attachments are a simple matter using MIME, there’s no mechanism to attach a file to an HTML mail message, other than the image links I mentioned above. It turns out that, even if you use HTML, you will need to use MIME (remember that?) to attach anything. Interestingly, you will also end up using MIME if you want to have more than one version of the text of the message, too. For instance, many HTML mail clients format messages as both HTML and text/enriched, so that they’ll be viewable in most mail readers. Text-only users still have to deal with the ugly HTML markup in the HTML section, but they can read the enriched part.
So, at this point, you have to ask, if I need to use MIME to do what we really want to do with mail messages, and if we need to use text/enriched if we want to interoperate with other users, and if those formats allow us to do everything we want to with mail, then why would we use HTML in email messages at all?
The answer is, of course, that if we thought about it at all, we wouldn’t.
It opens security holes
It’s interesting to note that Spammers (see spam) are particularly fond of HTML mail. Why would this be?
A spammer is in the business of sending email messages to people who don’t want to receive them. So, even if you are very careful to protect your email address, not using it in public places, not posting it on web sites, and so on, a spammer might guess your email address. The spammer can send you a message, and if you read it, then they know that your email address is a real, valid email address, and that you read the spam that’s sent to you. Your address will go immediately onto the "A" list -- the list of addresses to which it would be particularly advantageous to send spam.
But it’s worse than that. If someone cared to, they could send you an HTML message and then be able to tell what computer, running what operating system, on what network, connected via which ISP, you used to read the message. This may be more information than you care to give out to anyone who can send you email -- which is to say, anyone in the world.
(To be fair, several of the viruses recently have relied on really brain-dead behavior in Microsoft email clients, which are perfectly happy to run any program that an attacker sends to a victim in email. HTML isn’t strictly necessary to exploit these Microsoft bugs.)
Worst yet, is that at least the AOL client software (and perhaps others) doesn’t even allow users to turn off HTML in messages they send! It’s just not possible, anymore, for an AOL user to do the right thing!
HTML in email is neither necessary nor useful. It doesn’t do quite what you want in email messages (remember: it was designed for a wholly different purpose), and there are purpose-built technologies that do a much better job. It opens you up to severe security problems.
HTML in email was a quick kluge by Netscape not to produce an innovative solution to any need that existed, but rather to get a bloated, bug-ridden product out the door a little faster. Is this really something that you want to buy into?
- Geoff Adams
18 Apr 2002