Protecting your email address: how to prevent spam

As a website operator you’re advised to include your e-mail address on your website so that you can easily be contacted by visitors. The problem with including your e-mail address is that you could find yourself inundated with spam. This isn’t just annoying, it could also mean you’re at risk from phishing e-mails as well as malicious programmes that cyber criminals hide in attachments. Necessity being the mother of invention, website operators try everything to stop spambots from gaining access to their e-mail addresses. We compare the most popular methods and explain their advantages and disadvantages.

E-mail harvesting: how spambots stalk their prey

E-mail harvesting is the automated acquisition of email addresses for unfair advertising, phishing attacks, or spreading malicious software, and is usually software-supported. For this purpose, specialised programmes (known as ‘e-mail harvesters’) search websites, mailing lists, internet forums, or social media platforms for e-mail addresses. The characteristic syntax, which corresponds to all e-mail addresses, delivers the sought-after contact information. Simple search patterns in a website’s source text search for @ signs. This sign is not normally found in natural texts, but is used in e-mail addresses to separate the username and domain from each other. Transcribing the address provides little protection. More refined spambots can even search for popular alternative spellings such as [at], [AT], (at), (AT):

User@domain.com

User[at]domain.com

If the @ sign or its equivalent contains two special characters separated by a dot, this is a clear indication for the harvester that this is an e-mail address. Even changing the 'dot' in front of the top-level domain offers comparatively little protection and makes it harder to read:

User[AT]domain[DOT]com

Even more revealing than the @ sign, is the HTML email link 'mailto:user@domain.com'. This allows website visitors to open their preferred e-mail programme with a simple click. The recipient address is automatically copied to the corresponding field. This is practical, but still doesn’t stop the spambots from realising this is an e-mail address. Website operators are therefore advised not to use traditional patterns when it comes to providing contact options. At the same time, human site visitors should still be able to read the address so they can easily contact you if they need to.

Classic representation of the e-mail address without protection

In order to be able to protect an e-mail address as much as possible before it is automatically read out by e-mail harvesters, you have to envision how it is generally integrated into a web page. A simple, easily accessible e-mail address can be inserted into any HTML page using the following code example:

<p>If you have any questions or suggestions, please write an e-mail to: 
  <a href="mailto:user@domain.com">user@domain.com</a>.
</p>

When a visitor accesses a web page with this code, the web browser displays the following information including a clickable mailto link:

If you have any questions or suggestions, please write an e-mail to: user@domain.com.

From the user’s point of view this is an ideal representation of an online email address. In order to keep the display user-friendly, the most popular method for protecting an email address is to make it look unrecognisable in the source text without changing how it looks in the browser. Alternatively, it is possible to separate the e-mail address from the actual website and forward it to the mailto link with a side-server redirect. However, re-writing the e-mail addresses in the browser view is becoming less and less common. The reason for this is that it doesn’t look as good for the user and isn’t that much more effective at preventing spam.

Effective tricks that can be used to protect you against spam rely on substitutions, masks, or encryptions in the source text, which hinder the spambot and not the user.

Substituting the e-mail address

Protection strategies based on substitution involve removing the entire e-mail address from the source text and replacing it with a graphical representation or a referral to the mailto link.

Integrate e-mail address as a graphic

If an e-mail address is implemented as a graphic, it can still be read by the human eye, but texts written as graphics are hard for e-mail harvesters to recognise. Occasionally there are spambots that are able to convert image elements into text elements using OCR (Optical Character Recognition), but this isn’t common. Including corresponding contact information as a graphic therefore offers a comparatively high protection against spam. However, website operators must realise that these can limit the user-friendliness of their website. The following HTML code shows how an e-mail address can be integrated into a website as a graphic file:

<img src="Path/graphicfile.png" with="120" height="20" alt= If you have any questions or suggestions, please write an e-mail to: user@domain.com"> 

The following graphic will then be displayed to website visitors:

This e-mail address is legible for most people. The text can neither be copied nor linked to a mailto link. While it is difficult for most users to manually type an e-mail address, text information in the form of a graphic is often not available for users with visual impairment. Therefore, it makes sense to include a description of the graphic as alt text. These can be read out by screen readers, but the downside is that spambots can read them as well so this method alone is not recommended as a preventative measure against spam.

HTML e-mail link via redirect

In order to effectively protect e-mail addresses from harvesters, it’s a good idea to separate them from the website. A script is generally used, which redirects human users to the mailto link after the first click. This opens the user’s e-mail programme and displays the address. For spambots that scan the source code of a website, this link will look like a file link and therefore prevents automatic reading. This protection mechanism can, for example, be implemented as a link to a PHP file that contains the redirect:

<p> If you have any questions or suggestions, please write us an
  <a href="redirect-mailto.php">e-mail</a>.
</p>

The content of the redirect-mailto.php file is a script that redirects to the actual mailto link:

<?php
header("Location: mailto:user@domain.com"); 
?>

Since PHP is processed on the server side, spambots that read a website’s source code have no chance of getting to the e-mail address. If it is necessary for the e-mail address to be displayed on the website, it’s recommended for you to combine this method with graphically integrating the e-mail address.

The disadvantage of this spam prevention solution is that users need a handler for mailto: to get to the e-mail address. In practice, this is usually an e-mail programme such as Outlook or Thunderbird. However, web mailers can also be entered as handlers in new browsers.

Masking the e-mail address

If you don’t want to completely replace an e-mail address with a graphic or a mailto link, there are alternative strategies. They make it possible to code an e-mail address by masking additional elements or first compiling them in the browser using JavaScript. Simple encoding can be implemented by HTML entities, for example, as well as by URL or HEX encoding. Simple masking strategies rely on the comments feature, HTML elements, and CSS. A bit more complex, however, is to mask an e-mail by dynamically composing the address.

Simply transcribing characters means that these methods are limited to manipulating the address in the source code and therefore not affecting how it is displayed in the browser.

Masking by character encoding

Common character encoding, used when masking e-mail addresses in the source code, is based on HTML entities, HEX code, or the percentage of URL encodings. These descriptions were originally developed for representing special characters through standard characters. For masking e-mail addresses, this type of encoding is suitable because the respective reference characters are automatically translated in the browser view. If the characters of the e-mail address user@domain.com are masked using HTML entities, they are first written in the alternative style.

&commat; = @

&period; = . (dot)

This results in the following source code:

<p> If you have any questions or suggestions, please write an e-mail to: 
  <a href="mailto:user&commat;domain&period;com"> user&commat;domain&period;com</a>
</p>

Since HTML entities have only been defined for special characters, this means that with this character encoding, neither the entire e-mail address nor the significant text string mailto: can be encrypted. Alternatively, a representation using HEX encoding is possible. The Unicode character number is used here and is listed in the following basic schema:

&#characternumber;

Typically, the hex number of the corresponding character is indicated by a small 'x'. Thus the letter 'm' could be noted down as '&#x6d;' or decimal '&#109;'. The e-mail address user@domain.com including the mailto link will look like this:

<p>If you have any questions or suggestions, please write an 
<a href="&#x6d;&#x61;&#x69;&#x6c;&#x74;&#x6f;&#x3a;&#x62;&#x65;&#x6e
;&#x75;&#x74;&#x7a;&#x65;&#x72;&#x40;&#x64;&#x6f;&#x6d;&#x61;&#x69;
&#x6e;&#x2e;&#x64;&#x65;">email</a>.
</p>

The corresponding reference characters for translating an email address can be easily found from lists available online. A clear overview is provided on htmlarrows.com. If you want to encode the complete e-mail address, we recommend encoding programmes that are offered free of charge as web applications on numerous websites.

Another way to protect e-mail addresses from spam is to use URL encoding. This method was originally developed to assign special characters in a URL to something that the browser could interpret. Three-character combinations are used that originate from the two-character ASCII hexadecimal code of the respective character and a pre-defined percentage symbol. The following example shows an @ sign being masked by URL coding:

<p>If you have any questions or suggestions, please write an
  <a href="mailto:user%40domain.com">e-mail</a>.
</p>

In principle, masking the email address can be quickly and easily done by character encoding. The protection is comparatively low presently since most spambots are now programmed to easily decipher this simple form of encryption.

Masking by supplementing

Basically, it is possible to hide email addresses from spambots by inserting additional characters into them. Programmes will then hopefully not see the address as a whole and therefore it won’t be able to be read out automatically. HTML comments provide a simple way to do this.

<!-- Comment -->

Ideally, these include just the characters that are normally used in e-mail addresses.

<!-- abc@def -->

<!-- @abc.com -->

If comments like these are added into the e-mail address, spambots (who scan the website) will stumble across the following code:

<p>If you have any questions or suggestions, please write an e-mail to:
us<!-- abc@def -->er@domai<!-- @abc.com -->n.com. 
</p>

In the browser view, however, the HTML comments are invisible.

Alternatively, it is possible to insert any characters without comments, as long as they are hidden in the browser view using CSS. In the following example, the e-mail address is interrupted by a span element. The content between the start and the end tag isn’t considered because of the display quality along with the value none.

<style type="text/css">
span.spamprotection {display:none;}
</style>

<p>If you have any questions or suggestions, please write an e-mail to:
user<span class="spamprotection">CHARACTER SEQUENCE</span>@domain.com. 
</p>

While a human user receives a correct e-mail address in the web browser, a spambot is expected to read out the blended text in the span element. This gives website operators the option to use the e-mail address userCHARACTERSEQUENCE@domain.com as a so-called honeypot in order to locate sender addresses and block them from spam attacks.

A disadvantage of masking by supplementing is that with this method the e-mail address can’t be connected with an HTML e-mail link. In this case, users must manually copy the address into their e-mail programme.

Reversing a string

CSS can be used not only to hide additional characters in the source code, but also to reverse the string. This enables website operators to store e-mail addresses in the wrong order in the source code in order to deceive spambots.

<style type="text/css">
span.ltrText {unicode-bidi: bidi-override; direction: rtl}
</style>
<p>If you have any questions or suggestions, please write an e-mail to:
<span class="ltrText"> moc.niamod@resu</span>.
</p>

While spambots find the character string moc.niamod@resu in the source code, the CSS property unicode-bidi ensures (along with the value bidi-override) that all characters within the appropriately distinguished span elements are read by the browser just as the quality direction intends them to be – in this case from right to left (rtl).

This masking means that e-mail addresses aren’t displayed as they usually are. However, more advanced spambots can’t be deceived by this trick.

Dynamic composition with JavaScript

JavaScript offers another way to make sure the correct e-mail address is displayed in the browser. The address is divided into several parts that are dynamically composed by the browser when the website is called up.

<script type="text/javascript">
var part1 = "user";
var part2 = Math.pow(2,6);
var part3 = String.fromCharCode(part2);
var part4 = "domain.com"
var part5 = part1 + String.fromCharCode(part2) + part4;
document.write("If you have any questions or suggestions, please write an e-mail to:
   <href=" + "mai" + "lto" + ":" + part5 + ">" + part1 + part3 + part4 + "</a>.");
</script>

In lines 2 to 6, the individual sections of the e-mail address are defined. The @ sign is defined in two steps. The Math.pow(2,6) function in part2 determines the number of the character in the ASCII compatible character sets (26 = 64). This is converted to the corresponding character in part3 by the function String.fromCharCode(part2). The output of the parts defined in part1 to part5 is performed in lines 7 and 8 by the document.write() function. The e-mail address becomes available only after client-side execution of the script. It’s also possible to have a variant where the script is only started once the user has clicked.

Anti-spam methods that use scripts for dynamic composition are based on the assumption that e-mail harvesters can’t fully implement JavaScript. If this is the case, it could be assumed that there’s a high level of protection. The disadvantage of this method is that users who have deactivated JavaScript in their browser aren’t displayed as much contact information as they should be. This doesn’t affect many users today though.

Encrypting the e-mail address

With JavaScript, e-mail addresses can not only be assembled from individual parts, but the scripting language also enables you to encrypt the e-mail address to protect it from spam. A common method for e-mail encryption is ROT13, which can be implemented with just a few lines of JavaScript.

<script type="text/javascript">
function decode(a) {
  return a.replace(/[a-zA-Z]/g, function(c){
    return String.fromCharCode((c <= "Z" ? 90 : 122) >= (c = c.charCodeAt(0) + 13) 
                               ? c : c - 26);
  })
}; 
function openMailer(element) {
var y = decode("znvygb:orahgmre@qbznva.qr");
element.setAttribute("href", y);
element.setAttribute("onclick", "");
element.firstChild.nodeValue = "Open e-mail software";
};
</script>
<a id="email" href=" " onclick='openMailer(this);'>E-mail: please click</a>

In line 9 of the sample code, it shows the encrypted version of the email address user@domain.com including the mailto text string (znvygb:orahgmre@qbznva.qr) as well as how it should be encrypted (in lines 2 to 7). The function in lines 8 to 13 opens the user’s preferred email program and writes the decrypted address into the recipient field.

The script is started by clicking on the link with the anchor text 'Email: please click' (lines 15 to 16). After being clicked on, this displays the text 'Open email software' (line 12).

Just like the JavaScript-based composition of the email address, the encryption method is based on the assumption that spambots can’t interpret the entire client-side script language or can only partly interpret it. Theoretically, the encrypted email address could be used as a honeypot. In this case the domain should not be encrypted.

CAPTCHAs

CAPTCHAs offer the possibility of protecting an e-mail address from spam. Encrypted e-mail addresses are only displayed in plain text if a check has revealed that the user is human. These checks come in different forms such as asking the user to type a letter or number combination. Easy calculations, combination tasks and puzzles are also options for CAPTCHAs. A free CAPTCHA service is provided by Google with reCAPTCHA.

CAPTCHAs offer a comparatively high level of protection against spam since e-mail addresses are either not displayed at all or only in the encrypted form in the source code. CAPTCHAs can also be easily integrated into a website’s design. However, the additional effort required to get to an e-mail address does have a negative impact on a site’s user friendliness since it hinders the user from easily accessing the contact information.

Alternative: feedback form?

Instead of posting an e-mail address on their website many website operators provide a feedback form that allows visitors to enter their messages as well as leave their name and contact address. These are redirected in the background to a stored recipient address. Integrating it into the website can be done using server-side programming languages such as PHP. In order to prevent spambots from automatically filling out these forms and sending them, they are usually secured by CAPTCHAs.

Conclusion

The strategy you should use to protect your e-mail address depends primarily on the presentation requirements that need to be met and which technical possibilities are available. Redirecting to the mailto link using with the help of PHP or similar server-side programming languages is a good protection method. However, this must be supported by the hosting site. If you decide to list your e-mail address on your website, it’s recommended to display the e-mail address as a graphic.

Transcribing as well as coding using HTML entities, HEX code, or URL encoding, offer less protection in comparison. However, the last ones are a precursor for any subsequent encryption. Masking or encrypting via JavaScript provides reliable protection against spambots and you could also consider presenting your e-mail address in a graphical way. This is definitely a good idea when the address isn’t created on the website, but rather only in the mailto handler.