From Fedora Project Wiki

< PatrickBarnes

Revision as of 14:13, 24 May 2008 by fp-wiki>ImportUser (Imported from MoinMoin)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

HTML Encoding

There are always "spam bots" or "email harvesters" crawling the Internet. They browse every page they can and grab every email address they can. They add the collected addresses to a list or database that is then used by spammers. There are many techniques to protect your address. The most common is to write it in a strange form, such as myname SPAMFREE AT domain DOT com. Intelligent bots are now learning tricks to read even these kinds of addresses. Also, addresses of this format must also be interpreted by a human and cannot form mailto links.

There is another, less common way to protect your address, and few (if any) spam bots are programmed to detect it. You can encode your email address and mailto link using HTML entities. You can, in fact, also combine this with other techniques! When you encode the address using HTML entities, it appears normally in a browser (and therefore to a human), but a bot will only get a series of senseless characters. For example, when you encode the project name:

Fedora

...into HTML entities, you get:

Fedora

...which continues to appear as:

#!html
<pre>Fedora

~-This is a live example!-~

There are several ways to encode your address like this. I have provided an online tool to do this quickly and easily at:

http://n-man.com/htmlencode.html

As you're reading this, you might be interested in how you can use this to make a full mailto link in MoinMoin. You'll need the HTML parser installed for this to work. This wiki already includes that parser. It is very easy to do.

1. Get the converted form of your email address.

You can use the link I provided for an easy-to-use online tool.

2. Start your page and add the link.

#!html
<pre>
{{{#!html
 <a href="&#109;&#97;&#105;&#108;&#116;&#111;&#58;[your address]">[your address]</a>
}}}

Where [your address] is replaced with the encoded form of your email address.

Save your page and you're done!

Advantages Over the MailTo Macro

This method has several advantages over MoinMoin's MailTo macro. As email harvesters are further developed, they are coded around some of the more popular techniques. The most basic use of this technique will certainly be no exception, as bots will most likely eventually be developed to work around this, too. It can, however, be combined with other techniques to further extend its effective use.

The macro will garble your email address to users who aren't logged in. It will display it with some of the symbols and punctuation replaced with capitalized words. Bots have been developed that can work around this, thus removing its effective power. It also forces any legitimate user that isn't logged in to reconstruct your email address to send you a message.

The HTML encoding method does not share all of these pitfalls. Few bots, if any, have been created that work around the HTML encoding method. This method is also not specific to MoinMoin and can be used on other sites. It allows the creation of normal mailto links that users can click to send you a message. They can also highlight, copy and paste your email address normally.

Download My Encoder

I have provided the source code for my encoder page. It is written in PHP. I have also provided a GPG signature. Key ID: 0x299407D8

License: Public Domain

This script works on PHP 4 or 5 with register_globals off and renders HTML 4.01 Strict. It could use some improvement. Maybe someday I'll rewrite it in Python.  :-)