Coding Corporate Profiles without fear of SPAM

I’ve been asked to add a few bio’s for new employees at my company to our corporate website. I noticed that the current template for employees has the full addresses in both the link and text

i.e. <a href=mailto:jsmith@abc.com>jsmith@abc.com</a>

I’m not a real web designer nor programmer but this seems like poor form to me because the addesses will be harvested in no time at all and be bombarded with spam. I know on some personal sites people add text strings like REMOVETHIS into the link, but that is clearly not practical on a professional site. So what is the best alternative? Are there redaction php scripts out there that can hide it while still permitting real people to get through? Should we go all out an get a CAPTCHA in place to protect the addresses? I assume spam bots don’t respect “norobots” files?

Your alternatives are:

  • Don’t publish their email address at all (but interested customers will have a harder time getting in contact with your employees).
    or
  • publish their email address, and accept that SPAMmers will try to send them stuff (your corporate email server does have anti-spam protection, doesn’t it?)

You have to decide which of those alternatives is “best”.

Neither one of those is acceptable.

hmm?

Are there spam bots with sufficiently intelligent semantic algorithms to know which images to use OCR upon? Or would just making JPGs of the e-mail addies be sufficient means of stopping harvesters in their tracks?

As far as I know, you’d be fine to have a small javascript in there that takes a decomopsed version of the address and puts it back together as like so:


var part1 = "jsmith";
var part2 = "abc.com";
document.write("<a href=\"mailto:"+part1+ "@"+part2+"\">"+part1+"@"+part2+"</a>");

This assumes that the spammers’ crawler software just downloads html files and parses them w/o running the javascript, which is compiled client-side. This seems like a valid assumption to me and should work out.

That would likely work, though clicking on the email wouldn’t be able to link to someone’s mail program. You could also completely remove the email address but put in a “contact me” type link, which links to a form on your website, and then the website would send the actual email. Of course, bots can submit web forms…

That would probably work unless your site is such a rich mine of information that it’s worth someone’s while to customise a bot to work with it (a site with tens of thousands of addresses presented as images, and new ones being added daily, would be tempting bait - a site with two dozen that seldom change isn’t worth the effort).

But doing it this way makes it less convenient for the people trying to contact you - they can’t click on the email, because it’s not a link; they can’t copy and paste the address, because it’s not text. Might just about work if your addresses are really simple (john@foo.com), but if the addresses are at all complex, user frustration will translate into some people going elsewhere - making potential customers jump through hoops before they’ve even met you might not be a good idea.

I’ve had luck with methods of doing this by Googling “obfuscate email” .

No, I’m not saying “just google it.” The google phrase itself isn’t obvious so I thought I’d mention what works for me.