How can I stop Web form spam?

For several weeks now, one of the forms on my Web site has been used by spammers to send me spam. Their spiders fill out the “address” field of the form, which is a large multi-line field, with urls and other glurge and send it to me.

Here is a recent sample:

AFAITC, they are not hijacking my Web server to send their e-mail to other people, which is one form of Web form spamming I learned about while looking for a solution. They just seem to be sending me their junk. It all seems to be coming from Italy, for some reason.

The problem is that I use this form to take orders for a product I sell, and the spam is now swamping the actual orders by an order of magnitude. I can’t afford to just disregard all replies from this form, or I wouldn’t have created it in the first place. But oddly, the spammers are only using one of several forms on the site (so far), even though the others have similar fields that could be misappropriated to this purpose.

So does anyone have any suggestions for how I can prevent this? I’ve just tried renaming the “Address” field and will see how that works. I may also try creating a new page for this product, and turn the old page that the spammers are using into a dead page that I can safely ignore. But I’d rather find a method that prevents their using the form in the first place, but without forcing real customers to jump through hoops like using a CAPTCHA image.

(FYI, the site is hosted on a commercial server, not one in my direct control.)

Any ideas? Thanks.

I don’t have any surefire suggestions…just want to say that the exact same thing (I think with the exact same content) was happening to me. In my case though the form was on an old site that didn’t require the use of the form so I was able to just take down the form.

I looked at the logs for the site and the page and didn’t see any one specific IP making the posts to the form, so I couldn’t just block the IP from the server.

The email WAS from the same address each time so I could have just blocked said address from email (you could do this if that’s the case) but since I am the server owner I preferred not to process the email at all.

Good luck. This sort of spam absolutely SUCKS :frowning:

I’ve never dealt with this myself but I’ve heard a few ideas.

One is to put a input field on the page somewhere, hidden somehow (either marked as hidden, or put ‘underneath’ something or styled to not look like a form field). Give it an enticing name. Spam bots tend to fill in all fields by default so just ditch any form with that field filled in.

Anything else tends to involve some user confirmation, the simplest is to make the form a two stage thing. Most spam bots fill in forms and click submit then wander off the page – they don’t expect to have to do something on the following page. So if you got to an ‘are you sure’ page (or something) you can cut down on the spam.

Otherwise you’re going to have to code something to try and see if the address data looks ‘spammy’ or not. That may or may not be easy depending what legit data looks like.

Of course it’s all and arms race and spammers get progressively better at getting round these tricks once they’re in widespread use.

Try this web form –> Beast-Blog.com - Mike Cherim's Professional and Personal Web Log - Home Page

<mod>

If you quote the entire OP, please remember to uncheck “Automatically Parse Links in Text.”

I just spent 10 minutes removing all the URL tags to make the URLs non-clicakable.

Thank you.

</mod>

Comment spam is hell!

Before I came up with my own methodology for blocking spam - myself & some people who help me on my website would kill about 200-500 comments in the guestbook & galleries on my site - and I would also get about another 100 emails of spam.

My system is on shared hosting on an apache server - wrote in php and I use parts of phpBB to get it to work, but if you are using something similar - I can share my source and you can adapt it to your needs. If you have a bulletin board - make sure that you only allow registered users to post on your board - because bbs get spammed like mad.

What I do is a 4 pronged attack.

1.) System side - I have quite a bit blocked via .htaccess using deny for certain IP blocks and also some mod_rewrite rules. Also I have no indexing of any of my forms permitted by search engines in my robots.txt.

2.) Generate & set a sessionID for the user (I use the phpBB sessionID because I allow a bypass of all the spam filters for the users of my bulletin board to bypass step 3 & step 3)- and put this sessionID in the form as a hidden field. When a user or spammer submits the form – check that you have a sessionID and check the sent sessionID against the user’s sessionID and if they match - go to the next step, because if they posted directly this sessionID will not exist, or will not match against a newly generated session.

Unfortunately some spammers will read your page & headers and get the sessionID from the headers and form and send correct headers & it will look like a real request.

3.) I send all the data from a post to linksleeve’s XML RPC - they check the URLs in the content and will send back a reject for URLs in their system.

check it out at:
http://www.linksleeve.org

Unfortunately linksleeve is still pretty new - so some gets through.

4.) I break the whole post into words by the spaces - do a replace on the whacky characters, and then match the words against my own spam_words.txt file.

If one word matches a spam word - I reject the post.

But still some gets through, but it is down from hundreds of spams per day to about 10 or 20.

I still have to add occasional spam words - but this final step helps.

My stats for blocked posts yesterday alone were:

130 blocked by .htaccess
605 blocked by the sessionID method
40 blocked by linksleve
19 blocked by keyword
And I didn’t have to use a captcha and I still can allow my phpBB users to bypass the spam filters - all of this is behind the scenes and doesn’t disrupt the flow for visitors on my site.

I was considering adding this method - placing a blank input form field in a div which is hidden using an external css file to see if it helps any more - it does sound like a good trick.

I was looking at today’s 403s not yesterday’s - requests blocked by .htaccess yesterday was 323.

Here is some of the code I used if anyone wants to adapt it:
http://www.punkhistorycanada.ca/test/SPAMSTUFF.zip

Thanks for all the suggestions. As I mentioned in the OP, I tried renaming the address field, and sure enough, I started getting messages that looked just like the previous ones, but without the glurge in the (renamed) address field. Therefore whatever bot is doing this is so dumb it only looks for the “address” field.

But since getting a bunch of replies without glurge is only slightly less annoying than getting replies with glurge, and since only one form on the site is being spammed (at the moment), I have taken the following steps: I replaced the existing page with an identical one with a different name. I left the one with the original name on the site, but with with its form essentially deactivated. It behaves the same way as the old one, but it doesn’t send an e-mail or save any data to a file. No other pages on the site link to it. (The renamed one is linked to all the pages that the old one was.) So presumably the bot will continue looking for the old file, filling it in, and hitting submit. But nothing will happen.

Now, if the bots get tired of this, or get smarter, or decide to start spamming other forms, I may have to take some of the measures you’ve suggested. Till then, thanks again for your help.