weird Spam - Poetry ?

Lately I get a lot of spam email like the following:

What’s the point?
They are not trying to make me buy somthing, it’s just arbitrary words. Almost looks to me like some bizarre poetry.

Spammers often put nonsense words in their emails in order fool ISPs and other provider types into thinking they didn’t really just send 9 million of the exact same email. I’m not sure how effective this is, but that’s the reasoning behind it.

In your case, it looks like whatever script or program they are using has not been configured correctly and is missing the actual intended content.

Or, they just want to see that the mail was recieved and opened, in which case they have a live address that opens unsolicited email, and more targeted spam can follow.

It’s not poetry, it’s random words lifted from an online dictionary. Each spam sent out gets a different set of random words. As Eleusis says, it’s trying to fool spam filters by making the percentage of the content of each email unique.

The fact that it’s just the random words alone either suggests that idiot spammer has messed up or that your email reader either can’t, or is configured not to, display the spam content. Spam often comes as embedded HTML so that it can automatically redirect you to their spam site.

It may be that the payload is invisible - in an HTML-formatted message, this might just be a linked image (linked to a non-existent, but usinque URL) - when the spammer’s server receives your machine’s request for the image, it knows that your email address is a live one.

“Eclectic Bugaboo” (line 6)? Does that remind anyone else of anything? Nah, I’m sure it’s just a coincidence.

Seems to be a lot of idiots out there. Most of the stuff I’m receiving is either empty or unreadable (or in html which amounts to the same with my mailtool).
Here is an interesting legal aspect of the issue:
All these emails are marked with the added characters: SPAM in the subject line by my administrator. But as I am allowed to use the account for private mails, too, he is not allowed to delete them.
But, whatever spam-filter he is using, it’s pretty good, since it also detects these “poems” as spam.

Acute erasmus, atrocious? Disastrous!

Why would you call that Spam? It’s beautiful! Almost makes me want to grow a beard and wear a silly hat…

…On second thought, this may be from a modern-time Nostradamus. His chilling description of the imminent Saudi updraft sends shivers down my spine, it does.

I also see great significance in the “border englander
pedestrian townhouse” part but I can’t make it out yet, let’s hope to God that someone deciphers this before The Day of Reckoning is upon us.

It’s not so much to make each email different - spammers have been doing that for a while by adding strings of random letters. The reason they are starting to use real, albeit non-sensical, words is that they hope to fool the new breed of Bayesian adaptive filters. These filters rate ‘spamminess’ by looking at the spammiest and least spammy words in a message. Something like ‘Viagra’, (or ‘v1.aGra’, ‘V^I^A^G^R^A’ or whatever) would be considered very spammy, so the spammers hope to counterbalance such words by including innocent words. Trouble is, what each person regards as an innocent word is different - if you get a lot of emails about motorcycling from your friend Bob, then ‘Bob’ and ‘Honda’ might be very good non-spam indicators for you, but not necessarily for somebody else. Well-implemented adaptive filters shouldn’t be fooled by these chunks of random text.

Quite a lot of the spam that is currently slipping past my filters uses nonexistent HTML tags to break up the spammy words, so ‘Viagra’ becomes ‘V<hjk>iag</hjk>ra’ - html rendering engines typically just ignore such tags, making the word appear intact in the message.

I meant to add a link to Paul Graham’s essay, which kicked off the whole Bayesian filtering vogue, although the latest algorithms are considerably better than that outlined by Graham.

Mangetout - some filters render the HTML before ‘tokenising’ (extracting words from) the text. OTOH, HTML tags themseleves can be very spammy, so it’s debatable whether this is necessary. Paul Graham considers these points in a later essay.

It’s a pain in the ass.

The words they user aren’t a random grab from the unabridged dictionary – if you save 20 such messages and index the words used, cross-comparing each email, you’ll see a lot more overlap than if the words were chosen at random from a large pot. My guess is that they’ve got fewer than 10,000 and use an algorithm to snag a handful and throw it into each outbound email.

And unfortunately, among the words they are using are a couple of my special filter IN words designed to rescue any inquiries/responses to my web site content from the ravages of my other spam filters.

So I’ve had to write four dozen some-odd three-tier filters that look for combinations of words that tend to reappear in these lists, and then, before filtering in messages with mention of the subjects covered in my web site, filter out those that were previously tagged for having overlaps of various combinations of other often-used words.