What is up with spam these days?

I’ve gotten reasonably used to the random collections of letters that appear in spam subject lines, and I understand that they’re supposed to circumvent filters. The same thing with the leet speak that they use, like “/iagra”. But today I got the mother of all bizarre spams. Here it is in it’s entirety:

"Free CableTV!No more pay!*

restoration stillwater fellow invert secession junkerdom befell wolves battery drake aldebaran jove dissuade
unison chronic confession enquire handclasp sunbeam gel garlic jinx frank gallberry demarcate buyer allegro eratosthenes and thornton fiesta
epigraph cairn duck resemblant manslaughter tidy antiquary during wyoming sheer usher accede notify inflater decrement proxy fracture average stultify affront annihilate fuzzy dreyfuss commission otherworldly durance invoke stalemate coachwork susanne
deprave andre burgundy around apparent embrace personnel balsa morphophonemic repertory
circumflex platte assurance aboveground bella mexican rank quicksand chalcedony accidental davis eventuate cutout victrola chateau meditate ice compton disgruntle deviate food clad edmondson rift apology stationery whence forensic dna ferroelectric technocrat cumulate bewilder
congenial murder rundown invite colorado coolidge jocular gascony certainty neapolitan apothecary finitary fireboat functorial convent atavistic decisive cosmetic millstone commodious packet baden romance intramolecular curt sting preview style carol rubdown wapato yalta sylvester swart which
enviable eutectic scull trainmen leander ethic illusive smell soothsayer mile bloomfield setscrew glasgow dent pontiff pyroelectric zan mica gyp sancho delicatessen bellum coriander associate sistine momentous speculate gabrielle andy atalanta baronet kitchen heap darling
pegboard debugged bagatelle repetitive cotillion uphill aborning ovum afferent secession aplomb repulsive comic orthography rakish seaweed radiotherapy mach abrasion blew hot wonderful rabid arhat dell ebullient eerie tire detain pervade dynamism gail delineate explicit phon greenery cometary petersen
aileen anionic clara collimate alfalfa congenial afield append tampa spacious bassinet chapel
percent brunette decile backpack infuse referent embroider bloodline flier directory odd boar adamant snort abstention tapeworm
kresge also paymaster sumptuous brazil cannibal choosy edelweiss acton chock geophysical bark crankshaft diagnosis alveolus carney rooseveltian tether petersen pleura skyscrape identity smyrna yukon"

WTF is up with that? Without looking each one up, it seems like they are all real words, people’s names, places, or the like, except for maybe a couple. What kind of a person is going to see that and click on the link (which was a gif that wouldn’t load that linked to a site that doesn’t exist, by the way)? What could possibly be the point of something like this? Does someone actually expect this to generate revenue? It almost seems like it’s maybe the product of a company contracted to send out spam, but they didn’t want to actually take the time to create a legible product, and used some random word generator in the hopes that the company who contracted it would never know. Or if their intent was that you simply click on the link, why bother adding all the gibberish? This is just so absurd I’m almost unspeechified.

Guesses:

  1. It’s too confuse spam filters that look for obvious spam keywords – like “hot” “sexy” “FREE” – into thinking it’s a legit e-mail because it’s full of real words that aren’t spam keywords.

  2. Operator error. They screwed up the mail-merge and those are supposed to be the choice of “subject” for the header of the e-mail.

The usual reason a sequence of words like this are put in is to defeat Bayesian (proablistic) spam-scanners. Basically, each word or token gets given a certain score, and by including all these words they are trying to make the spam score similarly to a normal email.

Ah… On preview, I see that qts got it first.

I’m thinking that this is an attack against Bayesian filtering tools such as POPFile.
These tools act like neural networks: you teach them what is spam and what is not over a period of a few weeks. The Bayesian filter learns to recognize what spam looks like.

I suppose that if you fill your message with perfectly decent words and proper names, this may fool the filter.

As it is, I’m having a devil of a time trying to get POPFile to recognize the good stuff – it tosses everything, for lack of a sufficient supply of good messages to learn from.

Yeah, I recently got one that described the work of various scientists to identify certain fossils of extinct animals. Right in the middle of that, it stops and says “GET PRESCRIPTION DRUGS CHEAP!!!” with some of their usual garbage. Then it goes back to the article.

I can’t help but notice, though, that I’ve received a distinct lack of “sex” spam and a lot more “read their email!” spam lately.

Thanks.

I guess I was penalized because I couldn’t remember the name of the filtering system.

Such a technique can also work against manually created custom filters if you have some that spares any email with certain words or strings in them. One such piece got into my inbox last week because it had the word “feminism” nested among others as in your example above.

Oops… I guess I didn’t say it the way I meant. I really meant to say that qts mentioned the Bayesian angle first. :slight_smile:

<spoilt child> hmph Don’t care! Doodyhead! :stuck_out_tongue: </spoilt child>

Couldn’t remember for the life of me the name. Got stuck thinking “Babelfish”…