The Straight Dope

Go Back   Straight Dope Message Board > Main > Cafe Society

Reply
 
Thread Tools Display Modes
  #1  
Old 05-24-2009, 03:03 PM
alfonzos alfonzos is offline
Guest
 
Join Date: Aug 2005
Letter Frequency/Usage

Where can I find a list of frequency and use of letters and letter combinations in the English language?
Reply With Quote
Advertisements  
  #2  
Old 05-24-2009, 03:20 PM
running coach running coach is online now
Charter Member
 
Join Date: Nov 2000
Location: Riding my handcycle
Posts: 14,262
Here.
Reply With Quote
  #3  
Old 05-24-2009, 03:21 PM
Mahaloth Mahaloth is offline
Charter Member
 
Join Date: Apr 2000
Location: 地球
Posts: 21,359
I know it is new, but there is a web site for searching. It's called Google and I used it to find the answer to your question in less than 10 seconds.

Google Search "Engine"

Frequency of Letters in English

Scroll down for various answers.
Reply With Quote
  #4  
Old 05-24-2009, 06:16 PM
seosamh seosamh is offline
Guest
 
Join Date: Aug 2004
In the good old days of typeset newspapers, you would often find rows of meaningless letters appearing at random; but they would always say:

ETAOIN SHRDLU
ETAION SHRDLU

No doubt someone will put me right but I understood that the typesetting machines weren't Qwerty but arranged the letters in order of commonness, and if a typesetter accidentally leant onto the keyboard, a slug of the commonest letters was produced.

I am probably talking though my hat entirely. But that is why I know the first 12 most frequently used letters....
Reply With Quote
  #5  
Old 05-24-2009, 06:18 PM
Lynn Bodoni Lynn Bodoni is offline
Creature of the Night
 
Join Date: Mar 1999
Location: Fort Worth, Texas
Posts: 20,803
Quote:
I am probably talking though my hat entirely. But that is why I know the first 12 most frequently used letters....
I used to be fascinated by codes and ciphers, and quickly learned to try ETAOINS first for ciphers.
Reply With Quote
  #6  
Old 05-24-2009, 07:06 PM
samclem samclem is offline
graphite is a great moderator
Moderator
 
Join Date: Aug 1999
Location: Akron, Ohio
Posts: 21,271
You could have guessed Cecil addressed this........

http://www.straightdope.com/columns/...-etaoin-shrdlu

Also, a previous thread on this. http://boards.straightdope.com/sdmb/...LU#post6127767

Last edited by samclem; 05-24-2009 at 07:08 PM..
Reply With Quote
  #7  
Old 05-25-2009, 12:34 AM
panache45 panache45 is online now
Member
 
Join Date: Oct 2000
Location: NE Ohio (the 'burbs)
Posts: 22,026
In fact, there's a short story by Fredric Brown called "Etaoin Shrdlu." Yup, it's about an evil Linotype machine.
Reply With Quote
  #8  
Old 05-25-2009, 12:37 AM
panache45 panache45 is online now
Member
 
Join Date: Oct 2000
Location: NE Ohio (the 'burbs)
Posts: 22,026
Quote:
Originally Posted by seosamh View Post
. . . the typesetting machines weren't Qwerty but arranged the letters in order of commonness . . .
And there was no shift key; there were separate sections of the keyboard for lowercase and uppercase, and a third section for numbers, punctuation, etc.
Reply With Quote
  #9  
Old 05-25-2009, 09:38 AM
CalMeacham CalMeacham is offline
Guest
 
Join Date: May 2000
as a long time solver of cryptograms, I'm aware of the typical frequencies. (I also know that, if you make your own lists from particular sources, the frequencies change around a little, but not by much). But there's something I've long wondered about.


What are the frequencies of letter usage in other languages? If I were solving a cryptogram in, say, French, what is the most likely order of letters? Or German, or Italian.

For that matter, are there significant differences (larger than the variaions you get in binning letters from different sources) between British English and American English, or with Australian English?
Reply With Quote
  #10  
Old 05-25-2009, 11:19 AM
Chronos Chronos is offline
Charter Member
 
Join Date: Jan 2000
Location: The Land of Cleves
Posts: 52,859
One I've wondered, meanwhile... ETAOIN SHRDLU etc. are the most common letters in English usage, but that's largely because of a few very common words ("T", for instance, is as high as it is thanks to common words like "the", "it", "at", "to", "this", and "that"). But what's the frequency order for words in the dictionary? That is, giving the same weight to "the" as to "syzygy"?
Reply With Quote
  #11  
Old 05-25-2009, 11:43 AM
Omphaloskeptic Omphaloskeptic is offline
Guest
 
Join Date: Oct 2001
Quote:
Originally Posted by CalMeacham View Post
What are the frequencies of letter usage in other languages? If I were solving a cryptogram in, say, French, what is the most likely order of letters? Or German, or Italian.
Wiki table.

Quote:
Originally Posted by Chronos View Post
But what's the frequency order for words in the dictionary? That is, giving the same weight to "the" as to "syzygy"?
With the web2 word list, I get EIAORNTSLCUPMDHYGBFVKWZXQJ. This word list doesn't include obvious inflected forms, so adding those would probably move S (for example) up the list.
Reply With Quote
  #12  
Old 05-25-2009, 11:51 AM
glee glee is offline
Guest
 
Join Date: Aug 1999
Quote:
Originally Posted by Mahaloth View Post
I know it is new, but there is a web site for searching. It's called Google and I used it to find the answer to your question in less than 10 seconds.

Google Search "Engine"

Frequency of Letters in English

Scroll down for various answers.
Haven't you posted this before?
If so, why not link to it, instead of typing it again?

And of course asking questions like this leads to interesting discussions.
Unlike your post.
Reply With Quote
  #13  
Old 05-25-2009, 12:44 PM
Mahaloth Mahaloth is offline
Charter Member
 
Join Date: Apr 2000
Location: 地球
Posts: 21,359
Quote:
Originally Posted by glee View Post
Haven't you posted this before?
If so, why not link to it, instead of typing it again?

And of course asking questions like this leads to interesting discussions.
Unlike your post.
No, I have not posted it before. Why would I have? And I did answer the question....and was joking about the google thing.

Sorry if I offended you. I'm not sure I even understand if you were talking to me.

Reply With Quote
  #14  
Old 05-25-2009, 02:49 PM
panache45 panache45 is online now
Member
 
Join Date: Oct 2000
Location: NE Ohio (the 'burbs)
Posts: 22,026
Quote:
Originally Posted by Omphaloskeptic View Post
I notice that "C" has now surpassed "U," so we now have "ETAOINSHRDLC."
Reply With Quote
  #15  
Old 05-25-2009, 03:35 PM
Chronos Chronos is offline
Charter Member
 
Join Date: Jan 2000
Location: The Land of Cleves
Posts: 52,859
Quote:
With the web2 word list, I get EIAORNTSLCUPMDHYGBFVKWZXQJ. This word list doesn't include obvious inflected forms, so adding those would probably move S (for example) up the list.
Ah, that explains the "R S T L N E" on Wheel of Fortune, then, since the answers are mostly single words in the bonus round.
Reply With Quote
  #16  
Old 05-25-2009, 05:39 PM
BrotherCadfael BrotherCadfael is offline
Guest
 
Join Date: Feb 2003
Quote:
Originally Posted by CalMeacham View Post
as a long time solver of cryptograms, I'm aware of the typical frequencies. (I also know that, if you make your own lists from particular sources, the frequencies change around a little, but not by much). But there's something I've long wondered about.

What are the frequencies of letter usage in other languages? If I were solving a cryptogram in, say, French, what is the most likely order of letters? Or German, or Italian.

For that matter, are there significant differences (larger than the variaions you get in binning letters from different sources) between British English and American English, or with Australian English?
Professional cryptographers have language-specific frequency distributions for whatever language the target messages are in. Different frequency distributions are also based on the characteristics of the target message as well - military jargon has a different distribution than does Valley Girl speak, etc.

In addition to the simple one-character frequency distributions, frequency distributions are also compiled for digraphs, trigraphs, etc.
Reply With Quote
  #17  
Old 05-25-2009, 05:43 PM
AppallingGael AppallingGael is offline
Guest
 
Join Date: Dec 2004
Quote:
Originally Posted by Chronos View Post
Ah, that explains the "R S T L N E" on Wheel of Fortune, then, since the answers are mostly single words in the bonus round.
It used to be that players would choose five and one for the bonus round, but eventually everybody was picking the same six, so the formality seemed silly. So they began to give you those six, and have you request another three and one. I wondered how long till everyone settled on the same four, but so far it is 20 years+ and counting.
Reply With Quote
  #18  
Old 05-28-2009, 03:24 PM
dwc1970 dwc1970 is offline
Guest
 
Join Date: Oct 2001
Quote:
Originally Posted by panache45 View Post
I notice that "C" has now surpassed "U," so we now have "ETAOINSHRDLC."
No doubt the advent of the Internet has changed some of the rankings. You'd think W would have moved up in rank a couple notches with all those www's out there. The C has also likely gained rank with .com which would mean the O should gain some ranking (which would also have .org working in its favor).
Reply With Quote
  #19  
Old 05-28-2009, 03:50 PM
Chronos Chronos is offline
Charter Member
 
Join Date: Jan 2000
Location: The Land of Cleves
Posts: 52,859
Quote:
I wondered how long till everyone settled on the same four, but so far it is 20 years+ and counting.
Well, the first six are chosen blindly, but the next four are after you've seen some letters come up on the board, so theoretically you ought to be basing your choices on what you already have. For instance, if you have an N in the penultimate position, I and G might be good guesses, and T blank vowel or S blank vowel are likely to have an H in between.
Reply With Quote
  #20  
Old 05-28-2009, 04:02 PM
Kingspades Kingspades is offline
Guest
 
Join Date: Nov 2003
Quote:
Originally Posted by alfonzos View Post
Where can I find a list of frequency and use of letters and letter combinations in the English language?
Come on, man! Grab a copy of the New York Times and do it the old fashioned way!
Reply With Quote
  #21  
Old 05-31-2009, 04:58 PM
alfonzos alfonzos is offline
Guest
 
Join Date: Aug 2005
Thank you for your comments (even the snarky ones; they show you care). Runner pat's link was the most helpful.
Reply With Quote
  #22  
Old 06-01-2009, 01:20 AM
Oslo Ostragoth Oslo Ostragoth is offline
Charter Member
 
Join Date: Feb 2004
Location: the Prairie
Posts: 6,728
How about two and three letter combinations?
Reply With Quote
Reply

Bookmarks

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off

Forum Jump


All times are GMT -5. The time now is 05:59 PM.


Powered by vBulletin® Version 3.8.7
Copyright ©2000 - 2014, vBulletin Solutions, Inc.

Send questions for Cecil Adams to: cecil@chicagoreader.com

Send comments about this website to: webmaster@straightdope.com

Terms of Use / Privacy Policy

Advertise on the Straight Dope!
(Your direct line to thousands of the smartest, hippest people on the planet, plus a few total dipsticks.)

Publishers - interested in subscribing to the Straight Dope?
Write to: sdsubscriptions@chicagoreader.com.

Copyright 2013 Sun-Times Media, LLC.