simple way of extracting ordered lists from a text file

I have a dictionary list (alphabetical) in text format of about 200,000 words. I want to extract all the two, three, four letter words etc separately in alphabetical order. (this is to help me learn the valid words for scrabblicious, prolific etc).

Is there a simple way of doing so (say using Word, Excel, Filemaker etc) or would I need to write a program to work on the list? I suspect many programs might struggle with the number of entries even if in principle they could reorder it in letter length.

(I know there are scrabble dictionaries out there and I have one. But these online games have their own peculiar set of words)

Simple? Dunno.

I would think you could import the list into Excel (is it tab separated?) and then run an if/then function something like this:

If contents of cell “a” is two characters long, then return those characters here.

I haven’t really used that function much, so I’m not sure how the formula would be written.

In Excel, you could use the LEN function.
Create a column using LEN that contains the number of letters.
Then sort on that column.
Copy and paste to extract words of the same length.

This would be fairly straightforward on UNIX (also Linux, FreeBSD, Mac OS X, etc) with the standard text processing tools provided with the operating system. Off the top of my head, you could do it with sort and awk. Sort the file with sort and use awk to extract the N-character words. Many of the UNIX text processing tools have been ported to other operating systems, such as Windows.

Some Perl expert will probably chime in here with a one-liner that will do the job and also do your taxes :).

200,000 is more than will easily fit into Excel 2003. If you have Excel 2007, you can import the words into the first column and, as, Lionel says, use the LEN function to get the length of the words. If you don’t have Excel 2007, but do have Access, you can import the dictionary into a table, and then run a SELECT query where LEN([Field1]) is 2, or 3, or 4, etc., which is a little more complicated to figure out, but doable.

My aunt has a two and three letter word dictionary for scrabble. Looking around online, I couldn’t find it, but there are lists two and three letter words. Here’s one Two-Letter and Three-Letter Scrabble Words

thanks for the ideas. Only have Office 2003, so will try Access. I guess I could also first split the file into smaller chunks for Excel to work on too.

Here’s a simpler idea, if you don’t really need the lists to be separate: Simply have the second column be the LEN(), and sort by that column first. This was your list will have everything, with the 1-letter words first, 2-letter words next, and so on.