Reformatting a text file

I have a text file that I need to reformat into a specific order. Right now it’s a mess of names and addresses that all just run into each other on one line like this:

“MR “,“BLEEH BLAH”,“BLEEH”,””,“BLAH”,"999 BOARD ST “,“APT
A1”,“SPRINGFIELD,”??”,“90210”,“DATA”,“THAT”,“I”,“DONT”,“NEED”

I have some experience in using Matlab and Excel, but not for anything like this. I’d like to get it into a format like:

MR BLEEH BLAH
999 BOARD ST APT A1
SPRINGFIELD ?? 90210

Any helpful ideas on whether it’d be easier to do it in Excel or Matlab (I’d do C/C++ but I’m really rusty on those)? Oh, yes, and it’s a list, so there’s a blank line between each thing and the next set of info.
It doesn’t even have to look exactly like what I have above. Even if I could just get rid of all the extraneous quotes and repeated names and spaces, I wouldn’t mind hitting enter just to get them into the correct format.

Thanks in advance for any help you can give me.

If there is a space between the data then just open the file using excel. The import wizard with help you reformat the file into Excel. One the file is in Excel you can delete columns of information that you do not need.

Use Excel.

When coverting the text in Excel, use the “,” as your deliminator between columns. Delete the extraneous columns. Now concate the columns (street addresses) you require.

I actually tried to do it like that in Excel and for some reason couldn’t get the “,” as my deliminator to work. Fortunately I happened to try it in Access and it read it perfectly. Then I had to do some craziness with copying and pasting to get it back into a purely text file, so now I’m going to try writing a Matlab script to reformat the text (minus the repeated info and the stupid quotation marks).

Thanks for the help! And if anybody has any other helpful hints for writing the Matlab script, I’d love to hear it. If anybody has a better way to reformat the text in Excel, that’d be good too.

I just wanted to post that I was able to do what I needed to. Thanks for the input. It was quite easy to do after getting it into a reasonable format in Excel. Question, you have been answered.

Actually, the hardest part of the work has essentially been done for you – the parsing. Each piece of your desired output has already been isolated. Personally I’d write some simple code to read in all the components and write them out in the desired format. You can do it in Matlab, C, VB, whatever. Yes, you can finagle Excel to do it too, but what’s tricky is that your input is one line while your output consists of three lines. Why don’t you just write some VB within Excel? That should work fine.

In Matlab, you would do this with the function “csvread” which is used to read data from comma-seperated files.

Once you’ve read the information, you would can get rid of the quotes by using “strrep” to replace information in strings.

You can then create a new file using “fopen,” write to it using “fprintf,” and then save the file by using “fclose.” The help for “fprintf” in recent versions of Matlab gives some good information about formatting the file.

What I ended up doing was putting it in Excel, getting rid of the info I didn’t want, and then using [a b rawdata] = xlsread to read in the data (getting rid of the a and b). Then I used fprintf to write it out. The strrep would have been handy for doing it all from Matlab though. I’ll have to keep that function in mind if something like this ever comes up again.

Your original file was not a mess, it was properly stored in comma-separated-values (.csv) format. Probably output from some spreadsheet or database program.

You could have taken that directly into a word-processing program like MS Word, and used the mail merge feature to generate your addresses. Either directly to labels or to a saved file. Probably would take about 5 minutes, maybe 15 minutes if you’re real new to word processing.

Converting this to a text file is the wrong thing to do.

You should:

  • Keep the actual data file in a database or spreadsheet program. (Like your Excel version.) Make all adds & updates to that.

  • Each time you need to print the addresses, export the file in .csv format, and use your word processing program to print them. Delete the .csv file when you are done.

That way, you have only one original file to maintain, so no worry about which file is most current, etc. You are using each program to do what it’s best at: database for storing/updating data; word processor for producing documents. And you can easily re-use your data for other tasks; for example, printing a phone list instead of address labels would be a simple addition.

Huh. That would have been much easier. Actually, all told, it was only about an hours worth of work, so it wasn’t really that hard, but this appears to be the “correct” way to do things. I’ll pass on the info. The list wasn’t for me, but for a friend, so I’ll let them know.
Thanks!