My work has begun to involve a lot of list formatting. I use Windows. I cannot code. I am profoundly uncomfortable in UNIX. Again and again, I need to do something like “remove the first nine characters from each line in this document” or “insert a comma between each of these two-digit numbers and the name that follows”.
The coder doodz are always telling me that if it was them, they’d write a UNIX/VB/Python script to do it. This advice does nothing for me. What I need is a text application for Windows that treats text as a grid, with each character one cell high and one cell wide, and that will let me insert and remove columns.
It sounds like the type of thing I’d use MS Excel’s ‘text to columns’ feature for.
There’s also an application called ‘Notepad ++’ which allows block-selection (selecting the block of text nine characters in would achieve what you want). In it you hold down either the alt or the ctrl key (can’t remember which) and then drag a rectangle round the text you want. Then copy it to a new file.
I have a program called TextPad. It can select text in normal character and block (column) mode. It also has a simple macro record and playback facility. Using this you could record “move left twise, insert comma, move to begining of line, move to next line”, all by just having it record your typing, and then play it back.
There are many programs that could fancy up the process, but a simple text editor should be all you need. Just set the font to be “Courier New” and everything will be in rows and columns.
Seriously though, a few minutes reading about Unix scripting, sed, and AWK would be enough to do stuff like your examples. Get one of the coder doodz to write you a starter script that you can just modify as you go. Later you’ll be using grep and regular expressions like a pro.
I’ve used Excel for this sort of thing before. Also, in Microsoft Word, if you hold down the CTRL and ALT keys you can draw a box around a block of text. I tried this and was able to delete the first eight or nine characters from each line in a paragraph.
Plus it has a simple record/run macro feature. Basically you start recording, do the sequence of operations on one line, move down to the following line, stop recording. You can then repeat the operation over and over. It has an option to run a fixed number of times or until end of file.
All of these things, sed, grep, awk, perl, python are available for windows as well if you desire to do a little automation and without getting a new computer.
I use this quite often, also. You can also select columns in Notepad++ by holding the ALT key and dragging vertically, though the macro recording is probably your best option.
I’ll agree that learning to use grep, sed and other of the UNIX utilities is worth it if you’re going to be doing a lot of this sort of thing, and the learning curve isn’t that steep. You can download cygwin for free, and install it on a windows machine. BTW, don’t forget cut, which is simpler for a lot of special cases than figuring out the edit command you want to use with sed, particularly as it can cut by delimited fields as well as character positions. For instance, your first task is simply “cut -c10-”.
It’s also convenient to do that kind of massaging inside vi, rather than using sed, once you learn how to make use of the “g” command. Saves you from having to redirect command output into new files, and shuffle files around. And lets you do several bits of massaging a step at a time.
Both examples could be done quick in excel. To delete the first 9 rows. Open excel and open the txt file. On the import screen select fixed width, and next. click between the 9th and 10th column to create a break. Click on OK. Now in excel delete the first column. Save as a txt file and you are done. For commas follow the same steps but create the breaks where you want the commas. Then do a save as, and select a csv. Rename the extention to whatever you want.
MS Word (and Open-Office I would assume) can do this easily. Format all the text in courier or other fixed-width font. Hold down ALT while selecting the text columns you want removed. Hit delete. You can now go get a cup of coffee.
You can do this in MS Word with “Find and Replace” (control-H), by pressing the “Special” button or knowing the codes. In this case you would search for ^p^?^?^?^?^?^?^?^?^? (a line break followed by any nine characters) and replace it with just ^p (the line break).
If all the numbers are exactly two digits, search for ^#^# (any occurence of two consecutive digits) and replace with ^&, (the search-for text and a comma).