Removing line numbers from document produced by court stenographer

I’m helping a friend produce a book about his brother. He hired a court stenographer to take down two interview sessions with the brother and now she has sent me the digital files.

I’ve got a PDF, a text document, and a PTX. As one would expect with such a transcript, all the lines are numbered. I want to place/paste this text into an inDesign document. I need to remove the line numbers. This first document is 65 pages long and I don’t relish the idea of doing that manually.

I downloaded the E-Transcript Viewer so I can open the PTX file. Under “page setup” leaving out the line numbers is not an option, and I can highlight(select)/copy the text *without *the line numbers being highlighted, but when I paste into a Word doc, the line numbers reappear.

I can open the PDF and the text doc in Word, but of course, the line numbers are there. And selecting text includes the line numbers.

Is there any way to remove the line numbers except deleting each one manually? I saw something online in passing that said this can be done with macros. I don’t know how to work with macros, but if someone can tell me, I’ll send them a pan of cashew fudge today.

Is there some way the stenographer can reformat and leave the line numbers out? (I just called her, but she’s not in. I’ll retry if the answer to this is yes.)

In Word, you can highlight vertically, by holding down the ALT key while using the mouse just as if you were selecting text normally; you might be able to use that trick to highlight the line numbers on the right, and then delete them.

See here for example https://www.howtogeek.com/howto/microsoft-office/select-text-vertically-in-microsoft-word/

If you can copy the text as “plain text” (no formatting), removing the line numbers would be easy using regular expression search-and-replace.
A free tool like TextWrangler (on OS X) could do this.

If you can send me the text, I can do it for you.

OMG. That worked. You are The Bomb. PM me your address and a pan of cashew fudge will be on its way. Incredible.

I’m on a PC, but you get large points for responding quickly.

I’m curious. How would you accomplish this by using search-and-replace? What would be the search term? I’m assuming the replace term would be a space… but one space or two?

I’d have to see the way the line numbers are formatted, but the search string would be something like “carriage return / any number of numeric characters / any number of spaces” replace with carriage return:

regex search would be something like: \r\d+\s+

Just for fun, and maybe cashew fudge…

You can paste everything into a program called Notepad++ (which is a plain text editor) and do a regular expression (regex) find-and-replace using the find term [0-9] (meaning “any numbers followed by a tab ( )”) and a blank for replacement. That’s assuming that the line numbers are pasted in as “1[tab]” etc. Depending on how they get pasted in (maybe they are “1[space]” or “1[space][space][space]” you can tweak the expression to be "[0-9] " or "[0-9] " in those instances.

What about numbers that might be included in the text? (“I made the touchdown in 1955 and our team won by 72 points!”)

You need to be careful with this search, because it will also find numbers in the middle of the text. Start the search by looking for a carriage return or newline.

ETA: The OP got that one before I did.

Putting a caret at the start will match only at the beginning of a line. Depending on the type of regex supported, one of these might work:



^\d+




^[0-9]+




^[0-9][0-9]*


If the line number may have leading spaces, you would add “\s*” or " *" after the caret.

(BTW, is this board ever going to fix newlines in code blocks?)

I’m a little late to this party, but there are a lot of online webform tools like this one: http://remove-line-numbers.ruurtjan.com/ for doing various reformatting tasks, just a quick google away. Copy/paste your plaintext into the form, and bam, done. Somebody else did the heavy lifting of figuring out the right regex for you. :wink:

Well, dang! That worked! Thanks!

You can also do a column-mode select in Notepad++, similar to what Andy L described above for Word.

Glad to help. No need for fudge (I’m trying to cut down on sweets).

I understand. Here, reading the recipe should be almost as good as eating it: ThelmaLou’s Cashew Fudge.

Thanks!

I’m really too late for this, but definitely post any other formatting questions you might have. I’ve just started brushing off my manuscript formatting skills (stretches hands) and this looks like fun. :slight_smile:

Today I went into the stenographer’s transcript and first I removed the line numbers per AndyL’s instructions. There was a paragraph break at the end of every line, and between every line. So I used find/replace to remove all of those. Then I entered paragraph breaks when the speaker changed (it was an interview). Then I went through manually, added a few more paragraph breaks, and removed extra spaces and just odd shit that the stenographer didn’t hear correctly. (But all in all, she did a fantastic job!) Finally I changed the font, the margins, and double-spaced the whole shebang so my 80-year old client can go through and manually edit the text. It’s down to about 30 pages now. All of that only took 45 minutes. A second session is coming next week. I asked my client on the phone today about how many more sessions does he anticipate and he said four or five. This is a very sweet gig for me.

I’m also scanning many, many family photographs (all black & white) to put into the text where the client tells me to. Of course, after I scan them, I’m doing some brightening and some cleaning up in Photoshop. I’m going to bring my computer when I meet with the client on Monday and show him how easy it is for me to place the text into inDesign. I’ve been using it since it was hatched as Pagemaker in 1985 and I’m still learning all that it can do.

I appreciate the willing and helpful resource here. :slight_smile:

Boy, am I glad I made this last post! I just got the second session and I couldn’t remember how the heck I edited the first one. (Hey, that was a week ago. I can’t be expected to remember everything from the ancient past. :rolleyes: ) I’ve just copied my method into a note that I’m keeping in the folder with the documents. In case the SDMB is destroyedin the coming Armageddon. Carry on.

OMG, that’s food porn! :smiley:

Thanks for the update (that for some reason I didn’t notice until today)