Removing manual line breaks in Excel for Macs

Some of the data I receive has non-display characters that cause problems when converting the file to text. (They break the record.) On a PC I can search-and-replace ALT+010 with pipes. I can’t make an ALT+010 on my mac. Someone here (a Doper from Tel Aviv – I’ve forgotten his name) gave me a work-around: Copy the file into Word, choose Editing>Replace>Special>Manual line break and put in a pipe. This not only gets rid of the ALT+010 characters, but also other non-display characters that cause the problem, and whose codes I don’t know. It works great. After fixing the breaks, I copy the Word document back into the Excel spreadsheet.

But of course there’s a problem. Some larger files either take forever to paste into Word, or they take so long that I assume the Paste isn’t working. This is the case for a file I have right now that has 29,000 records.

I’ve emailed the file to myself, and I’m trying to fix it entirely on the Mac; as opposed to my Mac connected to my office PC. I searched for a solution and found this one. It says to use =SUBSTITUTE. Following the instructions, I have put =SUBSTITUTE(I24,CHAR(13),“|”) into an adjacent cell, and I inserted a manual line break between two words (option+return). The option+return put in a manual line break. The SUBSTITUTE formula put the words on a single line (which is good), but did not insert a pipe between the words. (I’ve also tried using a space instead of a pipe.)

Why does the SUBSTITUTE command not put a pipe in place of a manual line break?

If I can get the substitute formula to work properly, I’ll have to do it in close to a dozen columns; but at least it would be quicker than waiting for the records to be pasted into Word.

Reading this makes my head hurt like a five martini hangover.

Learn to process text files in Python or Perl. It will save you so much time and frustration over trying to make Excel do things that it was never designed to do, or copying files over to Word to just to eliminate carriage returns.

Stranger

I would have to take an in-person class for that. It’s not something I’d be able to learn from a book.

As for the current file: I could wait until Tuesday when I’m in the office and change ALT+010 to pipes, but this file also has the other manual line breaks I don’t know the code for.

The file was finally copied into Word for Macs. I can do a Find using the Search in Document box and putting in ^l (lowercase L), and I can find the breaks that way. But I don’t see in Word for Macs how to do a Find & Replace. I can’t find an icon for that. I found Replace by right-clicking the Find box.

Now I need to copy-and-paste the Word document back into Excel. That might take an hour.

It would literally take about ten minutes to learn how to import the necessary modules, open a file for read/write access, and do a text replacement. You don’t even need to crack a book, actually; there are plenty of examples on stackoverflow.com (and doubtless other sites) such as this one which you can just crib and modify as necessary.

Stranger

I use Easytrieve, which is an ancient sequential language. I have no experience with OO languages. But anyway, your link points to this:

with open("Stud.txt", "rt") as fin:
    with open("out.txt", "wt") as fout:
        for line in fin:
            fout.write(line.replace('A', 'Orange'))

I assume that ‘Stud.text’ is the file you want to modify. In my case it would be ‘ar_file.xlsx’. I assume that ‘out.txt’ is the output file. In my case it would be ‘ar_file_fixed’. And it appears that the code replaces all instances of capital A with the word ‘Orange’.

But I haven’t a clue how to run it. And it seems four lines aren’t enough. It looks like it’s only part of a routine.

Are these plain text files?
If so, get the free version of BBEdit, and just search and replace for the character you want.
You can “show invisibles” and copy the character you want to search for, if you don’t know how to generate it with the keyboard.

No, they’re Excel files. I need to get rid of the NDCs so that I can turn them into text files.

You might still try BBEdit.

Why would it take more than a moment?

I hate to be dense, but I don’t know what that is.

In any case, the file finally copied into Word and I was able to change the NDCs to pipes. I’ve sent the file back to my office PC and am just waiting for the virus scan to finish.

The Aging file is much smaller, and I was able to copy it into word in a couple/few minutes for editing.

I haven’t a clue. I was hoping it would be quicker on my Mac than it is on the PC, but this is not the case.

CTRL-A, CTRL-C, CTRL-V or the Macintosh equivalent.

I just gave it a try myself and found that I needed to use char(10) to see the pipe character inserted. Char(13) removed the break but did not add the pipe.

=SUBSTITUTE(E6,CHAR(10),"|")

Yes, I know that. That’s what I do. It still takes forever to copy a large Excel file into Word.

That worked! Thank you.

I guess I’m not understanding why you have to copy and paste into Word to clean up the line breaks, then paste back into Excel so you can export to text. It sounds like what you need is a plain text editor that can display non printable characters. @beowulff recommended BBEdit. I use Notepad++ but I’m not sure there is a Mac version. But I can’t imagine why you can’t just export to text or csv from Excel and then do your line break cleanup directly in the text file with a text editor using a simple find/replace.

I telecommute most of the time, and I use a Mac. I connect directly to my office PC to work. A couple/few files I receive have non-display characters. On the PC, I can find ALT+010 and replace with pipes. This is impossible on a Mac. Also, one or two files have non-display characters I don’t know the ALT codes for. So the Excel file needs to be copied into Word so that I can replace manual line breaks with pipes. This takes care of all of the NDCs, whether they are ALT+010 or the other ones. Once replaced, I can Paste as text back into the Excel file.

Why not just save the Excel file as Text? Because when I started doing this 20 years ago, there were two ways of saving Excel files as text. One way truncated the records (which are hundreds of characters long), and the other way split the records and put the ends at the bottom of the file. Easytrieve requires fixed-position text files. Fields must be in the proper position and be the proper length. So I save the Excel files as .csv files and import them into Access, then export the files as fixed-position text.

When there is a manual line break, Access breaks the record upon exporting it.

First, you need to save the file as a CSV (comma separated file) so that it is plaintext instead of Excel format. (There are libraries for working with various versions of Excel from Python but not really necessary in your case since you want to export to a text file anyway.)

The reason that copy and paste from Excel to Word or reverse takes so long is because you are copying the entire text and formatting of the document into the clipboard, which it was never intended to facilitate. This is a terrible, time-wasting ‘hack’ that, as you note, doesn’t work well or at all as filesize increases.

The with open(filename, read/write) as filehandle: just opens a text file for reading or writing as . The for line in filehandle: iterates through the lines of the file. The filehandle.write(line.replace()) command does the actual text replacement. In the case of “non-display characters” (which is what I interpret what you mean by “NDC”), you can just have it trim off all of the non-ASCII characters, or use a regular expression (regex), or any number of other solutions. All you need to do to run this on your Mac (which has Python installed in the system) is copy and edit this into a file (scriptname.py) in the directory where your CSV file is, open up a Terminal in the folder where your files are, type python, and then run scriptname.py. Those four lines are literally all you need to do this task.

For all the time you spend fucking around with Easytrieve, and Excel and Word, you could just spend a few hours and learn some very basic functionality in Python or Perl, write some generic functions to take inputs like filenames and text formats, and essentially automate this entire process without ever again firing up Easytrieve or Excel. You don’t need to know anything about object-oriented programming, or any obscure libraries, or even most of the functionality of a language like Python to do this simple thing. Most community colleges have extension classes which include basic programming if you have to have an in-person learning experience, although for what you are trying to do you could pay some junior high school nerd twenty bucks, or just go on stackexchange and ask people to solve your problem for you. Any of these options would be less painful, both for you and the posters here who keep giving you guidance on how to create a functional solution instead of weird, inefficient hacks to add to what already seems to be a convoluted process just to process and format text data.

Stranger

I agree that there is a better solutions. Unfortunately they came about long after ‘my time’. I have to use the tools I have. If I had one ‘routine’ or whatever that someone sets up, I could probably adapt it. But I’d need a template and the tools to run it (and we’re not allowed to install anything on our computers).

Sorry I’m such a dinosaur.

Seconding (thirding?) that if you are not using a script and want to edit a text file manually, you need a text editor (not Word). Literally any one will do (except possibly for a shitty one without proper search-and-replace). Are there any text editors installed on your computer (since you say you are not permitted to install additional software)? I am positive there is at least something (perhaps TextEdit and/or vi/vim?)

Excel files are not plain text files, so you will need to work with CSV as explained in Stranger_On_A_Train’s message.

You can call yourself a “dinosaur” if you want, but Word does not enter into it; as you said yourself in the first message, it chokes, so you cannot really use it.

Download BBEdit. Save Excel file as .csv format. Open in BBEdit. It is very good at revealing all the various line break codes (there are more than one of them) and letting you mass-substitute to nuke the ones you don’t want.

I do it all the time to convert .csv files that use carriage return to .csv files that use line feeds because Microsoft Word for Windows’ mailmerge function is ridiculously pickly and only accepts line feed line breaks.