Importing text file into Excel

I am using third party software to generate text files that I want to import into Excel. If I save the file with the extension .csv, I can open it in Excel and it works fine. However, if I use the import data function, it comes into Excel all screwed up.

Well, you may ask, why don’t I just open the csv? Because I want to import the data using VBA. Certain conditions in the sheet will determine which text file I import, so I need to VBA to be able to do this.

The only solution I can think of would be to have VBA open the csv file and then copy and paste the entire sheet. This seems really sloppy and overdone to me. I have other VBA programs that import text and I am happy with them, so I’d like to keep doing it the way I’m comfortable with. Anybody have an idea why the import data function messes up the data, while opening it does not? It seems like a bug (Excel 2010, BTW.) Any ideas?

Eh, forget it-solved. If anyone has an explanation for why importing the text would be any different than opening the file though, I’d like to hear it. It seems like it has to be a bug.

I’d like to take a look. Can you put an example up?

Unfortunately I can’t. It’s got stuff in it that falls under HIPPA and is confidential.

I can tell you what is going wrong. One of the fields being imported has text that was typed in by a user. If the user types a return in the field, when importing the text Excel sees that return as an End of Line character and starts a new row, even though the return falls within the text qualifier ("). When opening the csv, though, this doesn’t happen and the returns are imported into the cell with the rest of the text.

Bottom line, the software vendor sucks. I had this problem in another one of their reports and they fixed it. Then I found it in this report, and when I asked them to fix it they refused because they discovered that it would open correctly if you open the csv. That worked fine for a while, until I wanted to import the data using VBA. I only need to import one column, but because of this BS now I have to open the whole thing in a new workbook and copy/paste. I found another problem in the report that I can ask them to fix, so they may be willing to fix this problem at the same time, since they clearly already know how to fix it.

RFC-4180 (the ‘standard’ definition for the csv file format) does allow line breaks (carriage returns) provided they are contained within a quoted field. So your vendor IS providing a file that meets the proper standards.

It is Excel (or VBA) that is not properly handling the standard csv file format. (But then, Microsoft is traditionally pretty uncaring about meeting standards.)

A suggestion: replace the double-quotes (") with single-quotes (’). Excel sometimes deals better with them. (Note - specifying which version of Excel you are using would have been helpful.)

If that doesn’t help, a work around might be to scan the incoming files, replacing all returns in quoted fields with another, unused character (like ~, for example), before importing the file. Then it should import properly. You would have that character instead of returns in your spreadsheet, but they would still be readable text. Possibly a replace macro or VBA could then change it back to a return.

Can’t your VBA script scan for CR/LFs or whatever’s causing the issue and change them into whatever they need to be?

Thanks for the info on the csv file standards, I wasn’t aware of that. Doesn’t change my opinion of the vendor though; there is no reason for that field to be included in the csv in the first place.

(I did specify Excel 2010 in my OP.)

I tried replacing the double quotes with single. It didn’t work, and the replace took almost 15 minutes (there were 1,464,285 replacements). So, I don’t think replacements would be a good solution in this case, even if it did work.

Out of curiosity, how would I just replace returns in quoted fields, without also replacing the returns that are legitimate EOL characters? The only thing I could think of was to replace "¶ with ~, then replace all the remaining ¶ with something else, and finally change ~ back to "¶. With replaces taking 15 minutes, that’s obviously not an option, but is there an easier way?

More curiosity, what would happen if a user were to type quote marks in that text field? How would that affect the import?

I feel your pain. There’s no reason to forbid the user from entering a quotation mark or a line break in a text field. (Often, there’s even a helpful note showing how to do it, like SHIFT-RETURN or something.) Still, these converters and readers and such ignore that possibility, and just merrily chug away, processing the data without ever actually looking at it, and they just give bad results - or stupidly error out - when they’re blindsided.