Excel/Access non-display character?

I need to import .csv files into Access, so I make sure there are no commas or manual line breaks in the data. But there’s one file that’s weird. When I import it into Access, some fields have double-quotes around their contents. Unlike a comma or manual line break, this does not result in the data being split into separate cells, shifting data to the right. When I export as fixed-position text though, the cells are expanded and the data is shifted.

All I can think of is that there’s some sort of bizarre non-display character there that I’ve never heard of. Does anyone know what it is?

When I’m in the office, I get rid of manual line breaks by replacing ALT+010 with pipes. At home I copy the file to Word and replace manual line breaks with pipes. So if anyone can give me a clue as to what the ALT code is, and also what it’s called in Word, that would be great.

Have you tried examining the problematic snippets with a hex editor to check for invisible characters?

It’s been 20 years since I’ve looked at hex, and wouldn’t know how to do it or what I’m reading anymore.

Johnny L.A., you might look at the following link I found while searching for “alt 010”:
https://www.proz.com/forum/office_applications/184273-how_to_remove_all_of_the_line_breaks_inside_of_the_cells_in_a_row_in_excel.html

If you can isolate the problem to a short snippet, ideally just a few characters, perhaps you could copy and paste it here? (Assuming it does not get filtered by the forum software.)

ETA “alt-010” is indeed a line feed, but I thought the problem was some other invisible character? To figure out the ALT code, you first need to identify the character.

Yes, that’s how I use Word to get rid of the line breaks. In this case, the lines aren’t broken, as the data are intact in the fields within Access. It’s only upon exporting that they shift. If I knew what these are, and they exist in Word’s special character search, I can delete the offending characters.

In Excel it looks like this:


Sayaka Hibachi & Sushi Bar

I copied the cell and just tried pasting it here. It looks like this:


"Sayaka Hibachi & Sushi Bar			"

Yes, ALT+010 is no problem. This is some other character, and I don’t know what it is. If anyone knows what it is, then I can try using Word to get rid of it (telecommuting) or using the ALT code (at the office).

Your request for a snippet allowed me to determine the character is in the Excel file, so Access isn’t the problem.

You’re saying that when you export the file from Excel as a CSV flle, the quote are added, correct? RFC 4180 only REQUIRES quotes around fields that contain a comma, quote, carriage return or linefeed. But it ALLOWS quotes around any field.

My guess is you’re on the wrong track looking for nonprintable characters. Does the quoting happen if you remove the ampersand? Excel may be treating ampersand as a metacharacter (which it is in the context of HTML, XML and other formats, but not CSV).

No.

These non-display character exist in Excel. They cannot be seen until the file is saved as .csv and imported to Access. So I need to know what these characters in the Excel file are so that I can get rid of them.
ETA: No, the ampersand is a legitimate character that does not result in weirdness.

Are you using Windows? If you are, open the CSV in Notepad++ and use that to clean it up, not Word. Word is going to hang on to those odd characters for dear life!

In NPP you can go to View - Show Symbols - Show all characters to see characters.

Also the search is really powerful so you can write a regular expression to search and replace whatever is ailing you.

(Emphasis added.)
Wait, you’re saying you can see these characters when you save to CSV? What do they look like? Or are you just inferring that they must exist because the fields got quoted?

My office computer, which I connect to on my Mac, does not have NPP. We’re not allowed to download any programs. It has to be done by an administrator, which is an outside IT support provider. They’re expensive, and this is only happening with one pair of files. Once I export the files to .txt., I use Notepad to manually get rid of the double-quotes and extraneous spaces. Word has been very useful getting rid of the ALT+010 characters and another manual line break I don’t know the code to. If I knew what the currently aggravating character is, and what it is called in Word, I should be able to do a global change instead of having to hand-edit in Notepad.

Sorry, poor wording on my part. I cannot see the characters.

In Excel it looks like this:


Sayaka Hibachi & Sushi Bar

Note that there are no double-quotes or extraneous spaces.

I copied the cell and tried pasting it here. It looks like this:


"Sayaka Hibachi & Sushi Bar			"

This is also how it looks when I open the file in Access, and when I export as a .txt file. Note the double-quotes and the spaces.

So I don’t see the characters, but I can see that they are there because of what they do to the data.

I see some “horizontal tab” characters there, ASCII code 9

Unfortunately I’m telecommuting today and the rest of the week, then I’ll be on vacation for two weeks. I can’t use ALT codes from my Mac, so it will be almost three weeks before I can try looking for ALT+090.

I did copy the file into Word. I went to Replace => Special and tried several of the options. For example, Tab Character, Column Break, Nonbreaking Space, and Whitespace. None of these is the correct option.

Tab, in Windows, should be just ALT 09, not ALT 090. I do not have a copy of Word, so cannot test how it reacts to various characters (I have not used Word for a long time; for modifying “plain text” files I find dedicated text editors like Notepad (also Notepad++, etc.), Vim, Emacs, etc., are simpler to use and what is on the screen corresponds more closely to the file contents).

One unfortunate possibility is that the Tab characters were an artifact of cutting and pasting from Excel into this thread :frowning: In that case we could be on a wild-goose chase. One thing I did try is searching for “hex editor”, downloading the first result, and opening a file I prepared with various kinds of characters in it. This took only a few seconds, and by scrolling around I can see the actual contents of the file. For Mac I am seeing Hex Fiend and 0xED, for Windows HxD. I don’t think you have to know too much about hex editing to do this, beyond Tab = 09 hex, a normal space is 20 hex, and anyway if you move around and/or highlight something you can see the correspondence between the displayed characters and the hex version.

Since I can’t do ALT codes from home, I’ve sent the file to my boss to see if she can look for ALT+09.

She said it didn’t work for her. Oh, well.

Sorry to hear that. My advice would be to take the original (csv) file, hit it with Hex Fiend (or whatever) and copy/paste the hex version of the problematic segment here in the thread where people should be able to decipher it. E.g., Sayaka Hibachi should appear as 53 61 79 61 6b 61 20 48 69 62 61 63 68 69.

I’m not sure about copying and pasting between a VPN-connected PC and the Mac I’m on. In any case, when I pasted into the left window it said AAABACBA in the left window and ™´¨∫ in the right window.