I am working with a client who is sending us an excel spreadsheet that I am loading into a SQL database via openrowset.
I recently started receiving some conversion errors trying to get data into some numeric cells.
It turns out there are some cells that appear to be empty in the Excel spreadsheet, but do not come into SQL as NULLs as they always have in the past.
When I look at the cell and compare it to a neighbor cell they appear to be exactly the same. Same formatting, same everything as far as I can tell. When make that cell active, the formula bar shows nothing. Here are other tests I’ve run, several of which are redundant in their own way; but I’m grasping at straws…
=ISBLANK(): FALSE
=CODE(): #VALUE
=LEN(): 0
=ISTEXT(): TRUE
=TYPE(): 0
What I do know about Excel is that you can have zero length strings, and you can also have non printable characters in a cell, and in both cases the isblank() will return false, the len() will return zero. i.e. entering ="" or = char(9) in a cell will result in the above scenario, but you’ll be able to see that information in the formula bar. Nothing is locked or hidden on that cell. Furthermore, I can open the Excel file, put my cursor into the cell, press backspace, save it, and it performs as i would like (passing a NULL value to openrowset). This leads me to believe there is a hidden ascii code in there, but I’m not understanding why it’s not showing up in the formula bar. Could the individual cell not be marked hidden, but that part of the worksheet is somehow hiding the formula display?
I’m thinking my next step is write some VBA to translate some cells but I thought I’d throw it out here first.
(* Yes I know there are no cell references in the above formulas, but I did use cell references in my spreadsheet)
In Excel VBA the IsNull function checks for null, if you want any cell that’s “empty” visually to go into SQL Server as null then it has to actually be something that is recognized as null or “DbNull” by Sql Server. Typically this means no empty strings, no line characters (like new line), no whitespace, no tab character etc.
I don’t know how any of this is structured so this may not be viable, but one solution would be to set anything that’s LEN is zero to null programmatically in VBA.
ah yes extra spaces cause problems.
Its only treating empty string as NULL.
anything else, even a single space, is a string and its giving an error.
You can select the column and do “find and replace”… space to nothing.
Or you can write a macro that delete spaces on the selected cells… or by column name… When these clean up work becomes tedius, you might have other common clean ups to the macro …
Yes, it’s easy to fix on the import. As you stated, I can set anything with a LEN() =0 to NULL, but I’m not asking how to fix it… I want to know how to figure out what the character is on the cell.
There are no extra spaces. Whatever is in the cell is of LEN()=0; And until I know what the zero-length character is, I can’t search and replace either.
If you’re just curious as to why it doesn’t appear on the formula bar I’d say that’s just by design. Most markup characters do not. In a typical computer program where say, a user can enter true free text into a text box that means they can tab, space, carriage return etc. You don’t want the user to see whatever the character literals are for those (in a lot of languages it may be
, , etc, in VBA tab is like VbTab etc.) This is no different than why Word doesn’t show a carriage return indicator every time you’ve hit enter, or a tab character every time you tab. That’s just now how strings are typically rendered in really any computer program.
The formula bar’s purpose is to let you type in Excel formulas and show you the information in the cell, and there is nothing in that which would require you “see” the literal for stuff like tabs, space etc.
If you’re just really really obsessed with what it is, then in VBA you should be able to take the cell in question, cast it to a string, and then manually inspect the string at run time (you’ll need to debug and examine the string variable once its value is set programmatically.) Or you could do some Excel/VBA equivalent of printing it to console if Excel/VBA can do that (I’m not sure.)
Or (and I’ve done some VBA, but not much) assuming VBA has regex functions you could do a series of tests, see if it contains new line, tab, carriage return etc and return a message as to what it is.
I only mentioned the formula bar to provide more information. I figured if I didn’t mention it the first bit of help would be for me to see if there is a formula in there.
Basically that’s the issue… It’s easy to work around the problem, but I’m obsessed with finding out what is in that cell.
Since the file is from a client its not in my purview to clean it up on the excel side. It’s an easy fix. My issue is one of curiosity. I simply want to know what’s in that field.
If it’s an xlsx file, that’s simple a zip file containing all the data & such in a set of folders and files in an ungodly xml format.
You could look up the OpenXml standard to learn how to decode it, or just dig around and have fun sleuthing it out all by yourself. If you REALLY have to know, that might be a fun way to blow an afternoon.
Just firing it up in Excel & using VBA debug would be orders of magnitude easier. But not nearly so satisfying.
Why force us to speculate blindly? Can you just put up a copy & paste of the cells in question on an online worksheet? OneDrive lets you share Excel via a web viewer.
We’re like minded on the sleuthing idea. I actually already tried something similar. I saved the file in its original format and saved it after fixing that one cell. I then opened both files up in a binary editor and compared them. That didn’t work as well as I hoped, and got me nowhere.
Not true. If I put a char(9) (tab character) in a cell it comes back as a zero length field.
Thanks for the link. Ill give it a try. As for forcing you to do anything… I guess I’m sorry you feel forced to speculate blindly? The reason I didn’t post anything is because I’m not specifically seeking what’s in this one cell… I’m seeking the knowledge to determine what is in any cell similar to this and I thought it would be quick and easy answer for an excel expert.
Even though I only changed the one cell, the files were massively different, and in several parts of the document too. Not at all what I expected. I was expecting the files to be 99% the same. Nope.
The simplest solution would be to cast the variables as strings in VBA and examine them at runtime…would take maybe 10 seconds using nothing but what comes with Excel.
Maybe I’m misunderstanding this, but wouldn’t casting it as string still result in a non-displaying character? I mean aren’t " ", " ", and " ", all strings even though they’re different characters?
You know, try this function first:
=unicode(a1) to return the unicode codepoint of the first character in that cell. Does it show anything at all for the blank ones?
I interpreted this to mean use a function to cast it as a readable ascii string; which did using vba and the asc() function. It’s weird, when I tried the vba function failed. I placed a tab character into a test cell, ans using the same vba code it correctly return the ascii value; so I know I coded it right.