NAN: What is it good for? (programming)

Chronos · August 11, 2011, 12:35am

In the olden days, if you tried to perform an illegal arithmetic operation in a computer program (dividing by zero, log or sqrt of a negative number in a context that didn’t support complex numbers, etc.), the program would crash with an error message something like “Runtime error: Divide by zero”. This was good: Whenever a program was trying to do something like that, it was almost always a sign that the programmer was doing something stupid, and so it is appropriate to alert him of this as soon as possible, as well as giving some indication of what the problem was, so he would notice and mend his sinful ways.

Nowadays, however, due to advances in compilers, processors, or both, an attempt at such an operation no longer produces an error. Instead, it quietly returns a result of NAN, or “Not a Number”, and the program continues on its merry way, propagating that NAN all over the place, and thrashing the internals of the program in a manner which can be extremely frustrating to track down. This is bad.

But obviously, the folks who implemented this whole NAN business had very good reasons for it, and believed that their new invention would do more good than harm. So… What’s that good? Where are all these situations where a programmer who’s tried to do something stupid would actually want to succeed, instead of being warned sternly of the error of his ways?

engineer_comp_geek · August 11, 2011, 1:01am

NAN isn’t anything new. It’s been part of the IEEE 32 bit floating point standard (IEEE 754) since the late 1970s. Positive and negative infinity are also part of the standard. That standard became official in 1985 but it was in use in real hardware long before then.

In the IEEE 754 implementation, NaN consists of the exponent bits all set and a non-zero value for the significand. This allows you to store an error value in the significand which you can use later in your calculations.

Lets say for example you are executing some sort of control algorithm (this is what I do for a living, which is why I picked this for an example). If you get some sort of error in your algorithm, the last thing you want to happen is the control algorithm to just crash. Then you lose control over whatever it is you are trying to control. If you have a NaN implementation though, you can check for NaN after you’ve executed your algorithm and if your result is NaN you can go to some particular error state that you’ve encoded in the significand instead of just crashing and burning.

And your program doesn’t have to just continue, even with a modern compiler. In .NET for example, NaN throws an exception, which can be caught and handled. There probably is a point to be made that this tends to make lazy programmers who don’t handle all of the exceptions that they should.

Another example might be a computer game. Lets say your rendering library happens to have a bug in it. In certain parts of the game, this results in a NaN being generated. The old fashioned halt crash and burn approach means your program dies and the user gets mad. By using NaN and ignoring the exception, the game continues on, even though something on the screen might not look right, and instead of a game stopping crash all you get is a minor annoyance. Again, a point can be made that the right way of fixing it is to get to the bottom of the bug in the rendering engine, but with complex software these days and short development and test cycles you do what works best instead of what is right. It makes the end user happier.

leahcim · August 11, 2011, 1:06am

I wonder if the change in the standard behaviour has to do with floating point operations being done in hardware now. Hardware designers are more of the “this must return something no matter what” types.

I haven’t seen much use for NaN, but I have seen plenty of use for +inf and -inf. Unlike NaN, a function taking an inf can return a sensible answer if the limit is finite (e.g. erf(inf)=1 for a lot of libraries). I’ve found that this behaviour gives the “right” answer in reasonably many cases where an “internal inf” occur, and saves some edge case handling.

Arguably, if you have (+/-)inf, you need to also have NaN to represent inf-inf, &c.

ZipperJJ · August 11, 2011, 1:13am

It’s handy for Javascript form validation

tellyworth · August 11, 2011, 1:18am

There’s nothing fundamentally wrong or stupid about asking for sqrt(-1). It’s a perfectly reasonable question to ask “what is the square root of -1”; it’s just that the answer is a little complex. The sqrt() function doesn’t know in what context it was called, so it shouldn’t make assumptions about whether this is a complete disaster or a reasonable question.

Think about how you handle an error given various error signalling types:

if ( is_error( sqrt(a) ) ) …

if ( sqrt(a) == NaN ) …

try { sqrt(a) } catch ArithmeticException { … }

Practically speaking they achieve essentially the same thing. The NaN version involves the least syntactic baggage, and stays closest to the simple mathematical function.

Chronos · August 11, 2011, 2:01am

Except, if you’re dealing with a situation where you might end up trying to take a sqrt of a negative number, then the best way to handle it is to put in a statement like "if(a < 0.) … " just before the square root, since that would surely indicate that you’re dealing with something nonstandard, so you should be explicit about exactly what behavior you want. And if you’re taking a square root of something that you don’t think can be negative, but it turns out that it can be, well, that’s something you would want to be informed of, isn’t it?

Now, I can see that there might be some use for ±inf, in some cases. But that still doesn’t imply that you need NAN to handle inf - inf and the like, because you could just bump the error-and-crash behavior out to that situation.

CanTak3 · August 11, 2011, 2:13am

NaN is also used for an unknown/indeterminate number.

For example, in XAML (Silverlight/WPF), you specify height and width of your controls (such as a button or text) with a number of options. You can specify a fixed size which of course is always a value. You can specify “Auto” which means take the space you need and no more. You can specify relative sizes as well, so if you have two buttons, sized “2*” and “5*” then the buttons get 2/7 and 5/7 the total available size respectively.

The size for the later two above will report back to the programmer as NaN. They are not fixed and don’t have a static value.

Bytegeist · August 11, 2011, 2:20am

In production code, I have seen NANs used as fill values (“no data here”) in images and other kinds of arrays. Then, when you do array arithmetic, the CPU isn’t spending time on validity checks for every element of the input data. If your arrays have millions of elements, this can be a huge time savings. (But it does assume you didn’t want to halt at the first discovery of an invalid value.)

Keeve · August 11, 2011, 2:21am

Why would you want to check to NaN after you’ve executed your algorithm? Wouldn’t it make more sense to validate the data before executing it?

Similar to what Chronos posted. If you wait until afterward, you won’t know what went wrong. Was it a number that was too big? Was it a number that was too small? Maybe it was alpha text. It’s been a while since I did any serious programming, but I clearly remember how much trouble I got into because I failed to distinguish between blank and null.

First verify. Then execute. What’s good for the battlefield is good for the code too.

Keeve · August 11, 2011, 2:24am

It also assumes that you don’t care which values are the invalid ones.

I’m from an accounting background. Every value had better be a valid one. Can you suggest an area where it wouldn’t matter so much? I’m really curious.

tellyworth · August 11, 2011, 2:28am

Now you have two more problems.

One, that looks ok in the case of something simple like sqrt(a). But what if it’s calculate_foobar(a)? You wind up with duplication or coupling in your conditional. Better to call the function and let it tell you if the result can’t be calculated.

And two, you have to assume that someone will come along later and change your code, and you’ve just doubled the opportunities for them to screw it up. For example, if they change it to sqrt(a - 1) you also have to rely on them to change your conditional to “if (a < 1) …”. They have to get two things right, as opposed to just one in the other idioms.

rbroome · August 11, 2011, 2:32am

Chronos:

In the olden days, if you tried to perform an illegal arithmetic operation in a computer program (dividing by zero, log or sqrt of a negative number in a context that didn’t support complex numbers, etc.), the program would crash with an error message something like “Runtime error: Divide by zero”. This was good: Whenever a program was trying to do something like that, it was almost always a sign that the programmer was doing something stupid, and so it is appropriate to alert him of this as soon as possible, as well as giving some indication of what the problem was, so he would notice and mend his sinful ways.

Nowadays, however, due to advances in compilers, processors, or both, an attempt at such an operation no longer produces an error. Instead, it quietly returns a result of NAN, or “Not a Number”, and the program continues on its merry way, propagating that NAN all over the place, and thrashing the internals of the program in a manner which can be extremely frustrating to track down. This is bad.

But obviously, the folks who implemented this whole NAN business had very good reasons for it, and believed that their new invention would do more good than harm. So… What’s that good? Where are all these situations where a programmer who’s tried to do something stupid would actually want to succeed, instead of being warned sternly of the error of his ways?

There is no law that requires a fatal error to return a NAN. The compiler could just as easily crash. But with a NAN the program can trap the error and respond appropriately. Seems to me this is a feature, not a bug.

Indistinguishable · August 11, 2011, 2:36am

I don’t know the answer for sure, but I think the OP is mistakenly assuming that the designers of NaN felt the same way he does: that it’s best to have programs crash as soon as possible to inform the developers of clear bugs, as opposed to letting those bugs escape in a harder-to-track, but less “devastating”, manner. It seems quite plausible that the designers of NaN in fact explicitly preferred the latter option and developed the semantics of NaN for that reason. leachim’s “Hardware designers are more of the ‘this must return something no matter what’ types” rings reasonably to my ears as well.

Keeve · August 11, 2011, 2:43am

And you’ll end up with a user-hostile generic message like “Invalid entry” without giving the user a clue as to what was invalid about it. If you check it first, and tell the user what was wrong, not only will the user be less frustrated, but the programmer just might catch some things that he thought were invalid but should really be allowed.

Let’s not forget the point made in the original post (emphasis mine):

Indistinguishable · August 11, 2011, 2:44am

Put another way: why should there be programming languages with NULL pointers? It’s just the same concept, and the cause of quite similar hassles (to a far greater degree of ubiquity), but someone at some point thought it was a good idea… They were mistaken, but the explanation is still just that they sincerely bought into a poor design.

Keeve · August 11, 2011, 2:46am

Why do you say “mistakenly”?

Preferring “the latter option” is why we have so much crappy software out there. Please fight my ignorance: Under what circumstances would “the latter option” result in better software?

Indistinguishable · August 11, 2011, 2:49am

I don’t prefer the latter option. I am of a similar philosophy as the OP and you. The potential “mistake” is not the OP’s philosophy; it’s the OP’s assumption that the designers of NaN were of this philosophy as well. That need not be true; indeed, the very issue we are discussing could be taken as evidence against it.

Bytegeist · August 11, 2011, 2:56am

You care, but maybe not immediately, when you’re doing computations. You care later, when you’re displaying or analyzing the end product.

You’ve received an image from a spacecraft — but this being an imperfect world, it’s a little “damaged”. Some of the scan lines, or perhaps a few thousand bad pixels scattered here and there, are marked as garbage. These pixels get a special “fill value” in the image you receive.

It’s natural, or at least sometimes useful, to map these fill values to NANs in your image arrays, to make the processing easier to code. Then you can just write A = 100 * (B - C) for example (if you’re subtracting two images and scaling the result), without the CPU needing to validate every pair of input numbers.

Obviously at some point you will care about how many NANs there are. Too many, and the end image is worthless. But, putting that check “inside the loop” of the image arithmetic is not necessarily a cost you need or want to pay. It can be more efficient to check the image’s validity after the main computation is done.

engineer_comp_geek · August 11, 2011, 2:57am

You don’t always have the luxury of validating a number. One example is you are reading the temperature out of a piece of monitoring equipment and using that to control the heat being applied to a vessel. If the machine returns NaN when the temperature probe breaks, you know to shut off the heat. Similarly, plus infinity might signal an overrange value (temperature exceeds what you are capable of measuring) and minus infinity might indicate an underrange.

Another example is the game example I posted above. What if you didn’t write the rendering library and don’t have access to its source? If it has a math bug you don’t want it to crash since you can’t debug it anyway. Or maybe you do have access to the source but since it is someone else’s library you aren’t familiar enough with it to debug it. You want it to keep on going and maybe put something in an event log somewhere so you can figure out what happened. You don’t want it to crash. These days people buy libraries all the time to save code development. This isn’t an uncommon scenario, and isn’t restricted to game development either.

This doesn’t mean you should always do it this way, but the OP was asking what it’s good for. No one is saying it should always be used like this.

Chronos · August 11, 2011, 2:57am

Quoth tellyworth:

And then, when that later programmer tries to run the incorrect code, it’ll crash and give him an error message like “Invalid argument for sqrt”, and he’ll be forced to go back and fix his mistake.