Why are assignments in conditionals legal in C?

if (x=4) {} is legal in C (for those who are not familiar with C, this statement does NOT test if x is equal to 4, rather it assigns four to x, and then tests if that value is non-zero (it is)).
I can’t think of any case where doing this would lead to faster or more readable code. And, it obviously leads to all kinds of logic errors, due to the difficulty of distinguishing between ‘=’ and ‘==’.
So, why is is legal?

Here are some answers: http://stackoverflow.com/questions/151850/why-would-you-use-an-assignment-in-a-condition

For those who don’t feel like clicking the link, the one common pattern in which assignments are used in conditionals looks like this:



while(x = getSomeData()) {
    // Do something
}


By the way, there’s a GCC option, Wparentheses, that will trigger warnings when you use an assignment inside a condition.

Debugging tip: get in the habit of putting the constant on the left. Turns a warning into a compile error. :slight_smile:

And another common pattern:



if (failureCode = doOperation()) {
    handleFailure(failureCode);
}


I wouldn’t really recommend the pattern for isolated bits of code, but it works well when you have a bunch of similar error checks in a row. It saves a bit of screen space. In fact, at this very moment I’m looking at some code that does the following:



if (FAILED(status = doSomething()))
    return status;

if (FAILED(status = doAnotherThing()))
    return status;

if (FAILED(status = doYetAnotherThing()))
    return status;


What makes it particularly error prone is not just the = vs. the ==, but the C use of integer 0 = false, non-zero = true. Java retained as much c/c++ basic syntax as it could, expressly to make it easy for c/c++ programmers to move to it. It kept embedded assignments in expressions, including conditional expressions, but introduced a boolean type and insisted that conditional tests be expressions of type boolean. If x was an integer variable, then

if (x = 3) {

winds up failing to compile not because of the choice of operator, but because the expression is the wrong type. It keeps the feature that the expression in a conditional is syntactically like any other expression while preventing the common error from a typo. Saying something like:

if (b = [boolean expression]) {

where b is a variable of type boolean just isn’t going to come up very often.

C is a language with an extremely simple, extremely elegant, very consistent syntax. The if statement has the form

if (expression) statement

To make the language “friendlier” by outlawing some expressions in this context might appeal to some, but would contradict the simple and consistent syntax that many think gives C a special beauty.

Which doesn’t necessarily answer the question, “Why does C make assignment an expression operator?” (As opposed to a type of statement, of the form ‘identifier = expression’, having no value of its own.)

I realize that there’s a lot of cool tricks you can do with assignment expressions, but I do sometimes question if they’re worth the trouble.

Well, if it weren’t, it would break other things that look like they should work, such as:
x = y = 10;

Maybe they aren’t worth the trouble, but C takes maximum expressiveness as a core principle. Not that you want to maximize code density at all times, but there’s a certain value to it.

IMO, that only looks like it should work if you’re used to it. :wink:

One possible answer is:

They wanted a=b=c to be valid. So the idea was that when coding for b=c, leave c in the register , and then the value can be reused, for any use.

Another possible answer is:

Its perfectly understandable and deterministic… reliable… usable, costs nothing , and in fact saves… saves on coding. saves on cpu power.
I would have made the assignment token something different, eg “a #= b” is assign b to as… … it was a mistake to have == being to test for equality and = for assignment.

A better example of where this is useful is in while loops:


while((bytes_read = read(fd, buffer, size)) > 0) {
    //do something with buffer
}

If assignment statements are not legal expressions, it’s more awkward:


while (1) {
    bytes_read = read(fd, buffer, size);

    if (bytes_read <= 0)
        break;

    // do something with buffer
}

The short answer is: Because of its philosophy of maximum simplicity and consistency. There is no need to create a new type of statement (identifier = expression) as you advocate.

Whether such facilities are “useful” may be debatable. The most common occurrence
x = y = n;
is easily replaced,e.g.
{ y = n; x = y; }
Presumably you would also outlaw the simpler
y = n, x = y;
which would make many straightforward for statements impossible.

Here’s a real line from the Linux kernel boot inflater:
z = (z = g - w) > (unsigned)l ? l : z; /* upper limit on table size */

Here’s a line from a simple quadratic prober:
ix += incr ? (incr += 2) : 2;

You’re welcome to call these “bad style,” but crippling the syntax to enforce “good style” was not a priority for C’s designers.

Thanks for all the answers.
The reason I asked the question is I introduced a baffling bug in some code I was working on recently. I had created a macro which consisted of a few tests OR’d together ( (A==B) || (C==D) || (E=F)), and the code failed in a weird way. Turns out I made a typo, and left off the the last equals sign (as I did above). It got me to thinking about how common this problem must be. I do agree with some of the reasoning above - I use the construct of type “if !(c = read(…))” frequently, but I tend to write it in a more explicit fashion - “if (((c = read()…) != 0))” so that I can remember what’s going on…

I do like the concept behind Java’s “conditional tests must be of type boolean” - I’d be interested to know if that makes a noticeable dent in these types of errors.

I wish I had thought of that (or heard it suggested) 30 years ago!

Algol and the whole family of algol-like languages did use a different symbol. Conceptually, they wanted to use a left-pointing arrow for assignment, as in:


i ← i + 1

So they chose the digraph := to suggest this, writing:


i := i + 1

while using = solely for the comparison operator.

Are you sure that it’s deterministic? In Algol, it was ambiguous what would happen if you wrote:


i := 2 ;
a* := i := 3 ;

Would it store 3 into a[2] or into a[3]? What if you wrote the equivalent in C? Is it made clear anywhere which element of a would be affected? (You can’t answer this simply by saying that the assignments are done from right-to-left. It could still evaluate the i in a* first as 2, and remember that, even though it then assigns 3 to i before assigning 3 to a*.)

Are A, B, C, D, E, and F parameters to the macro? If so, you might be well advised to write the expression as:


( ((A)==(B)) || ((C)==(D)) || ((E)==(F)) )

Otherwise, if the actual parameters when the macro is used are expressions, then the operator precedences can get garbled. You can get all kinds of subtle bugs that way. Similarly, the body of any conditional statement or loop should always be enclosed within { … } even if it’s just one single statment (get in the habit of it!) since that single statement might be a macro call (and you might not necessarily even know it) that expands to multiple statements. If you didn’t surround the macro call with { } then only the first of the expanded statements within the macro will be in the body of the conditional or loop! OOPS.

¡Bravo! for you! I’ve always coded like that, and it makes a big difference in a program’s codability and readability. It boils down to writing code that actually says what it actually means.

Yes it does, but that’s a side effect of Java’s rather strict type rules. C, by design, has very loose type rules and even if it did have a boolean type, like C++, it wouldn’t solve the issue. Stricter type casting rules would, but then C wouldn’t be C anymore.

This goes well beyond this particular issue, but in general stricter type rules mean that what would result in runtime problems causes compile time errors instead.

Because C was designed to be simple and powerful, not protect n00bs from themselves. I don’t think it would have really occurerd to K&R to design features into the language that protected people from typos or thinkos. The ethos of programming was different in the 70s, cf. this hoary bit of usenet flotsam:

http://home.nvg.org/~venaas/jargon/jargon_48.html#SEC55

Perhaps more relevant is that the amount of time and effort expended on programming in a man-hours per line basis was much, much higher. In those days the license for a good compiler cost $1,000 (like $1,500 to $2,000 these days), and the machine it ran on cost $50,000, about three times the cost of a new car. So you invested a lot of time and effort to craft your code. Even into the late 80s it was fairly routine, once you had the stuff working, to go back over it and spend a lot of time tuning it up to be faster and more efficient, and to take a squint at the compiler output or even write some bits in assembler yourself to get the best performance. By modern standards this is crazy, of course, unless you’re writing the Linux kernel or a realtime OS.

Because it fell out of the language construction. C wasn’t systematically designed like Pascal, nor designed by committee, like algol. It was cut down from a larger language, and they cut out the stuff they didn’t see any need for.

In C, assignment statements have a value, so that you can do assignments like A=B=C=5. People liked doing assignments like that. It was even more important in C, because the people writing the assignments had to do their own typing, and they didn’t know how to type.

(Programmers never could type. If they were lucky, they had key-punch operators to do their typing. If they didn’t have key punch operators, they prefered descriptive names like “I” and “J” for their variables. C was particularly popular with people who couldn’t type)

Conditionals evaluate expressions like “a > 5+c”. When your assignment statements are expressions that have a value, like A=B=C=5, then you can use them in a conditional just like any other expression.

You could of course make assignment expressions a magic kind of expression, an expression that could not be used in a conditional, but that would just be adding another layor of complexity into the c compiler, something that was already notoriously difficult to get correct and often partly broken.

Programming languages of that era were primarily thought of as “assembly with less typing and maybe a modicum of portability”, intended to be used by already skilled assembly programmers looking to save themselves some time. The idea that they could do error checking that assembly language could not was a later concept. Your processor probably does not have a separate “Boolean” type, so why should your programming language? The only thing it can do is take flexibility away from you.