Extremeely basic programming question.

Perl actually has different operators for numeric and string types (numeric uses the conventional C-like operators, strings use “alphabetic” operators like eq, le, gt, etc.). Some examples:
print ‘1000’ == ‘1e3’;
> 1

print ‘1000’ eq ‘1e3’;
> undef (undef is “false” in Perl)

print ‘1000’ eq ‘1000’;
> 1

print 1 == 1.0000000000000001;
> 1 (not enough precision in a double)

print 1 eq 1.0000000000000001;
> 1 (Perl converted to a double before promoting to a string)

Okay, upon reading, I see that Jragon also gave a reasonably detailed and correct analysis, and Indistinguishable also had the right idea.

But by Og, don’t y’ll agree that the K&R version is awful? (BTW, I don’t think any current versions of K&R have this. It was in their first edition, way back when.)

Oh, and you don’t actually need to copy or keep separate copies of the original values of src and dst, in the case actually used in K&R. This really WAS the full body of the strcpy() function,and src and dst were the parameters passed by value into the function, so there was no need to preserve their original values.

Well, it may be awful, but it’s at least correct. And it doesn’t exploit any weirdness with the compiler or parser–the behavior is completely explainable by the basic properties of C (implicit booleans, operator precedence, postfix operator behavior, etc.).

The main problem is really the code density; there are like a half-dozen operations packed into one line. So while it’s solid code (ignoring the more fundamental problems with null-terminated strings), I give it 3/10 for maintainability. Then again, for a library function like that, who cares?

If you *really *want an abomination of structured programming, try Duff’s Device:


   register n = (count + 7) / 8;      /* count > 0 assumed */

   switch (count % 8)
   {
   case 0:        do {  *to = *from++;
   case 7:              *to = *from++;
   case 6:              *to = *from++;
   case 5:              *to = *from++;
   case 4:              *to = *from++;
   case 3:              *to = *from++;
   case 2:              *to = *from++;
   case 1:              *to = *from++;
                      } while (--n > 0);
   }

Of course, everyone’s first reaction is “there’s no way that even compiles”.

Well, yeah. Fact is, C and C++ are basically glorified assembly code. This is a feature, not a bug, but it does mean that you lose some things that you get in “managed” languages.

Just *try *to do thisin another language:


float magic( float number )
{
        long i;
        float x2, y;
        const float threehalfs = 1.5F;
 
        x2 = number * 0.5F;
        y  = number;
        i  = * ( long * ) &y;                       // evil floating point bit level hacking
        i  = 0x5f3759df - ( i >> 1 );               // what the f*ck?
        y  = * ( float * ) &i;
        y  = y * ( threehalfs - ( x2 * y * y ) );   // 1st iteration
//      y  = y * ( threehalfs - ( x2 * y * y ) );   // 2nd iteration, this can be removed
 
        return y;
}

(I did some slight spoiler-hiding for those that want to figure out what it does–the link has the details).

Of course, you could perform the same computation in many other languages–it’s just that it would take thousands (even millions) of clock cycles instead of tens. And despite what anyone tells you (and what is probably the greatest lie I was told in school), performance still matters.

C++ is just a bunch of syntactic sugar on top of C. That’s also a good thing.

Probably a matter of debate.

Yikes! That’s the kind of code that really should be written only in Assembly Language (because only thus would it actually look like assembly language, which is what it is)! But even then it would still be totally incomprehensible!

I kinda-sorta guessed that it might be something along the lines of a quick square root something-or-other, as I’ve heard discussions of problems like that before. One idea for a quick square root was to start with the original number, divide the characteristic (exponent) field by 2 while leaving the mantissa alone, which gives you a good first approximation; then use Newton’s method. (It doesn’t matter much what you do with the mantissa at that first stage.)

ANYWAY, for Sicks Ate, I promised to post a link for an earlier discussion like this one.

Here it is: Date puzzle - Factual Questions - Straight Dope Message Board

Just a little more light evening reading for you, 68.

That is bad. Well, why DOES it compile? Why is that even valid C? Do all C compilers compile that? Has anybody tried running lint on that? (Remember lint?) What does lint say?

The reason that it compiles is that the spec doesn’t specifically disallow it, and because switch statements are a bit like glorified gotos (and so it doesn’t actually care if other control flow statements are interleaved). As far as I know, all C compilers will handle it, though I don’t know if it’s considered valid C++ (C++ is almost but not entirely backwards compatible with C). Not sure about lint, but the last time I tried it on MSVC, it didn’t give any warnings.

Yeah, it’s a fun one. The 1st/2nd iteration lines are easy to figure out; it’s just a Newton’s Method refinement. But that constant 0x5f3759df is tricky. The primary thing about it is that you can divide the exponent in half (via bitshift) and subtract it from this constant, and get the correct result (taking the +127 bias into account). That gets you most of the way to the approximation, and the rest is handled by the constant’s mantissa.

I wouldn’t say it belongs as assembly. It’s completely correct, legit C that depends only on the machine having a floating-point representation that resembles IEEE-754. So it’s far more portable than ASM. The only real problem with that particular code is that when it was used, it had quite murky origins, and therefore wasn’t documented well.

(Bolding added.)

Maybe and maybe. Yes, a switch is nothing more than a glorified goto (in structured programming speak, it should actually have been a glorified multi-branch if…then…elseif…elseif…elseif…else…endif construct) but I only half buy that. Are you allowed to branch into a block from outside it? Any language that allows that has some SERIOUS compilation complexities to deal with. What if the block has some local variables of its own? They have to get stacked when you enter the block and unstacked when you leave. Allowing exits (break) from anywhere in such a block is messy enough. Allowing entries into any point in the block is REALLY messy. That’s what the compiler should reject, I think. (And if it’s valid C, it probably shouldn’t be.) We studied problems like that in my Compiler Construction class, way back in the day when it was de rigeur to write some portion of an Algol compiler, which allowed that and even worse horrors!

However your computer might represent floats, you also had to worry about how the hardware instructions might represent pointers! C scolds will always admonish you that “pointers are NOT integers” and they might be right. I had a friend who programmed at Atari for most of his career, and he told me stories of machines like that. So playing casting games with pointers and ints could fail vary badly on machines like that.

The late Dennis MacAlistair Ritchie (dmr), Praise be unto him, stated that all compilers he believes in accept Duff’s Device:

[QUOTE=Tom Duff]
The device is legal dpANS C. I cannot quote chapter and verse, but Larry Rosler, who was chairman of the language subcommittee (I think), has assured me that X3J11 considered it carefully and decided that it was legal. Somewhere I have a note from dmr certifying that all the compilers that he believes in accept it. Of course, the device is also legal C++, since Bjarne uses it in his book.
[/QUOTE]

As to why it is legal C, recall that in C programs keywords like for, do, while can be replaced mechanically with constructions using only labels, if and goto to yield a program with identical effect.

This is technically known as translating the C program into FORTRAN. :smiley:

I can see how that would be done, if one also removes all blocks (that is, removes all { and } except the ones beginning and ending the whole function). But given that any block of code exists in the code, surrounded by { and } can you jump into that block from outside it?

Well, okay, I can believe that C allows this, it’s that loose of a language. I don’t know for sure because I sure as hell never wrote a C program that does that. As I asked above, how does it compile such a block if it has any local variables in it?

Now I’m going to have to try something like that with a few C compilers I’ve gathered over the years and see what it really does – and look at the assembler code it generated. And also, see what it compiles if a block has a local variable and you jump from inside the block to outside the block. Either way, that’s gotta have some kind of ugliness with how it handles the local variables! (Yes, I can see ways to do it, but it’s gotta be messy.)

I want to say that you’re not allowed to declare a local variable in a switch statement at all, ever, for precisely that reason. At least I remember getting some error recently in a graphics class whenever I declared a variable within a switch statement.

ETA: In Visual Studio 2010:


char x = 'a';
	switch(x){
	case 'a':
		int x = 4;
	default:
		x = 4;
	break;
	}

Fails to compile with the error:

So it looks like it doesn’t let you skip an initialization with a label, though it may let you declare a variable within a switch statement if all the labels come before it.

Confirmed,


	char x = 'a';
	switch(x){
	case 'a':
	default:
		int x = 4;
		x = 4;
	break;
	}

Compiles successfully, and


	char x = 'a';
	switch(x){
	case 'a':
               int x = 4;
         break;
	default:
		int x = 4;
		x = 4;
	break;
	}

Doesn’t, even though the second declaration is unreachable from the first one (you get both a multiple initialization AND a “default label skips declaration” error).

ETA: D’oh, I forgot, this is Visual C++ though, it may be different in C.

Yep. Like septimus said, you can imagine all the flow control constructs as being translated to if/goto/label. The compiler will handle all the stack nastiness, but you can definitely get yourself in trouble by skipping variable initialization (among other things) if you jump into a block in the middle.

As they say: C allows you to shoot yourself in the foot. In C++, you can blow your leg clean off.

The pointers are fine. Despite appearances, a pointer is never actually computed in that code. Specifically, this line:
i = * ( long * ) &y;

… is just a way of getting at the bits of the float y in int form. The reference-pointer cast-dereference is just a quick way of doing that, and completely equivalent to this:
memcpy(&i, &y, sizeof(i));

Of course, a memcpy call might not get optimized away properly, and the pointer trick better allows the compiler to rename the register. That said, I just tried it in MSVC and I get the same code either way, so if I were using it I’d go the route of clarity.

Autoconversion doesn’t have anything to do with the type system: That’s part of the interface, not something you can only get through breaking the interface.

I have to disagree. Wikipedia has a list of properties that contribute towards strong typing:

  • Absence of unchecked run-time type errors. This definition comes from Luca Cardelli’s article Typeful Programming. [1] In other writing, the absence of unchecked run-time errors is referred to as safety or type safety; Tony Hoare’s early papers call this property security.
  • Strong guarantees about the run-time behavior of a program before program execution, whether provided by static analysis, the execution semantics of the language or another mechanism.
  • Type safety; that is, at compile or run time, the rejection of operations or function calls which attempt to disregard data types. In a more rigorous setting, type safety is proved about a formal language by proving progress and preservation.
  • The guarantee that a well-defined error or exceptional behavior (as opposed to an undefined behavior) occurs as soon as a type-matching failure happens at runtime, or, as a special case of that with even stronger constraints, the guarantee that type-matching failures would never happen at runtime (which would also satisfy the constraint of “no undefined behavior” after type-matching failures, since the latter would never happen anyway).
  • The mandatory requirement, by a language definition, of compile-time checks for type constraint violations. That is, the compiler ensures that operations only occur on operand types that are valid for the operation. However, that is also the definition of static typing, leading some experts to state: “Static typing is often confused with StrongTyping”. [2]
  • Fixed and invariable typing of data objects. The type of a given data object does not vary over that object’s lifetime. For example, class instances may not have their class altered.
  • The absence of ways to evade the type system. Such evasions are possible in languages that allow programmer access to the underlying representation of values, i.e., their bit-pattern.
  • Omission of implicit type conversion, that is, conversions that are inserted by the compiler on the programmer’s behalf. For these authors, a programming language is strongly typed if type conversions are allowed only when an explicit notation, often called a cast, is used to indicate the desire of converting one type to another.
  • Disallowing any kind of type conversion. Values of one type cannot be converted to another type, explicitly or implicitly.
  • A complex, fine-grained type system with compound types.
    Brian Kernighan: “[…] each object in a program has a well-defined type which implicitly defines the legal values of and operations on the object. The language guarantees that it will prohibit illegal values and operations, by some mixture of compile- and run-time checking.”[3]

I bolded the ones that are relevant here.

Also, I don’t understand what you mean by “interface”. Are you claiming that direct bit access in C is “breaking the interface”? What would ever make you think that?

No the second definition is reachable from the first one, the code steps from char x = ‘a’ into the switch statement. Perhaps you meant this example which highlights how scope is handled in the switch statement?



            char x = 'a';
            switch (x)
            {
                case 'a':
                    int y = 4;
                    break;
                default:
                    int y = 4;
                    y = 4;
                    break;
            }

which results in the error ‘A local variable named ‘y’ is already defined in this scope’ when one would think the second definition of y should be fine as it is unreachable. However the C# compiler treats the everything in the switch statement as the same scope hence the error. You can get round this by the following:



            char x = 'a';
            switch (x)
            {
                case 'a':
                    {
                        int y = 4;
                        break;
                    }
            default:
                    {
                        int y = 4;
                        y = 4;
                        break;
                    }
            }


but it is not good practice.

Actually, your example and my example are the same, I was just stupid and reused “x”. The second x is in a “more local” scope than the first “x”, and thus does not count as a redefinition error to the compiler (that is, char x and int x are considered unique identifiers due to being in a different scope for some reason to the Visual C++ compiler), so my example with “x” is equivalent to your first example with “y”.

ETA: Though I wasn’t thinking, what I did was a terrible thing to do in any real situation, especially since I think that scope change is compiler dependent.