Extremeely basic programming question.

Dr. Strangelove: The interface is the list of things you can do with the object without exposing its internal representation. For integers, it includes addition and incrementing up to the point you attempt to reach beyond a maximum value, which invokes overflow semantics. For floating-point numbers, it’s everything you can do without a first course in numerical analysis. It’s comprehensively broken by, for example, taking the address of the double-precision float, casting it to a pointer to unsigned char, and directly accessing the bytes which compose the numeric value.

Your list is, to some extent, a list of alternatives. For example, Haskell is strongly-typed but it allows what is essentially automatic conversion among numeric types, by allowing you to, for example, add all numeric values to each other. This gets even more complicated when you consider type signatures such as ‘Eq a => a -> a -> Bool’, which is the type signature of the == (equality test) function.

Thanks! The more I read, the more it makes sense…which isn’t to say that I’m not completely in over my head, 'cuz I am.

On the bright side, I was happy to discover that I understood what was going on in the program in post #4 of your linked thread, which certainly wouldn’t have been the case yesterday morning. Though not surprisingly, I have no idea why different languages would produce different results.

Is this really the same phenomenon, though? Scheme, for instance, has a “numeric tower”, because all integers are rationals are reals are complex. But there’s just no way to add a string to an integer without a function that does this explicitly.

Excellent example of this for me today as well…this is what I recived from a team member in response to a minor rant I e-mailed this morning:

Problem number 3: Although [projet manager (whole 'nother rant)] does have some experience in [field of discipline] most of the programmers and engineers have very little knowledge as to how [my line of work] works which makes reporting problems much more difficult because they have a hard time understanding the desired result unless it is provided in a very detailed technical document. I have spent a great deal of time over the last year + just acting as a translator between practitioner and IT engineer. Which is the role I am playing today.*

emphasis mine

nitpick: surely that should be “… the set of things you can do with the object…”

Huh, from my knowledge of C, I thought that the only “locality” that counted for variables was what function they were in. This is the first I’ve ever encountered the concept that a variable could be local to a single (compound) statement. I’ve always been taught that all type declarations should precede any executable code in a function: I guess this is just a structured programming practice, and not an actual part of the language standard?

The block-scope variables are a new-fangled feature (though 2+ decades old by now :smack: ). I’ve never used this feature myself, preferring to write my C code the way [DEL]God[/DEL] Dennis Ritchie intended.

It does? That’s not my experience… I constantly have to manually fiddle with this (hence, fromIntegral, toRational, float2Double, etc.).

The only thing automatic I’m aware of is that many operations are polymorphic to the extent that they allow their operands to be any numeric type, so long as they’re all the same type. Mixing variables of different types in Haskell arithmetic has always been a pain, but perhaps I’m behind the times.

(cite)

Probably there have been one or two over the decades.

Then there’s IDL, from Exelis, a data plotting language I’ve used before. In IDL, the standard rule is that odd integers count as True, and even integers count as False.

C’s ‘strcmp’ isn’t really returning a boolean value or an error status. It returns an integer that indicates the relative order of the two strings: negative if the first string precedes the second, positive if it’s the other way around, and 0 if the strings are equal.

Still, it’s common to see real-world C code looking like this:



if (!strcmp(str1, str2)) { ...


… which suggests the programmer either did believe ‘strcmp’ returns a boolean, or that he’s lazy, or unconcerned with clarity, and didn’t want to write out the explicit comparison with 0.

Yes, every brace-delimited block of statements in C introduces a new nested scope for variables. Also for type definitions, if you want to go really nuts.

You’re also allowed to introduce a brace-delimited compound statement (and new scope) wherever any statement is allowed — not just in conjunction with an if, for, while, or switch.

That, I did know, though it’s not something you typically have use for unattached to a control keyword.

I hate to be the spoilsport here, but why are we talking C and C++ in these threads? It’s 2012, right?

C++ is (barely) acceptable as long as you religiously follow RAII methods. Or if you’re working on a base of code that’s already in C++. Or if you’re ok with spending three times more effort tracking down bugs than you would in a more modern language.

But if you’re starting a new project today, you need to be using a language with garbage collection. As a minimum. (And a reasonable syntax-- sorry Perl fans!) (And a decent library-- sorry PHP fans!)

I’m also disappointed at the “trick” code here, like Carmark’s square root approximator and Duff’s Device. Sure these gimmicks are “fun”, but the challenge of programming is writing *clear *code. Focusing on examples like this are counter-productive for someone learning how to write software, since they should never show up in any actual software.

What we should be talking about are the fundamentals of good software development:

  • Source control
  • Automated/unit testing
  • Use of modern, well-organized, development environments
  • Debuggers
  • Development methodologies

Believe me, you’ll never come across Duff’s Device in any non-WTF code. You will come across those concepts every hour of every day.

As for recommendations:

  • The Mythical Man-Month is the Bible of software development
  • I find Raymond Chen’s blog interesting, as he discusses Windows development-- with emphasis on the bad decisions the Windows team made in the past and how those decisions came back to haunt them later.
  • The DailyWTF is also a good (and amusing, once you’re more familiar with the profession) website that teaches via providing shockingly bad software practices.

I am aware of the definition. I just wanted to know what you thought belongs as a part of the interface.

Says who? You? Yes, we can define the numbers this way, but there is no inherent reason why we must limit ourselves this way.

In C, part of the interface to floating point numbers is the ability to access the underlying representation. The fast reciprocal-square-root routine above doesn’t break the interface; it utilizes it.

You are also simply wrong about “For floating-point numbers, it’s everything you can do without a first course in numerical analysis.”. A basic property of all rational and real numbers is that a / b * b == a for b != 0. And yet, one of the very first bugs that beginning programmers encounter goes something like this:
float x = 1.0 / 10.0;
if ((x * 10.0) != 1.0)
printf(“what happened?”);

So if your “interface” consists simply of grade-school arithmetic operations and their properties, then the interface is already broken for the vast majority of computer languages.

It gets worse, though. Any language with sufficiently powerful file I/O can again get at the internal representation. Perl and Python have an API for it, but any language that allows reading/writing a binary file will enable peeking at an internal representation, since files are untyped.

In my industry, performance matters (if we’re 2% slower than the competition, then we lose). Furthermore, predictability matters (garbage collection without very fine control is out). And last, it’s important to have as much control over bugs as possible (If you find a compiler bug in C#, then maybe you dick around with the code until it seems to go away. In C++, you inspect the assembly to see exactly what went wrong, and exactly what needs to be done to fix it).

I’ll admit that my industry isn’t typical.

PS: Duff’s Device remains stupid because it doesn’t actually gain you anything on modern compilers (besides, you should be using SSE or something for memcpy). The fast reciprocal-square-root is in fact useful in some situations.

I read the DailyWTF for a while, and although it was often amusing, far too often they cited completely legitimate things as WTFs. It was obvious that their editors and commenters had an incredibly limited worldview when it came to coding, and that most of them had never worked in an environment where performance mattered, or there were limited resources (like on a microcontroller), or similar.

It’s a bit of a shame that so few college grads have any insight into computer internals (though it does make for good job security).

Sorry for breaking this into several posts…

I cited Duff’s Device only as a joke and an example of an abomination. But the fast RSQ was found when the Quake III code was released. It’s ridiculous to say that it should never show up in any actual software, since it did. Games are one of the few remaining areas where every last trick is used to squeeze out more perf.

Luckily, C and C++ do well in all those areas. Very good IDEs and standalone debuggers are available. And they have been around long enough to build up a huge base of best practice knowledge.

Popular current languages like JavaScript don’t do nearly as well in these areas.

(we can agree that PHP needs to die horribly)

(Referring to that one-line C string copy code given and much-discussed above):

No dispute that it is correct C. And no dispute that it has a problem with “code density” as you call it, or “code obscureness density” as I would prefer. My complaint is exactly with its obscureness, but also with the surreptitious betrayal of “structured programming” in an environment that otherwise glorifies structured programming. I’m not a structured programming fanatic myself (I see problems with it), but this code, which violates the catechism while using constructs designed to support it, is really bad. To be sure, I wouldn’t hesitate to write a bottom-test loop or even a middle-test loop, but I’d insist that the test be up front and visible in the place where it happens. This is just as much a complaint that K&R actually deigned to present this code in their book as an alleged good example, as it is a complaint about actually writing code like that.

Earlier in this thread, I listed several steps that a program goes through (compile, link, load, begin run, continue run). My complaint about declaring and initializing variables in the same statement is that it could be done in various of these phases, and unless you really know the internal guts of the implementation, it might happen in ways you didn’t expect. Jumping into a block, skipping its declare/init statements, might get the variable initialized or not, depending on how it works. I object to the very idea that the programmer should need to know that kind of detailed insider information.

I don’t always know how this would work from one language to another: int i = 5 ;
But I have a much better idea of how this would work from one language to another:
int i ;
i = 5 ;

And, Chronos: You were better off NOT knowing that you could declare block-local variables. You’re right, there’s no really good reason you should ever have to do this. (It may have been more useful in the old days when memory was expensive and computers had small memories, as when C was invented. This could enable you to conserve precious memory space in your program.)

Language designers just gotta have some sense about what to allow and what not to allow. FORTRAN and C, for example, always allowed “mixed-mode arithmetic”, that is, expressions with a mixture of INT and FLOAT (and even COMPLEX, in FORTRAN) values. There was and is a well-defined type hierarchy. One could argue that this is perfectly legitimate because all combinations of mixed-mode operands are DEFINED operations with DEFINED results! Contrast Pascal, which took “strong typing” to such an absurd extreme that you could not write: x = 1 + 2.0 ;
because Wirth was so anal control-freak about forbidding loose typing! That does nothing to aid Good Programming, it only makes programming a pain in the arse. As Indistinguishable suggests, expressions like 1 + 2 and 1.0 + 2.0 work simply because + is overloaded. Exactly! And 1 + 2.0 or 1.0 + 2 can be defined to work too just as well. There is no violation of strong typing here!

But I definitely object to things like 1.0 + “2.0” because it’s so ambiguous as to what it could do. Never mind memorizing the rules about what it will do! There are too many ways that would be reasonable to expect it to work. I could just as well have two variables containing strings “1.0” and “2.0” and add them (not necessarily even knowing or caring if they are strings or floats, but just knowing that they are numbers that, say, I got from some HTML form), expecting to get “3.0” and end up with “1.02.0” instead. But if the “string concatenation” operator is a different symbol from the “numeric addition” operator (as is the case in PHP, for example), then I’m okay with that. (ETA: Well, sort of okay with that. It’s still best to use explicit conversions to ensure you are dealing with the types you think you are, and when you start writing longer expressions, concatenating lots of variables into one lengthy formatted string, it’s too easy to get it all bolluxed up.)

Bit-fiddling in C if perfectly “legal” if only because the language is defined to permit it. That doesn’t necessarily make it a good idea to do (it’s legal in C because of C’s stated intention to be a “high-level assembly language”) – and it certainly is still obscure. That square-root code fails badly in the “commenting” department which, IIRC, would also have been “legal” C.

Not new-fangled at all. Every block-structured language I have ever seen, all the way back to early Algol versions, has allowed block-scoped variables like this. I’m pretty sure that pre-dates C by several years. Didn’t Algol come out in the early 60s?

All pontificating aside about obscure details of C and C++, Blakeyrat is right of course. C and especially C++ have come under much criticism for being too ramshackle for modern serious coding. (Array bounds checking? We don’t need no steenkin array bounds checking!) C++ in particular fails to live up to modern needs by inheriting too much shabby baggage from C.

See the link I gave in post #108 above. Here it is again:

See, in particular, post #4 there.

As Dr. Strangelove says in post #135 above: “It’s a bit of a shame that so few college grads have any insight into computer internals (though it does make for good job security).”

Granted, performance, down to shaving off every last machine cycle, matters in some contexts. Another case is this: You are doing extremely heavy-duty mathematical number-crunching, like simulating major events down to the atomic level, and you have programs that run literally days on end.

I worked at Lawrence Livermore Labs once upon a time. They did work like that. They always had the newest, greatest, fastest super-computers there were. And they ran simulations that really did run for multiple days at a time. And the programmers really were obsessed with shaving off every machine cycle they could. Their mathematics library was largely hand-coded in assembly language (and re-written every time they got a newer faster different super-computer), to take every advantage they could of coding density, known machine quirks (like the hardward instruction stack), and whatever they could.

Well, were the basic tenets of structured programming strictly adhered to, you would never jump into the middle of a block…

Ok, let me put it this way. It’s a mess. C that is. Not the language, but the expectations of it. It is not a high level language. It is a very low level language, much like a portable assembler. However, it’s use as a high level language, and the it’s variant optimized versions have created a stew of higher level functionality within the the very low level framework. So trying to characterize one way or the other doesn’t work very well.

On the other hand, the way you use it can be adapted to the application. For systems programming, and other high efficiency uses C can be used as an efficient low level language to create portable code. Bit fiddling and non-structured coding can be used, and it’s common to see optimization turned off control memory precisely. This is very useful for operating system functions and physical interfaces, as well as managing shared memory with independent applications.

C can also be used in a purely structured manner acting like a much higher level language, with a high degree of operating efficiency and source level portability. Nothing stops programmers from employing strict typing rules.

Now Senegoid, I must strongly disagree with your caution against using localization in code blocks. It is imperative in well structured code to localize all variables at the lowest level. They are intended exist only within the scope of the structure, and using a more globally defined variable for that purpose is an invitation to bugs. The compiler will notice any use of a variable out of scope and prevent that from happening.

As far as strong typing goes, it’s only a problem where two conditions exist. First, that the inefficiency associated with weak typing matters, and second, that the language does not allow the usage of strong typing.

I believe Blakeyrat has pointed out some of the key factors in software development that extend beyond the language itself, but all of them could be applied to most languages, including C. Any language could include a pre-processing step in the development environment to enforce a subset of rules available with a language to conform to a programming standard. This can apply to the syntax as well, so Perl code and even ‘write-only’ languages like APL could retain usefulness through modification of the syntax (assuming there was justification for maintaining the code in the first place).

All languages have strengths and weaknesses, but as long as they have a set of features which conform to the needs of the application they will do the job if used properly. Languages don’t create bugs. Poor programming practices and bad* programmers create bugs.

  • The board rules, state and federal laws, the Geneva Conventions, and the Constitution of the United Federation of Planets prevent me from using the literal description of this type of programmer. But we all know them when we see their code.

On a slightly lighter note, I highly recommend that everyone read In the Beginning Was the Command Line if they haven’t already. Specifically, the section labeled “THE HOLE HAWG OF OPERATING SYSTEMS” (although the whole essay is entertaining, and only slightly dated).

C is the Hole Hawg of programming languages. This should come as no surprise since it came from the same development team as Unix.

Stephenson gets to the root of the problem here:
*But I never blamed the Hole Hawg; I blamed myself. The Hole Hawg is dangerous because it does exactly what you tell it to. It is not bound by the physical limitations that are inherent in a cheap drill, and neither is it limited by safety interlocks that might be built into a homeowner’s product by a liability-conscious manufacturer. The danger lies not in the machine itself but in the user’s failure to envision the full consequences of the instructions he gives to it.

A smaller tool is dangerous too, but for a completely different reason: it tries to do what you tell it to, and fails in some way that is unpredictable and almost always undesirable. But the Hole Hawg is like the genie of the ancient fairy tales, who carries out his master’s instructions literally and precisely and with unlimited power, often with disastrous, unforeseen consequences.*

Bolding mine. I see this as the primary defect of some languages (even if they do have their uses).