In fact, the issues you have with teaching C to beginners re: memory bugs are quite analogous to those one might have with teaching a dynamically typed language to beginners. Granted, the behavior of runtime type mismatch errors may be more deterministic than of memory bugs, but in both cases, it’s way easier to work with a language design that prevents bugs from even being writable in the first place (or flags them down at compile-time) than one which makes one hunt down avoidable bugs at run-time.
With dynamically typed languages, it’s clearer when there is an error. If you pass in parameters as user,path instead of path,user, it is immediately clear what the bug is and where it happened.
With memory problems, the cause and effect may be in completely different places and randomly happen. If you put an extra byte past the end of a char array, it may or may not have any effect on the program. If it does affect the program, it may be much later and in a completely different part of the code. Debugging memory problems is an essential skill to have, but I feel it’s more appropriate once the student has more skill at programming.
I think a good progression would be python, C, and then java. Python is a fancy scripting language. The more motivated students can create innovative solutions, but the casual people will still be able to get things done. Then move to C to introduce more type constraints, compiler, and memory bugs. Then move to java so they learn objects and can be thankful they don’t have to worry about tracking down memory problems.
It’s unclear to me what you are talking about. Show me an example of code in a statically typed language and corresponding code in a dynamically typed language, illustrating this distinction. Because as far as I can tell, statically typed languages make it clearer when there is a type mismatch error, pointing out the error at compile-time rather than in some random fashion when something happens to go wrong at run-time. And if you are not talking about some kind of type mismatch error, then I don’t understand why the distinction between statically and dynamically typed languages is relevant.
Your point about “it may or may not have any effect on the program. If it does affect the program, it may be much later and in a completely different part of the code” is precisely the way in which type mismatch errors progress in dynamically typed languages. There’s no static checking that the type of the argument passed to a function matches the type it expects; things will only blow up later when you try to operate on that data in some way it cannot handle, which may be at some point in the code far removed from the point at which it was originally mispassed.
Indistinguishable: thanks for that explanation! I didn’t even know the “nonlocal” keyword existed. That is indeed pretty horrible, though I don’t think I’ve had those difficulties myself in many years of Python. I usually try to keep my scopes as straightforward as possible, especially when closures are involved.
What do you mean by loosely typed? I would tend to think of “loose” as a synonym for “weak” in typing disciplines, but Python is (more or less) strongly typed.
Which can complicate and blur debugging to the point where it ceases to be much of a learning experience because you don’t know what you’re trying to fix- is it some unexpected implicit conversion causing the problem, or is it something screwy with your code.
A buddy described learning to program on Python as being similar to handing a toddler a bag of broken glass- it’ll be educational, but it’ll also be painful and bloody.
You could argue the same with C, though. Is my algorithm screwy or did I just have a dangling pointer/not null terminate that string/accidentally use “=” instead of “==”/whatever. I frequently compare using C to fighting a bear while juggling flaming chainsaws. I mean, you have the perfect tools to kill the bear, but one misstep…
In a language like python, you might write a function like:
def get_size(file):
There is no meta information about what the function returns or what the type of the parameter is. This means the programmer has to know what the input and output will be. They can’t depend on the compiler to get it right. If they get it wrong, they can easily see the problem in the debugger or with some print statements. If the ‘file’ parameter is supposed to be a file object and they pass in a string, they’ll quickly know and it will be relatively easy to spot the error. Having to keep this type of meta information in their head is a good programming skill to develop.
In C, memory problems like this:
buffer = malloc(5);
buffer[5] = '5'; /* oops. this is the 6th byte */
Can be very difficult to debug. It may or may not crash. It won’t crash with the bad write. Instead, the crash may come at some point in the future when that byte is read by some random part of the program. Beginners may spend hours spinning their wheels trying to debug memory problems like this and never find the bug. Programmers still need this skill, but it’s something better left until they have more experience.
According to the TIOBE ratings, C and Java have a yearly battle for the top programing language. C++ comes in fourth, which is more than it deserves. It’s an ugly hybrid of a language and should have died out long ago.
I learned programing with Pascal, because I learned it back when that still made sense. Today I would start with Java, or maybe even Python or Ruby. C# is a reasonably good facsimile of Java, so that would be fine if you don’t mind being limited to Windows.
Now, if you’re going to be a systems programer, i.e., you want to create embedded software for circuit boards and other devices, then C is the only way to go.
I’ll probably hear a lot of arguments about this, but here goes:
Typing shouldn’t exist in high level languages. The different internal representations of data are a machine level function, and programmers shouldn’t be concerned with them unless they are working at that level. It’s an unfortunate artifact of many high level languages which were originally designed in the age of horse drawn computers where processor speed was so limited.
C is not really a high level language, it’s a fancy assembler, so it’s reasonable for typing to matter there. And every programmer should get some exposure to that, but the vast majority of programmers shouldn’t need to be concerned about data types.
I’d argue that typing is superfluous and gets in the way in languages like Haskell or Prolog, Python I can see an argument either way since it’s supposed to be a scripting language. But if you’re doing scientific computing (or even things like geometry) in any language having types exposed is, while not necessary, very good. You at least need to be able to make an informed decision on whether you want to deal with the bullshit the IEEE floating point representation makes you deal with (and how to deal with it if you must). Sometimes it truly doesn’t matter, I’d argue that if your A* is taking the slightly non-optimal path because the path cost error accumulated that it either doesn’t really matter that it took the wrong path if the difference is so minute or you should just multiply all your costs by a constant amount. But sometimes it really does matter.
At least, unless the language designers do black magic to not have to deal with bad float stuff (like symbolic computing or something), I’d prefer to be able to have tight control over when I use it and how, without worrying about my ints suddenly becoming floats and ruining everything forever.
Of course there are cases where typing matters, but those represent a small percentage of all programs written. Something like an A* function isn’t likely to be coded too many times. It’s a low level function, just like the intrinsic functions available in languages, and the OS functions. IEEE floating point is an affront to humanity, but any numeric processing done at a high level should only need the bounds and precision defined.
Maybe I’m confused about the point you’re trying to make, but I feel like you have this exactly backwards. Typing is a high-level concept, an abstraction that places constraints on unorganized low-level representations. Look at assembly, which generally has no typing at all except what is implemented in the hardware; then at C, which has a thin and flimsy weakly-typed system; then at C++, which has a much stronger type system capable of greater abstraction. Or look at Haskell, which is extremely high level, and whose type system is its defining abstraction.
In fact, in most cases if you take away the sophisticated type system from a high-level language, what you get is a low-level language.
You are talking about compiler type detection and enforcement. I am saying the languages should be typeless. A programmer should not have any concern about how data is represented. The number 123 should not be something different as an integer or a float, there should be no such distinction, nor should it be distinquishable from the string “123”. The purpose of a high level language is to remove those machine level considerations from code.
These aren’t machine-level considerations, they’re the exact opposite. Only at the lowest machine level is there no meaningful distinction between the bytes that form a number or a string or a database record or an image. Building those distinctions is one of the main purposes of high level languages.
What you’re describing is a language that permits implicit lexical conversions in a few special cases, specifically between strings and numbers in certain contexts. I think this idea breaks down when you consider the other kinds of data that programs must be able to manipulate, specifically aggregates like lists, maps, tuples, or objects.
They are machine level considerations. Nobody spent much time considering them before the advent of computing machinery. High level languages shouldn’t be concerned with machine level representation like bits and bytes. Aggregates are just that, aggregates. They are aggregates of other aggregates and scalar values, and neither should have type distinctions.
:dubious: That last sentence is only true if you count Babbage’s difference engine as computing machinery. Type theory dates back to the early 20th century. The simply typed lambda calculus, which has influenced the type systems of many modern programming languages, was formulated in 1940 by Alonzo Church (the other half of Church-Turing!). Types are definitely a high-level concept in computer science.
And they usually aren’t. There are vast gulfs of abstraction between a raw vector of bytes you might manipulate in assembly or C, and, for example, a high-level type like “Ord a => [a] -> [a]” in Haskell, which is still critically different from a type like “Ord a => [a] -> a”. The two are neither reconcilable nor convertible, and the type system must distinguish between them either by requiring such things to resolve statically at compile time, or by detecting errors at runtime. There are no other sensible solutions.
How do you propose to obliterate the distinction between an aggregate and a scalar, or two aggregates with different structures? Remove aggregates from your language altogether? That’s how you create a low-level language, by forcing users to build abstraction themselves from a set of molecular primitives.
QBASIC is sweet, though it runs in a MS-DOS window. Liberty BASIC is much the same and Windows-based, but it is heavily extended and has a user community that keeps adding libraries. Stay away from line-numbered BASICs, no matter how retro-cool they are, because even us who like them hate them. I’m no great programmer, but I took a class in C and we were supposed to write this program for the final project. QBASIC and QuickBASIC are very similar to pseudocode (I mean, what good is pseudocode if you can’t see if it works?), so I first knocked it out in QB. With some bells and whistles it took me 35 minutes (yes, I timed it). A week later I still had no idea why it would not compile in any of the three C compilers I used, and failed at three different places. I hate C.
How long would that be? I’m guessing 200 lines of code. I was going to suggest COBOL because I’m a sadist. Car discussion with Wife the other day:
Me: We shoulda gotten into Data Processing when we were young.
Her: Why?
Me: Because…um…because we would’ve learned COBOL and be as under-employed now as we are. Never mind.
I’ve seen core, gramps, but since I refused to touch one of those infernal machines until they got rid of the cards, I’ve never used it. Mentioned to a guy who used to sell software back in the day that my high school had an IBM 360 back in 1970. He said it was massive overkill. I said the high school was in a rich town and we sorta expected it.
Logo is awesome! It’s infinitely extensible and with a good version you can do a lot with it. DL-Logo for OS-9 on the Tandy Color Computer was the best I tried and was not a toy; I think Dale Lear was trying to turn Logo back into LISP.
I’ll stick to the original goalposts. In computer programming languages typing was introduced as a run-time efficiency technique, which is not a high level consideration of the language interface.
That is not sensible solution, it is a complex construct which is unnecessary for the vast majority of all programming requirements. I have no objection to the creation of high level typing, I am talking about types as data limitation imposed by machine requirements.
I don’t. I propose that that scalars and collections are different. However the method of referencing elements of a collection is not inherent except at a low level.
If it’s not simple enough for you yet then consider this, why should a high level programmer have any concern about how data is represented in memory?
Not even a little bit true. Can you provide a cite?
The basic requirement holds. You must either resolve types at compile time, or check for type errors at runtime, or discard structured data altogether. Otherwise what do you do when your programmer tries to, say, add a list and a function together?
Types have nothing to do with how data is represented in memory. They were introduced in programming languages (among other reasons) specifically so that programmers would not have to worry about exactly how data is represented in memory. Consider my original example of Ord a => [a] -> [a] – a function mapping a list of an orderable type to a list of the same type. Nowhere is there any hint of what the orderable type is, or how it’s represented, or even how the basic structure of a list is represented in memory. In fact, the only information present at all is high level, the information that the programmer cares about and needs to know in order to do his or her job: namely, that this type maps lists to lists, and the things in the lists must be orderable.
That’s what types fundamentally are in programming languages – classifications that hide low-lying bits and bytes by imposing a set of rules about what certain things mean and which operations make sense on which pieces of data. All very high-level stuff.
I don’t even know how to answer something like that. You don’t even seem to understand what a data type is. Data types were created as way to allow the differing representations of numeric values in memory. Declaration was needed so you didn’t end up with nonsense results from applying operations to different data representations.
A function is not a data type nor is a list. Again your understanding of how computers work is deficient. And I am proposing that there should be no differing data types that could produce an error by their usage.
Again, you don’t understand that you are talking about high level structures created as an artifical mechanism, and unnecessary for the vast majority of all code written. The very fact that you use Haskell as an example shows how far your concepts are removed from reality. Once again, aggregates are not data types, they represent the structure of a set of data types. Further, concepts like orderability are a characteristic of a particular aggregate definition. I have no problems with the creation of high level structures or objects, but they should not be a requirement of the language because they are unnecessary limitations, they are instead application issues.
As long as data types are needed to distinquish between differing physical representations then you are placing those requirements on the programmer instead of the language. If you want to create a high level paradigm in a language there’s nothing wrong with that. But if the internal representations of the data make a difference than you’ve left the problem of data typing intact. The problem of mismatched types should not exist and can be eliminated from any language that does not have to operate at the machine level. You don’t realize it, but you are agreeing with me in principle. Like so many others you have become so accustomed to the idea of data typing that you think it is a necessary component of a language operating at it’s simplest level.