What's the best language to learn programming with?

We should probably be clear when we’re talking about static vs. dynamic typic as opposed to strong vs. weak typing. Implicit conversions like PHP’s death swamp of an == operator are an example of weak typing; Python’s insistence that “2” cannot be added to 2 to get 4 (or “4”) is an example of strong typing. Static vs. dynamic typing simply refers to whether an expression’s type must resolve at compile time. There are strongly typed dynamic languages and weakly typed static languages (C being perhaps a good example).

Personally, I would consider “== is not transitive” to be a fatally major issue just by itself, but the the “equality” operator is made thoroughly worthless by insane coercion rules. For example, your claim that strings are interpreted as numeric in numeric contexts is true, but excessively narrow. Strings are interpreted as numeric whenever possible, leading to such absurdities as “11” == “011” and “119” == “0119”. Which, incidentally, also shows that “interpreted as numeric” is itself a bizarrely inconsistent idea in PHP. Can you guess what 119 == 0119 evaluates to? Heck, can you guess what 0119 evaluates to?

The === operator is better but only slightly. It has its own nasty inconsistency, in that it means “strict equality” for non-objects but it’s the identity operator for objects. For objects, “==” is strict equality.

The idea of implicit conversion is to aid programmers by relaxing the need to think about types at the cost of a bit of sloppiness. Whether this is ever a good idea is debatable, but in PHP’s case it so clearly goes beyond reasonable that it actually makes things harder – you need to constantly reason about PHP’s demented coercions or you’re setting yourself up for serious bugs. Since we’re talking about learning languages, I’d argue that this sort of behavior is a pedagogical disaster for actually learning programming, although perhaps it has value in teaching youngsters that programming is often painful, and sometimes you have to contend with systems designed by people barely competent to operate an electric toaster. But, even as much as I am usually in favor of a healthy dose of cruelty to students, PHP is just too much. It’s the splintered broomstick sodomizing where a simple paddle hazing was called for.

(And type coercion is just the tip of the broomstick. That “should have their toaster license revoked” theme runs through PHP from top to bottom. There are mistakes and deep fundamental failures of understanding in its design and implementation that I’d have a hard time forgiving a CS undergrad for, much less working “engineers”. Can you tell I don’t like PHP very much? :o)

Python is currently the best choice for a first programming language. It is pretty common, it can do everything it needs to, it’s easy to read and explain both very simple and more complex beginner level programs.

Next best choice would be Ruby, I think, or maybe Logo. Logo if your students have a sense of play and don’t need to do businessy things, and Ruby if they really want to do practical stuff and not so much playing around to learn.

Basic and Pascal are old school, but workable choices. C# isn’t a terrible choice, nor is Javascript. Scheme or Smalltalk are edge cases but could both be very successful if the teacher is completely fluent in the language chosen. Scheme will work best with a class that has a very theoretical mindset and willing to use SICP (Scheme textbook). Smalltalk is a good choice for a first language for teachers.

There’s a building-block language (construct programs out of bricks that fit together) called Scratch. My son’s high school introductory programming class used it. http://scratch.mit.edu/ This is a great choice for initially less technical students to learn.

I would humbly suggest the answer depends on what one’s goals in learning programming are… For some purposes, being close-to-the-metal may be desirable; for others, a high level of abstraction is great. [And, of course, there are other dimensions along which languages vary than this simplified one…]

There are certainly students out there who will find C a perfect choice for their purposes. There are others who will much prefer, e.g., not having to manually manage memory allocation. It depends on what you’re interested in.

The way I feel about it right now is that some languages are better designed than others, but still, most any language will do for a start; the thing to be careful about is just to emphasize that it is but one way of doing things, and some of what is learnt in the context of that language is not The One Way of Programming, but simply that language’s particular way of doing things.

[I will say, though, that having taught programming using Python, I absolutely hate Python’s system of not syntactically distinguishing between variable declarations and reassignments and instead relying on stupid voodoo rules to figure out which it thinks you mean (and all the other design decisions they were forced into as a result of this wart; e.g., the need for and behavior of the scoping keywords). This is such a pain to explain, so everyone just assumes they can ignore it, which works great until it suddenly doesn’t]

Yeah, even as a bit of a Python evangelist I have to agree; it’s a wart on an otherwise excellent language. But at least Python handles this better than JavaScript does…

… 0119 doesn’t throw an error? In most languages an octal literal with 8 or 9 in it will just refuse to compile/interpret. (As a sidenote, I have and always will hate the leading 0 for octal semantic, use 0o or something).

Even more fun in PHP (which I’ve only played with): “0x11” == 0x11, but “011” != 011. So string numeric conversion converts correctly for decimal and hexadecimal… but not octal. Granted this is sort of tangential to the issue at hand, the number of rank beginners who are going to be using octal (for anything more than calling a file creation function at least) are tiny.

Despite only having played with PHP, I do rather enjoy reading this essay: PHP: a fractal of bad design / fuzzy notepad

:smiley:

ETA: Python uses voodoo rules? I’m curious now, I’ve never run into the problem; but the most I’ve ever written in Python is some short machine learning scripts (numpy and pyplot stuff; also reinforcement learning and graph searches), and one fairly involved OpenCV project that apparently never tickled this problem.

The first language a student learns should NOT be “close to the metal” no matter what the student would prefer. Since all computer languages run on the “metal” anyway, this prejudices the learner’s mind in a way that is difficult to fix afterward. (This is the same issue Gagundathar mentioned above about C!)

I think C is an excellent SECOND language to learn for programmers. But it should not be the first one. Java is also a poor choice for a first language, because if you think the issue with Python’s variable scoping is bad, Java has about twenty worse compromises in it - though Java’s variable scoping isn’t one of them. If your class is theoretical enough that you want to teach them in depth about scoping, you could go with Scheme - it’s the cleanest theoretically and SICP presents all the concepts needed to thoroughly understand dynamic vs static scoping, closures and all. It’s just harder to read with all the parentheses, demandingly theoretical in approach so that many students just don’t “get” Scheme – and the students who do comprehend Scheme, in all its conceptual purity, will be a bit taken aback about how flawed many of the other languages they have to use are. That’s another reason perhaps Python is the best choice - it does have a flaw (the variable scoping thing) and therefore won’t make the students who learn it first into snobs like Scheme might :slight_smile:

Wait, back to the topic I’d planned for this post - bad choices. PHP is a bad choice because it’s unsystematic. It will enable the practical types to get things done without actually learning hardly anything about programming (yes, that’s bad for a first language!), and it’ll stymie the more theoretical learners.

That’s the point though. As a student, you should learn to manage memory manually, so you can understand how it is managed automatically.
C exposes that functionality to any degree the programmer desires. It’s hidden to a degree in other languages.
As the poster above said, C is a great second language, and this is part of why- as an intermediate student, you’ll be exploring memory management and other technical issues.

There are people who don’t? I always do that unless I’m doing something like writing a SQL query where I intimately know the data and the tables concerned.

I think there may be two competing approaches to learning programming here- a more theoretical “art of programming” approach, and a more trade-schoolish approach of learning a language that’s useful to actually do things with.

The theoretical approach isn’t teaching practical programming- it’s more concerned with teaching computer science concepts and programming via the vehicle of a learning language like Pascal that’s pretty abstract and intentionally limited/constrained.

The nuts and bolts approach champions something like Python because it’s actually useful in a practical sense once you’ve learned it, not because it’s particularly good for learning on.

Nuts and bolts aren’t practical if you don’t know how engines work.
Who do you want working on your car, the kid who tinkers, or the SAE certified mechanic?
Anyone with intention of doing anything serious with programming needs to learn “software engineering”.

I found this out the hard way, by being into hardware design.
Once you build a thing, you have to program it- and that’s a separate thing.

If you want to tinker around, Python or any other language is fine.
When you want to learn how to program computers, something like Pascal to C to C++ or Java is the standard route because it’s highly effective.

“should” depends on your goals.

No, C exposes that functionality to only a very particular degree. If you would like to work at a more abstract level, conveniently manipulating complex data structures without constantly fiddling with malloc and free, C will not afford you the degree of abstraction you care for.

Indeed, for many students, C will be a great second language. And for some, it won’t. It depends on what your goals are.

After considering this carefully I have decided that the worst possible language is the best one to learn on. The student will learn how to deal with the idiosyncracies that exist in every language, and once he advances to other languages he’ll have an appreciation for the differences.

Yeah, my pithy expression now would be that the best first language would be two quite different languages.

Or, even: “No language is a good first language; thus, the best first language would be two different languages.”

Throwing some kind of error would make sense, which explains why PHP doesn’t do that. Instead, the nonconforming digit is silently ignored and 0119 evaluates to decimal 9.

That’s one of my favorites as well. Just pick out any handful of the litany of errors in that article and it becomes clear that PHP is the language for non-programmers (i.e. people who wish to avoid becoming programmers at all costs), by non-programmers.

Not quite voodoo: it’s just that a statement like “foo = 1” could be either a declaration or an assignment; there’s no way to tell just from the syntax. So Python has the “global” keyword to disambiguate in the case where you want to assign to a variable that was declared globally.

This is awkward, but it’s much better than the way JavaScript handled the problem, which was with the rather poorly-named “var” keyword. Any statement of the form “var foo = value” is taken to mean a declaration and initialization of a variable named “foo” in the current scope. In practice, the “var” keyword can be left off (and often is in examples) and the code may appear to work the same way. However, it’s extremely dangerous because if there’s no variable named “foo” already in scope, when JavaScript sees “foo = value” it will assume you meant to declare a global.

The best solution, of course, would be to make declarations syntactically distinct from assignments, but for some reason this seems to be unpopular in dynamic languages. Squirrel is one dynamically typed language that does this, with its “local” keyword for stack variables, and the “new slot” operator (<-) for tables (the basic kind of object in Squirrel).

This might explain why so many schools teach Java. rimshot

Let’s be clear: it’s worse than just that the “=” symbol is used for both declarations and reassignments. The “voodoo” is the system by which Python determines which it assumes you mean.

Names mentioned in function definitions without explicit scoping refer to the lexically innermost existing binding of that name (if no such binding is found, the name is construed as bound at the outermost level, an awfulness of its own), unless there is any line anywhere in the function (but not within a further nested function) which assigns to that name, in which case this creates a new binding. It’s not possible to determine the scope of a particular instance of a name simply by reading up to that instance; one must also check whether there are any assignments referring to that same name later on. Thus, “def foo(): return x” and “def foo(): return x; x = x” have very different behavior, with the latter guaranteed to throw UnboundLocalError no matter the environment.

This is particularly egregious since, if the first mention of a name within a function (again, not counting mentions within further nested functions) reads it, then the mention can only make sense as a reference to an external binding. Thus, even if one is committed to a voodoo system of inferring scoping from use, there is an infinitely better rule available: just check whether the first mention reads the variable before or without assigning to it. This gets rid of the lookahead problem. (Here, “first” might as well be interpreted syntactically rather than in terms of execution order, even if the latter is more directly relevant; it’s better for voodoo to be predictable than clever)

For that matter, there’s a second major awfulness in the explicit scoping rules: The “nonlocal” keyword can be used to suppress the creation of a new binding for a given name, even in the presence of assignment statements, unless this would cause that name to refer to a binding at the very outermost level, in which case this is a syntax error, and one must use the “global” keyword instead (which can always be used to make a name refer to bindings at the outermost level, regardless of the presence of innermore bindings).

Thus, when writing “x = 5; def foo(): {SCOPEWORD x; return x}”*, with the intent that all the instances of the name ‘x’ refer to the same variable, one must know whether this code will be written at the outermost level or within a function definition to know whether SCOPEWORD should be “global” or “nonlocal”. When moving such code between the outermost level and the inside of a function definition, one must hunt down all the relevant instances of “global” and change them to “nonlocal”.

Why? Why on Earth shouldn’t “nonlocal” simply and naturally cover the global case as well?! What in God’s name is the point of enforcing a distinct keyword for this special case?

Related to this, a minor awfulnesses: the “global” keyword allows reference from arbitrarily nested functions to bindings at the outermost level. However, there is no way to refer from arbitrarily nested functions to bindings at any level inbetween the outermost level and the innermost level with a matching binding. Which means the code “x = 5; def foo(): {x = 6; def foo2(): {SCOPEWORD x; return x}; return foo2()}”*, with the intent that foo() returns 5, can be written at the top-level using “global” for SCOPEWORD, but it is impossible to find an appropriate substitution for SCOPEWORD at any other level. When moving such code between the top level and the inside of a function definition, one must hunt down all the relevant instances of “global” and then… sigh loudly and give up, because the desired keyword doesn’t exist.

Instead of all this: Either all names should refer to the innermost existing binding by default and there should be a special syntax to declare new variables, or all names should refer to local variables by default and there should be a keyword to suppress this behavior and use the innermost matching binding instead. I prefer the former, but I understand that Python intentionally shied away from this to make initialization of variables indistinguishable from later assignments to them. There oughtn’t be a keyword specifically for reference to bindings at the outermost level, but it would be acceptable to have means of referring to second-innermost-binding, third-innermost-binding, etc. Finally, I would prefer that it not be possible to refer to uninitialized variables; every variable should be initialized at declaration time. However, even if this was not adopted, a reference to a variable for which no binding exists should be considered a syntax error, not an opportunity to pretend there is such an uninitialized binding automatically at the outermost level.

These are basically the biggest mistakes of Python’s design. Were they fixed, I would have an order of magnitude less quibbles with the language (much like Kernighan re: Pascal’s arrays). Yes, eventually you get used to it… but getting used to a terrible design doesn’t make it not terrible. The fact that Python seems to keep having to change its scoping system between versions (I am referring to Python 3 in the above) bolsters my confidence that it has never been well thought out and this is all just poor language design on their part, rather than simply a failure of appreciation on my part.

(A talented programmer with decades of experience once said to me, about the Python scope-disambiguation system, “There are rules? I just fiddle with the code until it works.”. I suspect they are not alone in having adopted that attitude…)

It was particularly shitty for my intro programming students, who did not yet have any experience with more sanely designed programming systems to draw on, as they would largely either be confused and overwhelmed at random intervals by this nonsense, or, worse, grow to accept it as the natural, proper way of things. (Yes, I sought to explain the details of this system to them early on, even as colleague teachers adopted the “Eh, let’s just not mention it and hope it never comes up” attitude, but it was unfortunate to see that the students had difficulty appreciating the distinction between the general concepts of scope of variables, environments, etc., and the particular complications which arise in determining which scope will be assigned in Python due to its particular boneheaded design. Students stymied by the latter would find it to frustrate their understanding of the former as well.). Thus, the need for early exposure to other languages…

[*: I am using semicolons and braces here not as valid Python syntax, but simply to indicate how the newlines and indentation would go in valid Python syntax, because I’m too lazy to bother with “code” tags right now]

Resolved, then: INTERCAL it is!

Overview. Reference Manual.

One problem with teaching C to beginners is that memory bugs will be exposed randomly and in random patterns. It can be extremely frustrating and difficult even for experienced programmers to track down those types of bugs. It’s better if students don’t have to deal with those issues until they are more developed.

I would favor having them learn a loosely typed language like python. Because it’s loosely typed, they have to think about where values are coming from and going to. They can’t just throw parameters in a method call and let the compiler find any bugs. If they make any errors, the cause of the error can be easily tracked down to its origin. A loosely-typed language creates good behaviors that make better programmers. Then when they go to a strongly-typed language, the appreciate what it does for them.

I find that an odd attitude. A strongly, statically typed language will just as much force them to think about where values are coming from and going to, with the cause of the error easily being trackable down to its origin (that’s precisely what the compile-time error messages noting type mismatches amount to!); they just will be forced to confront these issues immediately, with relatively clear and to-the-point notices of what they’ve done incorrectly, rather than waiting for entirely avoidable bugs to get silently accepted and then pop up in random, inscrutable ways at runtime.