Explain the concept of "computer language" to me.

Lemme try that again …

Instead of “prescriptive”, insert “imperitive”. I thought of the sentence that way and by the time my fingers got to that part, the word had completely escaped me and prescriptive was the best susbsitute I could come up with. Human thinking is a strange and wonderful, but not terribly reliable, phenomenon.

Be careful with those terms. Many people speaking exclusively of computer languages will use “imperative” in a different sense than you intend here to distinguish between languages like C, java, etc, and “declarative” languages like SQL, HTML, etc. In the sense you intend to use it, the “declarative” languages are still recipes to tell some application running on the computer to DO something, but using the word “imperative” for them could cause confusion when someone reads something about SQL NOT being “imperative” (I think it skates dangerously close to it, given the way people think about transactions, and the sort of procedure-like extensions that most vendors have added, but most descriptions will insist on calling SQL “declarative”).

In terms of the OP’s original question, one important distinction is that virtually all computer languages strive to be context-free and parsable by an LR grammar (in practice, most actual languages have some contextual “warts” when you get down to writing a compiler for them, but the intent is that they be context-free). Human languages are contextual as hell (interesting invented things like Loglan aside). Part of the reason computer languages are this way is the purpose they serve, and part of it is to make it practical to process them.

Now that I dragged that in, I’m going to have to translate it. Most of the points have been touched upon already. The “LR parsable” buzzword has to do with how you write a program to walk through the source, and extract the directions for the computer. It essentially means that your program is going to read “left to right” through the input source, and with a finite amount of “lookahead” make sense out of what it is looking at at every point strictly by using its grammar rules and the statements you’ve already made. For instance, let’s take a few simple statements in some garden variety procedural programming language (for our purposes, it doesn’t matter what it actually is, and in fact, it isn’t anything that I know of):

integer x;
integer y;
x = input();
y = x + 7;

When I design a compiler (or interpreter or whatever) that is going to understand that, it can walk through it, and make sense out of it strictly from grammatical rules - “integer” is probably a keyword, so it can apply some sort of declaration rule and “know” that “x” and “y” are names I have now chosen to use for integer values. “input” may be a keyword or known as some library function - the rules the compiler has let it realize that it makes sense to assign its result to one of my integer variables. “7” is something it can recognize as an explicit integer value (we call such things “literals”). The grammar rules it has built in concerning expressions allow it to determine what to do with “x + 7” and verify that I may indeed assign it to the variable I called “y”. All unambigous and not dependent on some unstated nature of things named “x”, “y” or “7”.

Contrast that with a couple statements in English:

“One should not drive a car with faulty brakes”
“One should not ride a motorcycle without a helmet”

You don’t even think about it, but those two sentences are horribly ambigous - to determine that “with faulty brakes” modifies “car” in the first example, and “without a helmet” modifies “one” in the second sentence you HAVE to rely on things other than the nearly identical grammatical structure, namely, your knowledge that people don’t have brakes and motorcycles don’t wear helmets. Most people get too “cute” in constructing examples to illustrate the ambiguity of natural language, and provide something that hinges on a play on words. That obscures the point, I think. The issue is more fundamental than that. Go through a newpaper article sometime being deliberately obtuse about the attachment of modifiers, and you’ll see what I mean.

I’m not a professional programmer, Pookah, but layperson to layperson, I figure I’d tackle the explanation of how some computer languages might be better at certain tasks than others. I’m sure the coders here will correct me where I’m wrong, or provide concrete examples where I’m just giving vague generalities.

When you are running a webserver that helps thousands of visitors per second, you want a language capable of handling multiple simultaneous users discretely, one which uses a minimum of run-time and memory space, and one which effectively keeps users restricted only to the parts of the information they’re entitled to. To keep the server’s resource consumption to a minimum, you’d use a language lightweight and nimble enough to process its script quickly.

When you are running a stand-alone program on a computer (a game, for instance, or a word processor) you have the entire computer to yourself, so it needn’t be small or nimble, necessarily. What you want here is that the program be robust: powerful enough to handle what users want, and secure enough that the users can’t foul it up with a few unthinking mouse clicks.

Some languages enable you to define certain terms or objects or functions or processes that you plan to repeat a lot. You might say: “Computer, I want to build a blueprint for all the buttons on this program. They’ll all be round and gray and when you click 'em, they’ll play click.noise. They’ll have text in 'em that’s in Palatino Bold 12 point and they’ll be 40 pixels tall.” Later, you tell the computer, “Okay, I want four Buttons here, just like I said before. One will say OKAY, one will say CANCEL, one will say BACK, one will say HELP.” The computer can then apply the blueprint Button to each instance of that blueprint that you create. You might then want to create a class of items called RedButton, and you say, “Computer, this will be exactly like Button, except it’ll be red.” You can then create some instances of RedButton objects which have the same traits as your blueprint. This kind of computer language is useful for specific kinds of data handling, but not all kinds.

Another language might be particularly good at dissecting strings of text. Your job might be to typeset someone’s manuscript, and you want to go through it and find every “…” that you can and replace it with “. . .” instead, except that when you find “…” you want to leave it completely alone. A text-parsing language would be easiest here, because it has commands written that do exactly this. In another language you might have to program a routine from scratch that will perform this same function.

Another language might excel at handling multiple users. Such a language might give every user a particular level of security, and built into that language is some kind of permission scheme: you can’t get into /secrets/stuff/passwords directory because you don’t have access! Before access is granted, the computer checks who is doing the asking and if it’s okay to tell them. Other languages may not have this feature built in and the programmer must build the security permissions manually.

Last, there are various ways that languages can execute. One is line-by-line, in order, starting at the top. For instance, written in pseudocode:


check recipe
if recipe.needs-sugar then import sugar
if recipe.needs-eggs then import eggs
if recipe.needs-flour then import flour
if recipe.needs-salt then import salt
if recipe.stove-temperature > 0 then openburner = getopenburner(stove) and
     set openburner = recipe.stove-temperature

In this program, the computer executes all the lines of code even if you’re making cold cereal. Some languages are slightly more efficient: once they figure out that the recipe does not need sugar, they skip to the next line (because everything after that is irrelevant).

A different language may handle it this way, again in pseudocode:


blueprint for Recipe:
All recipies will have a list of ingredients called list-of-ingredients.
All recipies will say true/false if they need to be cooked, a
     variable called is-stove-on.
All recipies will have cooking-temp and cooking-time variables.
end Recipe

cereal = a Recipe
list-of-ingredients [captain_crunch, milk]
is-stove-on = false
cooking-temp = 0
cooking-time = 0
end cereal

friedeggs = a Recipe
list-of-ingredients [eggs, salt]
is-stove-on = true
cooking temp = Medium
cooking-time = 360
end friedeggs

Main:
check recipe
what-we-need = get list-of-ingredients(cereal)
import what-we-need
if is-stove-on(cereal) then setstove(cereal.cooking-temp)
cook-for(cereal.cooking-time)
eat!
end Main

In this style of writing, it can be slightly more efficient because the “fried eggs” part won’t actually run unless it is called upon to do so. It has been defined but the computer need never actually execute that code until it’s needed.

What you’re describing here is really the operating system rather than anything to do with computer languages. All modern OSs will cope with multiple processes even those from Microsoft :). The programs running on a webserver can be written in anything most likely perl, C or php.

Every language I can think of allows (if not demands) this. What you go on to describe is the object oriented approach where everything in a program is an object. A string of text is an object, a button is an object a database connection is an object. If you want a red button what you do is use an existing basic button object (class actually but this is technical enough already) and extend it by adding the red-coloured attribute. This saves you from writing all the code to make a button work because you can just pick the existing button object off the shelf and bolt extra bits on to it.

And if you wanted to, you could even write the TCP/IP stack & webserver in php if you wanted too. :slight_smile:

Yobob,

Yup, I understand your point. I was trying to keep the discussion at the OP’s level, using terms in their plain-English sense.

HTML, at least before the advent of DHMTL, could legitimately be described as almost entirely declarative. DHTML, advanced CSS and of course the embedded script functionality all muddy the waters in the direction of imperitivity (new word!).

For SQL, I have a problem with calling it declaritive. This


INSERT INTO TableName (Field1, Field2) VALUES ( Value1, Value2)

is 100% imperitive, no less than x:=y+z is imperitive. Certainly the DDL portions of SQL (ie CREATE … ) are declarative, but every imperitive language still needs a declarative component to define the “nouns” that the imperitive “verbs” of the imperitive part will act on.

We’re on the same page. Nonetheless, many people will insist on typifying SQL as “declarative”, including the people who write primers on SQL, and I was addressing what people would find if they poke around the literature. And I edited out something about not considering such things as embedded scripts for HTML.

I don’t believe anyone has yet linked directly to this Wikipedia page of Esoteric Programming Languages, so I will do the honors.

Thank you, SmallClanger. I knew I was specifically describing an object-oriented approach but I wanted to keep the discussion in layman’s terms and didn’t want to get into too-technical terms like subroutines or defined functions that exist in other languages. I also didn’t want to have to get into the concept of the pre-designed top-level classes that come standard in a particular language.

Which did actually help a lot. :slight_smile:

Yabob, sorry I mixed up terms. Before this discussion I really knew nothing about the whole topic and I thought they were the same thing. I know exactly what you mean from other types of discussion, though. I’ve seen many discussions derail because people are talking about two different things using the same word.

To all. I’m still reading. It’s a lot to get my head around and I’m a bit confused about some of the finer points. I do think, however, that I understand at least the basic idea now, which was the point of the thread. Funny, but threads actually ending up doing what you started them for is actually a rarer event than you’d expect. Well, that’s my experience anyway. :wink:

Fish, it sounds like the language you need for your demonstration project is Chef. :slight_smile:

As far as computer languages being “context independent”, what about operator overloading? In some languages (C++, I think, for example) an operator can mean different things depending on the type of objects it’s acting on. So 12 + 25 (both integers) equals 37, but “12” + “25” (both strings) might be “1225”. Isn’t this a sort of context dependance?

“context-free” is a technical term, meaning that whether the syntax of a block of code is valid depends only on the contents of that block, and not the code around it.

Also, you might observe that the way you would typically write a parser for a language like C/C++ would not take types into account at the level of the grammar rules. Your rule for “expression” would include a rule that said an “expression” could be “expression + expression” (with precedence handling, but let’s leave that alone for now). It would be inside the action taken for that rule where you would generate different code based on the types of the left and right expressions, or generate a compiler error if they were inappropriate for the operator.

This could rapidly become far more technical than the spirit of the original question. The point I wanted to make with parsability stuff was simply that there is a gross difference between natural languages and computer languages in the ability to interpret them mechanistically without running into ambiguity that has to be resolved by some instrinsic knowledge of the things represented by the language. Good thing natural languages work that way, too, or some of the best humor would be impossible.

C++ fails to be context-free for all sorts of reasons, but I don’t think that’s one of them.

For a better example, the piece of syntax Thing f(); can be either (1) a declaration of a function named f that takes no arguments and returns a Thing; or (2) the definition of an object named f, of type Thing, which is initialized using the default constructor function for Things. Here you have a piece of syntax which can be resolved (“reduced”) by either of two different rules, but the compiler can’t decide which rule is correct without non-local information. Even then it might not be able to decide, if both rules happened to be valid.

Context-free is a syntactic, not a semantic, statement. Operator overloading is a semantic artifact. The “+” in a + b can refer to an integer add or floating point add (generating different machine code) in most language, and is thus overloaded.

I’ve never looked at the grammar of C++, but I’d say the grammar of C is context-free except for bugs. I once tried to figure out how to write code for an array of function pointers from the syntax - every hypothesis I came up with turned out to be syntactically legal! (I figured it out by looking at the code for emacs.)

Does anyone learn BNF these days. That stands for Backus Normal (or Backus Naur) Form, and describes the syntax of a programming language. Some of the BNF for Pacal, from Jensen & Wirth is

<expression> ::= <simple expression> | <simple expression> <relational operator> <simple expression>

which means an expression is either a simple expression, or two simple expressions around a relational operator.

<relational operator> ::= = | <> | < | <= | > | >= | in

<simple expression> ::= <term> | <sign> <term> | <simple expression> <adding operator> <term>

and so forth. Parser generators, like yacc can take a bnf and generate code to parse it, with hooks for semantic operations for each production. When I was in grad school SIGPLAN notices would have a new language just about every issue, but I suspect language creation is not so hot these days.

IIRC, grammar types from Chomsky are used for classification of programming languages - but as mentioned, real languages are all context free.

It annoys me how BNF is not taught in every 1st year CS course. It took me literally 20 minutes to learn yet it’s invaluable whenever I need to specify a custom communication/data storage protocol. Whenever I’m in a team, I’m the only one who knows BNF so trying to communicate stuff is a pain in the ass.

Here’s a basic example of a non-context-free C statement:


foo * bar;

That can be parsed either as an expression (multiply foo times bar, then throw away the result) or a declaration (declare bar as a pointer to foo). Before you can parse it correctly, you have to know whether ‘foo’ is a typedef or a variable.

It is possible to use a context-free parser generator to parse C, but you have to use a hack in the lexer. The lexer would see the identifier “foo”, look it up in a symbol table, and feed it to the parser as either a type identifier or a regular identifier, depending on how it’s been declared.

Okay, here’s a concrete example for you to see the actual syntax differences in two different languages. The routine will be a single function that will check a user’s password against the user file. It will do the following tasks:

  1. Receive the user’s name and password (as entered by the user somewhere else)
  2. Open the user file
  3. See if the name and password match an existing user
  4. Close the user file
  5. Return a “true” or “false”; either both name and password were found or not

I will use the same logic for both examples. The only difference will be the syntax of the language. (That is, I would never check a file this way in Access…)

In Visual FoxPro (VFP):


FUNCTION CheckPassword
PARAMETER pusername, ppassword
  LOCAL bfound

  bfound = .f.
  USE USERS INDEX USER
  SEEK username
  IF FOUND() THEN
    IF password = ppassword THEN
      bfound = .t.
    ENDIF
  ENDIF
  USE
RETURN (bfound)

In Microsoft Access using DAO (VB):


Private Function CheckPassword(pstrUser As String, pstrPW As String) As Boolean
  Dim blnFound As Boolean
  Dim rst As Recordset

  blnFound = False
  Set rst = db.OpenRecordset("tblUser", dbOpenSnapShot)
  rst.FindFirst "User = '" & pstrUser & "'"
  If Not rst.EOF Then
    If rst!password = pstrPW Then
      blnFound = True
    End If
  End If
  rst.Close
  Set rst = Nothing
  CheckPassword = blnFound
End Function

The differences number about as many as the similarities. First, VFP is hardly typed at all, which drives me insane. Look at the two function definitions:

FUNCTION CheckPassword
PARAMETER pusername, ppassword

In VFP, we know almost nothing. Does it return a value? (Probably. Part of the definition of a “Function” is that it returns a value.) Do we know anything about its return value? We know it receives 2 parameters, but what kind of variable are they? Numbers? Strings? Arrays? User defined types? We don’t know anything.

Private Function CheckPassword(pstrUser As String, pstrPW As String) As Boolean

In VB, we will get compiler errors if we don’t explicitly give much more information. The “Private” tells us this function can only be seen by the current object, such as a login dialog box (called a Form).

The two parameters (pstrUser and pstrPW) are defined as Strings, so we’ll get runtime errors (“Type Mismatch” ) if we try to send anything other than 2 strings to it. We’ll also get error messages if we don’t send exactly 2 parameters (“Argument not optional.”)

And finally, the function itself is defined as returning either true or false (“As Boolean”)
Which language is better? Well, neither are particularly good to begin with, but if you had to choose, you’d need to know a few things:

  1. Who will write and maintain the system? VFP programmers are much harder to find.

  2. How big is my data? Under 20,000 records, Access is fine. Between 20,000 and 2,000,000 and you need VFP. Over that? Other languages are better.

I think I’ve forgotten what my point, if any, was supposed to be.