A couple basic (hah) computer programming questions

In the mildly abstract, a compiler is a program that translates programs in one language to programs in another, so Monstre is right in that respect.

In common usage, a compiler is a program that makes machine code out of a high-level language, so Metacom is right in that sense.

Actually, what I said was:

Note the use of the word “and” – in my statement, I’m using it in the boolean sense. I was stating that Java code is to be compiled AND interpreted. (I did not say “compiled OR interpreted”).

Since I clearly explain that point to my Java students, along with the two J2SDK programs that do these tasks (javac and java), they don’t seem to have any trouble understanding what is meant when I talk about compiling Java source code.

  1. Does each language require a special editor, or can I use, say Notepad for any of them? If not, what sort of file would I save the work in? IIRC, Virtual Basic has it’s own GUI and but I’m not sure if that’s necessary to write programs in.

Generally you can use any old editor to program. Once upon a time, I took a course where we used - I think - APL, and it had characters in it that most card punch machines couldn’t represent, so we had to use one of the two card punch machines that had those characters. And some languages can use more characters than many editors can write. Forth, for instance, can use any “8 bit ASCII” characters wherever you want including function and variable names - but doesn’t need to if you don’t want to. If you want to check out a truly professional level text editor that is popular with programmers (and is itself programmable), look at The Semware Editor, made by the folks who brought us the Q editor (from which it evolved). One of the niftiest $49 I have ever spent!

  1. I assume that some languages are more efficient than others. Are there some languages that have specific areas that they’re designed for and won’t well for other applications (uh, poor choice of words. That is, is there a “financial” language that won’t make good games and vice versa?)?
    Absolutely! The C language is great for breaking things down into modules and assigning different modules to different programmers. Forth is famous for doing hardware control (it’s what the control system for the Space Shuttle cargo bay arm is written in). BASIC is not good at anything but is easy to understand (“Beginner’s All-purpose Symbolic Instruction Code”). Scheme is good for programming Fibonnacci number generators, and it’s funny to watch people writing Scheme books search for another example besides that one. Assembly is the way to go if you want to write the very fastest code, or the very shortest program - a few dozen bytes even.

  2. How does one write a programming language?
    Wow, good question - you might check out C. H. Ting’s book about eForth. He lays out a few pages of assembly to create the start of a Forth system, then develops the rest of a full featured package by writing in what Forth he’s gotten so far. It’s a short book. If you have an assembler like MASM, you can create this programming language just by typing what’s in his book. On a good day I imagine I can picture everything happening inside the computer with this system.
    Do they need to know Binary?
    Depends what they are doing, but generally they need some idea, especially if it’s a very “low level” language they’re working on.
    Do they use another language as a base?
    Somehow, yes, though maybe just assembly. At one time people used “opcodes”, the bytes used by the microprocessor itself. Look up “opcode table” to get an idea about this.

It’s pretty simple, though I rarely feel the need to do it. (I just had to do it earlier today for the first time in years, but that’s beside the point.)

In VB, simply open the form’s *.frm file with notepad. Quite an eye opener.

I had to change some form name references, because I had several with the same internal name, (don’t ask), and the project wouldn’t let me load them concurrently. So I opened them in notepad, fixed things up, and they loaded into the project fine.

For Visual C++, open the .rc file in notepad. You should be able to figure out the syntax very easily.

A rather amusing urban legend: Early in the gcc development, the NSA inserted a tiny bit of code into gcc which did the following 2 things:

If it detected it was compiling gcc, it would secretly insert itself into the binary.

If it detected it was compiling a *nix kernel, it would insert a tiny backdoor into the binary.

Because almost everyone uses gcc to compile itself, this trojan is self propogating and gives the NSA backdoor access to any *nix system on the internet and would furthermore lull security paranoid freaks into a false sense of security because the source code is open and secure but the binaries arent.

What about LOGO?

::: looks around for any takers, and then ducks and runs :::

Interesting, how would you create Visual widgets programatically? For example, if you wanted to create 30 text boxes, all exactly 10 pixels apart from each other. In VB/C#, its relatively simple since all the code is in the same fine and you can just look at the MS generated code. I find it odd that MS keep it outside in VC++.

The “NSA” part is urban legend, but the rest of it is (allegedly*) true. The actual culprit was Ken Thompson; he owned up to it in his Turing Award lecture as “the cutest program I ever wrote” (here’s a PDF version). I highly recommend this article to anyone who likes reading about cool hacks, for both edification and amusement.

*I say “allegedly” only because I’ve not heard anyone independently verify this trojan’s presence; but I don’t have any reason to question Thompson’s statements, either.

All the .NET languages, including the newer versions of Visual C++, do the object positioning with code in your form’s source file. It’s just collapsed in the IDE, so you don’t have to see it all the time. Looks like this:


this.button1.Location = new System.Drawing.Point(96, 72);
this.button1.Name = "button1";
this.button1.TabIndex = 0;
this.button1.Text = "button1";

Older versions of Visual Basic used a text file to describe their forms. I stopped using VB after version 3 or so, but IIRC, the format was something like Delphi’s.

Delphi and C++Builder store their forms in a text file (*.dfm) that you can edit in the IDE by viewing the form as text. Looks like this:


object Button1: TButton
  Left = 168
  Top = 208
  Width = 75
  Height = 25
  Caption = 'Button1'
end

You’d use a Windows API function like CreateWindowEx. (If you had to create a lot of controls programmatically, you’d probably write some helper functions to make it easier, and pretty soon you’d wish you had a standard class library like the ones in Java, Delphi, or .NET.)

Versions of Visual C++ before .NET didn’t anything like “forms” in VB, VC#, or Delphi. The closest you could get was a dialog box, which used a dialog resource file that limited you to a few predefined types of controls, and wasn’t as closely linked to your code. You could edit the resource file with a text editor, or use a visual dialog editor.

IMO, VC++ didn’t deserve the name “Visual” before .NET and Windows Forms came along. Any Windows C++ compiler could be used to make a program with dialog boxes, using the very same resource files.

Some examples of more specialized languages that I haven’t seen mentioned (I’ll use the modern definition of “good game” and exclude text adventures and such):
[ul]
[li]TeX, Postscript, and *roff - these are all full-blown languages with control structures, etc., but they’re oriented towards specifying printed matter. Can’t make a good game with them.[/li][li]shells - (both the Unix shells and DCL on VMS, WSH on Windows, etc.) these languages are good for doing simple operating system maintenance tasks, manipulating files, controlling other programs, etc. but get hairy pretty quick if you push them beyond that. Can’t make a good game with them.[/li][li]matlab - this language is oriented towards numeric calculations, especially those involving matrices. Can’t make a good game with it.[/li][li]CNC languages - these are languages for controlling metal fabrication equipment: lathes and mills and such. Can’t make a good game with them.[/li][/ul]
There are a LOT more out there–usually created for specialized applications. I’ve even made a few myself.

And when you get into languages that have an extensible syntax, like Lisp, you can even argue that evey program that extends the syntax creates a new language that’s specialized in whatever that program does.

But please don’t make the mistake of lumping Common Lisp in with Scheme. It’s an industrial-strength language that can run about as fast as C (with a hell of a lot more abstraction features) and that can do an awful lot more then generate Fibonacci sequences. I say that as someone whose first out-of-college job was developing an application in Common Lisp. :slight_smile:

Even with your more specific definition of what “compiled” means, Java is still both compiled and interpreted.

Java compiles to bytecode (not the native language of any preexisting CPU, but there are chips now that can execute Java bytecode), which can be interpreted by a virtual machine, but usually it’s translated into native machine code before it runs. That’s just-in-time compilation, or JIT.

.NET languages also compile to bytecode, but it’s always JITted before it runs, never interpreted directly.

And if you still don’t think that counts as “compiled”, the GNU Compiler for Java can compile Java source code directly to native machine code.

Wish I’d thought of that. :stuck_out_tongue:

Look, I’m not trying to define the word “compile,” just explaining how I’m used to hearing it used. I would never say “Java is a compiled language,” because the canonical implementation is byte-compiled and interpreted on a virtual machine, and I’d consider it misleading to imply otherwise. In my opinion, when people ask if a language is compiled or not, what they’re really asking is whether the compiler outputs a binary executable that the operating system can natively execute.

Disagree? Use the phrase “compiled language” differently? That’s fine. Maybe the OP will encounter your usage too, and now he knows that there’s disagreement about the term.

The default implementation of all modern Java VM’s use Hotspot: http://mindprod.com/jgloss/hotspot.html which compiles directly to native code.

Right. But it does so at runtime, and it does so by compiling bytecode, not sourcecode. It’s really more of an technique for implementing a fast interpreter: you have one program (the Java VM) analyzing data and acting accordingly.

I’d feel comfortable saying “C is a compiled language” because pretty much all implementations are compiled.

I’d feel comfortable saying “awk is an interpreted language” because I don’t know of any compiled implementations, and all the major implementions are interpreted.

I’d never say “Lisp is a compiled language,” because with the wide variety of implementations and a big mix of interpreters and compilers (sometimes in the same product) the answer is way more complicated then that.

I’d never say “Java is a compiled language,” because, has this discussion has made clear, the answer is fairly complex, and it doesn’t really fit into the traditional molds.

You can write code in any text editor (Notepad, vi, emacs, what have you). You will have to save the file with whatever filename extension the compiler you are using
demands (check with whatever documentation you may have for such). In Windows, this is a simple matter of hitting “Save” in Notepad, changing the “Save As Type” line in the dialog box to “All Files”, and typing the proper extension in the “File Name” blank after whatever name you give the file.

The GUI you refer to is more properly termed an “Integrated Development Environment” (IDE). IDEs offer all sorts of bells and whistles that make creating properly formatted code that accesses all sorts of resources that may be available in the language easier to generate than otherwise, so they are sometimes preferable depending on what you’re trying to do, but the idea that you can’t possibly code in a particular language without a particular IDE is just something the Marketing Department wants you to believe.

Given that I have seen, with my own eyes, a 3-D Tetris-like game produced with Microsoft Excel and VBA, I will refrain from suggesting that there is anything that can’t be done with anything.

At the base level of the computer, you have a series of higher or lower voltage electrical signals. They can be used as on/off signals for controlling various parts of the computer’s circuitry, or they can be interpreted as 1’s and 0’s. We can then think of various states of the computer circuitry as representing either the inputs and results of various functions in classic predicate logic as performed on Boolean algebraic entities, or as the inputs and results of mathematical functions performed on numbers represented in binary (base 2) form.

Once we have abstracted ourselvs to this level, we can use the interpreted numbers in various coding schemes. We can interpret banks of of the electrical signals as text characters, in something very akin to, but more intricate than, the old 1=A, 2=B, scheme, or as coded instructions for various rudimentary activities the microprocessor circuitry is capable of carrying out (which amount to shifting storing, and manipulating data in the form of voltage signals).

If one knows the numeric codes for the entire instruction set of the chip, and is algorithmically clever, one can organize the instructions in such a way as to carry out an astonishingly vast array of tasks, especially when one connects various peripheral devices to the circuitry.

If one is even smarter, then, rather than feverishly trying to remember or look up numbered codes for chip instructions, one creates a scheme for assigning some mnemonic codes to each of the instructions (such as ‘ADD A’ to mean “Add the number I’m about to give you to the value currently in processor storage register A”). Then, the last time one deals with plugging the numbers directly into the machine is when one creates (and presumably stores in some non-volatile place, handily accessible by the computer) a program called an Assembler that reads the characters (or rather, the numeric codes for each of the characters) in your mnemonic code, and generates the proper numbered instructions for you. Ahhhh!

If one’s name is John Backus, and is therefore a genius, one realizes that, numbers or mnemonics, screwing around with a big pile of chip-level tasks to create more complex tasks, especially tasks that might need to be done a number of times, is the chump way of doing things. One creates FORTRAN, a set of menmonic codes, each of which can stand for, not one, but a whole commonly-used sequence of processor instructions.

If one cottons to this notion, but one does not find the grouped instructions of FORTRAN to be quite what one is looking for, one creates one’s own codes to group together one’s own custom set of instruction sequences, and creates yet another in the vast panoply of higher level computer languages, each with its strengths and weakness for various sorts of applications.

Designing a computer language is the same as designing any written language. You figure out what sorts of structures you would like in the language, and work from there.

For example, if one were designing English from the ground up, one might decide that one wants a structure known as a simple declarative sentence. This beast shall consist of a subject, a predicate, and possibly a variety of prepositional phrases. One then defines what one means by a subject, a predicate, and a prepositional phrase, what order they can go in, and so on, on down to the words one can use and the letters that can make up those words. Eventually one would end up with more or less a system for determining what was a properly constructed and valid simple declarative snetence in English and what was not. The design of programming languages is not dissimilar.

One then wants something that understands one’s language. In the case of English, one has one or more nights of fun, followed by an 8-9 month wait, after which one has something which, given 18 or so years of nurturing and instruction, may or may not be able to understand standard written English. In the case of a computer language, one creates a program called a comiler, which, at its most abstract level, can be understood as a set of instructions written in language A for taking instructions written in language B and turning them into instructions written in language C.

Although some will object to this, one could see my earlier assembler example as being a set of instruction written in machine code for taking instructions written in assembly mnemonic and turning them into instructions in machine code.

In similar fashion, I could create a program in assembler for taking instructions written in C and turning them into machine code. Once I have that, I could then create a new compiler written in C that takes C and turns it into machine code. Or one that turns C into Assembly, which is then turned into machine code. You then compile your C compiler written in C using the C comiler written in Assembly. One can go on like this indefinietly. (Bjarne Stroustup’s original C++ compiler took C++ code and compiled it into C code, which would then need to be given to a C compiler. If that C compiler created Assembly code, you’d need to then run it through an Assembler program to get machine code that would actually run on some chip.) But one can always create a compiler, in just about any language, that will take a language and compile it to machine code.

Of course, once you have machine code, it only works on chips that understand that code. Once your end-user buys a new computer with a new chip, and you want them to be able to run your program, you’ll need to write a compiler for the new chip. Pain in the ass.

Fortunately for you, there’s a company that makes operating systems who’s in the same boat as you every time a new chip comes out. So you put on something revealing, take a stiff drink and sidle up to them real slow and sultry-like, and say “Tell you what, Bill. No point in us both having to redesign. Let’s get hitched, I tell ya. You provide interfaces for the user to the resources of the system. Make a couple for me, called Application Program Interfaces (APIs), and instead of compiling for the chip, I’ll compile for them. You make sure you keep them the same for each chip, you’ll always have my software to to make your OS attractive.” Which is how it generally works today.

The term you’re looking for is “Turing Complete”. In computer science, a language is simply a collection of strings of symbols with a set of rules for determining which strings are in the language and which are not. but not every language is Turing Complete.

A computer is a machine analog of a mathematical concept known as a Universal Turing Machine. A Turing Machine is basically a device that manipulates strings of information, and a Universal Turing Machine is one that is sophisticated enough to imitate other Turing Machines, by taking a description of a machine, and reacting the way that machine would react if given the same input that you give the simulation. A pocket calculator is a TM, and the Calulator Accessory in Windows is simply the computer imitating a pocket calculator according to instructions on how to do so provided to it. You give it the same sort of input that you would to an actual pocket calculator, and it responds the way a pocket calculator would to that input. Every program you write is a description of some sort of Turing machine, even if such a machine has never existed in the real world.

A Turing-Complete (compiled) langauge is one that can create self-contained descriptions of Turing Machines for a UTM. Other languages (interpreted), such as Javascript can only do so in some other context. HTML and XML are not even that far along, in that they describe document content and structure rather than machines.

Java is an interesting case. Java is a Turing-Complete language. You can create standalone programs in Java, which is all that’s required. Howver, Java compilers generally don’t compile to the machine code of your chip. They compile to bytecode, which is the native language of a computer called the Java Virtual Machine. No one has built a JVM out of hardware that I’m aware of, but that’s not really the point. The point was a way around the problem of running the same program on more than one chip without recompiling. What you do is, you give your computer a description of a Java Virtual Machine, and then give it the bytecode of your compiled program. Then your computer is supposed to react to the bytecode the same way an actual, physical Java Machine would. Microsoft’s .NET platform is the same idea.

XML and HTML are odd cases, though. Traditionally, they’re not programming languages, but as I stated above, I’ve written an XML-based language, and although it wasn’t Turing complete, it wouldn’t have been all that hard to make it so.

I guess you could write an interpreter for HTML that makes it a Turing complete, but that’s really more of a curiosity than anything else.