My first computers were IBM 1401 and 1410. We started the computer by introducing a 2 card program that started the hard coded core program which read more cards which were the program wanted to run. 1410 had a better system of hard disk which contained many programs. But, programs had to run in 20 k of memory.
I’d always argue, particularly from personal experience, that having an understanding of the layers under yours allows you to write code faster than people who don’t and which is more maintainable and runs faster than what other people write.
The idea that you can ignore the sublayers is something of a myth, created by the grayhairs who do have that understanding and intended for it to be true when they created those layers, but are hurting the young’uns in teaching it with the assumption that they succeeded 100%. Certainly it’s faster to develop code in modern languages than assembler, but like I said, in the end run of having the product work functionally and bug-free, you’re better to have the base understanding or you’re going to spend more time in development and waaaay more time in debugging and re-coding.
I taught Computer Architecture out of a 1979 edition, and that one was pretty good, so I agree.
I think this strongly depends on the type of code you’re writing. I haven’t needed to look at Sparc architecture manuals in ages, and our environment is such that you don’t even know what particular processor or computer you’re on. Back 15 years ago when we were writing some very compute-intensive EDA software, knowing something about the cache helped a little, but designing good algorithms and data structures was several orders of magnitude more important. Speed increases has also made optimization less important. I wrote something to process a few million lines of data. I wrote it fast, in Perl, to test it out, and was a bit surprised that it ran in a few seconds, plenty fast enough without porting to C++. Then we decided not to do it at all, so I really improved my efficiency.
If you’re writing embedded code, then it’s a different story. But in many cases your interface to the real hardware is so distant that it is pointless to worry about it. Plus there is virtualization.
Yes, but you understand the general principles of what’s beneath you. If you get a 22 year old out of school, he’ll just as likely write that same Perl script in such a way that it takes four days for the application he writes to process. I rewrote a C++ program in C++ and gained a 2000X speed improvement because I understand the basics of what’s going on where the person who originated it didn’t. I could have rewritten it in Perl and it would have gone faster than the implementation I was replacing because the person who wrote it had no idea of the various costs of things like database accesses, launching subprocesses, detecting character encodings, different methods of text parsing, constantly freeing and re-requesting memory, proper thread use and locking methods, etc.
2000X speed increase, I kid you not.
Perhaps the best answer to the question, “How do computers work?” is, “Remarkably well, all things considered.”
Oops, that’s right. To make up for my inattention to detail, I did mention, “Compilers and assemblers do all the work of generating the machine language for you.”
It gets more complicated still in Very Long Instruction Word (VLIW) CPUs, such as the Intel Itanium (aka the Itanic, aka the biggest mistake since the iAPX 432), where multiple operations are encoded into the same instruction and they’re all executed at the same time. Because of this, the human* has to figure out which operations are capable of running at the same time without stepping on each others’ toes. The alternative is a superscalar architecture, where each instruction has one operation and the hardware figures out which of them can run at the same time.
*(Either the human writing in assembly language or the human writing the compiler to generate the assembly language.)
This is emphasized when you have computers like the Three Rivers PERQ with loadable microcode: You could write your own opcodes in a microcode assembly language, translate that into a loadable form, load it up, and run programs using your special opcodes.
On the other end of the spectrum, some instruction sets defined as part of a virtual machine (a computer that exists only in software) have been implemented in hardware. picoJava implements the Java Virtual Machine in hardware, for example.
I believe you, all right. As part of one of our programs we flattened a hierarchical netlist, which was taking an ungodly long time to do. Once we looked at the algorithm used, we found out why. I don’t think we got a 2K improvement, but we got a couple of hundred times improvement.
I was thinking of lower level resources. A lot of the stuff you mention is more or less in the Perl environment, or the OS environment. Back in the good old days people thought they needed to code critical blocks in assembler. But indeed there are infinite ways of screwing up code.
Honestly, I got most of this stuff (though not to the level of expertise shown in this thread) when I took Assembler in college. It was quite the revelation - it just made computers make so much sense. (It didn’t hurt that I loved programming in assembler so much!)
So for people who are truly interested in the topic and are somewhat technically inclined, I recommend finding a class.
I worked on Merced, the first version of Itanic, for a year and a quarter, and you didn’t need to be Nostradamus to see where it was going. I got the hell out of there before I got too far along for my sabbatical, but people were leaving before their sabbaticals.
VLIW is a direct descendant of horizontal microprogramming. When I was in grad school, before I did my real dissertation research, we worked on microcode compaction for basic blocks, which was a big issue in the '70s. We finished it off, in two papers, one for IEEE. Trans. Comput. and one for Computing Surveys. At the same time Josh Fisher, first at Courant and then at Yale, was working on compaction across basic blocks. He moved from this to VLIW, which he talked about at a very early stage at the Microprogramming Workshop in Cape Cod.
RISC is also descended from microcode, vertical microcode in this case. Dave Patterson also did his dissertation on microcode, on verification. The real inventors were at IBM, which had more and better microprogramming tools than anyone. I only know what they published, but my advisor consulted for them, so I heard some news.
picoJava was basically open sourced long before it became popular. I tried to get some people to use it for test cases, with little success.
I’m surprised that people are still playing with wcs. Burroughs, and the D-machine, was the biggest commercial player in this kind of market. My dissertation was on a high level microprogramming language to help port across platforms and keep efficiency.
Today, since most processors which could be customized are implemented in Systems on a Chip, there is a big market in customizing processor hardware. Tensilica is a big player in this field. I just reviewed a book about this. Seeing our dreams of user microprogramming crash and burn in the mid-80s, I have my doubts. But there were lots and lots of papers on the subject.
This is all way too complicated for the OP, of course!
Assemblers did a bit more than this. The biggest thing was allowing labels, and making the code relocatable, which pure machine language does not allow. I wrote my assembler because I was tired of writing spaghetti code to add a few new instructions without going back and changing all the jump locations. Not fun, I assure you, even on a machine with a 4 K memory.
The second thing was macros, which are blocks of code placed in-line, not jumped to like subroutines. You could disassemble into labels, but it would be harder to regenerate macros.
The third big thing assemblers do is allow address calculations to be written in an algebraic form, which requires at least a minimal parser to understand the infix expressions involved. This is dependent on the machine code supporting relative addressing.
The fourth big thing assemblers do is only relevant on relatively advanced (modern, large) systems where all programs run under an OS: The assembler constructs the whole file in which the machine code program and its data will reside, and the ‘extra’ parts that don’t correspond to machine code or the program’s data are used by loaders to get the machine code into RAM the right way and linkers to figure out what other code the program needs (modern programs are spread out all over the filesystem in the form of libraries that can be reused by other programs on the system). In Linux, the ELF binary format is complex enough to allow 32-bit programs to run seamlessly alongside 64-bit programs on x86-64 computers (which can run 32-bit x86 opcodes as well).
In smaller systems, the assembler just generates a simple dump of the code and data, which is burned into ROM (as in an embedded system) or written to disk for loading into RAM later (as in MS-DOS).
(Not to pick on you, Voyager, but your posts are very easy to build on. :))
Really? I’ve seen calculations done based on index registers, but the additions have to be resolved at assembly time, and the relative addressing directly supported by the instruction addressing mode. I admit I haven’t played with an assembler since I taught PDP-11 and Cyber assembly language 30 years ago. The biggest distinction between an assembler and a compiler is that an assembler never adds to, subtracts from, or reorders your code.
I’ve never been much of an expert on linkers and loaders, since I mostly used assembler on really simple machines, like the PDP 1.
I know this is only the surface, but I’m actually surprised I could keep up with everything in this thread. I didn’t even have to Wikipedia anything!
Apparently I mostly had it, but it required a bit of clarification on some key points like the program counter and stack. My Comp Sci course (which I dropped out of because I ended up having to pretty much teach my peers since my teacher was busy with 6 classes in the same room at once so I had no time to do my OWN labs >.<, it’s okay I knew Java anyway) referred to them but never actually DEFINED them in a useful way. This helps me understand a little better why I found some methods faster than others, and the information actually gives me ideas on how to program better just by understanding a few of these (somewhat) simple concepts.
I’ll look at those books a bit and see what I can learn. I like the layers and every time I’ve learned about different layers, whether it be lower level languages or the machine itself I’ve felt a little more confident at optimizing and coding without fail every time. I know I’ve barely really even learned how this really goes, but the information here is tremendous (and I’m slowly but surely reading and digesting your linked post Sage Rat, thanks :)) so thank you. (Yes, it took me three paragraphs to say thank you, deal with it.)
They don’t teach this? sigh
Well, I guess I’ve at least found something useful to self-teach myself that’s NOT on the course list at least.
I’m self-taught so I don’t know what is taught if you get a CS in college (I studied literature. ) I suspect that they theoretically go through the lower levels of everything, but do it in a similar fashion as they teach Spanish to High School students in the US: You don’t need to learn nor comprehend anything, you just need to be able to pass a crib sheet-able test. And then immediately jump to Java and tell everyone that all the low level stuff they taught is irrelevant due to modern abstractions.
Right. That’s what I was talking about. I mean things like this, for the IA-32 ISA:
mov eax, [ebx-1234]
That is translated directly into machine code that adds -1234 to the contents of ebx and uses the result of that ALU operation as an address to load a word into eax. The lea opcode (Load Effective Address) is a good way to do three-operand adds and subtracts since the address can be computed from two registers, not just a register and a constant.
The PDP-11 had a form of this: Displacement and displacement deferred, both of them only using a constant and a register as opposed to allowing a pair of registers to be added. I don’t know about the Cyber.
Hm. This is almost always true, but if spim (a MIPS simulator that includes its own assembler, debugging system, and simple OS in ‘ROM’) is to be believed some assemblers will expand nonexistent opcodes (pseudo-ops) into strings of real opcodes. (Further research reveals this is a known property of the canonical MIPS assemblers.) So ‘no expansion’ is wrong, at least in one case, but it’s usually correct. Assemblers in general do very little to massage what they’re fed.
There’s a book on the subject, appropriately titled Linkers and Loaders, by John Levine. The ACCU (Association of C and C++ Users) thinks highly of it. (It’s sad the subject isn’t better-known. Some of the crockishness of C++ is due to the horrifically dumb linkers and loaders available on standard systems. Name mangling comes to mind.)
I used to be in a camp where “Why I do need to care about the hardware layer” till I get to Computer Architecture and Operation System modules in my Computer Science course. Then programming begins to make more sense. I won’t program in assembly for any real projects, but at least I appreciate the difference between a struct and a class in C#, for example, just by knowing one is on the stack, and one is in the heap.