Every now and then I read a story about this or that programmer who supposedly wrote this or that piece of software “directly in machine code”. As I understand those stories, “machine code” here refers not to the actual machine instructions, which are strings of ones and zeroes (even though they can be represented in decimal or hexadecimal), but rather to assembly language, which has the same syntax as machine code but replaces the latter’s instructions with easier-to-memorise statements borrowed from natural languages.
AIUI, programming in assembly does take place, though less so than there used to. But is there actual programming done in machine code, understood as the lowest-level ones and zeroes that flow through the processor’s circuits? My best guesses are the very early assemblers themselves; somebody had to write those, though I suppose that since the first assemblers became available, programming in machine code has become obsolete, and even new assemblers would be written in assembly and then compiled into machine code using already existing assemblers. Maybe the front panel-controlled computers of yore were programmed in machine code - though AIUI, the switches on the panel were used to tell the computer to load a program from some medium into the memory, not to manually enter ones and zeroes that would make up the program?
I do know a couple people who wrote straight to the CPU in hexadecimal. A friend of mine I’m still in touch with is one, and I recall he would write in C mostly, with a hand-written machine-language segment that was interrupt-driven where he absolutely had to wring the most performance out of his code. This was aimed at a 6809E microprocessor.
I imagine some of the authors for the “Compute!” magazine of yore wrote also serious machine language. A number of the programs in that magazine took the form of a long series of DATA statements with a BASIC loader to directly fire the code into RAM and then execute it. I expect a good proportion of these were written in assembly and then “PEEKed” out, but some were probably directly written in machine language. I’ve got nothing but a gut feeling on that one though.
When I was a teenager (in the early 80s), I had a Times Sinclair 1000 and dabbled with programming it in machine code, though I guess none of what I did amounted to “serious programming.” I think it’s fair to call what I did machine code rather than assembly language, since I had no assembler program and had to come up with the machine code to put into the computer (in hexadecimal; but it’s trivial to translate between hexadecimal and “ones and zeroes”).
I don’t know if any one regularly uses it these days (I assume C is the most common), but I learned it back in high school. In a world of BASIC, Assembly gave you a whole lot more power and a better working knowledge of what the computer’s doing internally WRT to registers and Interrupt/ESC codes. It also forced us to learn hex and binary.
IIRC, by final project for my AP Comp Sci class was doing a screen dump to a color printer in assembly (essentially, writing a print driver), watta a fucking pain in the ass that was.
It seems to me that these days, you’d only be using assembly if you needed something more powerful or you needed to drastically cut down on the size of the file.
Your understanding doesn’t reflect the common use of the term “machine code” in the computer and programming industries. From the earliest practical commercial computers (as opposed, say, to “toy” computers like the earliest Altairs) “machine language” typically was synonymous with “assembly language”, and was the important distinction with the first high-level languages like FORTRAN and COBOL.
The significant thing about assembly language is that it fundamentally bears a one-to-one correspondence with the underlying machine instructions, simply replacing instruction codes and their address fields and modifiers with mnemonics, symbolic location tags, and a defined assembler syntax. So the programmer was literally writing in “the language of the machine”, albeit using mnemonics and symbolic addresses. Whereas high-level languages are oriented to the requirements of the problem space and generally have no obvious relationship to the underlying machine code.
Real-world assembly languages were actually a bit more complex than my simple summary, in that they usually provided features like macros (pseudo-instructions that translated into a whole block of machine instructions) and symbolic operating system calls that requested the OS to perform all sorts of utility functions, system library subroutines, and so forth, but fundamentally the one-to-one mapping to machine code is the essence of assembler programming. And yes, in earlier times, it was often the only way to write complex programs, especially if you wanted them to be efficient. Early operating systems and major utility programs were invariably all written in assembler.
And just in case it wasn’t clear, I explicitly mean machine language, not assembly. There were only, what, ten instructions or so on the old 6809E (maybe 13?), so once you had them committed to memory, the assembler was largely just getting in the way anyway.
I rarely hear people talk about either machine code or assembly these days as a programmer, but in CS school they definitely made a distinction between machine code and assembly. That may have changed in the 20+ years since I’ve been in school, but if anything I’d think they’d be talking about machine code even less these days.
I would do it occasionally where I had limited ability to assemble anything but could poke a few bytes into memory. Most of the time I would start by assembling the code and then just copying the binary. Many instruction sets use multiple options in one op code, so a basic op code determined in the first few bits would use the remaining bits in the word to specify registers or other specs. Putting those together by hand wasn’t worth it when there was any alternative.
Way more instructions than that. The name of the instruction isn’t enough either, as I mention above there are many variants of the same basic instruction that have to be coded.
It depends on the context of the conversation. In CS it may often be necessary to make that distinction for teaching purposes – how else are you going to describe the output of an assembler? But when someone says “I wrote this in machine language” it inevitably meant that they wrote it in assembler as opposed to a high-level language. Except for small patches or some other very limited special purpose, raw machine code in the sense of octal or hex numbers was not something a programmer typically ever wrote anything in.
That said, the assembly listing – the assembler’s printout of your program – produced a copy of the resultant machine code alongside the symbolic instructions that generated them. This was important for debugging purposes, especially with hands-on computers where you were interpreting console lights and manipulating switches.
Well, there’s always Mel Kaye, who wrote machine code programs in hexadecimal back in the 50s/60s. It sounds to me like all the programmers at Royal McBee Computer Company did, he was just especially good at it.
I have a friend who wrote some programs in hex back in the late 70s/early 80s or so (when we were in high school, but I don’t think they were “serious” programs) and once got a job by knowing the “most interesting command in Assembly”. At least I think it was Assembly.
So yes, there was, but probably not for a really long time now?
The assembler has to be written in machine code, although I imagine nowadays it will be cross-assembled on a simulation on another machine. Back around 1956, I knew some people who were writing machine code for the Univac 1. However the machine directly understood some letters. But addresses were actual. So if you wrote A 546, the decimal code (the machine used binary-coded decimal internally) for the letter A was the add instruction and your instruction was to add the contents of memory location 546 (there were only 1000 of them) to the ADD register. Each line of instruction had two instructions and was actually 72 bits long. The program my acquaintances wrote turned out to have 80,000 such instructions, about half of which were used to read the appropriate instructions off the mag tape in to the memory at the right time. The memory was shared between instructions and data, a very modern feature for the time.
Sidebar: I’m trying to follow the discussion but would really appreciate it if you folks would explain the difference between these two terms, for someone who took only one CS class, a long time ago, using punchcards.
Machine code is just numbers. In hexadecimal maybe something like ‘FD30’. Assembly code uses text based mnemonics, like ‘MOV A,B’ to move the contents from the A register to the B register. Those instructions might mean exactly the same thing but obviously the text based instructions are easier to work with. There might be some specific numbers used in assembler possible, but the assembler can represent numbers with names like a variable name in high level programming languages.
In high school ('99-'03) I owned a TI-86 graphing calculator. It possessed programming capabilities, which was (and still is) the norm for such devices. When I searched around online for programs I could enter into the calculator, I read that it was possible to write programs for the device in assembly code, but I never really delved into that. The purported benefits were that such programs ran more quickly and had access to more of the calculator’s abilities than the BASIC programs you could run.
I think it is questionable if any significant work was done directly in machine code unless there was a very simple instruction set, or every possible opcode variant and been written out in a list next to a text mnemonic. So the coding was being done in hex only as the last phase of creating the program.
I do recall a bootstrap I had to manually code in 8080 instructions. Maybe 24 bytes or so, Real pain in the butt even for something small like that.
You could probably memorize the 6502 instruction set pretty easily. It’s only 151 instructions, and they’re laid out in a fairly logical way. I know I had the decimal equivalents of the most-used opcodes memorized as a kid. (Decimal because I never got a proper assembler for my Commodore, so for me to experiment in machine language, I had to POKE all the data into the correct memory locations. Do enough mnemonic → decimal to POKE in your code, and you know right away that LDA# is 169 (0xa9) in the instruction set. Working in decimal is stupid, though. But I never bothered with just coding up a hex-to-dec routine which would have been easy enough, but as an 11 or 12 year-old with no one to guide me, I wasn’t quite sharp enough to realize how much sense that would make. Anyhow, I did not embark on a computer science career, so all is good.
The Altair 8800 is programmed directly in machine code via the front panel - literally, you choose a memory address by selecting it in binary using the switches, then you toggle the switches to represent the binary value you want to put into that address, then toggle a switch to store it, and move to the next address and so on.
Of course, the first things that people did were to use that process to write bootstrap loaders for more complex programs to be loaded from paper tape, disk, or whatever, but if you just have the bare machine, you’re programming it in machine code.
The byte values you are entering in this operation have a direct, one-to-one relationship with assembly language instructions, as well as numeric operands. I’m not sure anyone would write the code without first thinking of it as a series of assembly language instructions, because those instructions are the operations that the processor understands that it has to do in response to those specific byte values.
So in that sense, machine code is an intermediate way of entering and storing assembly language instructions for the processor to execute as operations. So assembly is like the spoken language of the processor - machine code is like the written form of that language.
I think it’s one of those distinction without a difference things.
I mean, let’s say I wanted to copy a value from register 1 to 2 and the machine instruction to do this is value 01001110, or, say, CPW in assembler.
I’m going to probably need to have the reference manual or a crib sheet next to me that tells me CPW = 01001110. So what’s the important qualitative difference between me typing in that binary value versus typing “CPW”, apart from the former including more busywork?
There is a critical difference between assembly code and machine code. An assembler will present addresses as abstract entities and as part of its operation will back patch target addresses. So you don’t code the targets of jump instructions as hard values but as a simple symbolic offset. Even then you are not down to final, as executed, machine code. One has left out the linker/loader step. A program is usually not a monolithic piece of standalone code. It uses other libraries, and makes operating system calls. Your code must be linked with other code it uses, and loaded into memory. Loading into memory with other code means even the assembler can’t know the values of absolute addresses needed, and these are filled in at load time.
A modern assembler looks more like a programming language than banging the bits about. (By modern I mean anything from the mid-70’ onwards.)
I have written code that dynamically generates machine code. Huge fun it was. But nearest I have got to actually crafting machine code was patching a single instruction or two.
Back in the day, I know people who were so familiar with the Cyber 6600 series machines that they could and did directly write short snippets of machine code directly into memory without any reference material.