You’re rather hilariously missing the point of what “one-to-one correspondence” means. This image of a trivial PDP-11 assembler code listing demonstrates the point:
You can see the one-to-one correspondence between the instructions and the series of memory locations and their contents on the left. Where there are two number sets instead of one adjacent to a memory location, it’s because some PDP-11 instructions are single-word (two bytes) and some are double-word (four bytes).
The answer to your question is that “loop” is replaced by the assembler by a specific memory address. That number will of course change if you edit the code by adding or deleting instructions, but that has nothing to do with the one-to-one correspondence between the instruction example you cited and the corresponding machine instruction + address. That address may be further mapped to an absolute address at run time by the machine’s memory management hardware. But that has nothing to do with the one-to-one correspondence that I, and others, have referred to. It’s absolutely fundamental to what assembly language is.
Along with the story of Mel, I used to read the stories about Seymour Cray, the father of supercomputing. Before setting up on his own to build Cray Supercomputers, he designed computers (as in CPU and architecture) for Control Data Corporation (CDC).
Some of these stories involved Cray toggling the bootloader binary into the front panel of a new CDC computer, and pressing run to start the system and load the OS from tape, without the benefit of notes. Or reciting the 4kB bootloader for a obsolete system from memory over the phone because the documentation was lost. He was a Real Programmer™.
I have no doubt that Cray actually wrote those bootloaders. Whether he did so as machine code or using an assembler, I don’t know. But he wasn’t toggling in code he was writing on the fly - he was entering a data stream that happened to be a functioning bootloader that he had written. It was a feat of memory and data entry, but the actual programming had occurred much earlier, at some point between defining the architecture and instruction set, and having physical hardware to execute it on.
I’ll also comment that for an Electronics course at university in the late 80s, we had to program a ROM to implement a complex boolean expression. This involved reducing the expression via a Karnaugh map and mapping the inputs and outputs, and then using switches, program an EEPROM address by address to produce the desired result - exactly the same way a bootloader would be toggled into memory to load a computer. I have no doubt that the same exercise today would be implemented in a microcontroller or FPGA programmed over USB, but where is the soul in that …
believe it or not … people actually programmed the video game doom in one of those and from what I’ve seen of it it’s pretty good for what it is … in fact there’s a list of games you can write on them … brings me back to when i was 10 years old and typing in games from books on a ti/4a
Like you said, your experience was “largely […], where one rarely dealt directly with octal or hex machine code”.
My experience is with Z80, 8086, PIC and AVR microcontrollers, where there is not exactly a one-to-one correspondence between mnemonics and machine code. For example, how is NOP implemented on your machines? Is there a NOP instruction? Frequently there is not, and NOP is implemented as something like "add immediate zero’: so that’s two different assembly instructions for the same machine code. Or “INC” (increment) may be implemented as 2 different instructions, depending on where the target is: 2 different codes for the same instruction.
Speaking as a person who actually codes in assembler and debugs in machine code, many microcontrollers do not have a one-to-one correspondence between machine code and assembler, even at the most basic level.
It’s been over 50 years, but I think I did this using my simple assembler. Mostly if you code in assembler youy don’t need to know the opcodes, but when I wrote assembler for my PDP-11 I had my little card which told me if I ever needed to know. Plus, my LGP21 was designed in a way that the mnemonic for an operation was that opcode - so B for Bring has its last four bits as the opcode for the Bring instruction. So even when programming in machine language you didn’t need to know the opcode. Only if you were hacking it.
But for most purposes Assembler and machine language are nearly identical, since you don’t really care where your code goes. The important thing is that there is a 1-1 mapping between assembler instructions and machine language instructions.
And no one has even brought up macros so far.
It sounds for your machines NOP is a simple macro. Macros look like instructions but map to several ones. But I think we should be talking straight assembly here.
wolfpup’s code looks good to me. If you just wrote what you have there, any decent assembler would barf. Even simple ones need two passes, one to define symbols and the second to write the object code. Otherwise you can’t do forward references.
The amazing C. H. Ting is (or was?) a big fan of FORTH and, in the 80’s, did a stage performance of Bach that he played on an IBM PC by wiring various values of resistors to the pins of LPT1 and sending the tie point through an audio amplifier. His program was in FORTH.
He wrote (and maybe published) various books on various FORTHs. One of them, I think maybe eFORTH, his book included some pages of code for the Microsoft assembler, MASM. Then more pages of code in FORTH. This let the reader create their own FORTH system without any other source.
I was very fond of FORTH and wrote an industrial control system that operated for many years using FORTH with the most critical sections written in assembly language, which many FORTHs let you do in-line. That is, in one FORTH code text file, you can drop down into a sort of assembly language mode, and then come back up into the FORTH context. A lot of what my system did involved data acquisition and hardware control, using bus cards in the PC. The user accesses IO or memory space, peeking and poking to get things to happen.
When Windows started to become common, I think a lot of this kind of thing just died off. There was the whole attitude in FORTH that it’s your computer and you can instruct it to do whatever you feel like trying, so very little safety built into the system. If you want to pack the return stack with random integers, have fun! With Windows, not so much.
Not sure if this goes beyond the premise of the OP, but I also worked on a digital computer that toggled back and forth between two different industrial machine states, spending an adjustable amount of time in each of the two states, as defined by a clock signal generated by a magnet and reed switch on a machine driveshaft. For each state there was a multideck rotary switch for the ones digit, and another for the tens digit, so you could set each state to last anywhere from 1 to 99 revolutions.
The way this was accomplished was through individual transistors and diodes soldered directly onto the rotary switch deck lugs, both to connect and mount them. Somebody had built these things I don’t know when – they were years old in the early 1970s when I first saw one.
My job was to repair them when transistors or diodes had failed, diagnosing which components had failed by watching how the erroneous counts depended on which settings the knobs were set to.
NOP is an assembly language instruction of the assembly languages of several microprocessors. It may be implemented by the same code as JMP 1, but it’s still just a straight translation from a mnemonic to a code. There isn’t anything ‘macro’ about it: it’s just assembler, and assemblers don’t require that codes are uniquely associated with mnemonics. Neither do disassemblers for that matter: you often get different mnemonics when you disassemble.
Assemblers that translate mnemonics to one of any of several equivalent codes are no more complex, and are just as common.
Assemblers that translate mnemonics to multiple alternative codes may not use a simple translation table, because translating multi-dimension code-points makes the translation table multi-dimensioned. But it’s prescribed by the architecture of the machine, which is what assemblers do. Some machines require more complex assemblers for some of the instructions. It’s a reason for using an assembler.
The fact is that, in real life, there is not a one-to-one correspondence between all the assembly mnemonics and all the machine codes on all the architectures. That’s just the way it is.
Using ‘one to one correspondence’ as the definition of ‘assembly language’ is a ‘true Scotsman’ definition. But I don’t think you’ll get much traction saying that assembly language instructions like NOP aren’t ‘real’ assembly language, and it’s not a definition that works for all microprocessors.
I know what a NOP is. The way you described it, it is an inline built in macro. That’s what you use to describe instructions you can use in assembler that are not directly implemented by the target machine.
NOPs are not trivial. You can code an effective NOP on many ISAs in many ways. But there is devil in the detail. The CPU needs to decode the instruction. That takes time and on a nasty CISC ISA can take variable amounts of time. Next your NOP needs to not just do nothing to the exposed state of the machine, but avoid causing time consuming changes to the internal state. JMP 1 could use up resources that a different null operation would not.
Intel provide specific advice on NOP sequences of different lengths that perform the best.
I will also note that the Thinking Machines CM-5 vector processors used a really evil trick. They ran next to a Sparc processor, sitting on the same bus. The Sparc has a huge slew of null instructions. The vector processors used those null instructions to operate. All instructions run in once cycle. So the Sparc could run the local OS and the scalar code of the user program. It would then run into a long sequence of null instructions that it just clocked through, with the side effect that the vector processors would all be chomping through their tasks. The vector units were 64 bits wide with an 8 deep pipeline. Once they got going the performance, for the time, was just plain amazing.
Assemblers take care of all the grief of addressing modes and the variable length of instructions that comes with them. Not to mention the grief of how ISAs manage extended instructions bolted onto the design. Simple ISAs usually have simple instruction coding. Things like the x86-64 have become utterly dreadful with extension on extension.
As to the difference between assembler and machine code. We talk of writing in assembly language. Because we expect the assembler to do work for us based on not just the instruction codes, but manage lots of symbolic grief in the background. Different assemblers provide different capabilities even for the same ISA. Compilers like gcc let you write symbolic machine code in-line. The compiler can take care of a lot of the details like working out register assignment and memory addresses.
In the end, IMHO, machine code is what the CPU executes. Anything that still requires massaging before it can be executed is not yet proper machine code.
Now we’re going out the other side. What’s needed to integrate inline code with a runtime environment is way beyond what I’d call an assembler. Ditto rearranging and optimizing code for a specific architecture. Generating code for a VLIW machine is way beyond what a simple assembler can do.
So to get back to the OP, I’d say that today no serious programming is done in machine code for advanced processors since things are too damn complicated. Some simpler embedded processors, maybe.
However, I just had an idea about how to draw the distinction. Anything that needs to understand the semantics of a program isn’t an assembler. Understand in a very loose sense, of course. But it includes data dependency graphs for register assignments, for instance. And all kinds of optimization.
This isn’t correct and it’s incorrect in meaningful ways.
First: Machine code is indeed the raw numerical values, most often in hexadecimal these days and octal in the past (and decimal in the distant past); anyone who says otherwise isn’t using the language the way the rest of us are. Wikipedia is an acceptable cite for common usage.
Second: Assembly doesn’t have the same syntax as machine code. I know this because I program on x86 machines, which have two separate yet equally important syntaxes for assembly language programs: Intel for people who have good taste, and AT&T for syntactic offenders. These are some examples:
Intel syntax:
mov rax, [rcx + rbx * 4]
AT&T syntax:
movq (%rcx, %rbx, 4), %rax
Notice how the source and destination flipped position from Intel to AT&T. They can’t both accurately reflect machine code, now, can they?
My best guesses are the very early assemblers themselves; somebody had to write those, though I suppose that since the first assemblers became available, programming in machine code has become obsolete, and even new assemblers would be written in assembly and then compiled into machine code using already existing assemblers.
Yes, that’s how it’s done, except modern assemblers are more likely written in C, not assembly.
Maybe the front panel-controlled computers of yore were programmed in machine code - though AIUI, the switches on the panel were used to tell the computer to load a program from some medium into the memory, not to manually enter ones and zeroes that would make up the program?
They were used for both, but the second was less common.
I’m getting confused now as well. What would programming in machine code mean? I recall programming my ZX Spectrum (Z80 processor) by writing assembly on paper, them converting it manually in hex, then typing the hex in a Basic program to code it (with a series of {POKEs). I didn’t use an assembler (didn’t have one at the time), but used assembly. Is that programming in machine code?
Or would it only have been programming in machine code if I would not have written out assembly? What if I had the symbolic instructions in my mind and only wrote out the hex values on paper, but in my mind still though in terms of assembly (which is actually unavoidable, because how else would you understand what the hex values meant?)?
Yeah, I think this thread needs a glossary, but I’m not sure there is agreement on the terminology. I don’t think I have helped, as I’m coming from the same place as you - I had accepted that there was a thing called ‘assembly language’ which existed as a literal set of human-readable instructions, directly equivalent to the instruction set of the processor (i.e. this stuff ), and that an assembler program could (optionally) be used to organise it, as well as providing useful shortcuts and utilities.
But I think I’ve been wrong to call that Assembly Language. I believe that’s properly called the processor instruction set.
There appears to be disagreement as to whether this instruction set (used without an assembler environment) is also called ‘machine code’, or if the term ‘machine code’ only applies to the numeric values of those instructions, without considering their mnemonics.