I have no idea of the relevance of any of those questions.
The OP asks about serious programming done in machine code, and makes a distinction between machine code and assembler. I stated my opinion that there is no meaningful distinction between these two, as there is a one-to-one mapping of instructions.
Literally typing out binary would be useless busywork, doing exactly the same function, in the same way, as typing out 3-6 character commands.
Or, to put it another way, we could hypothetically make a CPU whose instructions were all human-meaningful ASCII character words like jmp. If I write for that CPU, is it assembler or machine code? Tomato, potato.
I think @Francis_Vaughan pointed out an important distinction between using an assembler - an IDE that takes care of memory addresses and some other things that a machine code programmer would have to do for themself definitely represents that one is a step away from the other.
Using an assembler is like a superset of writing assembly language
What about stack processors? There are microprocessors that use FORTH-like stack language as their machine code. These are wildly used in industrial settings and are easy to program.
Of course today the chip itself is microprogrammable and microprogrammed to use the stack language but there used to be chips that had it on hardware.
I don’t understand that… Using an assembler is what you do when you’re done writing code in assembly language to convert it into object code, possibly directly into machine code, but I think you understand the process is more complicated than that.
If you answered them you might figure it out. People who enter machine codes into a machine, or type in a program in assembly language, or any other language are doing data entry. You can call that programming if you want when the data entered happens to be machine codes, but I don’t. I don’t consider my disk drive to be programming when object code is transferred from the disk to memory, it’s just moving data from one place to another. I consider programming to be the process of creating code for a purpose. That requires intelligence. Maybe an intelligent human, maybe intelligent code, but it’s not a simple mechanical process.
There was a microcoded processor that ran P-code (the Pascal intermediate stack engine) way back when. (I think it was actually an LSI-11, but I am probably wrong.) A stack based abstract machine is pretty much the default.
Even with a stack based abstract architecture something needs to keep track of the stack offsets. In most languages that is the compiler, in abstract intermediate machine like LLVM, it works out the mapping from an infinite register stack machine to a real machine. FORTH is a neat way to code, but even then the intermediate code is not human generatable with any ease.
Then you have architectures like the SPARC, with a register stack.
Coding in assembler means you get to write all the verbs out easily. But the nouns - the subject of those verbs - that requires the assembler, linker, loader to sort out. All well and good being proud that you know how to write MOVAB, but that is the easy bit. The address and offset calculations are where the assembler earns it keep.
I’ll have to disagree there. All of the people I know that use machine code and assembler know the difference, and they all work in the computer and programming industries.
We still build using machine code, but I haven’t written anything in machine code in 30 years: if required, we build in assembler, output in hex, and include the machine code as an image.
Regarding the original question: it depends what you mean by “serious”. Anything very complex, and it’s probably worth the cost to pay for an assembler. I wrote in machine code when that was easier and less effort than adding another tool to the toolchain.
I don’t agree with those who have suggested that writing machine code is the same as writing assembler. The first high-level languages I used merely translated from “programming language” to machine code: when doing this stuff, I could look at FORTRAN or Pascal and tell you what machine code numbers were generated, but that’s like saying that English and Chinese are ‘the same’, or that written text and spoken text are ‘the same’. We use names and mnemonics because they match the way our brain works, and a lot of real programming is already at the limit of what the brain is capable of.
In any case, CISC assemblers generally didn’t have a one-to-one correspondence between assembly mnemonics and machine code, and most assemblers are of the type more specifically known as ‘macro assemblers’.
That’s an interesting discussion but I’m not sure if it is the obvious interpretation of the OP. In any programming language the developer has the choice of using an IDE or just text editor + compiler.
Heck, you could have an IDE for writing machine code directly, if there were any point to doing that over writing assembly.
So in comparing two forms of instruction, I don’t know why we’d compare using an IDE in one versus not with the other.
The OP is specifically asking about programming, and I have of course assumed that whether we’re talking machine code, assembly or Brainfuck, we’re talking about a software engineer writing code for some purpose.
Where has your data entry tangent come from?
Not mentioned in this discussion is one major group of people who still do a significant amount of work in machine code and that’s reverse engineers and vulnerability researchers. Because often the only form of the program they have access to is the raw machine code, they still remain literate in how to read and manipulate raw machine instructions. Often they will use tools to help with comprehension like decompilers and disassemblers but sometimes, nothing beats just staring at the raw hex codes or binary to truly understand the nitty gritty of how the code works.
I’ll try one more time. If I write this post in English, and then have it translated word for word into Spanish did I write the post in English or Spanish?
The one for one correspondence doesn’t exist and even if it did it wouldn’t matter. Here is a simple instruction. What is the one for one correspondence between this line of assembler and machine code? Pick any instruction set, just tell me the number that replaces ‘loop’.
You might be able to write code with that effect in an assembly language, but I would still maintain that it’s fundamentally programming in machine code, not programming in assembly. I can envision two different computers, with the same set of available fundamental operations, but which assign different numerical codes to those operations (yes, this is a rare situation, but it’s possible). You could use the same assembly-language code for those two computers, just by running it through different assemblers, to turn your mnemonics into two different sets of numbers. You could even write assembly code without knowing at all what the numerical codes are that correspond to your mnemonics. But you can’t do self-modifying code of that sort without knowing what all of the numeric machine codes are for those numbers.
Neither, because you didn’t do that. You can’t translate English into Spanish word-for-word. Very few things that get called distinct languages can be translated word-for-word, because the grammar is different. They can still be translated, of course, but you need to look at how words relate to each other, not just one word at a time. Occasionally, with closely-related languages, you can find a sentence that does translate word-for-word from one to the other, or even a sentence that looks identical in both, but for longer works, that eventually breaks down.
The analogy carries over fairly well to programming languages. You can translate Fortran to C. You can even write C with a Fortran accent. But you can’t just directly transform Fortran into C one word at a time, because the grammar is different, and so you have to look at higher-level structures.
In the cases where you can directly convert from one language to another, one word at a time, as with simple assemblers, there’s a strong case to be made that those aren’t actually different languages.
Yes you can. It may not be good Spanish, but may not be a good programmer either. It also doesn’t matter if it’s a word for word translation either. I wrote the post in English, not Spanish. A tool turned into Spanish, not me. Source code and object code are not the same thing.
" JNZ loop"
Tell me what number replaces “loop”. What is the one to one correspondence between “loop” and the number that replaces it in machine code? That’s assembly language. Symbolic definitions are part of the language and don’t have one to one correspondences.
What I am trying to say is that whilst the language instructions that you use in an assembler are directly representative of the instructions understood by the processor, the assembler environment (as @Francis_Vaughan pointed out) includes additional tools to make the process of writing a program easier - such as abstraction of memory addresses as labels.
It’s possible to write the processor instructions on paper and translate them into bytecodes manually by looking them up (I’ve done it, for very simple Z80 code), but you have to care about things like exact memory addresses - if you want to do an absolute jump or a call or read or write etc.
Exactly. Because assembly language is not just a list of the text representations of opcodes. It includes all the features of the assembly language, which in reality will include macros and environmental instructions, along with compile time, link time, and load time directives.
This was the distinction that wasn’t at the front of my mind when I joined the thread. Probably because, although I have programmed a bit using an assembler (years ago on my ZX Spectrum), I did not actually realise at the time that it had these features and I did the other stuff like addresses and jumps the hard way, so I guess I was just using the assembler to write machine code - I think the only advantage I made use of was the automatic translation of decimal numbers.
I was told many years ago that the early PC (read DOS 3.1) spreadsheet program Lotus 1-2-3 was written in assembly. No higher-level language was used. It was a killer app at the time and sold tens of thousands of copies. So if Lotus 1-2-3 was witten entirely in assembly then yes, serious programming was done in assembly.
Lotus 1-2-3 was fast and fit within the DOS memory limitation. But it was so hard to maintain and improve that it fell behind later spreadsheet programs.
P.S. I do make a distinction between machine code and assembly code but, IMHO, I can’t imagine how anyone could write even a smallish useful program in true machine code.
For those who are interested, this guy takes a 6502 processor, some RAM, and an Arduino for I/O; and executes machine language programs. A revelation on how computers really work at the most basic level.
It’s not a matter of “knowing the difference” – the difference is taught in Computer Science 101. It’s really just a matter of convention that may vary among different technology cultures. My IT experience was largely with mainframes and higher-end minicomputers where one rarely dealt directly with octal or hex machine code, but frequently dealt with high-level language compilers. So “machine language” was usually synonymous with “assembly language” in that culture, to distinguish it from a high-level application-oriented language. YMMV.
This is just wrong. Assembly language at its core is basically just a symbolic way of specifying machine instructions. The one-to-one correspondence between each symbolic specification and the resultant machine instruction is inherent to what assembler fundamentally is. Any assembly listing immediately reveals this one-to-one correspondence.
Of course this excludes specifications that are merely directives to the assembler. Macros are just a shorthand way of specifying a commonly used block of code. And CISC instruction sets have nothing whatsoever to do with any of this.
This is just wrong. Converting op codes expressed text into numbers is the simplest part and a small part of what an assembler does.
" JNZ loop"
Tell me what number replaces “loop”. What is the one to one correspondence between “loop” and the number that replaces it in machine code?
And just to be clear, only simplistic assemblers produce executable machine code. They may convert static parts of the program into their binary representations, but that’s not even necessary. Assemblers produce object code and a symbol table, generally in the same form as other language compilers do. That symbol table allows object code produced from multiple sources to be converted into machine code through the use of a linker. And even a linker may not produce executable machine code, it may still have to pass through a loader to finalize address resolution.