Although I have not had to program professionally in assembler (I did do some while in College), I did have to look at actual 1 and 0s when working on fixing a bug.
I worked for a company that built solid state devices as well as the software to extract the data they collected. I worked on the software side and was presented, on my first full week there, with a problem where the new Windows based program, would, in certain cases, and with certain models of our devices, return different data than the old DOS program. As we wanted to deprecate the older software, especially since it would not work with our newer devices, I had to figure out why it was occurring.
Both were written in C++ and were very similar when it came to communicating with the device, to the point that I could not really figure out what caused the discrepancy. I decided to be sure that they were actually sending and receiving the same data by actually having each version print out the 0s and 1s that were being sent back an forth through the serial port. And, I found that yes, they were the same. So the same instructions, the same ack’s etc. So there was something else that was the culprit.
It turned out it had to do with the logic used by the device to compress data and one question posed to the engineer who designed the device was all it took to know exactly where the bug in the code was and fix it within minutes.
If I had to do it again today, I probably would have followed a different tack other than outputting 1s and 0s, but at least I was able to confirm that it is in fact how machines communicate
I remember talking to one co-worker who had taken electronics technician training and was playing with his TRS-80 (those were the days). He disassembled the ROM and while looking at it, found several instances where routines would jump into the second byte of two-byte instructions. What he noted was that the sequence of bytes was coincidentally also what the ROM programmers needed for another very short subroutine, so they must have scanned existing code for a sequence of the correct bytes and then done a subroutine jump to that location. Basically, ROM contents doing double duty.
DNA does that to. Evolution can find optimisations in the most evil places.
Some of the above comments exemplify why understanding, and an ability hack, machine code remains part of the toolbox of many programmers. There are times when you simply must dive right past the abstractions most code works at.
My favourite bug was where on a MC68020, a MOVC3 instruction would drop a character when copying a string.
Right in the middle of a copied string there would be a missing character, not null, but actually missing. So
“ABCDEFG” copied as “ABCDEGx” where x was whatever had been in the receiving buffer before. Took me a solid week to find and fix that. Given this broke bcopy it was remarkable thing to find and then have to fix.
I should also point out, there’s the story that Bill Gates while at college wrote the original BASIC in machine code. Whether he wrote it in Assembler and translated it, or wrote it directly in machine code - he did not have access to a computer to write it, so he wrote it by hand using the manual for an 8088. He then translated it into bytes, and then typed the bytes onto a teletype paper tape, one byte at a time. The first time he got a chance to try it was at a computer show, and when it was loaded - it ran first time.
The concept of “bootstrapping” a complier in its own language - basically write a very minimal version of the language that can be turned into a runtime - either translate into assembler or into an existing high level language for the computer. Once that is working, use it the minimal language to add extensions. and more extensions… an compile each expanded version with the previous version.
If you’re familiar with the Atari 2600 game Yars Revenge, that colorful neutral zone that divides the play field is actual code being used as data to do double duty as a graphical element. Not quite as impressive as jumping into the operand of an instruction and treating it as an op code, but still pretty neat.
No, that’s not correct. Gates did not write BASIC without the aid of a computer. Something of that complexity would have been impossible to write that way. Bill Gates in fact used the DoD-funded PDP-10 at Harvard for that purpose, running an 8080 emulator for developing his Altair BASIC interpreter, and got into trouble for doing so, and in even more trouble for giving Paul Allen – who wasn’t even a student – access to the system. What Gates did write from scratch – on the flight down to visit MITS in Albuquerque – was a machine-code bootstrap to load the paper tape of his BASIC interpreter.
As I mentioned above, I changed Adds to Subtracts and back to Adds using arithmetic on instructions stored in memory.
I had a short initialization routine at the beginning of my code to make sure all the Adds were Adds so I wouldn’t have to reload the program from paper tape.
That’s hilarious. Totally believable, but hilarious.
Yeah, but the idea that people do that whilst only thinking of the numeric values of the code, rather than their action as an instruction is just absurd. Why would you do a thing unless you have an understanding of what it means?
One would hardly imagine an entire compiler written in assembly code, let alone in machine code, but it may have been done. (Can you tell already this is going to be another Seymour Cray story?)
I worked at UCBerkeley in the early 1970s at which time we had two CDC-6400s. My job was to do local maintenance on the FORTRAN compiler and associated run-time library. That compiler must have been about 1000 pages of un-modular assembly spaghetti code. (Mostly, all I did was install weekly patches from CDC, which consisted of patching the source code and reassembling the whole thing, which took hours.)
All the legends about Seymour Cray held that he (or someone?) had originally written the whole thing in bare numeric machine code. The assembly code we had seemed to have some hints that this might be true.
All the statement branch-labels were like, AAA, AAB, AAC, AAD, etc., all they way through ZZZ or however far they got. And being utter spaghetti code, every third statement was a branch of some kind to some place 200 pages away. And these were mostly just plain conditional branches, NOT mostly subroutine calls! Most of the text messages in the code were gathered together near the end, regardless of where in the code they were references – these were mostly error and warning messages – and they were all coded as octal numeric digit strings
The whole massive thing did show some signs of having been originally written in raw numeric code, and then back-translated later into assembly language.
The entire COBOL compiler was likewise written (or at least delivered to us) in assembly language. Now there’s a massive program! I know nothing of its history though.
All the other utility programs, including the source code maintenance program, were likewise assembly language programs. Not to mention the entire operating system – initial versions of which, at least, Seymour Cray was widely claimed to have written in raw machine code.
That makes more sense - and really, it’s only definitions of ‘machine code’, to include the meaning of the code in terms of instructions/operations that make any sense of the OP’s question, at least for conventional processors - I guess if someone built a CPU for which the lowest level input language was Brainfuck, all bets would be off.
Speaking of Brainfuck – I’m not sure why this strange language has such a derogatory name, nor why the wikipedia page says it was first created in 1993.
Okay, maybe that detail it true. But there’s more bona fide history than that. Way back in 1969, in a compiler construction class, we discussed a minimal type of language that was substantially just what Brainfuck is. The whole point was to demonstrate the concept of a Turing-complete language, and what a language needed to have (or really, how little a language needed to have as long as it had the right stuff) to be Turing-complete. It was a completely theoretical construct, for discussion only, never intended to be really built. But of course, people then went and wrote simulators for the language, and then programs in it to do stuff. So what I see as Brainfuck in all those “esoteric language” forums is really nothing particularly new or original.
And there are techniques for writing actual code in languages like that, or even in more minimal languages than that. The trick is to define small snippets of code to do the more usual higher-level things (like “add two numbers”). Then define assembler-language-level macros that expand to those snippets of code. Then you can write whole programs with instructions like LOAD X, ADD Y, STORE Z and run that through your assembler to produce the entire working Brainfuck program!
Back in the 8 bit days, many computers came with a rudimentary debugger, but did not come with a proper assembler. I did a lot of programming back then by poking bytes into memory, mostly on Apple II and Commodore 64 computers. If you did any serious programming on the Commodore it was almost a necessity. The BASIC that came with the machine was very limited in what it could do.
I write a lot of assembly code, but I haven’t done much machine coding since those days. However, at work many years ago I did run into a situation where I needed to do a write-back and invalidate cache instruction (WBINVD for you x86 assembly heads out there), but the assembler didn’t recognize anything beyond a 386 CPU.
So I did this:
; WBINVD you stupid assembler
db 0Fh
db 09h
For those of you who don’t understand assembly code, the first line is a comment (anything after a semicolon is a comment) just so anyone in the future that looked at it could see what it was. Then it literally pokes in the two hex values 0F and 09, which is the opcode for the WBINVD instruction. It’s a tiny piece of code, but it is technically machine code.
There is another language even more minimal than Brainfuck that is alleged to be Turing complete. A friend of mine claims to have invented it. But I’ve seen a near-equivalent on one of these esoteric language forums.
The language has exactly ONE instruction: “Reverse subtract and skip if negative”. It takes a single memory address operand. What it does: Subtract the contents of the memory address from the accumulator, store that (I forget where already – I think BOTH in the accumulator and in the named memory address maybe), and skip the next instruction if that result is negative. There was one additional proviso: Memory location 0 (zero) was the P-counter, and any operation that stored a number there produced an unconditional jump to that address.
Somehow, that was argued to be enough to be a Turing-complete machine. And the friend I had (and still have) who claimed to have invented that or something similar, pointed out that you could write a set of higher-level macros that would expand into the suitable (not necessarily most efficient) code to perform all the usual operations that computers do.
Many later CPUs, especially during the 8 bit era, could not multiply or divide. But if you can add, subtract, and shift, you can multiply and divide using fairly simple add/shift or subtract/shift loops. It’s a slow way to do it,since if you are multiplying 8 bit numbers you need to go through your loop 8 times (16 times through the loop for 16 bit, etc).
It was very common back then to have canned subroutines for multiply and divide since the algorithm was very widely known and copying the existing algorithm was a lot faster than coding up your own.
Many early CISC processors did the same sort of thing, going through add and shift loops to multiply and subtract and shift loops to divide. It’s just that the did it with a single opcode that made the ALU cycle through the loops on its own (1 pass through the loop per clock cycle) instead of forcing you to code it yourself.
Well, hacks like that were common enough. The Zilog Z-80 processor was a more-or-less clone of the Intel 8080 but with a few extra instructions. I don’t know if there was a Z-80 assembler around, but I hadn’t ever seen one.
So Z-80 programs were assembled with any of the 8080 assemblers that were kicking around. If one really wanted to use the extra instructions (all the better to guarantee your code would NOT be portable back to the 8080, of course), tricks like this were common.
In those long-gone days, if you programmed your (now-)archaic computer by sitting at the console poking digits into memory (I did plenty of that), then debugged it at the console with the blinkenlights and the single-step button, we called that “hands-on programming”.
We did even better than that. At Berkeley, we had an archaic Univac SS-90 machine (info link coming when I get a chance to dig it up) that was donated by a lumber company when they upgraded to an IBM 360.
This had logic cards with two or maybe four logic gates on each card (this was early solid state days), with all the larger logic implemented by they way they were all connected up on the wire-wrapped backplane. There were a whole lot of emptly card slots. And we had boxes of spare cards of various sorts – a good thing too, considering how many of them we blew out. We also had the full set of logic diagram blueprints.
And we had guys in the club who were more into hardward than software. They invented a few new program instructions that they thought would be useful. (IIRC, they were conditional jumps of various kinds.) Then they drew up the logic diagrams. Then they plugged in the requisite logic cards and wired up the logic on the backplane.
We called this hands-IN programming.
Of course, the assembler wouldn’t assemble those. But that was find, since we never used the assembler except to try it out a few times. Those of us on the programming sides did substantially ALL our coding in absolute numeric coding.
The only suitable input device we had for initially entering such code was a ten-key keypad and a few buttons on the console, and the register blinkenlights. I did pages of code that way. Once entered, the code could be punched out on cards for later reloading a little easier.
I don’t remember who that driver was, but I was on that expedition! I rode up in the back of that flat-bed truck, along with Hal S., under a tarp because it was cold and drizzly. I rode back in one of the other chase cars that came along. This was in summer or fall of 1969.
Yes, you are using the commands that are the language of the machine. But those are just the verbs. The import nouns you can’t name directly because in machine language they are specific addresses, and as you develop code you won’t know what those addresses are until you are done and go back to perform an assembly process. This is why I say that writing code in assembler is not the same thing as writing code in machine language. You are going to use higher level constructs to do your programming. And of course you can use much higher constructs than the machine instructions.
Anyway, I’ve made my points, we are getting down to blurry lines between definitions, and the modern world of computers barely resembles the simple von Neuman models these concepts were developed on. You obviously understand how machine code and assembler works. It has been fun to visit this subject again after many years.