I’m getting into two distinct types of programming these days: *nix hacking, on my Linux box with its full complement of GNU tools, and DOS hacking, within dosemu emulation on my Linux box, with a lot of freeware and shareware tools I’ve downloaded.
It’s toolkits I will describe here. Each kind of programming has its own style, reflected in large ways by the kinds of tools programmers in those styles deem essential to the task. *nix programming is focused on producing stable, secure products that will run in an OS that provides true multitasking, true memory protection, and a wide range of basic functionality. DOS programming is focused on producing stable (but not so secure) products that will run in a single-tasking OS that provides no memory protection and only the most absurdly simple functionality.
Another point to consider are the kinds of machines the two cultures come from. *nix OSes were originally used on relatively high-end business and academic machines that routinely served multiple users (both local and remote through a modem-teletype link). Disk space wasn’t in high demand, and everyone usually got more than enough clock cycles to do what they wanted. DOSes, on the other hand, have always been used on low-end personal computers. Disk space was traditionally at a premium and shaving clock cycles was essential to get something to run at an acceptable speed.
Now, for the lists.
DOS:
[ul]
[li]An editor of some type: This is so simple it seems almost not worth mentioning, but I’ll mention it anyway because it does matter. A simple editor is easy enough to obtain: One came free with MS-DOS, called EDIT, and there are a plethora of EDIT clones free to download. What would I want if I could only have a single editor? A hex editor of some kind. The ability to edit hex directly means the ability to write programs with nothing else. With a hex editor and a knowledge of Intel x86 opcode generation, a dedicated enough hacker can write his way into a more user-friendly system. This is akin to climbing Everest backwards on a unicycle, but it is possible, and some of the more extreme survivalists find it oddly appealing.[/li]
The dark side to this is the ability to crack. There has always been a vibrant, active program cracking scene in the low-end culture, dating from the pre-PC Commodore era and continuing on to the present day. Cracking mainly consists of looking at the binary of the program you wish to crack (probably because it has a copy-protection scheme embedded in it, or maybe to add a few cheat codes to a game’s binary), either directly in the hex editor or via a debugger (more on that later), and using the hex editor to replace the offensive opcodes (the ones that make up the copy-protection routine) with no-ops, null instructions that do nothing (Intel opcode 0x90 :D).
[li]Disassembler: This is a program that brings to mind the truism “There’s no such thing as closed-source. After all, what do you think the CPU’s executing?” Disassemblers translate binary into assembly code, essentially reversing the process of the assembler. But unlike the assembly process, disassembly isn’t always exact: Everything looks the same in a binary, code and data both represented by hex. A smart programmer knows enough about assembly to read what the disassembler gives him, and better disassemblers have options to skip past a certain amount to allow for nonexecutable header information and such. Disassembly is a great way to see what that compiler’s code generator is doing with your precious sources, and to see places where the damned thing is so hopeless you’d do better to write parts in assembly and inline them with the rest.[/li]
A disassembler can be useful in cracking, but (as I said above) it is not essential. It is actually a bit of mental work to read a disassembled program: You have to know what the program is accomplishing, and how that would probably be done in assembly.
Skill with a disassembler and a taste for assembly makes the point of closed-source software moot. Who knows how many curious hackers know the code of MS-DOS better than Gates himself because they were skilled with a disassembler? That’s why I laugh when I hear about ‘secure initiatives’ focused on keeping source code secret.
[li]Assembler: What makes a good assembler? Speed and simplicity. What makes a popular assembler? It being free to download. MASM was certainly the second, and so most assemblers in the DOS world are MASM clones. MASM is the Microsoft Assembler, and it is still free to download from Microsoft’s site. MASM’s popularity forced a certain syntax on the DOS world: Intel syntax.[/li]
For those of you who didn’t understand a word past ‘certain,’ I’ll explain. What does this line of code do?
mov ax,bx
Well, it’s impossible to know unless you know the syntax of the assembler: Intel or AT&T? Intel syntax, followed by MASM and clones, dictates that the value of bx is moved into ax. AT&T syntax, followed by gas (the GNU assembler), dictates that the value of ax is moved into bx. Quite a quandry for the itnerant hacker, moving between the *nix and the DOS worlds. I solve it by going Intel all the way: NASM, the free (and very good) Netwide Assembler, follows Intel syntax and is available in source form for *nix and in binary form for DOS.
[li]Compiler: It is only very recently that the DOS world has had compilers of any quality available to the non-professional user. Shelling out hundreds for Borland C simply didn’t resonate with the average DOS hacker, so the scene was limited to assembly for serious hacking for most of the early days.[/li]
Now there are a lot of C compilers available for free to the DOS hacker. Many of them conform to the older K&R standard, however, but the determined person can find a good ANSI C compiler from a download spot.
Non-C compilers? There aren’t many, and they aren’t of as high a quality. Basic, while mostly an interpreted language these days, has a few good compilers for it (including the Basic-like MoonRock language). (Interestingly, Basic was once purely interpreted for the simple reason that most low-end machines had Basic interpreters burned into ROM. Why write what was in the hardware to begin with?)
[/ul]
*nix:
[ul]
[li]Editor: Again, I have a reason for listing this. There are almost no hex editors in wide use in the *nix world. The editors that are in use fall into two main categories: ed-descended and TECO-descended. ed and TECO? Yep. The two iron-age editors dating right back to paper tape and teletypes.[/li]
TECO is the Tape Editor and COrrector, a name that implies the old paper tape method of storage and input. It is a modeless editor, in that it only has one mode: Command input. By default, all keystrokes are interpreted as commands and one has to input a command to append text. TECO is based around the input point, the equivalent of the modern cursor. TECO encourages the writing of macros in its own terse, nearly unreadable language. Obviously, TECO is the ancestor of Emacs. Emacs’ name derives from Extensible MACros, recalling when it was simply a package of macros sitting on top of TECO. Emacs, even though it doesn’t look like it, still operates in TECO’s single-mode style. The difference is that most keys (those not prefixed with a meta-key like Control or Alt) are bound to a function that simply appends the value of that key. You can rebind any key to do anything, reflecting the gonzo macrology of TECO while at the same time extending it with the elisp language.
ed is the editor. For a long time, it was the default editor on *nix machines. It is a two-mode editor: Text entry and command entry are distinct from each other. This makes most ed commands accessable through single keystrokes, and removes the need for the use of meta-keys. ed is line-oriented: All commands work on single lines or collections of lines. This means that ed can work on a line printer, a necessity in the days before computers had screens attached to them. Obviously, ed is the ancestor of vi. vi’s name means visual interface, because it was once a simple visual interface to ed. vi retains the two-mode scheme, and most vi commands are still single keystrokes. vi, reflecting ed, has little of the macrology Emacs is heir to: Macros are possible, but they tend to be less elaborate and are certainly not in anything as complex as elisp.
[li]Compiler: *nix machines have always had good C compilers, for the simple reason UNIX was mainly spread due to C’s popularity. UNIX was written entirely in C in an age when most OSes were written in assembly, making UNIX portable while others were confined to a single hardware. C was developed on an early version of UNIX written in PDP-11 assembly, so C is sometimes accused of being too tightly bound to the rather complex assembly language of the PDP-11. (The PDP-11 had the most programmer-friendly assembly language in the world, to hear CISC people tell it. It was made before CISC chipmakers went gonzo in bridging the ‘sematics gap’ and implementing a seperate opcode for every possible programmer need. The RISC revolution happened soon after that, and brought with it the dreamy MIPS R6000. :))[/li]
Every *nix machine has at least one good compiler: gcc. gcc isn’t just the GNU C Compiler, although that was its original name. gcc is now more properly called the GNU Compiler Collection, for the simple reason it now compiles more than just C. gcc is an interesting compiler in that it isn’t one entity: It’s a machine-specific backend married to a language-specific frontend. There are gcc frontends for C, C++, Objective-C (the other object-oriented C used in the NeXT world and, now, MacOS X), FORTRAN 77, and probably some others I don’t know about yet.
[li]Debugger: gcc is very good, but it can’t help PBCK (Problem Between Chair and Keyboard). That’s why Stallman invented gdb, the GNU Debugger. gdb can slice, dice, and renice (well, maybe not the last one :)) your program to your heart’s desire. Basically, it runs the program in a special mode under its control. You can set breakpoints, step through the code line by line, check the value of variables, and see the frame stack, all from a terse, fast command line. gdb can help save the most hopeless programs, and is a thing of beauty in the hands of an experienced person.[/li][/ul]
Well, now y’all know what goes through my head when I’m bored…