1950s computers: question #1

In 1953 it seems the IBM 701 could add/subtract two ten-digit decimal numbers in 60 microseconds, so the magazine reported it could do 16666 additions per second. Which isn’t much help to people like me who wonder what such a computer could do, and how long it took to do it. Hoping to find out more about that.

The 701 was aimed at people that needed the 1953 equivalent of a 1980 scientific calculator. Aside from the fact that the latter was vastly easier to use, I wonder how a 701 compared in speed with, say, an HP41CX on a given problem, once the 701 had been programmed.

For starters: if a calculation involves trig functions, how did the 701 get them? Had to calculate them, didn’t it? No way they could store a table of ten-digit trig functions, except on tape, where the lookup time would be prohibitive. So where did it store the program to calculate them – on the drum? How much of the drum storage would such a program fill?

Or maybe it stored a table of something like five-digit trig functions, on the assumption that that much accuracy would be enough for a worthwhile fraction of the world’s problems? If you need ten digits, then get to work programming.

The common prediction is what’s called Moore’s Law. It says that computer increase in speed and memory is about two times over two years (i.e. in the computers that are built in a given year). Others have said that it’s more like one and a half times over two years. Still others have said that it’s more like two and a half times over two years. It’s been 72 years since 1953. So computers have increased in speed and memory about 2 to the 36th power, which is about 47 followed by 20 zeros. Those who think that it’s slower than that say that it’s 22 followed by 5 zeros, while those who think it’s faster than that think that if’s 21 followed by 35 zeros. (I hope I did the calculations right.)

There are efficient algorithms for calculating basic trigonometric and hyperbolic functions by rotations (and others via similar approaches):

Although people frequently (and incorrectly) try to ascribe Moore’s Law to all kinds of technological development trends both within and without computing (and Nicklaus Wirth used somewhat in jest as an inverse to the efficiency of software due to performance-enabled ‘bloat’), what has become termed as “Moore’s Law” was an observation by Gordon Moore that the number of transistors in an integrated circuit doubles about every two years. While that does very roughly correspond to “an increase in speed and memory” he was really making an observation about the scale of manufacturing process improvements for integrated circuits, something that was in question because people assumed an electrical or mechanical lower limit in the resolution and reliability of ICs at smaller scales but in fact it has more or less held true right down to near the thermodynamic physical limit.

Stranger

Ninja’d on CORDIC. Need to type faster.

The CORDIC Wikipedia page mentions another shift/add/subtract method out of IBM. Method described here (sorry, pdf):

Since the paper is from IBM it’s possible the method was used on IBM computers prior to the publication date of 1962.

Aside:I’ve designed CORDIC hardware for applications requiring a lot of trig calculations such as backprojection in CT machines (lots of arctangent calculations) and digital downconverters (simultaneous calculation of A sin(wt) and A cos(wt)) in HD radio chips. The algorithm is exceptionally well suited to getting blazing speeds via pipelining.

My dad told me about a computer he worked on in the late 50’s. In lieu of RAm, it had a magnetic drum and a row of heads, and the drum was broken into cells. Programming consisted of a giant papeer spreadsheet, when you put in an operation. Each operation included the cells of the two operands, and where the result was to be written, and the cell address of the next instruction. Part of the fun programming was verifying the rotational delay of the drum vs. how long the operation took so as to position the next operation code just far enough along the drum so it did not have to wait too long for that cell to roll around to one of the heads.

I don’t recall the model number, but the rotational speed was obviously a big deal. Still, it would be faster than tape. RAM was virtually non-exitent and expensive, for bigger computers core memory was wired by hand, 3 wires for each bit.

Somewhere in my basement I have a copy of the first microcomputer plans published by Radio Electronics, using an 8008 processor. It included an add-on board for 256 bytes of RAM using flip-flops. By 1973 you could buy IIRC an 8-flipflop IC.

I recall reading about one of the first vacuum tube computer computers in the late 1940’s, grad students had a shopping cart full of vacuum tubes and would go back and forth replacing any tubes they found burned out, and the calculations were run three times (and take the average) because of the risk of dropped bits.

Since most floating point operations before math co-processor chips involved math on segments of the floating point number, a simple floating point operation could take a while.

(I.e. if a float number is Exp: first byte-second byte-third byte-fourth byte (So, although it’s more complex than that with the first byte of A being the exponent, etc.) Most computers had an arithmetic module than could receive two numbers (bytes) and do math on them, add or multiply. Also note that “byte” as 8 bits is a fairly recent standard.

So if we represent a floating point as,say, 3 digits - multiply would be 123 x 456 : we multiply 1x4, 1x5, 1x6 then 2x4, 2x5, 2x6 etc- allowing for carries, and then add them with offsets much like grade 4 multiplication. Then for floating point, reconcile the exponents. For addition you need to offset one with respect to the other based on exponents. After each full float calculation, the number had to be rationalized - scientific notation requires the decimal to be after the first significant digit, so shift to get rid of leading zeroes and adjut the exponent. Math modules that could expedite the floating point opeations, to 16 or more bits at a time, were valuable but complex logic.

There’s a whole field of computer math in designing algorithms avoiding the rounding errors creeping in to be significant. But you can see that one simple math operation for floating point (science’s favourite) involves a large number of byte-by-byte and logic operations. Plus with the variety of technoloy in the 50’s, no simple answer as to speed.

(IBM with the 360 series, for example, had decimal calculations where each byte was 2 decimal digits, and any arbitrary length string of bytes could be used to do decimal math - ideal for business calculations to avoid those pesky decimal-to-binary rounding errors that could lose a few cents in translation.

Binary floating point (the only kind that exists these days) uses a neat trick: since the first bit of a normalized mantissa is always a 1, you don’t have to actually store it. So you get a free bit out of it.

The x87 architecture attempted to reduce floating point error by always doing math with 80 bits internally. You could load some values into registers, do some math on them in high precision, and then write out the lower-precision values.

Mostly it was a bad idea, because you couldn’t predict which values would stay in the registers, and thus the extra precision wasn’t dependable (unless you stored all values in the extended precision or you wrote the assembly by hand). I think the feature was usually turned off.

Now, with vector instruction sets like SSE, you only ever get a max of 64 bit precision. Which is good enough most of the time (and if not, you can always emulate higher precision).

Because this is the Dope, I do have to make a correction here. I was literally just helping a budding programmer and we encountered the decimal type in C-sharp, which is in fact floating point decimal, not binary.

A fair nitpick. Binary floating point being the only kind natively implemented in hardware today is still accurate, I think–though who knows, maybe there’s some crazy implementation out there. Probably someone did it with an FPGA somewhere.

A while back I implemented an arbitrary precision library using base 256. Basically the same as binary except I couldn’t use the leading 1 trick. A little easier to handle adding two numbers since it didn’t require any bitshifting across bytes.

The decimal type seems like a weird hybrid thing, actually. The mantissa is still in binary, but the exponent is in decimal space. Basically an integer (encoded in binary) that you can multiply/divide by some power of 10.

I think that the HP calculators did all of their calculations natively in a form of binary-coded decimal. Though that’s a long stretch for “nowadays”.

The 6502 processor also had a BCD (binary coded decimal) mode, but I don’t think it did any floating point work with it.

Addendum to that: It looks like the TI-84, which certainly qualifies as “nowadays” given that nearly every high schooler has one, also uses BCD internally for floats. Though unsurprisingly, it uses the naive representation, not the overly-complicated one that HP used (HP engineers loved re-inventing the wheel).

That may have been the IBM 650. IBM offered a core memory option for it but it was very small, and main memory was indeed a rotating drum. But at some point in its evolution the optimal placement of instructions on the drum became an automated function of the assembler – later on, that sort of function would come to be the job of a program called the “loader”, but back then, the assembler did all the work.

The IBM 650 assembler was called “SOAP”, for “Symbolic Optimizing Assembly Program”, and the “optimizing” part referred to the fact that when it translated the symbolic instructions into machine code, it placed them on the rotating drum in positions that accounted for their execution time, so that when the instruction finished, the next one would be just coming up on the drum.

It is important to realise how tight the design of early computers was. Up until really the mid 70’s computers were designed for their commercial role. IBM machines lived up to the name of the company. International Business Machines. Upstarts like CDC and then Cray wiped the floor with IBM big iron for scientific purposes. Back in the 50’s machines designs were so tight even questions like the number of bits wide data was was trimmed to suit. Every machine had its own instruction set architecture and usually highly bespoke support infrastructure.

One of my favourite examples is the first Naval Tactical Data System, computer the AN/USQ-17, designed by a young Seymour Cray just before he left UNIVAC for the newly founded Control Data Corporation. It was 30 bits wide. Discrete transistors, and required to run reliably shipboard. Its successors are still with us. The word width was partly driven by limitations on power and space for the hardware. But it provided a real time shipboard combat management system integrating radar and weapons control.

But for the OP. An IBM 701 likely never computed a sine function in its lifetime. Its entire use case was directed at arithmetic that likely never got more complicated than simple interest.

The design of the processors could be surprisingly simple as well. Bit serial evaluation of operands works fine, and your ALU is only one or two bits wide. Takes a while to complete an operation but your computer can be vastly cheaper smaller and lower power. If every bit is made from hand soldered discrete transistors this can be the difference between a viable design and a pipe dream.

Computation in BCD was important. It provides precise definition of how interferes and fixed point fractions behave. Not always in a good way. But if you had a bit of COBOL that read

Date PIC 999999

You had some very specific (and unfortunate) expectations about how things wee going to work. Bad when an undergraduate I had a subject in COBOL. The final assignment was to write a simple library management package. On of the things one needed was code to manage dates, and to do things like calculate due dates and overdue fines. So I actually wrote a date handling system in Cobol with PIC 99 fields expressing dates. I very much doubt my code was Y2K safe. But it did work.

We tend to focus on modern high performance computer architectures. Arm, x86, RISC-V, Sparc. But our world is filled with tiny low power low performance devices, architectures like the PIC. They do stuff like run the insides of all manner of cheap and ubiquitous electronic devices. They are usually cheaper than a bespoke logic design, and come in packages so small you need a magnifying glass. Sometimes all they do is monitor a switch and turn on a LED and maybe a transistor. Others run the insides of things like brushless motors in power tools. These are the under appreciated computers filling our lives.

The very earliest IBM mainframes like the 650 certainly were intended for commercial purposes. But the IBM 7000 series – the 7040, 7044, 7090, and 7094 were absolutely targeted to the scientific and academic marketplace, dating back to the late 50s.

And IBM even offered very attractive discounts to major universities, the idea being that students who regarded “computer” as synonymous with “IBM” would have their decision-making so directed when they someday headed up a major corporation.

A dropped bit in a calculation could easily give you a result that is wildly off (potentially by orders of magnitude), so “taking the average” makes no sense to me.

It seems instead that you would want to run the calculations three times, and use the result in which at least two of the three results match—because it’s unlikely that a dropped bit or other random error would occur exactly the same way more than once.

The IBM 701 cited in the OP used an even earlier form of RAM, the Williams tube:

Essentially storing bits in the afterglow of the phosphor in a CRT.

According to Youtube, the 701 was intended to be the Defense Calculator; it was used by Boeing, Douglas, Lockheed, Northrop, Convair, Los Alamos, Livermore, etc. The 704 was the business counterpart.

Since we’re talking about weird early technology, here’s another early memory device: acoustic delay lines with magnetostrictive transducers. Pulses were sent down the wire, which was long enough to be delayed when they were read on the other end. The pulses were then fed back into the coil again, thus data was stored on the wire. I have a vintage Friden EC 132 calculator from 1964, with a CRT display and coiled memory wire; it can even calculate square roots. Remarkably, mine still works.

Friden EC-132 Friden EC-132

Absolutely wrong.. The IBM 704 was the first mass-produced computer with floating point hardware for scientific computations; it was the basis for the development of the science-oriented programming languages FORTRAN and LISP, and the precursor of the IBM solid-state line of scientific computers, the 7040/7044 and 7090/7094, which were mainstays of the scientific community at the time. I’m old enough to have personally used both a 7040 and 7044, both installed at university computing centers. A modified 7094 at MIT was the basis of the world’s first timesharing system.

You’re right, of course. The 702 was what I was thinking of?