Please explain binary to me. How can a 1 and a 0 = a command, or idea?

Computers are really just glorified state machines. They aren’t all that complex.

There are basically four states to a computer:

  1. Fetch
  2. Decode
  3. Execute
  4. Write Back

FETCH:

The first thing a computer does is fetch the instruction to execute. The instruction is going to be a fixed number of bits. One particular bit pattern may be an ADD instruction. Another bit pattern may be SUBTRACT. There are math functions, logic functions, and flow control functions. An example of a flow control function would be “if the negative bit is set, jump to instuction at memory location XYZ”.

DECODE:

A lot of instructions need more data than just what is in the instruction. For example, ADD A + B means you now have to go get A and B. The computer will now do memory fetches to get the necessary data.

EXECUTE:

After the instruction has been fetched and decoded, all of the necessary numbers are now inside the CPU, so if the instruction is an ADD, a piece of the CPU called the arithmetic logic unit (ALU) will add them together.

WRITE BACK:

Once you’ve executed the instruction, you now have the answer sitting inside the CPU. Now you need to write the answer out somewhere, either to an internetal CPU register or out to a memory location. Flow control operations write to a special register called the “program counter” which is the register that also tells the CPU where to fetch its next instruction from during the “fetch” state.

Once the write back is finished, the computer goes back into the fetch state, and the whole thing starts over. It just repeats these states over and over and over and over.

Actual computers are a bit more complicated than that, but that’s the basic idea. Modern computers do a thing called “pipelining” where they actually have dedicated hardware doing each of the four states, and they will process multiple instructions at once. For example, the first instruction is fetched, then when it goes to be decoded, a second instruction is fetched. While the first instruction is executed, the second is decoded, and a third is fetched, etc. Since your decode and fetch stages both want to access the same memory, caches are needed to stop them from stomping on each other. A pentium CPU has a very large instruction cache, a dual execution pipeline (it will execute 2 instructions side by side) and a seperate pipeline for floating point operations.

Here is a page that shows the instruction set for a PC:
http://www.arl.wustl.edu/~lockwood/class/ece291/class-resources/opcodes.html

If you look at it, you’ll see that “ADD” isn’t just one instruction. There are 8 different types of ADD instructions. “Reg” means register, which is one of the internal CPU registers. “Imm” means immediate, which means the number is stored as part of the instruction. “Add 2 to register AL” would be an example of an add immediate to register instruction. “Mem” means memory.

For a much simpler processor, here is the instruction set for the old 6502 CPU (used in the apple II and other computers of the time):
http://www2.asw.cz/~kubecj/aopc.htm

This is excellent ! Please keep it going, I am reading and re-reading.

And yes, my apologies- it was an article about Turing. Hence, it was a Turning Machine I was referring to up there.

-Cough- I even Previewed that. It should have read " Turing Machine ".

:smack:

As has been noted, the variety of the icons isn’t important - what matters is the way they are combined.

With 26 distinct symbols rather than 2, it’s quite reasonable to expect that a given idea (command, instruction, request, etc.) could be expressed with fewer symbols. But it doesn’t follow that few symbols restrict you to expressing simple ideas.

Well, I’d consider that to be cheating. Why? Simple: you’re speaking about an additional encoding mechanism which isn’t “obvious”. So, like someone already mentioned, on your “rod” you wouldn’t have the entire information; you’d have to add some kind of decoding information (meta-information).

… there’s a reason for all them there computer architectures…

well, to a certain extent, all context is arbitrary. If you try to measure the ‘decoding information’… how does the source machine know how to decode the decoding information? And how to decode the decode of the decoding information, etcetera??

The only possible answer is that we have established a convention about how certain ‘starting pieces of information’ should be interpreted, which can then be used to provide datatype information about other data, and so on, and so on. The ‘common context’ which is assumed to be known can be small or it can be great, but it must be known on both ends.

Either will work, theoretically, with the ‘notch on the rod’ system. I described a scenario where the context information was all completely known on both ends. Alternatively, it could be a schema where the first few numerical digits are used to ‘bootstrap’ the rest, as long as a language for reading the boostrapping information is agreed on. You don’t need ‘a seperate source’ of data for the meta-information. You just need to include MORE data.

If there is no format for meta-data or meta-meta-data or any higher step (or just for data) that is agreed upon, then ANY amount of information will be meaningless to a machine. It will probably be meaningless to us too, unless we can find a context pattern that matches something in our shared experience. I don’t think there’s any pattern for exchanging data that can be worked out from pure reason alone.

Agreed. However, I was objecting to the concept of “storing the library of congress with a single notch on a rod”. To me that is gross oversimplification.

However in “Contact” Carl Sagan seemed to imply that God embedded proof of his existence in one of the universal constants (pi, if I recall correctly).

Hijack concluded. Please carry on with the OP.

With today’s technology I don’t think you could reasonable encode a sentence let alone a library on the notch of a rod. It is a little complicated to come up with a good estimate for the number of possible English sentences but there are roughly 20000 words in the average person’s vocabulary. So a goodish lower bound on a 5 word sentence would be for each word you can choose from about 1000 words. This is 1,000,000,000,000,000 possible sentences. So on a rod one meter long you need to be able to resolve down to fempto meters. A hydrogen atom is about 2,000 times that size.

Oh, I don’t think anybody’s arguing that it could be done ‘with today’s technology’ or even in this universe. It’s one of these odd little games that mathematicians play in an idealized theoretical world, nothing more. :slight_smile:

[Nitpick]There’s no ‘p’ in ‘femto’.[/nitpick]

I stand corrected. I went and checked the most athouritive sight I know googlefight . Femto beat fempto 764,000 to 2,320.

But isn’t that true of any number base? You need some kind of “out-of-band” marker to indicate breaks between data groups? In written English we use a “space” character, which, by convention, doesn’t occur within a single word. When writing numbers, we use ‘space’ or ‘comma’ (to separate thousands).

Binary streams can use start sequences to determine the begining and end of numbers or words. After all when I send this message it will be ones and zeros only going to the sdmb.

Well, that’s where math and computer science part company.

Mathematically, you must have some kind of “out-of-band” value, because any other sequence of characters from the “alphabet” could equally well be data. However unlikely the sequence you choose, logically it could be data.

In computing, we take the more pragmatic route: we choose suitable “rogue” values that aren’t likely to occur by accident, or we dedicate one character from our “alphabet” to be a delimeter. An example of the latter is the use of ASCII NUL to delimit strings in “C”; we agree by convention that strings Shall Not Contain NULs and Bad Things happen if they do. By agreeing to this, it allows us to reserve NUL for terminating strings.

This is interesting stuff, but I think we’re getting on a bit of a tangent from Cartooniverse’s OP so I’ll leave it there.

All of the talk of binary and other encoding mechanisms, as well at the fact that 00101011 means “jump to subroutine” (or whatever) is ingoring the base question of how it works.

Electronically (think solid state or even relays; not processors) one can construct logic circuits. A critical component of these gates is a transistor which can pass current or not depending on whether the transistor is gated or not. In the most basic form, a supply voltage is on the supply side of the transistor. There’s nothing on the “output side” – the default output is +0VDC. On the “gate side,” there can be either +0VDC or +5VDC (example voltages only). If there’s +0VDC on the gate, the output remains at +0VDC. If you apply +5VDC to the input, the output is +5VDC. In essence, you’ve built a simple AND circuit, because you require +5VDC on the the gate and the “supply” to the transistor. Using multiple transistors, we can build an abstract device that we can call a logic gate, and this is what makes a computer microprocessor work. A single gate takes exactly two inputs and delivers a single output. An input can be defined logically as “high” or “low”; or “1” or “0”; or “on” or “off”; or +3.3VDC or “0VDC”, or any two contrary states whatsoever. Because logic decisions are inherently binary and because we’re talking microprocessors, let’s stick to “1” and “0” even though our gate circuit is really utilizing different voltages as inputs.

Logic gates come in four forms: AND, OR, and XOR (exclusive disjunction), and NOT (oops; NOT only takes a single input and reverses it). NOT is often applied inline with one of the other gates to make NAND, NOR, and (I guess) NXOR. The inside of a microprocessor is nothing but a massive, huge, gargantuan, incredibly complex electrical circuit, just like the wiring in your house, comprised of millions and millions of nothing but these logic gates.

At this point, we’re working with single bits of information, which is really, really constraining. It’s better to work in bytes (8 bits) or words (16 bits) or long words (32 bits) – these are conventional descriptions although “long word” may be 64 bits on some systems, etc. Put these logic gates next to each other in parallel, and now you can work with a byte at a time (and 8-bit processor like the old Commodore 64). Put 16- or 32- or 64-bits in parallel, and you have a 16-, 32-, or 64-bit processor. In a single operation you can see that it’s a lot quicker to process 64-bits rather than 8-bits.

I’m having some trouble verbalizing the massive parallel array of circuits and how they work inside of a processor, i.e., the processor design (not the machine language). Anyone care to take a stab at it?

Yes- which is why when I tried to look at what scotandrson wrote, which was this:

I went looking for a space between every 5 numbers. Lacking that, to be honest, I did not try to decode it. ( I will now, because I suspect it’s his/her name but who can say till we decode it? )

00111 = G
00101 = E
10100 = T
00000 = space
01001 = I
10100 = T
11111 = ?

Yes…lol. Thanks. But- tell us- why did you not use spaces between the sets of 5? We are visually cued animals. I believe that even Cuneiform used spaces between the triangular cuts in the wet clay. ( I am emailing an expert now to find out.)

A computing machine would count 5 easily enough and therefore decode quickly. I would WAG that were I to memorize the 5 number coding you presented up there, after a while I would be able to read- but with greater ease if the numbers were broken up by spaces. Then again, perhaps not? Would my brain eventually group large blocks of 0’s and 1’s and decode as I scanned them? Hmmm.

When I dated a gal who was hearing impaired, her lip-reading abilities were around 60%. She still relied upon sign. I couldn’t sign, but I could spell. After a while, my brain wasn’t using much steam to make * each * individual * letter* - I simply spelled out entire words at a time without any thought at all.

Am I to believe that if you presented me with a binary translation of this sentence, and I spent enough time learning the 5-number coding, that I could read a string of numbers without spaces and decode this sentence? I struggle with that idea.

I swear, I am following this stuff. Just barely, but I am- and please, DarrenS, don’t regard any of these interesting tangents as undesirable hijacks- this thread by its very nature is going to go off on tangents. Ok?

He was giving an example of how to code using just two symbols. If he had used a space, like this:

001110010110100 010011010011111

then he would have been using three symbols–‘0,’ ‘1,’ and ’ ,’ the space.

See what I’m saying?

-FrL-

–Grumble-- Yeah. I do, actually.

If you’re looking for something that human brains can decode easily – you shouldn’t be looking at binary in the first place. That’s not binary’s strong suit. Putting spaces in between the blocks of binary digits every so often at regular intervals, simply as a human cue that this is where one logical unit ends and another begins, helps if you absolutely have to read binary. What helps a lot more would be converting the logical units into a more rich and varied alphabet of symbols - such as decoding that back into english, or using assembly-language mnemonics instead of machine language, etcetera.

Human brains tend to work very well with rich alphabets of symbols (somewhere between 15 and 400 basic symbols seems to be about right,) and a lot of somewhat vague contextual information, which somehow we are good at turning into meaning. The electronic devices that we’ve built so far work best with just two basic symbols, high-current and low-current or the equivalent, repeated in different patterns millions or billions of times over. That’s just the best way we’ve found to build them.

There are a lot of similarities in the way information can be encoded in these relatively different ways. But they won’t be equally good for all purposes.

Does that help at all??

You don’t need it in the computer because there’s a fixed length for messages (32 bits on most modern machines). In unary, there’s no use for fixed length messages.