If what little I know about the binary system is correct, then, 01000001 is equal to 65. While looking through a book today I saw two references to uppercase “A” as being = 01000001 in binary code. If both statements are true, how does the computer differentiate between the two?
Inside the computer the letter “A” is represented by the ASCII code 01000001 which happens to equate to 65. When the computer wants to represent “65” it displays the ASCII code for “6” (00110110) and “5” (00110101)
The computer program knows whether it should be treating the byte 01000001 as a number or the letter A by where it is used. If it is expecting ASCII text, then it must be an A. If it is expecting a number, then it must be 65. If it was expecting a date, then it is in trouble.
Giving the wrong type of data to a program that is expecting something else is the source of many bugs.
Actually, no. The computer only differentiates between the two when it comes time to display them, or do an operation with them. That is, if there is a variable with a value of 01000001, and you ask the computer to display that as ascii, you get ‘A’; if you ask for it in decimal, you get ‘65’. If you should want it in hexadecimal, you’d get 81, and so on.
If you look at a byte of memory, there’s no way to tell what it means. There’s no differentiation between letters, numbers, or parts of programs. It’s all about the intepretation.
Shouldn’t that be 41? 01000001 = 065 = 41
Yes, in hex it is 41. In octal, it is 101. In machine code, it is…who knows, depends upon the machine. Whatever it is, if you try to run it, you may be executing your data.
its stored somewhere…
you say “hey computer give me a letter!”
it says I got a 01000001, thats an A
then you say “hey computer give me a number”
it says I got a 01000001, thats a 65
you ask it for something, it makes up what it should be as it goes along
Data is all stored in computers as patterns of magnetic charge: either a charge is there or it isn’t (or either a current can flow through that area of a ROM card or it can’t). That is commonly represented mathematically as ones (it’s there) and zeroes (it’s snot :)). That is because the logic of computers is theoretically described and mathematically reasoned about as Boolean algebra (a subset of algebra dealing entirely of statments which are either true (1) or false (0)). The computer knows nothing about numbers or text: All it sees are charges or the lack of charges.
Programs impose a certain interpretation on these patterns, like you impose a certain interpretation on the letters of a book when you read. If the computer is told to read a certain portion of its memory or disk storage and display the results as text, it will do just that based on the rules it has for reading charge-patterns as text (which nowadays is commonly either the ASCII (American Standard Code for Information Interchange) rules or the ASCII-like but much larger Unicode rules, but once the EBCDIC (Extended Binary-Coded Data Interchange Code) was somewhat common, as it came from IBM (International Business Machines) (aren’t abbreviations (and parenthesis) fun? :))). What the computer feeds you is your problem: If you ask it to convert to text charge-patterns that weren’t intended to be read as text, you’ll get gibberish. GIGO (Garbage In, Garbage Out).
How machines store the patters is a different matter entirely: Some are big-endian, some are little-endian, and some odd ducks are middle-endian. Ask if you want me to expand on those terms.
(ASCII is pronounced ass-key'. EBCDIC is commonly pronounced
ebbs-dick’. CS nerds have to have some fun.)
Yes! Derleth discovers the long sought-after magnetic monopole!
Middle-endian?
Depends on the context; in xBase files, one of the date parameters in the header is the number of years since 1900 (or the number of years since the turn of the last century - this difference in interpretation leads to Y2K+1 incompatibilities between MS Access and DBase. (I know the fix method, in case anyone is interested…)
There are 10 types of people in this world; those who understand binary and those who do not.
The jargon file : middle-endian
Simply means that the bytes of a long (double) word start in the middle, and follow some weird pattern. The PDP-11 (not that I’ve ever worked with one) was middle-endian.
One could argue from the origin of the terms, that raw eggs are only broken in the middle, so ‘middle-endian’ is quite logical.
It’s apparently also used of the American style of writing the date (e.g. 1/31/03 ) compared to the ‘little-endian’ style used elsewhere.
Minor nitpick:
You are sort of describing RAM (Random Access Memory), not ROM (Read Only Memory). RAM is typically composed of rows of tiny capacitors, which store charge in electric fields. ROMs typically use a couple of transistors and some fancy semiconductor whiz-bangery which will make the cell either conduct or not.
Reading a RAM element actually destroys its data. The data has to be written back into the RAM as part of the read process. Also, since it’s just a capacitor, and capacitors leak, the data has to periodically be read and written back into the memory elements or else it will decay down to all zeros.
ROMs on the other hand, have to be zapped (exposed to UV light in the case of a UV EPROM or an electric field in the case of a flash memory) in order to be erased.
Your computer boots from ROM (the BIOS). Once the operating system starts, it’s pretty much only using RAM after that.
And yes we engineers are quite aware that a memory that you can erase and re-write to isn’t really a read only memory, but we’re sticking with the acronyms.
A PC can only differentiate between a byte (8 bits), a word (16 bits) a doubleword (32 bits) or a float (also 32 bits, but the bits take on a different meaning). Floats are fairly specialized (basically it’s the equivalent of writing something like 3.2 times 10 to the fourth power, but in binary instead of decimal), but the rest of the data types are just integer numbers. How they are interpreted (ascii text, video data, the first number of a launch sequence to start world war III) is completely up to the programmer. If you accidentally feed ascii text into a mathematical calculation, the computer will not know or care. It will just do the mathematical calculation you told it to do. In fact, if you make a really bad programming error, the computer can even misinterpret data as executable computer instructions (which will usually cause the computer or at least that program to crash).
If you want to figure out something using another type of data structure that doesn’t fit neatly into 8,16, or 32 bits (like pi to the umpteen millionth number, or a picture of your grandmother) then it’s up to the programmer, not the computer, to use the limited set of instructions and data types that the computer can handle to create the desired result.
“If you accidentally feed ascii text into a mathematical calculation, the computer will not know or care. It will just do themathematical calculation you told it to do.”
Or indeed if you deliberately feed ascii into a math calc. For example to make text all uppercase, you handle each character or ascii number like this:
if 96 < x < 123 then x = x - 32
and then to alphabetize a list you sort the first character of each entry in ascending mathematical order and so forth.
It’s similar to the way the word “binary” can refer to computer data, double stars, or chemical weapons.
That’s the problem we would face if we received a binary transmission from outer space.
01000001 = a number, an symbol, the date of invasion ?