Attempting a full explanation - computers currently work using binary logic, where “signals” (or “voltages” or “currents” - you pick the terminology, as it detracts from the overall explanation) are either on or off. (It’s entirely possible to use trinary logic; I read an article within the past two years advocating a switch to a three-valued logic system because it’s a more space efficient encoding, but I can’t find the reference. It was an analysis using Shannon’s information theory, but I digress.)
Now, one bit doesn’t tell you all that much. Either on or off. To represent a greater range of values, we can group bits together. So, using 2 bits as a unit, we can represent 4 values - 00, 01, 10, and 11. Further increasing the number of bits in a “group” allows the expression of an even larger range. For instance, the ASCII code uses an 8-bit representation, which yields 256 possible “values”. As far as representations go, dealling with strings of 1s and 0s is not very human-friendly, so we tend to use other representations such as octal (groups of 3 bits, e.g. 000 = 0, 001 = 1, 010 = 2, 011 = 3, etc. up to 111 = 8) or hexadecimal (groups of 4 bits, e.g., 0000 = 0, …, 0101 = 9, 0110 = A, etc., up to 1111 = F) for more values.
Octal and hexadecimal are nice because they continue to use powers of 2, which allows representations with a wider range of values to be easily converted back to the basic binary representation. Now, in a computer, wires are necessary to carry this information, which is called the “bus”. At the risk of oversimplifying or misstating the rationale, I’ll just say that for a general purpose computing device, it’s easier, more flexible, and more efficient to process information in groups of wires that subscribe to the power of 2 format (e.g., the Intel 4044 used 4 bits, the 8088 used 8 bits, the current crop of Intel compatible processors use 32 bits, and some current processors use 64 bits; 64 bit personal computer processors are on the horizon).
The point of the above is that for “general purposes”, setting the number of bits is a tradeoff. Custom hardware would use the minimum number of bits necessary to fully represent the information required. So, if one were to make a circuit for representing decimal numbers, it would have to use 4 wires. Of course, this would “waste” some of the possible representations; the bit required to represent anything beyond 7 would only be used for representing 8, 9, and 10. But, if we’re not sure what data we’re going to have to deal with before hand, we just supply a standard number of bits and use it as we see fit. There’s a tradeoff between space efficiency and the flexibility of the system.
Now, your question was about efficiency of representation. As others have pointed out, Huffman encoding can be used to determine an efficient representation of an arbitrary range of values such that any group of individual values is unique - as Shalmanese points out, it is important to have a string of values be unambiguous.
However, the most efficient encoding of information is dependent on what is being represented. In the representation you present, the codes are assigned in alphabetical order. This isn’t the most efficient representation for english, since the probability of the letter “e” occurring is highest in a given text and you want to use the shortest representation for the letter that occurs most often. It’s best (that is, most efficient) to represent the english language according to occurance distributions. (Of course, any particular text might show a different distribution - so it’s most efficient representation would use a different encoding.)
LZW compression (explanation at http://dogma.net/markn/articles/lzw/lzw.htm) makes use of the above fact by creating an encoding dynamically based on the particular data to be encoded. A truly nifty algorithm, if you don’t mind me saying so.
Hope that helps with understanding binary and some issues involved in discussing “efficiency”…
Kramer