Is the password hash function one-to-one? I thought hashes are several-to-one, like those used for verifying torrents.
Generally they’re many-to-one, but it’s vanishingly unlikely to be any easier to determine the other acceptable values than it is the initial password.
The “pigeonhole principle” virtually guarantees that password hashes are many-to-one.
The hash algorithm creates a bit string of finite length. The range of possible raw text passwords is greater than a one-to-one mapping to the hash bit string. Windows, for example, uses the md5 hash method, which produces a 32 digit (hex) number. But a password in Windows can be longer than 32 ASCII characters. By definition, then, different raw passwords can produce the same md5 hash – known as a collision.
Hmm so since hashes are all that’s compared, they only need to guess all the hashes - 2^32? There’s no point using password strings with more than 2^32 combinations then?
I encountered this regularly logging into various Unix shells. I also noticed it on the login line. It seemed that Backspace was regularly sent as Control-H in the terminal software (i.e. the software on my computer), but that Control-H meaning backspace was a feature of the shell, and not the login prompt.
Actually, it depends on the hash.
How unlikely it is for any given hash is referred to as that hash’s “collision resistance.”
For md5, your comment is wrong. Here is a link to a 2009 (PDF) paper that shows a variety of methods to find a second value that produces the same md5 hash as an arbitrarily selected original text.
Yes. And no.
Yes to your first question. They are comparing a stolen hash value to a list of all possible hash values. So as long as they have at least one raw value to make each possible hash value, they can determine a password that will work.
No to your second. Because if you make a list of all possible password strings in the 2[sup]32[/sup] range, it may not fill all possible hash spaces. Some of those password strings may map to the same hash value. A longer raw password may map to a hash value that isn’t reachable with the smaller range.
For instance, imagine a really bad hash function that uses the number of characters in the string as the hash value (did I mention this is a REALLY BAD hash function)? This is a legitimate function, but getting every string in the 2^32 range is rather obviously not going to get you what you want.
It seems to me a hacker’s best tactic would be to figure out the smallest set of strings that would generate all the possible hashes, and use that instead.
And mine would be to find the shortest string which hashes to a hash with the least possible other strings.
The thing is, that’s a non-trivial problem, especially since every hash function is different (in fact, the same algorithm can produce different collisions depending on the specific constants used). In fact, I’d wager the problem of determining the minimal set would take longer* and produce worse results than just dictionary attacking (assuming you don’t care WHOSE account you hacked, just that you hacked AN account).
Which itself isn’t nearly as good as calling the secretary and pretending to be from the Password Security Council.
- If it’s a problem that’s even generally solvable before the heat death of the universe.
That’s not very easy. Even for something like MD5 (which is actually a terrible choice for password hashing, since its hash size is small and it’s actually designed to be fast) you have 2[sup]128[/sup] possible hash values. That’s 340,282,366,920,938,463,463,374,607,431,768,211,456 hashes. And you’d have to test a lot more than that number of strings to hit every one.
Of course, the set of actual passwords in common use is much smaller. And there are dictionary databases out there which contain maps of common passwords to their known hashes.
You would have to test an infinite number of strings to find that out, though.
Ooh, interesting. Thanks for the replies. So it’s 16^32, not 2^32.
A requirement for a good cryptographic hash function is that finding the hash value for a given plain text should be easy, finding a plain text to hash to a given hash value should be very very hard. This would make the above “solution” very hard.
For a hash function that produces 128 bit values there are 2^128 possible values which, by the magic of Windows Calculator, gives ~ 340,000,000,000,000,000,000,000,000,000,000,000,000 possible hashes you have to find, each of which is a significant task in itself.
ETA - Doh, beaten to it - several times.
Right. 128 bits, each with two possible values, equals 2[sup]128[/sup] (or 16[sup]32[/sup]) possible permutations.
I’ve seen a few websites that store the passwords and send it to you instead of doing a password reset. They weren’t serious sites with any e-commerce capabilities.
There are also some systems where the userid and password aren’t encrypted at the browser. They rely on HTTPS to keep eavesdroppers out.
You would not believe what I’ve seen after more than a decade in the web applications business. Let’s just say that there are a great many people out there for whom basic security isn’t even a backseat priority – it just simply doesn’t occur to them at all.
Well, if they’re using HTTPS then it’s encrypted at the browser, just at the protocol level instead of within the application.
You really need to read Slashdot.
They post a story every month or two about yet another web site/application that stores plain text passwords and has been hit.
Right now, a lot of smart phone apps are being discovered as keeping/transmitting passwords in plain text. But there are some major web sites, especially gaming ones, that have lousy password security.
It is a really good idea to assume that your password at site X is going to be found out and take precautions that this has no affect on your use of site Y. (As the recent LinkedIn case illustrates.)
There is virtually no practical method of protecting your own password in a case like the LinkedIn incident. If someone wants to expend a couple weeks finding it from the hash, they will. But it’s like the old joke: You don’t need to out run the bear, you just need to outrun the other guy. If your password is harder to crack than 95% of the others, then the casual crackers will cherry pick the easy ones and hopefully leave you alone. Assuming you have a low profile and there is no reason for random people to target you.