Cryptographic Zero Knowledge Proof Question

Aye, but the OP’s stipulations seem to include that the hash itself can also be brute-forced back into the original secret.

Would this work, or am I missing something important?

Alice creates an algorithm for converting the number to: (1) A date, (2) a time, (3) a location (maybe precise coordinates), (4) an outfit.

Alice shares the algorithm with Bob. Both secretly convert their numbers.

If they ever meet in the same place, at the same time, wearing the same outfit, they’ll know their numbers match. If it’s likely that Alice and Bob are going to meet under those conditions often, then other conditions can be added. (For example: type of haircut; type of food or drink being consumed; particular greeting or comment; etc.)

The trick is to make sure that all the conditions are things that someone might commonly do anyway. If the algorithm converts part of the number to “wear a giant sombrero”, then Bob could have Alice followed and if he sees her wearing a giant sombrero he knows it’s likely that her number contains certain digits.

But if care is taken, then even if Bob has Alice followed for an indefinite period of time (if their numbers differ, the date will differ), recording her every move and outfit, he still won’t know which of those moments or locations are the correct ones.

There may be, of course, certain practical problems if time is of the essence. :slight_smile:

OK, how about this. Alice treats her number as a one time pad. She chooses a random string the same length as her number and XORs the two. She sends the ciphertext to Bob. He uses his number to decrypt the string and sends the results to Alice. If Alice agrees that the sting Bob sent matches her random string then the keys match.

No dice; Alice has the original random string. When Bob sends her (Bob’s secret XOR random string), Alice will be able to recover Bob’s secret from that by XORing it with the random string. Thus, Alice finds out Bob’s secret, regardless of whether it matches hers.

No. No matter what algorithm you chose, it’s going to be brute forcible, because the opposite party only has to evaluate 99 other numbers to correctly figure out your yours is.

Just to elaborate:

Alice picks a number between 1-100. She does super-secret algorithm f(x) on it to transform it into a huge amount of data. Bob does this as well.

Alice sends the second half of the data to Bob. It doesn’t match Bob’s. Bob then starts calculating f(x) for all numbers between 1-100. He stops when he gets one that matches what Alice sent him. He knows Alice picked ‘88’.

There’s really one defense; Make f(x) take a long time. It might take a year, making it feasible for Bob to calculate it once, but not for all 100 numbers. But he can get around that by enlisting the help of 100 friends, or upgrading his computer. So that doesn’t necessarily help. Even if f(x) is exponentially harder to calculate as as the numbers get bigger, it only helps a little bit. If Alice can calculate it once in a reasonable amount of time, than so can Bruce, who can enlist a reasonable amount of friends to help.

Well, there are other possible defenses. Make the algorithm non-deterministic; i.e., instead of using f(x), use f(x, y), where y is a random input, the number of possibilities for which are very large. At the moment, I can’t prove that something like this couldn’t end up working.

But then how does Bob verify whether they have the same number?

I don’t know; the idea is that it may be mathematically possible to come up with such an f(x, y) such that some predicate p(f(x, y), x’) has high correlation with the predicate x = x’. I’m just saying this is a possible way in which a protocol could exist satisfying the OP’s requirements which is not addressed by Sinaijoin’s ostensible nonexistence proof, though I don’t know if it can actually be carried through to a full-fledged design or not.

ETA: Oh, but I see the flaw now… you could just run through p(f(x, y), x’) for all values of x’ still. Very well, comment retracted red-facedly… though I’m still not convinced Sinaijoin’s proof covers all possibilities. You could imagine a kind of interactive protocol, which would prevent Alice from simply running through the cases for all possibilities at once, as she could only send out the data corresponding to one secret choice of hers.

(As in the “Query for random digits in alternate turns, stopping once you get a mismatch” leak-limiting protocol)

So this isn’t a zero-information protocol (or I guess, more accurately, a one-bit of information protocol, since at least that much has to leak between the two), but what about this?

Choose a large set of possible keys, say N of them. Alice XORs her string with a randomly chosen one of those keys (some other reversible hash would be feasible, too) to produce cypher-text. Bob then XORs the cypher-text with his string. If he gets back one of the possible keys, then he assumes the strings match and sends his data back to Alice.

This leaks information in two ways. First, Bob immediately knows that Alice has one of N possible strings, so some information has leaked. Second, Bob will reveal his secret to Alice either if they have the same string, or Bob’s string happens to accidentally decrypt the cypher text to one of the other keys, so information can leak that way, too.

This protocol has a nice property, though, which is that by changing N you can change the balance of information that each party leaks. If N is small, then Alice leaks more information (because the set of possible strings that Bob calculates is equally small), while if N is large, Bob leaks more information (because the chance of Bob’s string colliding with a key even though his string doesn’t match Alice’s is higher).

One other feature of the protocol (which, upon reflection is essentially the same as kferr’s*) is that there is actually a maximum amount of information that can be leaked, regardless of the computational power brought to bear or the strings Alice and Bob have (the “query until you get a mismatch” protocol can potentially leak a lot of information if the two strings are almost the same). Bob can do no better than finding a set of strings, one of which is Alice’s, and Alice can do no better than finding out Bob’s information if there’s a collision.

  • I think the point that both kferr and later Indistinguishable missed about kferr’s original protocol is that just because Bob gets something readable from brute-forcing the cypher-text is no guarantee that he produced the same paragraph that Alice originally encrypted. Essentially this is equivalent to my protocol where the number of keys, N, is the number of readable paragraphs of text of a given length.

This is why I suspect that there’s a solution in quantum cryptography. You can’t make copies of quantum information, so if Alice’s encrypted information is quantum mechanical, and she transmits it to Bob, then she doesn’t still have the original.

Meanwhile, I seem to recall an article in Scientific American some years ago on the subject of secure communication, which included a discussion on this very topic. Unfortunately, I can’t pin down the date of the article more precisely than “some time between 9 and 150 years ago”. Is there a searchable index of all SciAm articles available somewhere?

Aye, that’s another problem with kferr’s method as well.

Your method, incidentally, is information-theoretically equivalent to Alice sending out just part of her secret (all but the last log(N) bits, basically), and Bob thinking there’s a hit if that matches the first part of his secret as well [in which case, he’ll send back his last log(N) bits for confirmation]. Refining this so that information can exchange in less coarse chunks with more damage control, one gets the bit-for-bit/digit-for-digit interactive protocol I’ve mentioned several times.

Was this the article, by any chance? It seems, from the preview, to be about a related, but slightly different, problem, though I haven’t found any closer matches yet.

ETA: Oh, but the date is too recent, I suppose. Alas.

Could you use public/private key encryption for this?

If Alice has a number a, and Bob has b. Both numbers range from 1 to N.

Bob creates N public / private key pairs - call them all Pub[sub]x[/sub] and Pri[sub]x[/sub] - and publishes the public keys.

Alice generates a random text S, and encrypts it with Pub[sub]a[/sub] and sends it to Bob as Pub[sub]a/sub. Bob decrypts it with Pri[sub]b[/sub] and sends it back to Alice.

If Pri[sub]b/sub = S then Alice knows that she and Bob have the same number.

If they are different, Alice can’t find b because she can’t generate the private keys (assuming Bob did something more clever than making the private keys ‘1’, ‘2’, etc)
And Bob can’t find a because he doesn’t know what the plaintext S was, so wouldn’t know which Pub[sub]a[/sub] was used.

One problem with this is that now Alice knows whether or not a = b but Bob doesn’t.

You could run the protocol in reverse so that Bob can learn, but I can’t think of a way for them to both learn at the same time.

(After I thought of this it sounded kind of familiar… similar to Kerberos or other authentication schemes?)

Let me be more clear about the protocol I laid out.

Say Alice has the number 3 and Bob has the number 3.

Alice generates a SALT or thequickbrownfox and appends the number 3 to get thequickbrownfox3.

Alice then generates a MD5 Hash of that to get 2a9a06d5de8c17fd4d9af393c8f2396c.

Alice sends 2a9a06d5de8c17fd4d9af393c8f2396c to Bob who then appends it his salt of jumpsoverthelazydog to get jumpsoverthelazydog2a9a06d5de8c17fd4d9af393c8f2396c.

Bob then generates the the MD5 sum of 4413ad3dee17f9d5e92aa013952c625e and publishes it.

They then repeat that process in the reverse order and… they get completely different answers because MD5(“jumpsoverthelazydog” + MD5(“thequickbrownfox” + 3)) != MD5(“thequickbrownfox” + MD5(“jumpsoverthelazydog” + 3))

But say we did have some hashing function such that h(g(x)) = g(h(x)), then Alice and Bob could compare the two hashes for equality.

Bob cannot reverse engineer Alice’s number because he doesn’t know what Alice’s salt is and vice versa and it would be infeasible to brute force if the salt was sufficiently long. This fulfils all the requirements of my question except that I don’t know if the desired hash function exists.

I think Kinbote has it. The only challenge is if the plaintext is long, generating all those pub/pri keypairs will be very exensive.

You could use an agreed 1-way hash to reduce the plaintext to much smaller MaxA & MaxB values, and therefore have to generate many fewer pub/pri pairs, but then you introduce the problem that any two plaintexts which hash to the same value will appear equal under the protocol when they are not.

The hash itself can even be public without affecting the secuity of Alice & Bob’s numbers versus eavesdroppers.

But if that’s the case, then when Bob is given Alices hash result h(g(x)), all he has to do is run his hash function(g(h(x)) for all numbers between 1-100. The one that matches will give him Alice’s number. Its the same as my example, with f(x) = h(g(x)) = g(h(x)).

But Bob has both the encrypted result and S. So he just runs ‘S’ through all his public keys (there is only 100) until he finds one that matches the encrypted result he originally got from alice. He would then know Alice’s number.

Basically, he finds ‘x’ such that Pub[sub]x/sub = Pub[sub]a/sub