Blog - Zero Knowledge

I work in cryptography.

When I tell people that, they usually ask for which spy agency I crack codes for.

(which, to be fair, wouldn't be that much of an unreasonable assumption half a century earlier. But a few things have changed since the 70s)

As it often occurs, what happens in an academic discipline and what the public perceives aren't quite the same thing. Archaeology isn't much like Indiana Jones, history isn't about knowing the birth and death dates of every person who has ever lived, and literature isn't about wondering around in tweed while occasionally heatedly debating about Wordsworth's position in regards to the French Revolution.


One can still dream, though.
(source: https://www.shutterstock.com/image-photo/little-child-tweed-jacket-around-encyclopedias-226512925)

And most of cryptography isn't really about code-breaking nowadays.

My cryptography professor gave us the following definition of the subject in the very first lesson: "Cryptography is communication in the presence of an adversary." Which at first does sound similar to you talking to your sibling on the phone and the NSA eavesdropping. However, the key aspect here is that your adversary might be the very person you are communicating with. (maybe your sibling is lying about it not being his turn to do the dishes when you clearly remember agreeing that it was, dammit)

This opens up a huge can of worms that is what a lot of cryptography deals with. A basic construction of cryptography is a protocol - I do A, you do B, I do C, and so on, and if we both do what we should (or all, there might be more parties than just us two), we end up with the result that we both want to. For example, consider e-voting. People submit their votes, these are verified, mixed and shuffled (so that nobody knows which vote belongs to who), perhaps added together, opened – lots of stuff happens.

All of this is based on the assumption of people doing what they should, and (at least judging based on my own behaviour) people are notoriously bad at that. Moreover, people are also keen in finding holes in existing structures and using them to their own benefit. Maybe, for example, the servers that were supposed to shuffle the votes actually threw some away and replaced them with other votes for a party that they liked more.

For example.
(source: "The Simpsons")

The world was much more simpler when we could at least assume that the people we were running the protocol with were on our side, wasn't it?

 So, we would want some guarantee that everybody exactly follows the protocol – a proof, if you will. You want to prove to the other person that you did everything correctly. Suppose you are the vote mixing server and you were good and you want others to be convinced about it. One possibility is showing the votes that came in and the votes that you output and that they are the same set. However, if you do that, then that sort of defeats your point   - your whole purpose was to anonymize the votes and if you now tell somebody what the incoming votes were, that flies out of the window.

There seems to be a very uncomfortable tension in the heart of proving that you behaved correctly in a cryptographic protocol. The other parties should learn that you behaved without learning any of your specifics, really. It's like proving to your parents that you were a good boy all the day long without them knowing nothing about how you spent the day. 

Solving this tension is what Zero Knowledge proofs is all about. It allows you to prove a fact without leaking anything extra. On one hand, the proof should be a honest no-cheating kind of thing – it shouldn't work when the thing that the proof aims to prove isn't in fact true. On the other hand, remember that the information about which the proof is, can be sensitive in nature. This means that the proof should be zero-knowledge - meaning that it should not give any more information than the fact that you are proving.

And, fortunately for us, such proofs exist.

Let's take the Rubik's Cube example. If you are a master of the Rubik's Cube and want to prove it to me, one option would be just me giving you an unsolved cube and asking you to solve it. But then I could take the solved cube and say to everybody that I solved it and thus get all the praise that people so often give to people who solve arbitrary and useless puzzles.

The thing I could do is to take an unsolved Rubik's Cube, specify a random legal position and ask you to get the cube into that state. Then you have showed to me that you are indeed a master of the Rubik's cube and I am left with a basically random cube that I can't do anything with.

(except procrastination, of course)

Another example would be you claiming that you have a large quantum computer in your basement. You can't show it to me, because maybe I would steal the design. However, I could pick two large random primes p and q and multiply them together and give the product p*q to you. Finding p and q from p*q using normal computers is very hard, provided that the primes are large enough, but can be done in reasonable time with quantum computers. If you gave me p and q back, then it would be quite good evidence of you having such a computer. *

(*Nitpick: this example isn't, formally speaking, a zero-knowledge proof, because here a bad verifier could take some number he found in the wild and give it to the prover to obtain the factors. He would still learn nothing about the quantum computer, but might learn useful information about other things in the world. This shows that we have to be careful when we do cryptography – often there are many aspects we don't think of at first.)

So, ignoring nitpicks, what is the catch?

Why don't we just scoop ZK on top everything and retire as cryptographers?

Notice one common thing in these two examples – the first one started with me giving you a Rubik's cube and a target. The second one started with me giving you the product of two primes.

One of the big catches with ZK proofs is that you usually have to do it interactively. And interaction is difficult.

Especially in some cultures.
(source: https://imgur.com/gallery/fcwGS0c) 

Interaction often takes resources and coordination and requires people being online simultaneously. Moreover, this means that the proof only works for the verifier you interact with. It would be kind of neat if you could just do something and attach the proof of its correctness with it. Then everybody could see the proof and be convinced.

This goal is in fact studied, and is called Non-Interactive Zero Knowledge, or NIZK for short.

However, there is a problem with NIZKs in that they are not possible without us having to assume some stuff that are a bit, uh, controversial. You see, ZK proofs often function in a way that the verifier gives the prover a random hoop to jump through and if the statement that the verifier wants to prove is incorrect, then with good probability the verifier fails. However, if the hoop is not random then the verifier might be able to prepare himself just for this task even when the statement he wants to prove is incorrect. So, the question is – where do you get this random hoop in the non-interactive setting?

Pictured: a typical prover -  a good boy!
(source: https://www.shutterstock.com/image-photo/beautiful-dog-motion-rough-collie-jump-589747424)

There are two main ideas for this. First one is called the random oracle model. It essentially says that you have deterministic things that behave seemingly so much like it has no structure whatsoever that you can use them for dice. So you apply this 'random oracle' to the specific data of what you are trying to prove and obtain the necessary random hoop. The problem with this is that random oracles are not believed to exist. Hash functions are often used as random oracles. The question whether it practically makes sense to consider hash functions as random oracles is hotly debated.  

The other idea is to use some trusted setup. Essentially, somebody has to provide some tools from which you can build the random hoop yourself. (these tools are called a Common Reference String, or CRS for short in crypto-parlance) You can get quite effective results with this, however, here you run into the problem that cryptographers have chronic trust issues.

Maybe it has something to do with all the stuff you had to prove to your parents and siblings in your childhood?
(source: https://www.shutterstock.com/image-photo/fears-scared-woman-on-street-422925040)

You have to have some trust in the CRS in that setup, and who can say that it hasn't been cleverly rigged to work for the other party? We have done some research to alleviate this problem. First, we have built protocols where you can check whether the CRS has been generated correctly. Second, we have studied methods how the CRS can be generated by many parties so that it would be secure if at least one of them behaves correctly.

Other research is also at hand about how to build these NIZKs more efficiently and with better security, but I think that we have talked enough for today. 

So, hopefully you have learned something from this post about zero knowledge. (Or, in the more general sense, what kind of problems cryptographers study) Also let's hope that you can have more trust in your everyday life than in the crypto world, and that you don't have to prove everything to the people around you, and that, maybe, you could even trust some of them with your secrets. 

The next post in this blog should appear in March and will probably talk about something completely different. 

Until next time,

Toomas