Feb 13 JDN 2459620
I said in last week’s post that most of the world’s online security rests upon public-key encryption. It’s how we do our shopping, our banking, and paying our taxes.
Yet public-key encryption has an Achilles’ Heel. It relies entirely on the assumption that, even knowing someone’s public key, you can’t possibly figure out what their private key is. Yet obviously the two must be deeply connected: In order for my private key to decrypt all messages that are encrypted using my public key, they must, in a deep sense, contain the same information. There must be a mathematical operation that will translate from one to the other—and that mathematical operation must be invertible.
What we have been relying on to keep public-key encryption secure is the notion of a one-way function: A function that is easy to compute, but hard to invert. A typical example is multiplying two numbers: Multiplication is a basic computing operation that is extremely fast, even for numbers with thousands of digits; but factoring a number into its prime factors is far more difficult, and currently cannot be done in any reasonable amount of time for numbers that are more than a hundred digits long.
“Easy” and “hard” in what sense? The usual criterion is in polynomial time.
Say you have an input that is n bits long—i.e. n digits, when expressed as a binary number, all 0s and 1s. A function that can be computed in time proportional to n is linear time; if it can only be done in time proportional to n2, that is quadratic time; n3 would be cubic time. All of these are examples of polynomial time.
But if instead the time required were 2n, that would be exponential time. 3n and 1.5n would also be exponential time.
This is significant because of how much faster exponential functions grow relative to polynomial functions, for large values of n. For example, let’s compare n3 with2n. When n=3, the polynomial is actually larger: n3=27 but 2n=8. At n=10 they are nearly equal: n3=1000 but 2n=1024. But by n=20, n3 is only 8000 while 2n is over 1 million. At n=100, n3is a manageable (for a modern computer) 1 million, while 2nis a staggering 1030; that’s a million trillion trillion.
You may see that there is already something a bit fishy about this: There are lots of different ways to be polynomial and lots of different ways to be exponential. Linear time n is clearly fast, and for many types of problems it seems unlikely one could do any better. But is n100 time really all that fast? It’s still polynomial. It doesn’t take a large exponential base to make for very fast growth—2 doesn’t seem that big, after all, and when dealing with binary digits it shows up quite naturally. But while 2n grows very fast even for reasonably-sized n, 1.0000001n grows slower than most polynomials—even linear!—for quite a long range before eventually becoming very fast growth when n is in the hundreds of millions. Yet it is still exponential.
So, why do we use these categories? Well, computer scientists and mathematicians have discovered that many types of problems that seem different can in fact be translated into one another, so that solving one would solve the other. For instance, you can easily convert between the Boolean satisfiability problem and the subset-sum problem or the travelling salesman problem. These conversions always take time that is a polynomial in n(usually somewhere between linear and quadratic, as it turns out). This has allowed to build complexity classes, classes of problem such that any problem can be converted to any other in polynomial time or better.
Problems that can be solved in polynomial timeare in class P, for polynomial.
Problems that can be checked—but not necessarily solved—in polynomial time are in class NP, which actually stands for “non-deterministic polynomial” (not a great name, to be honest). Given a problem in NP, you may not be able to come up with a valid answer in polynomial time. But if someone gave you an answer, you could tell in polynomial time whether or not that answer was valid.
Boolean satisfiability (often abbreviated SAT) is the paradigmatic NP problem: Given a Boolean formula like (A OR B OR C) AND (¬A OR D OR E) AND (¬D OR ¬C OR B) and so on, it isn’t a simple task to determine if there’s some assignment of the variables A, B, C, D, E that makes it all true. But if someone handed you such an assignment, say (¬A, B, ¬C, D, E), you could easily check that it does in fact satisfy the expression. It turns out that in fact SAT is what’s called NP-complete: Any NP problem can be converted into SAT in polynomial time.
This is important because in order to be useful as an encryption system, we need our one-way function to be in class P (otherwise, we couldn’t compute it quickly). Yet, by definition, this means its inverse must be in class NP.
Thus, simply because it is easy to multiply two numbers, I know for sure that factoring numbers must be in NP: All I have to do to verify that a factorization is correct is multiply the numbers. Since the way to get a public key from a private key is (essentially) to multiply two numbers, this means that getting a private key from a public key is equivalent to factorization—which means it must be in NP.
This would be fine if we knew some problems in NP that could never, ever be solved in polynomial time. We could just pick one of those and make it the basis of our encryption system. Yet in fact, we do not know any such problems—indeed, we are not even certain they exist.
One of the biggest unsolved problems in mathematics is P versus NP, which asks the seemingly-simple question: “Are P and NP really different classes?” It certainly seems like they are—there are problems like multiplying numbers, or even finding out whether a number is prime, that are clearly in P, and there are other problems, like SAT, that are definitely in NP but seem to not be in P. But in fact no one has ever been able to prove that P ≠ NP. Despite decades of attempts, no one has managed it.
To be clear, no one has managed to prove that P = NP, either. (Doing either one would win you a Clay Millennium Prize.) But since the conventional wisdom among most mathematicians is that P ≠ NP (99% of experts polled in 2019 agreed), I actually think this possibility has not been as thoroughly considered.
Vague heuristic arguments are often advanced for why P ≠ NP, such as this one by Scott Aaronson: “If P = NP, then the world would be a profoundly different place than we usually assume it to be. There would be no special value in “creative leaps,” no fundamental gap between solving a problem and recognizing the solution once it’s found.”
That really doesn’t follow at all. Doing something in polynomial time is not the same thing as doing it instantly.
Say for instance someone finds an algorithm to solve SAT in n6 time. Such an algorithm would conclusively prove P = NP. n6; that’s a polynomial, all right. But it’s a big polynomial. The time required to check a SAT solution is linear in the number of terms in the Boolean formula—just check each one, see if it works. But if it turns out we could generate such a solution in time proportional to the sixth power of the number of terms, that would still mean it’s a lot easier to check than it is to solve. A lot easier.
I guess if your notion of a “fundamental gap” rests upon the polynomial/exponential distinction, you could say that’s not “fundamental”. But this is a weird notion to say the least. If n = 1 million can be checked in 1 million processor cycles (that is, milliseconds, or with some overhead, seconds), but only solved in 1036 processor cycles (that is, over a million trillion years), that sounds like a pretty big difference to me.
Even an n2 algorithm wouldn’t show there’s no difference. The difference between n and n2, is, well, a factor of n. So finding the answer could still take far longer than verifying it. This would be worrisome for encryption, however: Even a million times as long isn’t really that great actually. It means that if something would work in a few seconds for an ordinary computer (the timescale we want for our online shopping and banking), then, say, the Russian government with a supercomputer a thousand times better could spend half an hour on it. That’s… a problem. I guess if breaking our encryption was only feasible for superpower national intelligence agencies, it wouldn’t be a complete disaster. (Indeed, many people suspect that the NSA and FSB have already broken most of our encryption, and I wouldn’t be surprised to learn that’s true.)
But what I really want to say here is that since it may be true that P=NP—we don’t know it isn’t, even if most people strongly suspect as much—we should be trying to find methods of encryption that would remain secure even if that turns out to be the case. (There’s another reason as well: Quantum computers are known to be able to factor numbers in polynomial time—though it may be awhile before they get good enough to do so usefully.)
We do know two such methods, as a matter of fact. There is quantum encryption, which, like most things quantum, is very esoteric and hard to explain. (Maybe I’ll get to that in another post.) It also requires sophisticated, expensive hardware that most people are unlikely to be able to get.
And then there is onetime pad encryption, which is shockingly easy to explain and can be implemented on any home computer.
The problem with substitution ciphers is that you can look for patterns. You can do this because the key ultimately contains only so much information, based on how long it is. If the key contains 100 bits and the message contains 10,000 bits, at some point you’re going to have to repeat some kind of pattern—even if it’s a very complex, sophisticated one like the Enigma machine.
Well, what if the key were as long as the message? What if a 10,000 bit message used a 10,000 bit key? Then you could substitute every single letter for a different symbol each time. What if, on its first occurrence, E is D, but then it’s Q, and then it’s T—and each of these was generated randomly and independently each time? Then it can’t be broken by searching for patterns—because there are no patterns to be found.
Mathematically, it would look like this: Take each bit of the plaintext, and randomly generate another bit for the key. Add the key bit to the plaintext bit (technically you want to use bitwise XOR, but that’s basically adding), and you’ve got the ciphertext bit. At the other end, subtracting out each key bit will give back each plaintext bit. Provided you can generate random numbers efficiently, this will be fast to encrypt and decrypt—but literally impossible to break without the key.
Indeed, onetime-pad encryption is so secure that it is a proven mathematical theorem that there is no way to break it. Even if you had such staggering computing power that you could try every possible key, you wouldn’t even know when you got the right one—because every possible message can be generated from a given ciphertext, using some key. Even if you knew some parts of the message already, you would have no way to figure out any of the rest—because there are no patterns linking the two.
The downside is that you need to somehow send the keys. As I said in last week’s post, if you have a safe way to send the key, why can’t you send the message that way? Well, there is still an advantage, actually, and that’s speed.
If there is a slow, secure way to send information (e.g. deliver it physically by armed courier), and a fast, insecure way (e.g. send it over the Internet), then you can send the keys in advance by the slow, safe way and then send ciphertexts later the fast, risky way. Indeed, this kind of courier-based onetime-pad encryption is how the “red phone” (really a fax line) linking the White House to the Kremlin works.
Now, for online banking, we’re not going to be able to use couriers. But here’s something we could do. When you open a bank account, the bank could give you a, say, 128 GB flash drive of onetime-pad keys for you to use in your online banking. You plug that into your computer every time you want to log in, and it grabs the next part of key each time (there are some tricky technical details with synchronizing this that could, in practice, create some risk—but, done right, the risk would be small). If you are sending 10 megabytes of encrypted data each time (and that’s surely enough to encode a bank statement, though they might want to use a format other than PDF), you’ll get over 10,000 uses out of that flash drive. If you’ve been sending a lot of data and your key starts to run low, you can physically show up at the bank branch and get a new one.
Similarly, you could have onetime-pad keys on flash drives (more literal flash keys)given to you by the US government for tax filing, and another from each of your credit card issuers. For online purchases, the sellers would probably need to have their own onetime-pad keys set up with the banks and credit card companies, so that you send the info to VISA encrypted one way and they send it to the seller encrypted another way. Businesses with large sales volume would go through keys very quickly—but then, they can afford to keep buying new flash drives. Since each transaction should only take a few kilobytes, the cost of additional onetime-pad should be small compared to the cost of packing, shipping, and the items themselves. For larger purchases, business could even get in the habit of sending you a free flash key with each purchase so that future purchases are easier.
This would render paywalls very difficult to implement, but good riddance. Cryptocurrency would die, but even better riddance.It would be most inconvenient to deal with things like, well, writing a blog like this; needing to get a physical key from WordPress sounds like quite a hassle. People might actually just tolerate having their blogs hacked on occasion, because… who is going to hack your blog, and who really cares if your blog gets hacked?
Yes, this system is awkward and inconvenient compared to our current system. But unlike our current system, it is provably secure. Right now, it may seem like a remote possibility that someone would find an algorithm to prove P=NP and break encryption. But it could definitely happen, and if it did happen, it could happen quite suddenly. It would be far better to prepare for the worst than be unprepared when it’s too late.