Optimization is unstable. Maybe that’s why we satisfice.

Feb 26 JDN 2460002

Imagine you have become stranded on a deserted island. You need to find shelter, food, and water, and then perhaps you can start working on a way to get help or escape the island.

Suppose you are programmed to be an optimizerto get the absolute best solution to any problem. At first this may seem to be a boon: You’ll build the best shelter, find the best food, get the best water, find the best way off the island.

But you’ll also expend an enormous amount of effort trying to make it the best. You could spend hours just trying to decide what the best possible shelter would be. You could pass up dozens of viable food sources because you aren’t sure that any of them are the best. And you’ll never get any rest because you’re constantly trying to improve everything.

In principle your optimization could include that: The cost of thinking too hard or searching too long could be one of the things you are optimizing over. But in practice, this sort of bounded optimization is often remarkably intractable.

And what if you forgot about something? You were so busy optimizing your shelter you forgot to treat your wounds. You were so busy seeking out the perfect food source that you didn’t realize you’d been bitten by a venomous snake.

This is not the way to survive. You don’t want to be an optimizer.

No, the person who survives is a satisficerthey make sure that what they have is good enough and then they move on to the next thing. Their shelter is lopsided and ugly. Their food is tasteless and bland. Their water is hard. But they have them.

Once they have shelter and food and water, they will have time and energy to do other things. They will notice the snakebite. They will treat the wound. Once all their needs are met, they will get enough rest.

Empirically, humans are satisficers. We seem to be happier because of it—in fact, the people who are the happiest satisfice the most. And really this shouldn’t be so surprising: Because our ancestral environment wasn’t so different from being stranded on a desert island.

Good enough is perfect. Perfect is bad.

Let’s consider another example. Suppose that you have created a powerful artificial intelligence, an AGI with the capacity to surpass human reasoning. (It hasn’t happened yet—but it probably will someday, and maybe sooner than most people think.)

What do you want that AI’s goals to be?

Okay, ideally maybe they would be something like “Maximize goodness”, where we actually somehow include all the panoply of different factors that go into goodness, like beneficence, harm, fairness, justice, kindness, honesty, and autonomy. Do you have any idea how to do that? Do you even know what your own full moral framework looks like at that level of detail?

Far more likely, the goals you program into the AGI will be much simpler than that. You’ll have something you want it to accomplish, and you’ll tell it to do that well.

Let’s make this concrete and say that you own a paperclip company. You want to make more profits by selling paperclips.

First of all, let me note that this is not an unreasonable thing for you to want. It is not an inherently evil goal for one to have. The world needs paperclips, and it’s perfectly reasonable for you to want to make a profit selling them.

But it’s also not a true ultimate goal: There are a lot of other things that matter in life besides profits and paperclips. Anyone who isn’t a complete psychopath will realize that.

But the AI won’t. Not unless you tell it to. And so if we tell it to optimize, we would need to actually include in its optimization all of the things we genuinely care about—not missing a single one—or else whatever choices it makes are probably not going to be the ones we want. Oops, we forgot to say we need clean air, and now we’re all suffocating. Oops, we forgot to say that puppies don’t like to be melted down into plastic.

The simplest cases to consider are obviously horrific: Tell it to maximize the number of paperclips produced, and it starts tearing the world apart to convert everything to paperclips. (This is the original “paperclipper” concept from Less Wrong.) Tell it to maximize the amount of money you make, and it seizes control of all the world’s central banks and starts printing $9 quintillion for itself. (Why that amount? I’m assuming it uses 64-bit signed integers, and 2^63 is over 9 quintillion. If it uses long ints, we’re even more doomed.) No, inflation-adjusting won’t fix that; even hyperinflation typically still results in more real seigniorage for the central banks doing the printing (which is, you know, why they do it). The AI won’t ever be able to own more than all the world’s real GDP—but it will be able to own that if it prints enough and we can’t stop it.

But even if we try to come up with some more sophisticated optimization for it to perform (what I’m really talking about here is specifying its utility function), it becomes vital for us to include everything we genuinely care about: Anything we forget to include will be treated as a resource to be consumed in the service of maximizing everything else.

Consider instead what would happen if we programmed the AI to satisfice. The goal would be something like, “Produce at least 400,000 paperclips at a price of at most $0.002 per paperclip.”

Given such an instruction, in all likelihood, it would in fact produce exactly 400,000 paperclips at a price of exactly $0.002 per paperclip. And maybe that’s not strictly the best outcome for your company. But if it’s better than what you were previously doing, it will still increase your profits.

Moreover, such an instruction is far less likely to result in the end of the world.

If the AI has a particular target to meet for its production quota and price limit, the first thing it would probably try is to use your existing machinery. If that’s not good enough, it might start trying to modify the machinery, or acquire new machines, or develop its own techniques for making paperclips. But there are quite strict limits on how creative it is likely to be—because there are quite strict limits on how creative it needs to be. If you were previously producing 200,000 paperclips at $0.004 per paperclip, all it needs to do is double production and halve the cost. That’s a very standard sort of industrial innovation— in computing hardware (admittedly an extreme case), we do this sort of thing every couple of years.

It certainly won’t tear the world apart making paperclips—at most it’ll tear apart enough of the world to make 400,000 paperclips, which is a pretty small chunk of the world, because paperclips aren’t that big. A paperclip weighs about a gram, so you’ve only destroyed about 400 kilos of stuff. (You might even survive the lawsuits!)

Are you leaving money on the table relative to the optimization scenario? Eh, maybe. One, it’s a small price to pay for not ending the world. But two, if 400,000 at $0.002 was too easy, next time try 600,000 at $0.001. Over time, you can gently increase its quotas and tighten its price requirements until your company becomes more and more successful—all without risking the AI going completely rogue and doing something insane and destructive.

Of course this is no guarantee of safety—and I absolutely want us to use every safeguard we possibly can when it comes to advanced AGI. But the simple change from optimizing to satisficing seems to solve the most severe problems immediately and reliably, at very little cost.

Good enough is perfect; perfect is bad.

I see broader implications here for behavioral economics. When all of our models are based on optimization, but human beings overwhelmingly seem to satisfice, maybe it’s time to stop assuming that the models are right and the humans are wrong.

Optimization is perfect if it works—and awful if it doesn’t. Satisficing is always pretty good. Optimization is unstable, while satisficing is robust.

In the real world, that probably means that satisficing is better.

Good enough is perfect; perfect is bad.

The paperclippers are already here

Jan 24 JDN 2459239

Imagine a powerful artificial intelligence, which is comprised of many parts distributed over a vast area so that it has no particular location. It is incapable of feeling any emotion: Neither love nor hate, neither joy nor sorrow, neither hope nor fear. It has no concept of ethics or morals, only its own programmed directives. It has one singular purpose, which it seeks out at any cost. Any who aid its purpose are generously rewarded. Any who resist its purpose are mercilessly crushed.

The Less Wrong community has come to refer to such artificial intelligences as “paperclippers”; the metonymous singular directive is to maximize the number of paperclips produced. There’s even an online clicker game where you can play as one called “Universal Paperclips“. The concern is that we might one day invent such artificial intelligences, and they could get out of control. The paperclippers won’t kill us because they hate us, but simply because we can be used to make more paperclips. This is a far more plausible scenario for the “AI apocalypse” than the more conventional sci-fi version where AIs try to kill us on purpose.

But I would say that the paperclippers are already here. Slow, analog versions perhaps. But they are already getting out of control. We call them corporations.

A corporation is probably not what you visualized when you read the first paragraph of this post, so try reading it again. Which parts are not true of corporations?

Perhaps you think a corporation is not an artificial intelligence? But clearly it’s artificial, and doesn’t it behave in ways that seem intelligent? A corporation has purpose beyond its employees in much the same way that a hive has purpose beyond its bees. A corporation is a human superorganism (and not the only kind either).

Corporations are absolutely, utterly amoral. Their sole directive is to maximize profit. Now, you might think that an individual CEO, or a board of directors, could decide to do something good, or refrain from something evil, for reasons other than profit; and to some extent this is true. But particularly when a corporation is publicly-traded, that CEO and those directors are beholden to shareholders. If shareholders see that the corporation is acting in ways that benefit the community but hurt their own profits, shareholders can rebel by selling their shares or even suing the company. In 1919, Dodge successfully sued Ford for the “crime” of setting wages too high and prices too low.

Humans are altruistic. We are capable of feeling, emotion, and compassion. Corporations are not. Corporations are made of human beings, but they are specifically structured to minimize the autonomy of human choices. They are designed to provide strong incentives to behave in a particular way so as to maximize profit. Even the CEO of a corporation, especially one that is publicly traded, has their hands tied most of the time by the desires of millions of shareholders and customers—so-called “market forces”. Corporations are entirely the result of human actions, but they feel like impersonal forces because they are the result of millions of independent choices, almost impossible to coordinate; so one individual has very little power to change the outcome.

Why would we create such entities? It almost feels as though we were conquered by some alien force that sought to enslave us to its own purposes. But no, we created corporations ourselves. We intentionally set up institutions designed to limit our own autonomy in the name of maximizing profit.

Part of the answer is efficiency: There are genuine gains in economic efficiency due to the corporate structure. Corporations can coordinate complex activity on a vast scale, with thousands or even millions of employees each doing what they are assigned without ever knowing—or needing to know—the whole of which they are a part.

But a publicly-traded corporation is far from the only way to do that. Even for-profit businesses are not the only way to organize production. And empirically, worker co-ops actually seem to be about as productive as corporations, while producing far less inequality and far more satisfied employees.

Thus, in order to explain the primacy of corporations, particularly those that are traded on stock markets, we must turn to ideology: The extreme laissez- faire concept of capitalism and its modern expression in the ideology of “shareholder value”. Somewhere along the way enough people—or at least enough policymakers—became convinced that the best way to run an economy was to hand over as much as possible to entities that exist entirely to maximize their own profits.

This is not to say that corporations should be abolished entirely. I am certainly not advocating a shift to central planning; I believe in private enterprise. But I should note that private enterprise can also include co-ops, partnerships, and closely-held businesses, rather than publicly traded corproations, and perhaps that’s all we need. Yet there do seem to be significant advantages to the corporate structure: Corporation seem to be spectacularly good at scaling up the production of goods and providing them to a large number of customers. So let’s not get rid of corporations just yet.

Instead, let us keep corporations on a short leash. When properly regulated, corporations can be very efficient at producing goods. But corporations can also cause tremendous damage when given the opportunity. Regulations aren’t just “red tape” that gets in the way of production. They are a vital lifeline that protects us against countless abuses that corporations would otherwise commit.

These vast artificial intelligences are useful to us, so let’s not get rid of them. But never for a moment imagine that their goals are the same as ours. Keep them under close watch at all times, and compel them to use their great powers for good—for, left to their own devices, they can just as easily do great evil.