# Pascal’s Mugging

Nov 10 JDN 2458798

In the Singularitarian community there is a paradox known as “Pascal’s Mugging”. The name is an intentional reference to Pascal’s Wager (and the link is quite apt, for reasons I’ll discuss in a later post.)

There are a few different versions of the argument; Yudkowsky’s original argument in which he came up with the name “Pascal’s Mugging” relies upon the concept of the universe as a simulation and an understanding of esoteric mathematical notation. So here is a more intuitive version:

A strange man in a dark hood comes up to you on the street. “Give me five dollars,” he says, “or I will destroy an entire planet filled with ten billion innocent people. I cannot prove to you that I have this power, but how much is an innocent life worth to you? Even if it is as little as \$5,000, are you really willing to bet on ten trillion to one odds that I am lying?”

Do you give him the five dollars? I suspect that you do not. Indeed, I suspect that you’d be less likely to give him the five dollars than if he had merely said he was homeless and asked for five dollars to help pay for food. (Also, you may have objected that you value innocent lives, even faraway strangers you’ll never meet, at more than \$5,000 each—but if that’s the case, you should probably be donating more, because the world’s best charities can save a live for about \$3,000.)

But therein lies the paradox: Are you really willing to bet on ten trillion to one odds?

This argument gives me much the same feeling as the Ontological Argument; as Russell said of the latter, “it is much easier to be persuaded that ontological arguments are no good than it is to say exactly what is wrong with them.” It wasn’t until I read this post on GiveWell that I could really formulate the answer clearly enough to explain it.

The apparent force of Pascal’s Mugging comes from the idea of expected utility: Even if the probability of an event is very small, if it has a sufficiently great impact, the expected utility can still be large.

The problem with this argument is that extraordinary claims require extraordinary evidence. If a man held a gun to your head and said he’d shoot you if you didn’t give him five dollars, you’d give him five dollars. This is a plausible claim and he has provided ample evidence. If he were instead wearing a bomb vest (or even just really puffy clothing that could conceal a bomb vest), and he threatened to blow up a building unless you gave him five dollars, you’d probably do the same. This is less plausible (what kind of terrorist only demands five dollars?), but it’s not worth taking the chance.

But when he claims to have a Death Star parked in orbit of some distant planet, primed to make another Alderaan, you are right to be extremely skeptical. And if he claims to be a being from beyond our universe, primed to destroy so many lives that we couldn’t even write the number down with all the atoms in our universe (which was actually Yudkowsky’s original argument), to say that you are extremely skeptical seems a grievous understatement.

That GiveWell post provides a way to make this intuition mathematically precise in terms of Bayesian logic. If you have a normal prior with mean 0 and standard deviation 1, and you are presented with a likelihood with mean X and standard deviation X, what should you make your posterior distribution?

Normal priors are quite convenient; they conjugate nicely. The precision (inverse variance) of the posterior distribution is the sum of the two precisions, and the mean is a weighted average of the two means, weighted by their precision.

So the posterior variance is 1/(1 + 1/X^2).

The posterior mean is 1/(1+1/X^2)*(0) + (1/X^2)/(1+1/X^2)*(X) = X/(X^2+1).

That is, the mean of the posterior distribution is just barely higher than zero—and in fact, it is decreasing in X, if X > 1.

For those who don’t speak Bayesian: If someone says he’s going to have an effect of magnitude X, you should be less likely to believe him the larger that X is. And indeed this is precisely what our intuition said before: If he says he’s going to kill one person, believe him. If he says he’s going to destroy a planet, don’t believe him, unless he provides some really extraordinary evidence.

What sort of extraordinary evidence? To his credit, Yudkowsky imagined the sort of evidence that might actually be convincing:

If a poorly-dressed street person offers to save 10(10^100) lives (googolplex lives) for \$5 using their Matrix Lord powers, and you claim to assign this scenario less than 10-(10^100) probability, then apparently you should continue to believe absolutely that their offer is bogus even after they snap their fingers and cause a giant silhouette of themselves to appear in the sky.

This post he called “Pascal’s Muggle”, after the term from the Harry Potter series, since some of the solutions that had been proposed for dealing with Pascal’s Mugging had resulted in a situation almost as absurd, in which the mugger could exhibit powers beyond our imagining and yet nevertheless we’d never have sufficient evidence to believe him.

So, let me go on record as saying this: Yes, if someone snaps his fingers and causes the sky to rip open and reveal a silhouette of himself, I’ll do whatever that person says. The odds are still higher that I’m dreaming or hallucinating than that this is really a being from beyond our universe, but if I’m dreaming, it makes no difference, and if someone can make me hallucinate that vividly he can probably cajole the money out of me in other ways. And there might be just enough chance that this could be real that I’m willing to give up that five bucks.

These seem like really strange thought experiments, because they are. But like many good thought experiments, they can provide us with some important insights. In this case, I think they are telling us something about the way human reasoning can fail when faced with impacts beyond our normal experience: We are in danger of both over-estimating and under-estimating their effects, because our brains aren’t equipped to deal with magnitudes and probabilities on that scale. This has made me realize something rather important about both Singularitarianism and religion, but I’ll save that for next week’s post.

# The winner-takes-all effect

JDN 2457054 PST 14:06.

As I write there is some sort of mariachi band playing on my front lawn. It is actually rather odd that I have a front lawn, since my apartment is set back from the road; yet there is the patch of grass, and there is the band playing upon it. This sort of thing is part of the excitement of living in a large city (and Long Beach would seem like a large city were it not right next to the sprawling immensity that is Los Angeles—there are more people in Long Beach than in Cleveland, but there are more people in greater Los Angeles than in Sweden); with a certain critical mass of human beings comes unexpected pieces of culture.

The fact that people agglomerate in this way is actually relevant to today’s topic, which is what I will call the winner-takes-all effect. I actually just finished reading a book called The Winner-Take-All Society, which is particularly horrifying to read because it came out in 1996. That’s almost twenty years ago, and things were already bad; and since then everything it describes has only gotten worse.

What is the winner-takes-all effect? It is the simple fact that in competitive capitalist markets, a small difference in quality can yield an enormous difference in return. The third most popular soda drink company probably still makes drinks that are pretty good, but do you have any idea what it is? There’s Coke, there’s Pepsi, and then there’s… uh… Dr. Pepper, apparently! But I didn’t know that before today and I bet you didn’t either. Now think about what it must be like to be the 15th most popular soda drink company, or the 37th. That’s the winner-takes-all effect.

I don’t generally follow football, but since tomorrow is the Super Bowl I feel some obligation to use that example as well. The highest-paid quarterback is Russell Wilson of the Seattle Seahawks, who is signing onto a five-year contract worth \$110 million (\$22 million a year). In annual income that will make him pass Jay Cutler of the Chicago Bears who has a seven-year contract worth \$127 million (\$18.5 million a year). This shift may have something to do with the fact that the Seahawks are in the Super Bowl this year and the Bears are not (they haven’t since 2007). Now consider what life is like for most football players; the median income of football players is most likely zero (at least as far as football-related income), and the median income of NFL players—the cream of the crop already—is \$770,000; that’s still very good money of course (more than Krugman makes, actually! But he could make more, if he were willing to sell out to Wall Street), but it’s barely 1/30 of what Wilson is going to be making. To make that million-dollar salary, you need to be the best, of the best, of the best (sir!). That’s the winner-takes-all effect.

To go back to the example of cities, it is for similar reasons that the largest cities (New York, Los Angeles, London, Tokyo, Shanghai, Hong Kong, Delhi) become packed with tens of millions of people while others (Long Beach, Ann Arbor, Cleveland) get hundreds of thousands and most (Petoskey, Ketchikan, Heber City, and hundreds of others you’ve never heard of) get only a few thousand. Beyond that there are thousands of tiny little hamlets that many don’t even consider cities. The median city probably has about 10,000 people in it, and that only because we’d stop calling it a city if it fell below 1,000. If we include every tiny little village, the median town size is probably about 20 people. Meanwhile the largest city in the world is Tokyo, with a greater metropolitan area that holds almost 38 million people—or to put it another way almost exactly as many people as California. Huh, LA doesn’t seem so big now does it? How big is a typical town? Well, that’s the thing about this sort of power-law distribution; the concept of “typical” or “average” doesn’t really apply anymore. Each little piece of the distribution has basically the same shape as the whole distribution, so there isn’t a “typical” size or scale. That’s the winner-takes-all effect.

As they freely admit in the book, it isn’t literally that a single winner takes everything. That is the theoretical maximum level of wealth inequality, and fortunately no society has ever quite reached it. The closest we get in today’s society is probably Saudi Arabia, which recently lost its king—and yes I do mean king in the fullest sense of the word, a man of virtually unlimited riches and near-absolute power. His net wealth was estimated at \$18 billion, which frankly sounds low; still even if that’s half the true amount it’s oddly comforting to know that he is still not quite as rich as Bill Gates (\$78 billion), who earned his wealth at least semi-legitimately in a basically free society. Say what you will about intellectual property rents and market manipulation—and you know I do—but they are worlds away from what Abdullah’s family did, which was literally and directly robbed from millions of people by the power of the sword. Mostly he just inherited all that, and he did implement some minor reforms, but make no mistake: He was ruthless and by no means willing to give up his absolute power—he beheaded dozens of political dissidents, for example. Saudi Arabia does spread their wealth around a little, such that basically no one is below the UN poverty lines of \$1.25 and \$2 per day, but about a fourth of the population is below the national poverty line—which is just about the same distribution of wealth as what we have in the US, which actually makes me wonder just how free and legitimate our markets really are.

The winner-takes-all effect would really be more accurately described as the “top small fraction takes the vast majority” effect, but that isn’t nearly as catchy, now is it?

There are several different causes that can all lead to this same result. In the book, Robert Frank and Philip Cook argue that we should not attribute the cause to market manipulation, but in fact to the natural functioning of competitive markets. There’s something to be said for this—I used to buy the whole idea that competitive markets are the best, but increasingly I’ve been seeing ways that less competitive markets can make better overall outcomes.

Where they lose me is in arguing that the skyrocketing compensation packages for CEOs are due to their superior performance, and corporations are just being rational in competing for the best CEOs. If that were true, we wouldn’t find that the rank correlation between the CEO’s pay and the company’s stock performance is statistically indistinguishable from zero. Actually even a small positive correlation wouldn’t prove that the CEOs are actually performing well; it could just be that companies that perform well are willing to pay their CEOs more—and stock option compensation will do this automatically. But in fact the correlation is so tiny as to be negligible; corporations would be better off hiring a random person off the street and paying them \$50,000 for all the CEO does for their stock performance. If you adjust for the size of the company, you find that having a higher-paid CEO is positively related to performance for small startups, but negatively correlated for large well-established corporations. No, clearly there’s something going on here besides competitive pay for high performance—corruption comes to mind, which you’ll remember was the subject of my master’s thesis.

But in some cases there isn’t any apparent corruption, and yet we still see these enormously unequal distributions of income. Another good example of this is the publishing industry, in which J.K. Rowling can make over \$1 billion (she donated enough to charity to officially lose her billionaire status) but most authors make little or nothing, particularly those who can’t get published in the first place. I have no reason to believe that J.K. Rowling acquired this massive wealth by corruption; she just sold an awful lot of booksover 100 million of the first Harry Potter book alone.

But why would she be able to sell 100 million while thousands of authors write books that are probably just as good or nearly so make nothing? Am I just bitter and envious, as Mitt Romney would say? Is J.K. Rowling actually a million times as good an author as I am?

Obviously not, right? She may be better, but she’s not that much better. So how is it that she ends up making a million times as much as I do from writing? It feels like squaring the circle: How can markets be efficient and competitive, yet some people are being paid millions of times as others despite being only slightly more productive?

The answer is simple but enormously powerful: positive feedback.Once you start doing well, it’s easier to do better. You have what economists call an economy of scale. The first 10,000 books sold is the hardest; then the next 10,000 is a little easier; the next 10,000 a little easier still. In fact I suspect that in many cases the first 10% growth is harder than the second 10% growth and so on—which is actually a much stronger claim. For my sales to grow 10% I’d need to add like 20 people. For J.K. Rowling’s sales to grow 10% she’d need to add 10 million. Yet it might actually be easier for J.K. Rowling to add 10 million than for me to add 20. If not, it isn’t much harder. Suppose we tried by just sending out enticing tweets. I have about 100 Twitter followers, so I’d need 0.2 sales per follower; she has about 4 million, so she’d need an average of 2.5 sales per follower. That’s an advantage for me, percentage-wise—but if we have the same uptake rate I sell 20 books and she sells 800,000.

Languages are also like this, which is why I can write this post in English and yet people can still read it around the world. English is the winner of the language competition (we call it the lingua franca, as weird as that is—French is not the lingua franca anymore). The losers are those hundreds of New Guinean languages you’ve never heard of, many of which are dying. And their distribution obeys, once again, a power-law. (Individual words actually obey a power-law as well, which makes this whole fractal business delightfully ever more so.)
Network externalities are not the only way that the winner-takes-all effect can occur, though I think it is the most common. You can also have economies of scale from the supply side, particularly in the case of information: Recording a song is a lot of time and effort, but once you record a song, it’s trivial to make more copies of it. So that first recording costs a great deal, while every subsequent recording costs next to nothing. This is probably also at work in the case of J.K. Rowling and the NFL; the two phenomena are by no means mutually exclusive. But clearly the sizes of cities are due to network externalities: It’s quite expensive to live in a big city—no supply-side economy of scale—but you want to live in a city where other people live because that’s where friends and family and opportunities are.

The most worrisome kind of winner-takes-all effect is what Frank and Cook call deep pockets: Once you have concentration of wealth in a few hands, those few individuals can now choose their own winners in a much more literal sense: the rich can commission works of art from their favorite artists, exacerbating the inequality among artists; worse yet they can use their money to influence politicians (as the Kochs are planning on spending \$900 million—\$3 for every person in America—to do in 2016) and exacerbate the inequality in the whole system. That gives us even more positive feedback on top of all the other positive feedbacks.

Sure enough, if you run the standard neoclassical economic models of competition and just insert the assumption of economies of scale, the result is concentration of wealth—in fact, if nothing about the rules prevents it, the result is a complete monopoly. Nor is this result in any sense economically efficient; it’s just what naturally happens in the presence of economies of scale.

Frank and Cook seem most concerned about the fact that these winner-take-all incomes will tend to push too many people to seek those careers, leaving millions of would-be artists, musicians and quarterbacks with dashed dreams when they might have been perfectly happy as electrical engineers or high school teachers. While this may be true—next week I’ll go into detail about prospect theory and why human beings are terrible at making judgments based on probability—it isn’t really what I’m most concerned about. For all the cost of frustrated ambition there is also a good deal of benefit; striving for greatness does not just make the world better if we succeed, it can make ourselves better even if we fail. I’d strongly encourage people to have backup plans; but I’m not going to tell people to stop painting, singing, writing, or playing football just because they’re unlikely to make a living at it. The one concern I do have is that the competition is so fierce that we are pressured to go all in, to not have backup plans, to use performance-enhancing drugs—they may carry awful risks, but they also work. And it’s probably true, actually, that you’re a bit more likely to make it all the way to the top if you don’t have a backup plan. You’re also vastly more likely to end up at the bottom. Is raising your probability of being a bestselling author from 0.00011% to 0.00012% worth giving up all other career options? Skipping chemistry class to practice football may improve your chances of being an NFL quarterback from 0.000013% to 0.000014%, but it will also drop your chances of being a chemical engineer from 95% (a degree in chemical engineering almost guarantees you a job eventually) to more like 5% (it’s hard to get a degree when you flunk all your classes).

Frank and Cook offer a solution that I think is basically right; they call it positional arms control agreements. By analogy with arms control agreements between nations—and what is war, if not the ultimate winner-takes-all contest?—they propose that we use taxation and regulation policy to provide incentives to make people compete less fiercely for the top positions. Some of these we already do: Performance-enhancing drugs are banned in professional sports, for instance. Even where there are no regulations, we can use social norms: That’s why it’s actually a good thing that your parents rarely support your decision to drop out of school and become a movie star.

That’s yet another reason why progressive taxation is a good idea, as if we needed another; by paring down those top incomes it makes the prospect of winning big less enticing. If NFL quarterbacks only made 10 times what chemical engineers make instead of 300 times, people would be a lot more hesitant to give up on chemical engineering to become a quarterback. If top Wall Street executives only made 50 times what normal people make instead of 5000, people with physics degrees might go back to actually being physicists instead of speculating on stock markets.

There is one case where we might not want fewer people to try, and that is entrepreneurship. Most startups fail, and only a handful go on to make mind-bogglingly huge amounts of money (often for no apparent reason, like the Snuggie and Flappy Bird), yet entrepreneurship is what drives the dynamism of a capitalist economy. We need people to start new businesses, and right now they do that mainly because of a tiny chance of a huge benefit. Yet we don’t want them to be too unrealistic in their expectations: Entrepreneurs are much more optimistic than the general population, but the most successful entrepreneurs are a bit less optimistic than other entrepreneurs. The most successful strategy is to be optimistic but realistic; this outperforms both unrealistic optimism and pessimism. That seems pretty intuitive; you have to be confident you’ll succeed, but you can’t be totally delusional. Yet it’s precisely the realistic optimists who are most likely to be disincentivized by a reduction in the top prizes.

Here’s my solution: Let’s change it from a tiny change of a huge benefit to a large chance of a moderately large benefit. Let’s reward entrepreneurs for trying—with standards for what constitutes a really serious, good attempt rather than something frivolous that was guaranteed to fail. Use part of the funds from the progressive tax as a fund for angel grants, provided to a large number of the most promising entrepreneurs. It can’t be a million-dollar prize for the top 100. It needs to be more like a \$50,000 prize for the top 100,000 (which would cost \$5 billion a year, affordable for the US government). It should be paid at the proposal phase; the top 100,000 business plans receive the funding and are under no obligation to repay it. It has to be enough money that someone can rationally commit themselves to years of dedicated work without throwing themselves into poverty, and it has to be confirmed money so that they don’t have to worry about throwing themselves into debt. As for the upper limit, it only needs to be small enough that there is still an incentive for the business to succeed; but even with a 99% tax Mark Zuckerberg would still be a millionaire, so the rewards for success are high indeed.

The good news is that we actually have such a system to some extent. For research scientists rather than entrepreneurs, NSF grants are pretty close to what I have in mind, but at present they are a bit too competitive: 8,000 research grants with a median of \$130,000 each and a 20% acceptance rate isn’t quite enough people—the acceptance rate should be higher, since most of these proposals are quite worthy. Still, it’s close, and definitely a much better incentive system than what we have for entrepreneurs; there are almost 12 million entrepreneurs in the United States, starting 6 million businesses a year, 75% of which fail before they can return their venture capital. Those that succeed have incomes higher than the general population, with a median income of around \$70,000 per year, but most of this is accounted for by the fact that entrepreneurs are more educated and talented than the general population. Once you factor that in, successful entrepreneurs have about 50% more income on average, but their standard deviation of income is also 60% higher—so some are getting a lot and some are getting very little. Since 75% fail, we’re talking about a 25% chance of entering an income distribution that’s higher on average but much more variable, and a 75% chance of going through a period with little or no income at all—is it worth it? Maybe, maybe not. But if you could get a guaranteed \$50,000 for having a good idea—and let me be clear, only serious proposals that have a good chance of success should qualify—that deal sounds an awful lot better.