Bundling the stakes to recalibrate ourselves

Mar 31 JDN 2460402

In a previous post I reflected on how our minds evolved for an environment of immediate return: An immediate threat with high chance of success and life-or-death stakes. But the world we live in is one of delayed return: delayed consequences with low chance of success and minimal stakes.

We evolved for a world where you need to either jump that ravine right now or you’ll die; but we live in a world where you’ll submit a hundred job applications before finally getting a good offer.

Thus, our anxiety system is miscalibrated for our modern world, and this miscalibration causes us to have deep, chronic anxiety which is pathological, instead of brief, intense anxiety that would protect us from harm.

I had an idea for how we might try to jury-rig this system and recalibrate ourselves:

Bundle the stakes.

Consider job applications.

The obvious way to think about it is to consider each application, and decide whether it’s worth the effort.

Any particular job application in today’s market probably costs you 30 minutes, but you won’t hear back for 2 weeks, and you have maybe a 2% chance of success. But if you fail, all you lost was that 30 minutes. This is the exact opposite of what our brains evolved to handle.

So now suppose if you think of it in terms of sending 100 job applications.

That will cost you 30 times 100 minutes = 50 hours. You still won’t hear back for weeks, but you’ve spent weeks, so that won’t feel as strange. And your chances of success after 100 applications are something like 1-(0.98)^100 = 87%.

Even losing 50 hours over a few weeks is not the disaster that falling down a ravine is. But it still feels a lot more reasonable to be anxious about that than to be anxious about losing 30 minutes.

More importantly, we have radically changed the chances of success.

Each individual application will almost certainly fail, but all 100 together will probably succeed.

If we were optimally rational, these two methods would lead to the same outcomes, by a rather deep mathematical law, the linearity of expectation:
E[nX] = n E[X]

Thus, the expected utility of doing something n times is precisely n times the expected utility of doing it once (all other things equal); and so, it doesn’t matter which way you look at it.

But of course we aren’t perfectly rational. We don’t actually respond to the expected utility. It’s still not entirely clear how we do assess probability in our minds (prospect theory seems to be onto something, but it’s computationally harder than rational probability, which means it makes absolutely no sense to evolve it).

If instead we are trying to match up our decisions with a much simpler heuristic that evolved for things like jumping over ravines, our representation of probability may be very simple indeed, something like “definitely”, “probably”, “maybe”, “probably not”, “definitely not”. (This is essentially my categorical prospect theory, which, like the stochastic overload model, is a half-baked theory that I haven’t published and at this point probably never will.)

2% chance of success is solidly “probably not” (or maybe something even stronger, like “almost definitely not”). Then, outcomes that are in that category are presumably weighted pretty low, because they generally don’t happen. Unless they are really good or really bad, it’s probably safest to ignore them—and in this case, they are neither.

But 87% chance of success is a clear “probably”; and outcomes in that category deserve our attention, even if their stakes aren’t especially high. And in fact, by bundling them, we have even made the stakes a bit higher—likely making the outcome a bit more salient.

The goal is to change “this will never work” to “this is going to work”.

For an individual application, there’s really no way to do that (without self-delusion); maybe you can make the odds a little better than 2%, but you surely can’t make them so high they deserve to go all the way up to “probably”. (At best you might manage a “maybe”, if you’ve got the right contacts or something.)

But for the whole set of 100 applications, this is in fact the correct assessment. It will probably work. And if 100 doesn’t, 150 might; if 150 doesn’t, 200 might. At no point do you need to delude yourself into over-estimating the odds, because the actual odds are in your favor.

This isn’t perfect, though.

There’s a glaring problem with this technique that I still can’t resolve: It feels overwhelming.

Doing one job application is really not that big a deal. It accomplishes very little, but also costs very little.

Doing 100 job applications is an enormous undertaking that will take up most of your time for multiple weeks.

So if you are feeling demotivated, asking you to bundle the stakes is asking you to take on a huge, overwhelming task that surely feels utterly beyond you.

Also, when it comes to this particular example, I even managed to do 100 job applications and still get a pretty bad outcome: My only offer was Edinburgh, and I ended up being miserable there. I have reason to believe that these were exceptional circumstances (due to COVID), but it has still been hard to shake the feeling of helplessness I learned from that ordeal.

Maybe there’s some additional reframing that can help here. If so, I haven’t found it yet.

But maybe stakes bundling can help you, or someone out there, even if it can’t help me.

Statisticacy

Jun 11 JDN 2460107

I wasn’t able to find a dictionary that includes the word “statisticacy”, but it doesn’t trigger my spell-check, and it does seem to have the same form as “numeracy”: numeric, numerical, numeracy, numerate; statistic, statistical, statisticacy, statisticate. It definitely still sounds very odd to my ears. Perhaps repetition will eventually make it familiar.

For the concept is clearly a very important one. Literacy and numeracy are no longer a serious problem in the First World; basically every adult at this point knows how to read and do addition. Even worldwide, 90% of men and 83% of women can read, at least at a basic level—which is an astonishing feat of our civilization by the way, well worthy of celebration.

But I have noticed a disturbing lack of, well, statisticacy. Even intelligent, educated people seem… pretty bad at understanding statistics.

I’m not talking about sophisticated econometrics here; of course most people don’t know that, and don’t need to. (Most economists don’t know that!) I mean quite basic statistical knowledge.

A few years ago I wrote a post called “Statistics you should have been taught in high school, but probably weren’t”; that’s the kind of stuff I’m talking about.

As part of being a good citizen in a modern society, every adult should understand the following:

1. The difference between a mean and a median, and why average income (mean) can increase even though most people are no richer (median).

2. The difference between increasing by X% and increasing by X percentage points: If inflation goes from 4% to 5%, that is an increase of 20% ((5/4-1)*100%), but only 1 percentage point (5%-4%).

3. The meaning of standard error, and how to interpret error bars on a graph—and why it’s a huge red flag if there aren’t any error bars on a graph.

4. Basic probabilistic reasoning: Given some scratch paper, a pen, and a calculator, everyone should be able to work out the odds of drawing a given blackjack hand, or rolling a particular number on a pair of dice. (If that’s too easy, make it a poker hand and four dice. But mostly that’s just more calculation effort, not fundamentally different.)

5. The meaning of exponential growth rates, and how they apply to economic growth and compound interest. (The difference between 3% interest and 6% interest over 30 years is more than double the total amount paid.)

I see people making errors about this sort of thing all the time.

Economic news that celebrates rising GDP but wonders why people aren’t happier (when real median income has been falling since 2019 and is only 7% higher than it was in 1999, an annual growth rate of 0.2%).

Reports on inflation, interest rates, or poll numbers that don’t clearly specify whether they are dealing with percentages or percentage points. (XKCD made fun of this.)

Speaking of poll numbers, any reporting on changes in polls that isn’t at least twice the margin of error of the polls in question. (There’s also a comic for this; this time it’s PhD Comics.)

People misunderstanding interest rates and gravely underestimating how much they’ll pay for their debt (then again, this is probably the result of strategic choices on the part of banks—so maybe the real failure is regulatory).

And, perhaps worst of all, the plague of science news articles about “New study says X”. Things causing and/or cancer, things correlated with personality types, tiny psychological nudges that supposedly have profound effects on behavior.

Some of these things will even turn out to be true; actually I think this one on fibromyalgia, this one on smoking, and this one on body image are probably accurate. But even if it’s a properly randomized experiment—and especially if it’s just a regression analysis—a single study ultimately tells us very little, and it’s irresponsible to report on them instead of telling people the extensive body of established scientific knowledge that most people still aren’t aware of.

Basically any time an article is published saying “New study says X”, a statisticate person should ignore it and treat it as random noise. This is especially true if the finding seems weird or shocking; such findings are far more likely to be random flukes than genuine discoveries. Yes, they could be true, but one study just doesn’t move the needle that much.

I don’t remember where it came from, but there is a saying about this: “What is in the textbooks is 90% true. What is in the published literature is 50% true. What is in the press releases is 90% false.” These figures are approximately correct.

If their goal is to advance public knowledge of science, science journalists would accomplish a lot more if they just opened to a random page in a mainstream science textbook and started reading it on air. Admittedly, I can see how that would be less interesting to watch; but then, their job should be to find a way to make it interesting, not to take individual studies out of context and hype them up far beyond what they deserve. (Bill Nye did this much better than most science journalists.)

I’m not sure how much to blame people for lacking this knowledge. On the one hand, they could easily look it up on Wikipedia, and apparently choose not to. On the other hand, they probably don’t even realize how important it is, and were never properly taught it in school even though they should have been. Many of these things may even be unknown unknowns; people simply don’t realize how poorly they understand. Maybe the most useful thing we could do right now is simply point out to people that these things are important, and if they don’t understand them, they should get on that Wikipedia binge as soon as possible.

And one last thing: Maybe this is asking too much, but I think that a truly statisticate person should be able to solve the Monty Hall Problem and not be confused by the result. (Hint: It’s very important that Monty Hall knows which door the car is behind, and would never open that one. If he’s guessing at random and simply happens to pick a goat, the correct answer is 1/2, not 2/3. Then again, it’s never a bad choice to switch.)

When maximizing utility doesn’t

Jun 4 JDN 2460100

Expected utility theory behaves quite strangely when you consider questions involving mortality.

Nick Beckstead and Teruji Thomas recently published a paper on this: All well-defined utility functions are either reckless in that they make you take crazy risks, or timid in that they tell you not to take even very small risks. It’s starting to make me wonder if utility theory is even the right way to make decisions after all.

Consider a game of Russian roulette where the prize is $1 million. The revolver has 6 chambers, 3 with a bullet. So that’s a 1/2 chance of $1 million, and a 1/2 chance of dying. Should you play?

I think it’s probably a bad idea to play. But the prize does matter; if it were $100 million, or $1 billion, maybe you should play after all. And if it were $10,000, you clearly shouldn’t.

And lest you think that there is no chance of dying you should be willing to accept for any amount of money, consider this: Do you drive a car? Do you cross the street? Do you do anything that could ever have any risk of shortening your lifespan in exchange for some other gain? I don’t see how you could live a remotely normal life without doing so. It might be a very small risk, but it’s still there.

This raises the question: Suppose we have some utility function over wealth; ln(x) is a quite plausible one. What utility should we assign to dying?


The fact that the prize matters means that we can’t assign death a utility of negative infinity. It must be some finite value.

But suppose we choose some value, -V, (so V is positive), for the utility of dying. Then we can find some amount of money that will make you willing to play: ln(x) = V, x = e^(V).

Now, suppose that you have the chance to play this game over and over again. Your marginal utility of wealth will change each time you win, so we may need to increase the prize to keep you playing; but we could do that. The prizes could keep scaling up as needed to make you willing to play. So then, you will keep playing, over and over—and then, sooner or later, you’ll die. So, at each step you maximized utility—but at the end, you didn’t get any utility.

Well, at that point your heirs will be rich, right? So maybe you’re actually okay with that. Maybe there is some amount of money ($1 billion?) that you’d be willing to die in order to ensure your heirs have.

But what if you don’t have any heirs? Or, what if we consider making such a decision as a civilization? What if death means not only the destruction of you, but also the destruction of everything you care about?

As a civilization, are there choices before us that would result in some chance of a glorious, wonderful future, but also some chance of total annihilation? I think it’s pretty clear that there are. Nuclear technology, biotechnology, artificial intelligence. For about the last century, humanity has been at a unique epoch: We are being forced to make this kind of existential decision, to face this kind of existential risk.

It’s not that we were immune to being wiped out before; an asteroid could have taken us out at any time (as happened to the dinosaurs), and a volcanic eruption nearly did. But this is the first time in humanity’s existence that we have had the power to destroy ourselves. This is the first time we have a decision to make about it.

One possible answer would be to say we should never be willing to take any kind of existential risk. Unlike the case of an individual, when we speaking about an entire civilization, it no longer seems obvious that we shouldn’t set the utility of death at negative infinity. But if we really did this, it would require shutting down whole industries—definitely halting all research in AI and biotechnology, probably disarming all nuclear weapons and destroying all their blueprints, and quite possibly even shutting down the coal and oil industries. It would be an utterly radical change, and it would require bearing great costs.

On the other hand, if we should decide that it is sometimes worth the risk, we will need to know when it is worth the risk. We currently don’t know that.

Even worse, we will need some mechanism for ensuring that we don’t take the risk when it isn’t worth it. And we have nothing like such a mechanism. In fact, most of our process of research in AI and biotechnology is widely dispersed, with no central governing authority and regulations that are inconsistent between countries. I think it’s quite apparent that right now, there are research projects going on somewhere in the world that aren’t worth the existential risk they pose for humanity—but the people doing them are convinced that they are worth it because they so greatly advance their national interest—or simply because they could be so very profitable.

In other words, humanity finally has the power to make a decision about our survival, and we’re not doing it. We aren’t making a decision at all. We’re letting that responsibility fall upon more or less randomly-chosen individuals in government and corporate labs around the world. We may be careening toward an abyss, and we don’t even know who has the steering wheel.

Terrible but not likely, likely but not terrible

May 17 JDN 2458985

The human brain is a remarkably awkward machine. It’s really quite bad at organizing data, relying on associations rather than formal categories.

It is particularly bad at negation. For instance, if I tell you that right now, no matter what, you must not think about a yellow submarine, the first thing you will do is think about a yellow submarine. (You may even get the Beatles song stuck in your head, especially now that I’ve mentioned it.) A computer would never make such a grievous error.

The human brain is also quite bad at separation. Daniel Dennett coined a word “deepity” for a particular kind of deep-sounding but ultimately trivial aphorism that seems to be quite common, which relies upon this feature of the brain. A deepity has at least two possible readings: On one reading, it is true, but utterly trivial. On another, it would be profound if true, but it simply isn’t true. But if you experience both at once, your brain is triggered for both “true” and “profound” and yields “profound truth”. The example he likes to use is “Love is just a word”. Well, yes, “love” is in fact just a word, but who cares? Yeah, words are words. But love, the underlying concept it describes, is not just a word—though if it were that would change a lot.

One thing I’ve come to realize about my own anxiety is that it involves a wide variety of different scenarios I imagine in my mind, and broadly speaking these can be sorted into two categories: Those that are likely but not terrible, and those that are terrible but not likely.

In the former category we have things like taking an extra year to finish my dissertation; the mean time to completion for a PhD is over 8 years, so finishing in 6 instead of 5 can hardly be considered catastrophic.

In the latter category we have things like dying from COVID-19. Yes, I’m a male with type A blood and asthma living in a high-risk county; but I’m also a young, healthy nonsmoker living under lockdown. Even without knowing the true fatality rate of the virus, my chances of actually dying from it are surely less than 1%.

But when both of those scenarios are running through my brain at the same time, the first triggers a reaction for “likely” and the second triggers a reaction for “terrible”, and I get this feeling that something terrible is actually likely to happen. And indeed if my probability of dying were as high as my probability of needing a 6th year to finish my PhD, that would be catastrophic.

I suppose it’s a bit strange that the opposite doesn’t happen: I never seem to get the improbability of dying attached to the mildness of needing an extra year. The confusion never seems to trigger “neither terrible nor likely”. Or perhaps it does, and my brain immediately disregards that as not worthy of consideration? It makes a certain sort of sense: An event that is neither probable nor severe doesn’t seem to merit much anxiety.

I suspect that many other people’s brains work the same way, eliding distinctions between different outcomes and ending up with a sort of maximal product of probability and severity.
The solution to this is not an easy one: It requires deliberate effort and extensive practice, and benefits greatly from formal training by a therapist. Counter-intuitively, you need to actually focus more on the scenarios that cause you anxiety, and accept the anxiety that such focus triggers in you. I find that it helps to actually write down the details of each scenario as vividly as possible, and review what I have written later. After doing this enough times, you can build up a greater separation in your mind, and more clearly categorize—this one is likely but not terrible, that one is terrible but not likely. It isn’t a cure, but it definitely helps me a great deal. Perhaps it could help you.

Pascal’s Mugging

Nov 10 JDN 2458798

In the Singularitarian community there is a paradox known as “Pascal’s Mugging”. The name is an intentional reference to Pascal’s Wager (and the link is quite apt, for reasons I’ll discuss in a later post.)

There are a few different versions of the argument; Yudkowsky’s original argument in which he came up with the name “Pascal’s Mugging” relies upon the concept of the universe as a simulation and an understanding of esoteric mathematical notation. So here is a more intuitive version:

A strange man in a dark hood comes up to you on the street. “Give me five dollars,” he says, “or I will destroy an entire planet filled with ten billion innocent people. I cannot prove to you that I have this power, but how much is an innocent life worth to you? Even if it is as little as $5,000, are you really willing to bet on ten trillion to one odds that I am lying?”

Do you give him the five dollars? I suspect that you do not. Indeed, I suspect that you’d be less likely to give him the five dollars than if he had merely said he was homeless and asked for five dollars to help pay for food. (Also, you may have objected that you value innocent lives, even faraway strangers you’ll never meet, at more than $5,000 each—but if that’s the case, you should probably be donating more, because the world’s best charities can save a live for about $3,000.)

But therein lies the paradox: Are you really willing to bet on ten trillion to one odds?

This argument gives me much the same feeling as the Ontological Argument; as Russell said of the latter, “it is much easier to be persuaded that ontological arguments are no good than it is to say exactly what is wrong with them.” It wasn’t until I read this post on GiveWell that I could really formulate the answer clearly enough to explain it.

The apparent force of Pascal’s Mugging comes from the idea of expected utility: Even if the probability of an event is very small, if it has a sufficiently great impact, the expected utility can still be large.

The problem with this argument is that extraordinary claims require extraordinary evidence. If a man held a gun to your head and said he’d shoot you if you didn’t give him five dollars, you’d give him five dollars. This is a plausible claim and he has provided ample evidence. If he were instead wearing a bomb vest (or even just really puffy clothing that could conceal a bomb vest), and he threatened to blow up a building unless you gave him five dollars, you’d probably do the same. This is less plausible (what kind of terrorist only demands five dollars?), but it’s not worth taking the chance.

But when he claims to have a Death Star parked in orbit of some distant planet, primed to make another Alderaan, you are right to be extremely skeptical. And if he claims to be a being from beyond our universe, primed to destroy so many lives that we couldn’t even write the number down with all the atoms in our universe (which was actually Yudkowsky’s original argument), to say that you are extremely skeptical seems a grievous understatement.

That GiveWell post provides a way to make this intuition mathematically precise in terms of Bayesian logic. If you have a normal prior with mean 0 and standard deviation 1, and you are presented with a likelihood with mean X and standard deviation X, what should you make your posterior distribution?

Normal priors are quite convenient; they conjugate nicely. The precision (inverse variance) of the posterior distribution is the sum of the two precisions, and the mean is a weighted average of the two means, weighted by their precision.

So the posterior variance is 1/(1 + 1/X^2).

The posterior mean is 1/(1+1/X^2)*(0) + (1/X^2)/(1+1/X^2)*(X) = X/(X^2+1).

That is, the mean of the posterior distribution is just barely higher than zero—and in fact, it is decreasing in X, if X > 1.

For those who don’t speak Bayesian: If someone says he’s going to have an effect of magnitude X, you should be less likely to believe him the larger that X is. And indeed this is precisely what our intuition said before: If he says he’s going to kill one person, believe him. If he says he’s going to destroy a planet, don’t believe him, unless he provides some really extraordinary evidence.

What sort of extraordinary evidence? To his credit, Yudkowsky imagined the sort of evidence that might actually be convincing:

If a poorly-dressed street person offers to save 10(10^100) lives (googolplex lives) for $5 using their Matrix Lord powers, and you claim to assign this scenario less than 10-(10^100) probability, then apparently you should continue to believe absolutely that their offer is bogus even after they snap their fingers and cause a giant silhouette of themselves to appear in the sky.

This post he called “Pascal’s Muggle”, after the term from the Harry Potter series, since some of the solutions that had been proposed for dealing with Pascal’s Mugging had resulted in a situation almost as absurd, in which the mugger could exhibit powers beyond our imagining and yet nevertheless we’d never have sufficient evidence to believe him.

So, let me go on record as saying this: Yes, if someone snaps his fingers and causes the sky to rip open and reveal a silhouette of himself, I’ll do whatever that person says. The odds are still higher that I’m dreaming or hallucinating than that this is really a being from beyond our universe, but if I’m dreaming, it makes no difference, and if someone can make me hallucinate that vividly he can probably cajole the money out of me in other ways. And there might be just enough chance that this could be real that I’m willing to give up that five bucks.

These seem like really strange thought experiments, because they are. But like many good thought experiments, they can provide us with some important insights. In this case, I think they are telling us something about the way human reasoning can fail when faced with impacts beyond our normal experience: We are in danger of both over-estimating and under-estimating their effects, because our brains aren’t equipped to deal with magnitudes and probabilities on that scale. This has made me realize something rather important about both Singularitarianism and religion, but I’ll save that for next week’s post.

Why are humans so bad with probability?

Apr 29 JDN 2458238

In previous posts on deviations from expected utility and cumulative prospect theory, I’ve detailed some of the myriad ways in which human beings deviate from optimal rational behavior when it comes to probability.

This post is going to be a bit different: Yes, we behave irrationally when it comes to probability. Why?

Why aren’t we optimal expected utility maximizers?
This question is not as simple as it sounds. Some of the ways that human beings deviate from neoclassical behavior are simply because neoclassical theory requires levels of knowledge and intelligence far beyond what human beings are capable of; basically anything requiring “perfect information” qualifies, as does any game theory prediction that involves solving extensive-form games with infinite strategy spaces by backward induction. (Don’t feel bad if you have no idea what that means; that’s kind of my point. Solving infinite extensive-form games by backward induction is an unsolved problem in game theory; just this past week I saw a new paper presented that offered a partial potential solutionand yet we expect people to do it optimally every time?)

I’m also not going to include questions of fundamental uncertainty, like “Will Apple stock rise or fall tomorrow?” or “Will the US go to war with North Korea in the next ten years?” where it isn’t even clear how we would assign a probability. (Though I will get back to them, for reasons that will become clear.)

No, let’s just look at the absolute simplest cases, where the probabilities are all well-defined and completely transparent: Lotteries and casino games. Why are we so bad at that?

Lotteries are not a computationally complex problem. You figure out how much the prize is worth to you, multiply it by the probability of winning—which is clearly spelled out for you—and compare that to how much the ticket price is worth to you. The most challenging part lies in specifying your marginal utility of wealth—the “how much it’s worth to you” part—but that’s something you basically had to do anyway, to make any kind of trade-offs on how to spend your time and money. Maybe you didn’t need to compute it quite so precisely over that particular range of parameters, but you need at least some idea how much $1 versus $10,000 is worth to you in order to get by in a market economy.

Casino games are a bit more complicated, but not much, and most of the work has been done for you; you can look on the Internet and find tables of probability calculations for poker, blackjack, roulette, craps and more. Memorizing all those probabilities might take some doing, but human memory is astonishingly capacious, and part of being an expert card player, especially in blackjack, seems to involve memorizing a lot of those probabilities.

Furthermore, by any plausible expected utility calculation, lotteries and casino games are a bad deal. Unless you’re an expert poker player or blackjack card-counter, your expected income from playing at a casino is always negative—and the casino set it up that way on purpose.

Why, then, can lotteries and casinos stay in business? Why are we so bad at such a simple problem?

Clearly we are using some sort of heuristic judgment in order to save computing power, and the people who make lotteries and casinos have designed formal models that can exploit those heuristics to pump money from us. (Shame on them, really; I don’t fully understand why this sort of thing is legal.)

In another previous post I proposed what I call “categorical prospect theory”, which I think is a decently accurate description of the heuristics people use when assessing probability (though I’ve not yet had the chance to test it experimentally).

But why use this particular heuristic? Indeed, why use a heuristic at all for such a simple problem?

I think it’s helpful to keep in mind that these simple problems are weird; they are absolutely not the sort of thing a tribe of hunter-gatherers is likely to encounter on the savannah. It doesn’t make sense for our brains to be optimized to solve poker or roulette.

The sort of problems that our ancestors encountered—indeed, the sort of problems that we encounter, most of the time—were not problems of calculable probability risk; they were problems of fundamental uncertainty. And they were frequently matters of life or death (which is why we’d expect them to be highly evolutionarily optimized): “Was that sound a lion, or just the wind?” “Is this mushroom safe to eat?” “Is that meat spoiled?”

In fact, many of the uncertainties most important to our ancestors are still important today: “Will these new strangers be friendly, or dangerous?” “Is that person attracted to me, or am I just projecting my own feelings?” “Can I trust you to keep your promise?” These sorts of social uncertainties are even deeper; it’s not clear that any finite being could ever totally resolve its uncertainty surrounding the behavior of other beings with the same level of intelligence, as the cognitive arms race continues indefinitely. The better I understand you, the better you understand me—and if you’re trying to deceive me, as I get better at detecting deception, you’ll get better at deceiving.

Personally, I think that it was precisely this sort of feedback loop that resulting in human beings getting such ridiculously huge brains in the first place. Chimpanzees are pretty good at dealing with the natural environment, maybe even better than we are; but even young children can outsmart them in social tasks any day. And once you start evolving for social cognition, it’s very hard to stop; basically you need to be constrained by something very fundamental, like, say, maximum caloric intake or the shape of the birth canal. Where chimpanzees look like their brains were what we call an “interior solution”, where evolution optimized toward a particular balance between cost and benefit, human brains look more like a “corner solution”, where the evolutionary pressure was entirely in one direction until we hit up against a hard constraint. That’s exactly what one would expect to happen if we were caught in a cognitive arms race.

What sort of heuristic makes sense for dealing with fundamental uncertainty—as opposed to precisely calculable probability? Well, you don’t want to compute a utility function and multiply by it, because that adds all sorts of extra computation and you have no idea what probability to assign. But you’ve got to do something like that in some sense, because that really is the optimal way to respond.

So here’s a heuristic you might try: Separate events into some broad categories based on how frequently they seem to occur, and what sort of response would be necessary.

Some things, like the sun rising each morning, seem to always happen. So you should act as if those things are going to happen pretty much always, because they do happen… pretty much always.

Other things, like rain, seem to happen frequently but not always. So you should look for signs that those things might happen, and prepare for them when the signs point in that direction.

Still other things, like being attacked by lions, happen very rarely, but are a really big deal when they do. You can’t go around expecting those to happen all the time, that would be crazy; but you need to be vigilant, and if you see any sign that they might be happening, even if you’re pretty sure they’re not, you may need to respond as if they were actually happening, just in case. The cost of a false positive is much lower than the cost of a false negative.

And still other things, like people sprouting wings and flying, never seem to happen. So you should act as if those things are never going to happen, and you don’t have to worry about them.

This heuristic is quite simple to apply once set up: It can simply slot in memories of when things did and didn’t happen in order to decide which category they go in—i.e. availability heuristic. If you can remember a lot of examples of “almost never”, maybe you should move it to “unlikely” instead. If you get a really big number of examples, you might even want to move it all the way to “likely”.

Another large advantage of this heuristic is that by combining utility and probability into one metric—we might call it “importance”, though Bayesian econometricians might complain about that—we can save on memory space and computing power. I don’t need to separately compute a utility and a probability; I just need to figure out how much effort I should put into dealing with this situation. A high probability of a small cost and a low probability of a large cost may be equally worth my time.

How might these heuristics go wrong? Well, if your environment changes sufficiently, the probabilities could shift and what seemed certain no longer is. For most of human history, “people walking on the Moon” would seem about as plausible as sprouting wings and flying away, and yet it has happened. Being attacked by lions is now exceedingly rare except in very specific places, but we still harbor a certain awe and fear before lions. And of course availability heuristic can be greatly distorted by mass media, which makes people feel like terrorist attacks and nuclear meltdowns are common and deaths by car accidents and influenza are rare—when exactly the opposite is true.

How many categories should you set, and what frequencies should they be associated with? This part I’m still struggling with, and it’s an important piece of the puzzle I will need before I can take this theory to experiment. There is probably a trade-off between more categories giving you more precision in tailoring your optimal behavior, but costing more cognitive resources to maintain. Is the optimal number 3? 4? 7? 10? I really don’t know. Even I could specify the number of categories, I’d still need to figure out precisely what categories to assign.

Experimentally testing categorical prospect theory

Dec 4, JDN 2457727

In last week’s post I presented a new theory of probability judgments, which doesn’t rely upon people performing complicated math even subconsciously. Instead, I hypothesize that people try to assign categories to their subjective probabilities, and throw away all the information that wasn’t used to assign that category.

The way to most clearly distinguish this from cumulative prospect theory is to show discontinuity. Kahneman’s smooth, continuous function places fairly strong bounds on just how much a shift from 0% to 0.000001% can really affect your behavior. In particular, if you want to explain the fact that people do seem to behave differently around 10% compared to 1% probabilities, you can’t allow the slope of the smooth function to get much higher than 10 at any point, even near 0 and 1. (It does depend on the precise form of the function, but the more complicated you make it, the more free parameters you add to the model. In the most parsimonious form, which is a cubic polynomial, the maximum slope is actually much smaller than this—only 2.)

If that’s the case, then switching from 0.% to 0.0001% should have no more effect in reality than a switch from 0% to 0.00001% would to a rational expected utility optimizer. But in fact I think I can set up scenarios where it would have a larger effect than a switch from 0.001% to 0.01%.

Indeed, these games are already quite profitable for the majority of US states, and they are called lotteries.

Rationally, it should make very little difference to you whether your odds of winning the Powerball are 0 (you bought no ticket) or 0.000000001% (you bought a ticket), even when the prize is $100 million. This is because your utility of $100 million is nowhere near 100 million times as large as your marginal utility of $1. A good guess would be that your lifetime income is about $2 million, your utility is logarithmic, the units of utility are hectoQALY, and the baseline level is about 100,000.

I apologize for the extremely large number of decimals, but I had to do that in order to show any difference at all. I have bolded where the decimals first deviate from the baseline.

Your utility if you don’t have a ticket is ln(20) = 2.9957322736 hQALY.

Your utility if you have a ticket is (1-10^-9) ln(20) + 10^-9 ln(1020) = 2.9957322775 hQALY.

You gain a whopping 40 microQALY over your whole lifetime. I highly doubt you could even perceive such a difference.

And yet, people are willing to pay nontrivial sums for the chance to play such lotteries. Powerball tickets sell for about $2 each, and some people buy tickets every week. If you do that and live to be 80, you will spend some $8,000 on lottery tickets during your lifetime, which results in this expected utility: (1-4*10^-6) ln(20-0.08) + 4*10^-6 ln(1020) = 2.9917399955 hQALY.
You have now sacrificed 0.004 hectoQALY, which is to say 0.4 QALY—that’s months of happiness you’ve given up to play this stupid pointless game.

Which shouldn’t be surprising, as (with 99.9996% probability) you have given up four months of your lifetime income with nothing to show for it. Lifetime income of $2 million / lifespan of 80 years = $25,000 per year; $8,000 / $25,000 = 0.32. You’ve actually sacrificed slightly more than this, which comes from your risk aversion.

Why would anyone do such a thing? Because while the difference between 0 and 10^-9 may be trivial, the difference between “impossible” and “almost impossible” feels enormous. “You can’t win if you don’t play!” they say, but they might as well say “You can’t win if you do play either.” Indeed, the probability of winning without playing isn’t zero; you could find a winning ticket lying on the ground, or win due to an error that is then upheld in court, or be given the winnings bequeathed by a dying family member or gifted by an anonymous donor. These are of course vanishingly unlikely—but so was winning in the first place. You’re talking about the difference between 10^-9 and 10^-12, which in proportional terms sounds like a lot—but in absolute terms is nothing. If you drive to a drug store every week to buy a ticket, you are more likely to die in a car accident on the way to the drug store than you are to win the lottery.

Of course, these are not experimental conditions. So I need to devise a similar game, with smaller stakes but still large enough for people’s brains to care about the “almost impossible” category; maybe thousands? It’s not uncommon for an economics experiment to cost thousands, it’s just usually paid out to many people instead of randomly to one person or nobody. Conducting the experiment in an underdeveloped country like India would also effectively amplify the amounts paid, but at the fixed cost of transporting the research team to India.

But I think in general terms the experiment could look something like this. You are given $20 for participating in the experiment (we treat it as already given to you, to maximize your loss aversion and endowment effect and thereby give us more bang for our buck). You then have a chance to play a game, where you pay $X to get a P probability of $Y*X, and we vary these numbers.

The actual participants wouldn’t see the variables, just the numbers and possibly the rules: “You can pay $2 for a 1% chance of winning $200. You can also play multiple times if you wish.” “You can pay $10 for a 5% chance of winning $250. You can only play once or not at all.”

So I think the first step is to find some dilemmas, cases where people feel ambivalent, and different people differ in their choices. That’s a good role for a pilot study.

Then we take these dilemmas and start varying their probabilities slightly.

In particular, we try to vary them at the edge of where people have mental categories. If subjective probability is continuous, a slight change in actual probability should never result in a large change in behavior, and furthermore the effect of a change shouldn’t vary too much depending on where the change starts.

But if subjective probability is categorical, these categories should have edges. Then, when I present you with two dilemmas that are on opposite sides of one of the edges, your behavior should radically shift; while if I change it in a different way, I can make a large change without changing the result.

Based solely on my own intuition, I guessed that the categories roughly follow this pattern:

Impossible: 0%

Almost impossible: 0.1%

Very unlikely: 1%

Unlikely: 10%

Fairly unlikely: 20%

Roughly even odds: 50%

Fairly likely: 80%

Likely: 90%

Very likely: 99%

Almost certain: 99.9%

Certain: 100%

So for example, if I switch from 0%% to 0.01%, it should have a very large effect, because I’ve moved you out of your “impossible” category (indeed, I think the “impossible” category is almost completely sharp; literally anything above zero seems to be enough for most people, even 10^-9 or 10^-10). But if I move from 1% to 2%, it should have a small effect, because I’m still well within the “very unlikely” category. Yet the latter change is literally one hundred times larger than the former. It is possible to define continuous functions that would behave this way to an arbitrary level of approximation—but they get a lot less parsimonious very fast.

Now, immediately I run into a problem, because I’m not even sure those are my categories, much less that they are everyone else’s. If I knew precisely which categories to look for, I could tell whether or not I had found it. But the process of both finding the categories and determining if their edges are truly sharp is much more complicated, and requires a lot more statistical degrees of freedom to get beyond the noise.

One thing I’m considering is assigning these values as a prior, and then conducting a series of experiments which would adjust that prior. In effect I would be using optimal Bayesian probability reasoning to show that human beings do not use optimal Bayesian probability reasoning. Still, I think that actually pinning down the categories would require a large number of participants or a long series of experiments (in frequentist statistics this distinction is vital; in Bayesian statistics it is basically irrelevant—one of the simplest reasons to be Bayesian is that it no longer bothers you whether someone did 2 experiments of 100 people or 1 experiment of 200 people, provided they were the same experiment of course). And of course there’s always the possibility that my theory is totally off-base, and I find nothing; a dissertation replicating cumulative prospect theory is a lot less exciting (and, sadly, less publishable) than one refuting it.

Still, I think something like this is worth exploring. I highly doubt that people are doing very much math when they make most probabilistic judgments, and using categories would provide a very good way for people to make judgments usefully with no math at all.

How do people think about probability?

Nov 27, JDN 2457690

(This topic was chosen by vote of my Patreons.)

In neoclassical theory, it is assumed (explicitly or implicitly) that human beings judge probability in something like the optimal Bayesian way: We assign prior probabilities to events, and then when confronted with evidence we infer using the observed data to update our prior probabilities to posterior probabilities. Then, when we have to make decisions, we maximize our expected utility subject to our posterior probabilities.

This, of course, is nothing like how human beings actually think. Even very intelligent, rational, numerate people only engage in a vague approximation of this behavior, and only when dealing with major decisions likely to affect the course of their lives. (Yes, I literally decide which universities to attend based upon formal expected utility models. Thus far, I’ve never been dissatisfied with a decision made that way.) No one decides what to eat for lunch or what to do this weekend based on formal expected utility models—or at least I hope they don’t, because that point the computational cost far exceeds the expected benefit.

So how do human beings actually think about probability? Well, a good place to start is to look at ways in which we systematically deviate from expected utility theory.

A classic example is the Allais paradox. See if it applies to you.

In game A, you get $1 million dollars, guaranteed.
In game B, you have a 10% chance of getting $5 million, an 89% chance of getting $1 million, but now you have a 1% chance of getting nothing.

Which do you prefer, game A or game B?

In game C, you have an 11% chance of getting $1 million, and an 89% chance of getting nothing.

In game D, you have a 10% chance of getting $5 million, and a 90% chance of getting nothing.

Which do you prefer, game C or game D?

I have to think about it for a little bit and do some calculations, and it’s still very hard because it depends crucially on my projected lifetime income (which could easily exceed $3 million with a PhD, especially in economics) and the precise form of my marginal utility (I think I have constant relative risk aversion, but I’m not sure what parameter to use precisely), but in general I think I want to choose game A and game C, but I actually feel really ambivalent, because it’s not hard to find plausible parameters for my utility where I should go for the gamble.

But if you’re like most people, you choose game A and game D.

There is no coherent expected utility by which you would do this.

Why? Either a 10% chance of $5 million instead of $1 million is worth risking a 1% chance of nothing, or it isn’t. If it is, you should play B and D. If it’s not, you should play A and C. I can’t tell you for sure whether it is worth it—I can’t even fully decide for myself—but it either is or it isn’t.

Yet most people have a strong intuition that they should take game A but game D. Why? What does this say about how we judge probability?
The leading theory in behavioral economics right now is cumulative prospect theory, developed by the great Kahneman and Tversky, who essentially founded the field of behavioral economics. It’s quite intimidating to try to go up against them—which is probably why we should force ourselves to do it. Fear of challenging the favorite theories of the great scientists before us is how science stagnates.

I wrote about it more in a previous post, but as a brief review, cumulative prospect theory says that instead of judging based on a well-defined utility function, we instead consider gains and losses as fundamentally different sorts of thing, and in three specific ways:

First, we are loss-averse; we feel a loss about twice as intensely as a gain of the same amount.

Second, we are risk-averse for gains, but risk-seeking for losses; we assume that gaining twice as much isn’t actually twice as good (which is almost certainly true), but we also assume that losing twice as much isn’t actually twice as bad (which is almost certainly false and indeed contradictory with the previous).

Third, we judge probabilities as more important when they are close to certainty. We make a large distinction between a 0% probability and a 0.0000001% probability, but almost no distinction at all between a 41% probability and a 43% probability.

That last part is what I want to focus on for today. In Kahneman’s model, this is a continuous, monotonoic function that maps 0 to 0 and 1 to 1, but systematically overestimates probabilities below but near 1/2 and systematically underestimates probabilities above but near 1/2.

It looks something like this, where red is true probability and blue is subjective probability:

cumulative_prospect
I don’t believe this is actually how humans think, for two reasons:

  1. It’s too hard. Humans are astonishingly innumerate creatures, given the enormous processing power of our brains. It’s true that we have some intuitive capacity for “solving” very complex equations, but that’s almost all within our motor system—we can “solve a differential equation” when we catch a ball, but we have no idea how we’re doing it. But probability judgments are often made consciously, especially in experiments like the Allais paradox; and the conscious brain is terrible at math. It’s actually really amazing how bad we are at math. Any model of normal human judgment should assume from the start that we will not do complicated math at any point in the process. Maybe you can hypothesize that we do so subconsciously, but you’d better have a good reason for assuming that.
  2. There is no reason to do this. Why in the world would any kind of optimization system function this way? You start with perfectly good probabilities, and then instead of using them, you subject them to some bizarre, unmotivated transformation that makes them less accurate and costs computing power? You may as well hit yourself in the head with a brick.

So, why might it look like we are doing this? Well, my proposal, admittedly still rather half-baked, is that human beings don’t assign probabilities numerically at all; we assign them categorically.

You may call this, for lack of a better term, categorical prospect theory.

My theory is that people don’t actually have in their head “there is an 11% chance of rain today” (unless they specifically heard that from a weather report this morning); they have in their head “it’s fairly unlikely that it will rain today”.

That is, we assign some small number of discrete categories of probability, and fit things into them. I’m not sure what exactly the categories are, and part of what makes my job difficult here is that they may be fuzzy-edged and vary from person to person, but roughly speaking, I think they correspond to the sort of things psychologists usually put on Likert scales in surveys: Impossible, almost impossible, very unlikely, unlikely, fairly unlikely, roughly even odds, fairly likely, likely, very likely, almost certain, certain. If I’m putting numbers on these probability categories, they go something like this: 0, 0.001, 0.01, 0.10, 0.20, 0.50, 0.8, 0.9, 0.99, 0.999, 1.

Notice that this would preserve the same basic effect as cumulative prospect theory: You care a lot more about differences in probability when they are near 0 or 1, because those are much more likely to actually shift your category. Indeed, as written, you wouldn’t care about a shift from 0.4 to 0.6 at all, despite caring a great deal about a shift from 0.001 to 0.01.

How does this solve the above problems?

  1. It’s easy. Not only don’t you compute a probability and then recompute it for no reason; you never even have to compute it precisely. Just get it within some vague error bounds and that will tell you what box it goes in. Instead of computing an approximation to a continuous function, you just slot things into a small number of discrete boxes, a dozen at the most.
  2. That explains why we would do it: It’s easy. Our brains need to conserve their capacity, and they did especially in our ancestral environment when we struggled to survive. Rather than having to iterate your approximation to arbitrary precision, you just get within 0.1 or so and call it a day. That saves time and computing power, which saves energy, which could save your life.

What new problems have I introduced?

  1. It’s very hard to know exactly where people’s categories are, if they vary between individuals or even between situations, and whether they are fuzzy-edged.
  2. If you take the model I just gave literally, even quite large probability changes will have absolutely no effect as long as they remain within a category such as “roughly even odds”.

With regard to 2, I think Kahneman may himself be able to save me, with his dual process theory concept of System 1 and System 2. What I’m really asserting is that System 1, the fast, intuitive judgment system, operates on these categories. System 2, on the other hand, the careful, rational thought system, can actually make use of proper numerical probabilities; it’s just very costly to boot up System 2 in the first place, much less ensure that it actually gets the right answer.

How might we test this? Well, I think that people are more likely to use System 1 when any of the following are true:

  1. They are under harsh time-pressure
  2. The decision isn’t very important
  3. The intuitive judgment is fast and obvious

And conversely they are likely to use System 2 when the following are true:

  1. They have plenty of time to think
  2. The decision is very important
  3. The intuitive judgment is difficult or unclear

So, it should be possible to arrange an experiment varying these parameters, such that in one treatment people almost always use System 1, and in another they almost always use System 2. And then, my prediction is that in the System 1 treatment, people will in fact not change their behavior at all when you change the probability from 15% to 25% (fairly unlikely) or 40% to 60% (roughly even odds).

To be clear, you can’t just present people with this choice between game E and game F:

Game E: You get a 60% chance of $50, and a 40% chance of nothing.

Game F: You get a 40% chance of $50, and a 60% chance of nothing.

People will obviously choose game E. If you can directly compare the numbers and one game is strictly better in every way, I think even without much effort people will be able to choose correctly.

Instead, what I’m saying is that if you make the following offers to two completely different sets of people, you will observe little difference in their choices, even though under expected utility theory you should.
Group I receives a choice between game E and game G:

Game E: You get a 60% chance of $50, and a 40% chance of nothing.

Game G: You get a 100% chance of $20.

Group II receives a choice between game F and game G:

Game F: You get a 40% chance of $50, and a 60% chance of nothing.

Game G: You get a 100% chance of $20.

Under two very plausible assumptions about marginal utility of wealth, I can fix what the rational judgment should be in each game.

The first assumption is that marginal utility of wealth is decreasing, so people are risk-averse (at least for gains, which these are). The second assumption is that most people’s lifetime income is at least two orders of magnitude higher than $50.

By the first assumption, group II should choose game G. The expected income is precisely the same, and being even ever so slightly risk-averse should make you go for the guaranteed $20.

By the second assumption, group I should choose game E. Yes, there is some risk, but because $50 should not be a huge sum to you, your risk aversion should be small and the higher expected income of $30 should sway you.

But I predict that most people will choose game G in both cases, and (within statistical error) the same proportion will choose F as chose E—thus showing that the difference between a 40% chance and a 60% chance was in fact negligible to their intuitive judgments.

However, this doesn’t actually disprove Kahneman’s theory; perhaps that part of the subjective probability function is just that flat. For that, I need to set up an experiment where I show discontinuity. I need to find the edge of a category and get people to switch categories sharply. Next week I’ll talk about how we might pull that off.

What do we mean by “risk”?

JDN 2457118 EDT 20:50.

In an earlier post I talked about how, empirically, expected utility theory can’t explain the fact that we buy both insurance and lottery tickets, and how, normatively it really doesn’t make a lot of sense to buy lottery tickets precisely because of what expected utility theory says about them.

But today I’d like to talk about one of the major problems with expected utility theory, which I consider one of the major unexplored frontiers of economics: Expected utility theory treats all kinds of risk exactly the same.

In reality there are three kinds of risk: The first is what I’ll call classical risk, which is like the game of roulette; the odds are well-defined and known in advance, and you can play the game a large number of times and average out the results. This is where expected utility theory really shines; if you are dealing with classical risk, expected utility is obviously the way to go and Von Neumann and Morgenstern quite literally proved mathematically that anything else is irrational.

The second is uncertainty, a distinction which was most famously expounded by Frank Knight, an economist at the University of Chicago. (Chicago is a funny place; on the one hand they are a haven for the madness that is Austrian economics; on the other hand they have led the charge in behavioral and cognitive economics. Knight was a perfect fit, because he was a little of both.) Uncertainty is risk under ill-defined or unknown probabilities, where there is no way to play the game twice. Most real-world “risk” is actually uncertainty: Will the People’s Republic of China collapse in the 21st century? How many deaths will global warming cause? Will human beings ever colonize Mars? Is P = NP? None of those questions have known answers, but nor can we clearly assign probabilities either; Either P = NP or not, as a mathematical theorem (or, like the continuum hypothesis, it’s independent of ZFC, the most bizarre possibility of all), and it’s not as if someone is rolling dice to decide how many people global warming will kill. You can think of this in terms of “possible worlds”, though actually most modal theorists would tell you that we can’t even say that P=NP is possible (nor can we say it isn’t possible!) because, as a necessary statement, it can only be possible if it is actually true; this follows from the S5 axiom of modal logic, and you know what, even I am already bored with that sentence. Clearly there is some sense in which P=NP is possible, and if that’s not what modal logic says then so much the worse for modal logic. I am not a modal realist (not to be confused with a moral realist, which I am); I don’t think that possible worlds are real things out there somewhere. I think possibility is ultimately a statement about ignorance, and since we don’t know that P=NP is false then I contend that it is possible that it is true. Put another way, it would not be obviously irrational to place a bet that P=NP will be proved true by 2100; but if we can’t even say that it is possible, how can that be?

Anyway, that’s the mess that uncertainty puts us in, and almost everything is made of uncertainty. Expected utility theory basically falls apart under uncertainty; it doesn’t even know how to give an answer, let alone one that is correct. In reality what we usually end up doing is waving our hands and trying to assign a probability anyway—because we simply don’t know what else to do.

The third one is not one that’s usually talked about, yet I think it’s quite important; I will call it one-shot risk. The probabilities are known or at least reasonably well approximated, but you only get to play the game once. You can also generalize to few-shot risk, where you can play a small number of times, where “small” is defined relative to the probabilities involved; this is a little vaguer, but basically what I have in mind is that even though you can play more than once, you can’t play enough times to realistically expect the rarest outcomes to occur. Expected utility theory almost works on one-shot and few-shot risk, but you have to be very careful about taking it literally.

I think an example make things clearer: Playing the lottery is a few-shot risk. You can play the lottery multiple times, yes; potentially hundreds of times in fact. But hundreds of times is nothing compared to the 1 in 400 million chance you have of actually winning. You know that probability; it can be computed exactly from the rules of the game. But nonetheless expected utility theory runs into some serious problems here.

If we were playing a classical risk game, expected utility would obviously be right. So for example if you know that you will live one billion years, and you are offered the chance to play a game (somehow compensating for the mind-boggling levels of inflation, economic growth, transhuman transcendence, and/or total extinction that will occur during that vast expanse of time) in which at each year you can either have a guaranteed $40,000 of inflation-adjusted income or a 99.999,999,75% chance of $39,999 of inflation-adjusted income and a 0.000,000,25% chance of $100 million in inflation-adjusted income—which will disappear at the end of the year, along with everything you bought with it, so that each year you start afresh. Should you take the second option? Absolutely not, and expected utility theory explains why; that one or two years where you’ll experience 8 QALY per year isn’t worth dropping from 4.602056 QALY per year to 4.602049 QALY per year for the other nine hundred and ninety-eight million years. (Can you even fathom how long that is? From here, one billion years is all the way back to the Mesoproterozoic Era, which we think is when single-celled organisms first began to reproduce sexually. The gain is to be Mitt Romney for a year or two; the loss is the value of a dollar each year over and over again for the entire time that has elapsed since the existence of gamete meiosis.) I think it goes without saying that this whole situation is almost unimaginably bizarre. Yet that is implicitly what we’re assuming when we use expected utility theory to assess whether you should buy lottery tickets.

The real situation is more like this: There’s one world you can end up in, and almost certainly will, in which you buy lottery tickets every year and end up with an income of $39,999 instead of $40,000. There is another world, so unlikely as to be barely worth considering, yet not totally impossible, in which you get $100 million and you are completely set for life and able to live however you want for the rest of your life. Averaging over those two worlds is a really weird thing to do; what do we even mean by doing that? You don’t experience one world 0.000,000,25% as much as the other (whereas in the billion-year scenario, that is exactly what you do); you only experience one world or the other.

In fact, it’s worse than this, because if a classical risk game is such that you can play it as many times as you want as quickly as you want, we don’t even need expected utility theory—expected money theory will do. If you can play a game where you have a 50% chance of winning $200,000 and a 50% chance of losing $50,000, which you can play up to once an hour for the next 48 hours, and you will be extended any credit necessary to cover any losses, you’d be insane not to play; your 99.9% confidence level of wealth at the end of the two days is from $850,000 to $6,180,000. While you may lose money for awhile, it is vanishingly unlikely that you will end up losing more than you gain.

Yet if you are offered the chance to play this game only once, you probably should not take it, and the reason why then comes back to expected utility. If you have good access to credit you might consider it, because going $50,000 into debt is bad but not unbearably so (I did, going to college) and gaining $200,000 might actually be enough better to justify the risk. Then the effect can be averaged over your lifetime; let’s say you make $50,000 per year over 40 years. Losing $50,000 means making your average income $48,750, while gaining $200,000 means making your average income $55,000; so your QALY per year go from a guaranteed 4.70 to a 50% chance of 4.69 and a 50% chance of 4.74; that raises your expected utility from 4.70 to 4.715.

But if you don’t have good access to credit and your income for this year is $50,000, then losing $50,000 means losing everything you have and living in poverty or even starving to death. The benefits of raising your income by $200,000 this year aren’t nearly great enough to take that chance. Your expected utility goes from 4.70 to a 50% chance of 5.30 and a 50% chance of zero.

So expected utility theory only seems to properly apply if we can play the game enough times that the improbable events are likely to happen a few times, but not so many times that we can be sure our money will approach the average. And that’s assuming we know the odds and we aren’t just stuck with uncertainty.

Unfortunately, I don’t have a good alternative; so far expected utility theory may actually be the best we have. But it remains deeply unsatisfying, and I like to think we’ll one day come up with something better.