Locked donation boxes and moral variation

Aug 8 JDN 2459435

I haven’t been able to find the quote, but I think it was Kahneman who once remarked: “Putting locks on donation boxes shows that you have the correct view of human nature.”

I consider this a deep insight. Allow me to explain.

Some people think that human beings are basically good. Rousseau is commonly associated with this view, a notion that, left to our own devices, human beings would naturally gravitate toward an anarchic but peaceful society.

The question for people who think this needs to be: Why haven’t we? If your answer is “government holds us back”, you still need to explain why we have government. Government was not imposed upon us from On High in time immemorial. We were fairly anarchic (though not especially peaceful) in hunter-gatherer tribes for nearly 200,000 years before we established governments. How did that happen?

And if your answer to that is “a small number of tyrannical psychopaths forced government on everyone else”, you may not be wrong about that—but it already breaks your original theory, because we’ve just shown that human society cannot maintain a peaceful anarchy indefinitely.

Other people think that human beings are basically evil. Hobbes is most commonly associated with this view, that humans are innately greedy, violent, and selfish, and only by the overwhelming force of a government can civilization be maintained.

This view more accurately predicts the level of violence and death that generally accompanies anarchy, and can at least explain why we’d want to establish government—but it still has trouble explaining how we would establish government. It’s not as if we’re ruled by a single ubermensch with superpowers, or an army of robots created by a mad scientist in a secret undergroud laboratory. Running a government involves cooperation on an absolutely massive scale—thousands or even millions of unrelated, largely anonymous individuals—and this cooperation is not maintained entirely by force: Yes, there is some force involved, but most of what a government does most of the time is mediated by norms and customs, and if a government did ever try to organize itself entirely by force—not paying any of the workers, not relying on any notion of patriotism or civic duty—it would immediately and catastrophically collapse.

What is the right answer? Humans aren’t basically good or basically evil. Humans are basically varied.

I would even go so far as to say that most human beings are basically good. They follow a moral code, they care about other people, they work hard to support others, they try not to break the rules. Nobody is perfect, and we all make various mistakes. We disagree about what is right and wrong, and sometimes we even engage in actions that we ourselves would recognize as morally wrong. But most people, most of the time, try to do the right thing.

But some people are better than others. There are great humanitarians, and then there are ordinary folks. There are people who are kind and compassionate, and people who are selfish jerks.

And at the very opposite extreme from the great humanitarians is the roughly 1% of people who are outright psychopaths. About 5-10% of people have significant psychopathic traits, but about 1% are really full-blown psychopaths.

I believe it is fair to say that psychopaths are in fact basically evil. They are incapable of empathy or compassion. Morality is meaningless to them—they literally cannot distinguish moral rules from other rules. Other people’s suffering—even their very lives—means nothing to them except insofar as it is instrumentally useful. To a psychopath, other people are nothing more than tools, resources to be exploited—or obstacles to be removed.

Some philosophers have argued that this means that psychopaths are incapable of moral responsibility. I think this is wrong. I think it relies on a naive, pre-scientific notion of what “moral responsibility” is supposed to mean—one that was inevitably going to be destroyed once we had a greater understanding of the brain. Do psychopaths understand the consequences of their actions? Yes. Do rewards motivate psychopaths to behave better? Yes. Does the threat of punishment motivate them? Not really, but it was never that effective on anyone else, either. What kind of “moral responsibility” are we still missing? And how would our optimal action change if we decided that they do or don’t have moral responsibility? Would you still imprison them for crimes either way? Maybe it doesn’t matter whether or not it’s really a blegg.

Psychopaths are a small portion of our population, but are responsible for a large proportion of violent crimes. They are also overrepresented in top government positions as well as police officers, and it’s pretty safe to say that nearly every murderous dictator was a psychopath of one shade or another.

The vast majority of people are not psychopaths, and most people don’t even have any significant psychopathic traits. Yet psychopaths have an enormously disproportionate impact on society—nearly all of it harmful. If psychopaths did not exist, Rousseau might be right after all; we wouldn’t need government. If most people were psychopaths, Hobbes would be right; we’d long for the stability and security of government, but we could never actually cooperate enough to create it.

This brings me back to the matter of locked donation boxes.

Having a donation box is only worthwhile if most people are basically good: Asking people to give money freely in order to achieve some good only makes any sense if people are capable of altruism, empathy, cooperation. And it can’t be just a few, because you’d never raise enough money to be useful that way. It doesn’t have to be everyone, or maybe even a majority; but it has to be a large fraction. 90% is more than enough.

But locking things is only worthwhile if some people are basically evil: For a lock to make sense, there must be at least a few people who would be willing to break in and steal the money, even if it was earmarked for a very worthy cause. It doesn’t take a huge fraction of people, but it must be more than a negligible one. 1% to 10% is just about the right sort of range.

Hence, locked donation boxes are a phenomenon that would only exist in a world where most people are basically good—but some people are basically evil.

And this is in fact the world in which we live. It is a world where the Holocaust could happen but then be followed by the founding of the United Nations, a world where nuclear weapons would be invented and used to devastate cities, but then be followed by an era of nearly unprecedented peace. It is a world where governments are necessary to reign in violence, but also a world where governments can function (reasonably well) even in countries with hundreds of millions of people. It is a world with crushing poverty and people who work tirelessly to end it. It is a world where Exxon and BP despoil the planet for riches while WWF and Greenpeace fight back. It is a world where religions unite millions of people under a banner of peace and justice, and then go on crusadees to murder thousands of other people who united under a different banner of peace and justice. It is a world of richness, complexity, uncertainty, conflict—variance.

It is not clear how much of this moral variance is innate versus acquired. If we somehow rewound the film of history and started it again with a few minor changes, it is not clear how many of us would end up the same and how many would be far better or far worse than we are. Maybe psychopaths were born the way they are, or maybe they were made that way by culture or trauma or lead poisoning. Maybe with the right upbringing or brain damage, we, too, could be axe murderers. Yet the fact remains—there are axe murderers, but we, and most people, are not like them.

So, are people good, or evil? Was Rousseau right, or Hobbes? Yes. Both. Neither. There is no one human nature; there are many human natures. We are capable of great good and great evil.

When we plan how to run a society, we must make it work the best we can with that in mind: We can assume that most people will be good most of the time—but we know that some people won’t, and we’d better be prepared for them as well.

Set out your donation boxes with confidence. But make sure they are locked.

“DSGE or GTFO”: Macroeconomics took a wrong turn somewhere

Dec 31, JDN 2458119

The state of macro is good,” wrote Oliver Blanchard—in August 2008. This is rather like the turkey who is so pleased with how the farmer has been feeding him lately, the day before Thanksgiving.

It’s not easy to say exactly where macroeconomics went wrong, but I think Paul Romer is right when he makes the analogy between DSGE (dynamic stochastic general equilbrium) models and string theory. They are mathematically complex and difficult to understand, and people can make their careers by being the only ones who grasp them; therefore they must be right! Nevermind if they have no empirical support whatsoever.

To be fair, DSGE models are at least a little better than string theory; they can at least be fit to real-world data, which is better than string theory can say. But being fit to data and actually predicting data are fundamentally different things, and DSGE models typically forecast no better than far simpler models without their bold assumptions. You don’t need to assume all this stuff about a “representative agent” maximizing a well-defined utility function, or an Euler equation (that doesn’t even fit the data), or this ever-proliferating list of “random shocks” that end up taking up all the degrees of freedom your model was supposed to explain. Just regressing the variables on a few years of previous values of each other (a “vector autoregression” or VAR) generally gives you an equally-good forecast. The fact that these models can be made to fit the data well if you add enough degrees of freedom doesn’t actually make them good models. As Von Neumann warned us, with enough free parameters, you can fit an elephant.

But really what bothers me is not the DSGE but the GTFO (“get the [expletive] out”); it’s not that DSGE models are used, but that it’s almost impossible to get published as a macroeconomic theorist using anything else. Defenders of DSGE typically don’t even argue anymore that it is good; they argue that there are no credible alternatives. They characterize their opponents as “dilettantes” who aren’t opposing DSGE because we disagree with it; no, it must be because we don’t understand it. (Also, regarding that post, I’d just like to note that I now officially satisfy the Athreya Axiom of Absolute Arrogance: I have passed my qualifying exams in a top-50 economics PhD program. Yet my enmity toward DSGE has, if anything, only intensified.)

Of course, that argument only makes sense if you haven’t been actively suppressing all attempts to formulate an alternative, which is precisely what DSGE macroeconomists have been doing for the last two or three decades. And yet despite this suppression, there are alternatives emerging, particularly from the empirical side. There are now empirical approaches to macroeconomics that don’t use DSGE models. Regression discontinuity methods and other “natural experiment” designs—not to mention actual experiments—are quickly rising in popularity as economists realize that these methods allow us to actually empirically test our models instead of just adding more and more mathematical complexity to them.

But there still seems to be a lingering attitude that there is no other way to do macro theory. This is very frustrating for me personally, because deep down I think what I would like to do as a career is macro theory: By temperament I have always viewed the world through a very abstract, theoretical lens, and the issues I care most about—particularly inequality, development, and unemployment—are all fundamentally “macro” issues. I left physics when I realized I would be expected to do string theory. I don’t want to leave economics now that I’m expected to do DSGE. But I also definitely don’t want to do DSGE.

Fortunately with economics I have a backup plan: I can always be an “applied micreconomist” (rather the opposite of a theoretical macroeconomist I suppose), directly attached to the data in the form of empirical analyses or even direct, randomized controlled experiments. And there certainly is plenty of work to be done along the lines of Akerlof and Roth and Shiller and Kahneman and Thaler in cognitive and behavioral economics, which is also generally considered applied micro. I was never going to be an experimental physicist, but I can be an experimental economist. And I do get to use at least some theory: In particular, there’s an awful lot of game theory in experimental economics these days. Some of the most exciting stuff is actually in showing how human beings don’t behave the way classical game theory predicts (particularly in the Ultimatum Game and the Prisoner’s Dilemma), and trying to extend game theory into something that would fit our actual behavior. Cognitive science suggests that the result is going to end up looking quite different from game theory as we know it, and with my cognitive science background I may be particularly well-positioned to lead that charge.

Still, I don’t think I’ll be entirely satisfied if I can’t somehow bring my career back around to macroeconomic issues, and particularly the great elephant in the room of all economics, which is inequality. Underlying everything from Marxism to Trumpism, from the surging rents in Silicon Valley and the crushing poverty of Burkina Faso, to the Great Recession itself, is inequality. It is, in my view, the central question of economics: Who gets what, and why?

That is a fundamentally macro question, but you can’t even talk about that issue in DSGE as we know it; a “representative agent” inherently smooths over all inequality in the economy as though total GDP were all that mattered. A fundamentally new approach to macroeconomics is needed. Hopefully I can be part of that, but from my current position I don’t feel much empowered to fight this status quo. Maybe I need to spend at least a few more years doing something else, making a name for myself, and then I’ll be able to come back to this fight with a stronger position.

In the meantime, I guess there’s plenty of work to be done on cognitive biases and deviations from game theory.

Why New Year’s resolutions fail

Jan 1, JDN 2457755

Last week’s post was on Christmas, so by construction this week’s post will be on New Year’s Day.

It is a tradition in many cultures, especially in the US and Europe, to start every new year with a New Year’s resolution, a promise to ourselves to change our behavior in some positive way.

Yet, over 80% of these resolutions fail. Why is this?

If we are honest, most of us would agree that there is something about our own behavior that could stand to be improved. So why do we so rarely succeed in actually making such improvements?

One possibility, which I’m guessing most neoclassical economists would favor, is to say that we don’t actually want to. We may pretend that we do in order to appease others, but ultimately our rational optimization has already chosen that we won’t actually bear the cost to make the improvement.

I think this is actually quite rare. I’ve seen too many people with resolutions they didn’t share with anyone, for example, to think that it’s all about social pressure. And I’ve seen far too many people try very hard to achieve their resolutions, day after day, and yet still fail.

Sometimes we make resolutions that are not entirely within our control, such as “get a better job” or “find a girlfriend” (last year I made a resolution to publish a work of commercial fiction or a peer-reviewed article—and alas, failed at that task, unless I somehow manage it in the next few days). Such resolutions may actually be unwise to make in the first place, as it can feel like breaking a promise to yourself when you’ve actually done all you possibly could.

So let’s set those aside and talk only about things we should be in control over, like “lose weight” or “save more money”. Even these kinds of resolutions typically fail; why? What is this “weakness of will”? How is it possible to really want something that you are in full control over, and yet still fail to accomplish it?

Well, first of all, I should be clear what I mean by “in full control over”. In some sense you’re not in full control, which is exactly the problem. Your conscious mind is not actually an absolute tyrant over your entire body; you’re more like an elected president who has to deal with a legislature in order to enact policy.

You do have a great deal of power over your own behavior, and you can learn to improve this control (much as real executive power in presidential democracies has expanded over the last century!); but there are fundamental limits to just how well you can actually consciously will your body to do anything, limits imposed by billions of years of evolution that established most of the traits of your body and nervous system millions of generations before there even was such a thing as rational conscious reasoning.

One thing that makes a surprisingly large difference lies in whether your goals are reduced to specific, actionable objectives. “Lose weight” is almost guaranteed to fail. “Lose 30 pounds” is still unlikely to succeed. “Work out for 2 hours per week,” on the other hand, might have a chance. “Save money” is never going to make it, but “move to a smaller apartment and set aside $200 per month” just might.

I think the government metaphor is helpful here; if you President of the United States and you want something done, do you state some vague, broad goal like “Improve the economy”? No, you make a specific, actionable demand that allows you to enforce compliance, like “increase infrastructure spending by 24% over the next 5 years”. Even then it is possible to fail if you can’t push it through the legislature (in the metaphor, the “legislature” is your habits, instincts and other subconscious processes), but you’re much more likely to succeed if you have a detailed plan.

Another technique that helps is to visualize the benefits of succeeding and the costs of failing, and keep these in your mind. This counteracts the tendency for the costs of succeeding and the benefits of giving up to be more salient—losing 30 pounds sounds nice in theory, but that treadmill is so much work right now!

This salience effect has a lot to do with the fact that human beings are terrible at dealing with the future.

Rationally, we are supposed to use exponential discounting; each successive moment is supposed to be worth less to us than the previous by a fixed proportion, say 5% per year. This is actually a mathematical theorem; if you don’t discount this way, your decisions will be systematically irrational.

And yet… we don’t discount that way. Some behavioral economists argue that we use hyperbolic discounting, in which instead of discounting time by a fixed proportion, we use a different formula that drops off too quickly early on and not quickly enough later on.

But I am increasingly convinced that human beings don’t actually use discounting at all. We have a series of rough-and-ready heuristics for making future judgments, which can sort of act like discounting, but require far less computation than actually calculating a proper discount rate. (Recent empirical evidence seems to be tilting this direction.)

In any case, whatever we do is clearly not a proper rational discount rate. And this means that our behavior can be time-inconsistent; a choice that seems rational at one time can not seem rational at a later time. When we’re planning out our year and saying we will hit the treadmill more, it seems like a good idea; but when we actually get to the gym and feel our legs ache as we start running, we begin to regret our decision.

The challenge, really, is determining which “version” of us is correct! A priori, we don’t actually know whether the view of our distant self contemplating the future or the view of our current self making the choice in the moment is the right one. Actually, when I frame it this way, it almost seems like the self that’s closer to the choice should have better information—and yet typically we think the exact opposite, that it is our past self making plans that really knows what’s best for us.

So where does that come from? Why do we think, at least in most cases, that the “me” which makes a plan a year in advance is the smart one, and the “me” that actually decides in the moment is untrustworthy.

Kahneman has a good explanation for this, in his model of System 1 and System 2. System 1 is simple and fast, but often gets the wrong answer. System 2 usually gets the right answer, but it is complex and slow. When we are making plans, we have a lot of time to think, and we can afford to expend the extra effort to engage the full power of System 2. But when we are living in the moment, choosing what to do right now, we don’t have that luxury of time, and we are forced to fall back on System 1. System 1 is easier—but it’s also much more likely to be wrong.

How, then, do we resolve this conflict? Commitment. (Perhaps that’s why it’s called a New Year’s resolution!)

We make promises to ourselves, commitments that we will feel bad about not following through.

If we rationally discounted, this would be a baffling thing to do; we’re just imposing costs on ourselves for no reason. But because we don’t discount rationally, commitments allow us to change the calculation for our future selves.

This brings me to one last strategy to use when making your resolutions: Include punishment.

“I will work out at least 2 hours per week, and if I don’t, I’m not allowed to watch TV all weekend.” Now that is a resolution you are actually likely to keep.

To see why, consider the decision problem for your System 2 self today versus your System 1 self throughout the year.

Your System 2 self has done the cost-benefit analysis and ruled that working out 2 hours per week is worthwhile for its health benefits.

If you left it at that, your System 1 self would each day find an excuse to procrastinate the workouts, because at least from where they’re sitting, working out for 2 hours looks a lot more painful than the marginal loss in health from missing just this one week. And of course this will keep happening, week after week—and then 52 go by and you’ve had few if any workouts.

But by adding the punishment of “no TV”, you have imposed an additional cost on your System 1 self, something that they care about. Suddenly the calculation changes; it’s not just 2 hours of workout weighed against vague long-run health benefits, but 2 hours of workout weighed against no TV all weekend. That punishment is surely too much to bear; so you’d best do the workout after all.

Do it right, and you will rarely if ever have to impose the punishment. But don’t make it too large, or then it will seem unreasonable and you won’t want to enforce it if you ever actually need to. Your System 1 self will then know this, and treat the punishment as nonexistent. (Formally the equilibrium is not subgame perfect; I am gravely concerned that our nuclear deterrence policy suffers from precisely this flaw.) “If I don’t work out, I’ll kill myself” is a recipe for depression, not healthy exercise habits.

But if you set clear, actionable objectives and sufficient but reasonable punishments, there’s at least a good chance you will actually be in the minority of people who actually succeed in keeping their New Year’s resolution.

And if not, there’s always next year.

Experimentally testing categorical prospect theory

Dec 4, JDN 2457727

In last week’s post I presented a new theory of probability judgments, which doesn’t rely upon people performing complicated math even subconsciously. Instead, I hypothesize that people try to assign categories to their subjective probabilities, and throw away all the information that wasn’t used to assign that category.

The way to most clearly distinguish this from cumulative prospect theory is to show discontinuity. Kahneman’s smooth, continuous function places fairly strong bounds on just how much a shift from 0% to 0.000001% can really affect your behavior. In particular, if you want to explain the fact that people do seem to behave differently around 10% compared to 1% probabilities, you can’t allow the slope of the smooth function to get much higher than 10 at any point, even near 0 and 1. (It does depend on the precise form of the function, but the more complicated you make it, the more free parameters you add to the model. In the most parsimonious form, which is a cubic polynomial, the maximum slope is actually much smaller than this—only 2.)

If that’s the case, then switching from 0.% to 0.0001% should have no more effect in reality than a switch from 0% to 0.00001% would to a rational expected utility optimizer. But in fact I think I can set up scenarios where it would have a larger effect than a switch from 0.001% to 0.01%.

Indeed, these games are already quite profitable for the majority of US states, and they are called lotteries.

Rationally, it should make very little difference to you whether your odds of winning the Powerball are 0 (you bought no ticket) or 0.000000001% (you bought a ticket), even when the prize is $100 million. This is because your utility of $100 million is nowhere near 100 million times as large as your marginal utility of $1. A good guess would be that your lifetime income is about $2 million, your utility is logarithmic, the units of utility are hectoQALY, and the baseline level is about 100,000.

I apologize for the extremely large number of decimals, but I had to do that in order to show any difference at all. I have bolded where the decimals first deviate from the baseline.

Your utility if you don’t have a ticket is ln(20) = 2.9957322736 hQALY.

Your utility if you have a ticket is (1-10^-9) ln(20) + 10^-9 ln(1020) = 2.9957322775 hQALY.

You gain a whopping 40 microQALY over your whole lifetime. I highly doubt you could even perceive such a difference.

And yet, people are willing to pay nontrivial sums for the chance to play such lotteries. Powerball tickets sell for about $2 each, and some people buy tickets every week. If you do that and live to be 80, you will spend some $8,000 on lottery tickets during your lifetime, which results in this expected utility: (1-4*10^-6) ln(20-0.08) + 4*10^-6 ln(1020) = 2.9917399955 hQALY.
You have now sacrificed 0.004 hectoQALY, which is to say 0.4 QALY—that’s months of happiness you’ve given up to play this stupid pointless game.

Which shouldn’t be surprising, as (with 99.9996% probability) you have given up four months of your lifetime income with nothing to show for it. Lifetime income of $2 million / lifespan of 80 years = $25,000 per year; $8,000 / $25,000 = 0.32. You’ve actually sacrificed slightly more than this, which comes from your risk aversion.

Why would anyone do such a thing? Because while the difference between 0 and 10^-9 may be trivial, the difference between “impossible” and “almost impossible” feels enormous. “You can’t win if you don’t play!” they say, but they might as well say “You can’t win if you do play either.” Indeed, the probability of winning without playing isn’t zero; you could find a winning ticket lying on the ground, or win due to an error that is then upheld in court, or be given the winnings bequeathed by a dying family member or gifted by an anonymous donor. These are of course vanishingly unlikely—but so was winning in the first place. You’re talking about the difference between 10^-9 and 10^-12, which in proportional terms sounds like a lot—but in absolute terms is nothing. If you drive to a drug store every week to buy a ticket, you are more likely to die in a car accident on the way to the drug store than you are to win the lottery.

Of course, these are not experimental conditions. So I need to devise a similar game, with smaller stakes but still large enough for people’s brains to care about the “almost impossible” category; maybe thousands? It’s not uncommon for an economics experiment to cost thousands, it’s just usually paid out to many people instead of randomly to one person or nobody. Conducting the experiment in an underdeveloped country like India would also effectively amplify the amounts paid, but at the fixed cost of transporting the research team to India.

But I think in general terms the experiment could look something like this. You are given $20 for participating in the experiment (we treat it as already given to you, to maximize your loss aversion and endowment effect and thereby give us more bang for our buck). You then have a chance to play a game, where you pay $X to get a P probability of $Y*X, and we vary these numbers.

The actual participants wouldn’t see the variables, just the numbers and possibly the rules: “You can pay $2 for a 1% chance of winning $200. You can also play multiple times if you wish.” “You can pay $10 for a 5% chance of winning $250. You can only play once or not at all.”

So I think the first step is to find some dilemmas, cases where people feel ambivalent, and different people differ in their choices. That’s a good role for a pilot study.

Then we take these dilemmas and start varying their probabilities slightly.

In particular, we try to vary them at the edge of where people have mental categories. If subjective probability is continuous, a slight change in actual probability should never result in a large change in behavior, and furthermore the effect of a change shouldn’t vary too much depending on where the change starts.

But if subjective probability is categorical, these categories should have edges. Then, when I present you with two dilemmas that are on opposite sides of one of the edges, your behavior should radically shift; while if I change it in a different way, I can make a large change without changing the result.

Based solely on my own intuition, I guessed that the categories roughly follow this pattern:

Impossible: 0%

Almost impossible: 0.1%

Very unlikely: 1%

Unlikely: 10%

Fairly unlikely: 20%

Roughly even odds: 50%

Fairly likely: 80%

Likely: 90%

Very likely: 99%

Almost certain: 99.9%

Certain: 100%

So for example, if I switch from 0%% to 0.01%, it should have a very large effect, because I’ve moved you out of your “impossible” category (indeed, I think the “impossible” category is almost completely sharp; literally anything above zero seems to be enough for most people, even 10^-9 or 10^-10). But if I move from 1% to 2%, it should have a small effect, because I’m still well within the “very unlikely” category. Yet the latter change is literally one hundred times larger than the former. It is possible to define continuous functions that would behave this way to an arbitrary level of approximation—but they get a lot less parsimonious very fast.

Now, immediately I run into a problem, because I’m not even sure those are my categories, much less that they are everyone else’s. If I knew precisely which categories to look for, I could tell whether or not I had found it. But the process of both finding the categories and determining if their edges are truly sharp is much more complicated, and requires a lot more statistical degrees of freedom to get beyond the noise.

One thing I’m considering is assigning these values as a prior, and then conducting a series of experiments which would adjust that prior. In effect I would be using optimal Bayesian probability reasoning to show that human beings do not use optimal Bayesian probability reasoning. Still, I think that actually pinning down the categories would require a large number of participants or a long series of experiments (in frequentist statistics this distinction is vital; in Bayesian statistics it is basically irrelevant—one of the simplest reasons to be Bayesian is that it no longer bothers you whether someone did 2 experiments of 100 people or 1 experiment of 200 people, provided they were the same experiment of course). And of course there’s always the possibility that my theory is totally off-base, and I find nothing; a dissertation replicating cumulative prospect theory is a lot less exciting (and, sadly, less publishable) than one refuting it.

Still, I think something like this is worth exploring. I highly doubt that people are doing very much math when they make most probabilistic judgments, and using categories would provide a very good way for people to make judgments usefully with no math at all.

How do people think about probability?

Nov 27, JDN 2457690

(This topic was chosen by vote of my Patreons.)

In neoclassical theory, it is assumed (explicitly or implicitly) that human beings judge probability in something like the optimal Bayesian way: We assign prior probabilities to events, and then when confronted with evidence we infer using the observed data to update our prior probabilities to posterior probabilities. Then, when we have to make decisions, we maximize our expected utility subject to our posterior probabilities.

This, of course, is nothing like how human beings actually think. Even very intelligent, rational, numerate people only engage in a vague approximation of this behavior, and only when dealing with major decisions likely to affect the course of their lives. (Yes, I literally decide which universities to attend based upon formal expected utility models. Thus far, I’ve never been dissatisfied with a decision made that way.) No one decides what to eat for lunch or what to do this weekend based on formal expected utility models—or at least I hope they don’t, because that point the computational cost far exceeds the expected benefit.

So how do human beings actually think about probability? Well, a good place to start is to look at ways in which we systematically deviate from expected utility theory.

A classic example is the Allais paradox. See if it applies to you.

In game A, you get $1 million dollars, guaranteed.
In game B, you have a 10% chance of getting $5 million, an 89% chance of getting $1 million, but now you have a 1% chance of getting nothing.

Which do you prefer, game A or game B?

In game C, you have an 11% chance of getting $1 million, and an 89% chance of getting nothing.

In game D, you have a 10% chance of getting $5 million, and a 90% chance of getting nothing.

Which do you prefer, game C or game D?

I have to think about it for a little bit and do some calculations, and it’s still very hard because it depends crucially on my projected lifetime income (which could easily exceed $3 million with a PhD, especially in economics) and the precise form of my marginal utility (I think I have constant relative risk aversion, but I’m not sure what parameter to use precisely), but in general I think I want to choose game A and game C, but I actually feel really ambivalent, because it’s not hard to find plausible parameters for my utility where I should go for the gamble.

But if you’re like most people, you choose game A and game D.

There is no coherent expected utility by which you would do this.

Why? Either a 10% chance of $5 million instead of $1 million is worth risking a 1% chance of nothing, or it isn’t. If it is, you should play B and D. If it’s not, you should play A and C. I can’t tell you for sure whether it is worth it—I can’t even fully decide for myself—but it either is or it isn’t.

Yet most people have a strong intuition that they should take game A but game D. Why? What does this say about how we judge probability?
The leading theory in behavioral economics right now is cumulative prospect theory, developed by the great Kahneman and Tversky, who essentially founded the field of behavioral economics. It’s quite intimidating to try to go up against them—which is probably why we should force ourselves to do it. Fear of challenging the favorite theories of the great scientists before us is how science stagnates.

I wrote about it more in a previous post, but as a brief review, cumulative prospect theory says that instead of judging based on a well-defined utility function, we instead consider gains and losses as fundamentally different sorts of thing, and in three specific ways:

First, we are loss-averse; we feel a loss about twice as intensely as a gain of the same amount.

Second, we are risk-averse for gains, but risk-seeking for losses; we assume that gaining twice as much isn’t actually twice as good (which is almost certainly true), but we also assume that losing twice as much isn’t actually twice as bad (which is almost certainly false and indeed contradictory with the previous).

Third, we judge probabilities as more important when they are close to certainty. We make a large distinction between a 0% probability and a 0.0000001% probability, but almost no distinction at all between a 41% probability and a 43% probability.

That last part is what I want to focus on for today. In Kahneman’s model, this is a continuous, monotonoic function that maps 0 to 0 and 1 to 1, but systematically overestimates probabilities below but near 1/2 and systematically underestimates probabilities above but near 1/2.

It looks something like this, where red is true probability and blue is subjective probability:

cumulative_prospect
I don’t believe this is actually how humans think, for two reasons:

  1. It’s too hard. Humans are astonishingly innumerate creatures, given the enormous processing power of our brains. It’s true that we have some intuitive capacity for “solving” very complex equations, but that’s almost all within our motor system—we can “solve a differential equation” when we catch a ball, but we have no idea how we’re doing it. But probability judgments are often made consciously, especially in experiments like the Allais paradox; and the conscious brain is terrible at math. It’s actually really amazing how bad we are at math. Any model of normal human judgment should assume from the start that we will not do complicated math at any point in the process. Maybe you can hypothesize that we do so subconsciously, but you’d better have a good reason for assuming that.
  2. There is no reason to do this. Why in the world would any kind of optimization system function this way? You start with perfectly good probabilities, and then instead of using them, you subject them to some bizarre, unmotivated transformation that makes them less accurate and costs computing power? You may as well hit yourself in the head with a brick.

So, why might it look like we are doing this? Well, my proposal, admittedly still rather half-baked, is that human beings don’t assign probabilities numerically at all; we assign them categorically.

You may call this, for lack of a better term, categorical prospect theory.

My theory is that people don’t actually have in their head “there is an 11% chance of rain today” (unless they specifically heard that from a weather report this morning); they have in their head “it’s fairly unlikely that it will rain today”.

That is, we assign some small number of discrete categories of probability, and fit things into them. I’m not sure what exactly the categories are, and part of what makes my job difficult here is that they may be fuzzy-edged and vary from person to person, but roughly speaking, I think they correspond to the sort of things psychologists usually put on Likert scales in surveys: Impossible, almost impossible, very unlikely, unlikely, fairly unlikely, roughly even odds, fairly likely, likely, very likely, almost certain, certain. If I’m putting numbers on these probability categories, they go something like this: 0, 0.001, 0.01, 0.10, 0.20, 0.50, 0.8, 0.9, 0.99, 0.999, 1.

Notice that this would preserve the same basic effect as cumulative prospect theory: You care a lot more about differences in probability when they are near 0 or 1, because those are much more likely to actually shift your category. Indeed, as written, you wouldn’t care about a shift from 0.4 to 0.6 at all, despite caring a great deal about a shift from 0.001 to 0.01.

How does this solve the above problems?

  1. It’s easy. Not only don’t you compute a probability and then recompute it for no reason; you never even have to compute it precisely. Just get it within some vague error bounds and that will tell you what box it goes in. Instead of computing an approximation to a continuous function, you just slot things into a small number of discrete boxes, a dozen at the most.
  2. That explains why we would do it: It’s easy. Our brains need to conserve their capacity, and they did especially in our ancestral environment when we struggled to survive. Rather than having to iterate your approximation to arbitrary precision, you just get within 0.1 or so and call it a day. That saves time and computing power, which saves energy, which could save your life.

What new problems have I introduced?

  1. It’s very hard to know exactly where people’s categories are, if they vary between individuals or even between situations, and whether they are fuzzy-edged.
  2. If you take the model I just gave literally, even quite large probability changes will have absolutely no effect as long as they remain within a category such as “roughly even odds”.

With regard to 2, I think Kahneman may himself be able to save me, with his dual process theory concept of System 1 and System 2. What I’m really asserting is that System 1, the fast, intuitive judgment system, operates on these categories. System 2, on the other hand, the careful, rational thought system, can actually make use of proper numerical probabilities; it’s just very costly to boot up System 2 in the first place, much less ensure that it actually gets the right answer.

How might we test this? Well, I think that people are more likely to use System 1 when any of the following are true:

  1. They are under harsh time-pressure
  2. The decision isn’t very important
  3. The intuitive judgment is fast and obvious

And conversely they are likely to use System 2 when the following are true:

  1. They have plenty of time to think
  2. The decision is very important
  3. The intuitive judgment is difficult or unclear

So, it should be possible to arrange an experiment varying these parameters, such that in one treatment people almost always use System 1, and in another they almost always use System 2. And then, my prediction is that in the System 1 treatment, people will in fact not change their behavior at all when you change the probability from 15% to 25% (fairly unlikely) or 40% to 60% (roughly even odds).

To be clear, you can’t just present people with this choice between game E and game F:

Game E: You get a 60% chance of $50, and a 40% chance of nothing.

Game F: You get a 40% chance of $50, and a 60% chance of nothing.

People will obviously choose game E. If you can directly compare the numbers and one game is strictly better in every way, I think even without much effort people will be able to choose correctly.

Instead, what I’m saying is that if you make the following offers to two completely different sets of people, you will observe little difference in their choices, even though under expected utility theory you should.
Group I receives a choice between game E and game G:

Game E: You get a 60% chance of $50, and a 40% chance of nothing.

Game G: You get a 100% chance of $20.

Group II receives a choice between game F and game G:

Game F: You get a 40% chance of $50, and a 60% chance of nothing.

Game G: You get a 100% chance of $20.

Under two very plausible assumptions about marginal utility of wealth, I can fix what the rational judgment should be in each game.

The first assumption is that marginal utility of wealth is decreasing, so people are risk-averse (at least for gains, which these are). The second assumption is that most people’s lifetime income is at least two orders of magnitude higher than $50.

By the first assumption, group II should choose game G. The expected income is precisely the same, and being even ever so slightly risk-averse should make you go for the guaranteed $20.

By the second assumption, group I should choose game E. Yes, there is some risk, but because $50 should not be a huge sum to you, your risk aversion should be small and the higher expected income of $30 should sway you.

But I predict that most people will choose game G in both cases, and (within statistical error) the same proportion will choose F as chose E—thus showing that the difference between a 40% chance and a 60% chance was in fact negligible to their intuitive judgments.

However, this doesn’t actually disprove Kahneman’s theory; perhaps that part of the subjective probability function is just that flat. For that, I need to set up an experiment where I show discontinuity. I need to find the edge of a category and get people to switch categories sharply. Next week I’ll talk about how we might pull that off.

What is the price of time?

JDN 2457562

If they were asked outright, “What is the price of time?” most people would find that it sounds nonsensical, like I’ve asked you “What is the diameter of calculus?” or “What is the electric charge of justice?” (It’s interesting that we generally try to assign meaning to such nonsensical questions, and they often seem strangely profound when we do; a good deal of what passes for “profound wisdom” is really better explained as this sort of reaction to nonsense. Deepak Chopra, for instance.)

But there is actually a quite sensible economic meaning of this question, and answering it turns out to have many important implications for how we should run our countries and how we should live our lives.

What we are really asking for is temporal discounting; we want to know how much more money today is worth compared to tomorrow, and how much more money tomorrow is worth compared to two days from now.

If you say that they are exactly the same, your discount rate (your “price of time”) is zero; if that is indeed how you feel, may I please borrow your entire net wealth at 0% interest for the next thirty years? If you like we can even inflation-index the interest rate so it always produces a real interest rate of zero, thus protecting you from potential inflation risk.
What? You don’t like my deal? You say you need that money sooner? Then your discount rate is not zero. Similarly, it can’t be negative; if you actually valued money tomorrow more than money today, you’d gladly give me my loan.

Money today is worth more to you than money tomorrow—the only question is how much more.

There’s a very simple theorem which says that as long as your temporal discounting doesn’t change over time, so it is dynamically consistent, it must have a very specific form. I don’t normally use math this advanced in my blog, but this one is so elegant I couldn’t resist. I’ll encase it in blockquotes so you can skim over it if you must.

The value of $1 today relative to… today is of course 1; f(0) = 1.

If you are dynamically consistent, at any time t you should discount tomorrow relative to today the same as you discounted today relative to yesterday, so for all t, f(t+1)/f(t) = f(t)/f(t-1)
Thus, f(t+1)/f(t) is independent of t, and therefore equal to some constant, which we can call r:

f(t+1)/f(t) = r, which implies f(t+1) = r f(t).

Starting at f(0) = 1, we have:

f(0) = 1, f(1) = r, f(2) = r^2

We can prove that this pattern continues to hold by mathematical induction.

Suppose the following is true for some integer k; we already know it works for k = 1:

f(k) = r^k

Let t = k:

f(k+1) = r f(k)

Therefore:

f(k+1) = r^(k+1)

Which by induction proves that for all integers n:

f(n) = r^n

The name of the variable doesn’t matter. Therefore:

f(t) = r^t

Whether you agree with me that this is beautiful, or you have no idea what I just said, the take-away is the same: If your discount rate is consistent over time, it must be exponential. There must be some constant number 0 < r < 1 such that each successive time period is worth r times as much as the previous. (You can also generalize this to the case of continuous time, where instead of r^t you get e^(-r t). This requires even more advanced math, so I’ll spare you.)

Most neoclassical economists would stop right there. But there are two very big problems with this argument:

(1) It doesn’t tell us the value r should actually be, only that it should be a constant.

(2) No actual human being thinks of time this way.

There is still ongoing research as to exactly how real human beings discount time, but one thing is quite clear from the experiments: It certainly isn’t exponential.

From about 2000 to 2010, the consensus among cognitive economists was that humans discount time hyperbolically; that is, our discount function looks like this:

f(t) = 1/(1 + r t)

In the 1990s there were a couple of experiments supporting hyperbolic discounting. There is even some theoretical work trying to show that this is actually optimal, given a certain kind of uncertainty about the future, and the argument for exponential discounting relies upon certainty we don’t actually have. Hyperbolic discounting could also result if we were reasoning as though we are given a simple interest rate, rather than a compound interest rate.

But even that doesn’t really seem like humans think, now does it? It’s already weird enough for someone to say “Should I take out this loan at 5%? Well, my discount rate is 7%, so yes.” But I can at least imagine that happening when people are comparing two different interest rates (“Should I pay down my student loans, or my credit cards?”). But I can’t imagine anyone thinking, “Should I take out this loan at 5% APR which I’d need to repay after 5 years? Well, let’s check my discount function, 1/(1+0.05 (5)) = 0.8, multiplied by 1.05^5 = 1.28, the product of which is 1.02, greater than 1, so no, I shouldn’t.” That isn’t how human brains function.

Moreover, recent experiments have shown that people often don’t seem to behave according to what hyperbolic discounting would predict.

Therefore I am very much in the other camp of cognitive economists, who say that we don’t have a well-defined discount function. It’s not exponential, it’s not hyperbolic, it’s not “quasi-hyperbolic” (yes that is a thing); we just don’t have one. We reason about time by simple heuristics. You can’t make a coherent function out of it because human beings… don’t always reason coherently.

Some economists seem to have an incredible amount of trouble accepting that; here we have one from the University of Chicago arguing that hyperbolic discounting can’t possibly exist, because then people could be Dutch-booked out of all their money; but this amounts to saying that human behavior cannot ever be irrational, lest all our money magically disappear. Yes, we know hyperbolic discounting (and heuristics) allow for Dutch-booking; that’s why they’re irrational. If you really want to know the formal assumption this paper makes that is wrong, it assumes that we have complete markets—and yes, complete markets essentially force you to be perfectly rational or die, because the slightest inconsistency in your reasoning results in someone convincing you to bet all your money on a sure loss. Why was it that we wanted complete markets, again? (Oh, yes, the fanciful Arrow-Debreu model, the magical fairy land where everyone is perfectly rational and all markets are complete and we all have perfect information and the same amount of wealth and skills and the same preferences, where everything automatically achieves a perfect equilibrium.)

There was a very good experiment on this, showing that rather than discount hyperbolically, behavior is better explained by a heuristic that people judge which of two options is better by a weighted sum of the absolute distance in time plus the relative distance in time. Now that sounds like something human beings might actually do. “$100 today or $110 tomorrow? That’s only 1 day away, but it’s also twice as long. I’m not waiting.” “$100 next year, or $110 in a year and a day? It’s only 1 day apart, and it’s only slightly longer, so I’ll wait.”

That might not actually be the precise heuristic we use, but it at least seems like one that people could use.

John Duffy, whom I hope to work with at UCI starting this fall, has been working on another experiment to test a different heuristic, based on the work of Daniel Kahneman, saying essentially that we have a fast, impulsive, System 1 reasoning layer and a slow, deliberative, System 2 reasoning layer; the result is that our judgments combine both “hand to mouth” where our System 1 essentially tries to get everything immediately and spend whatever we can get our hands on, and a more rational assessment by System 2 that might actually resemble an exponential discount rate. In the 5-minute judgment, System 1’s voice is overwhelming; but if we’re already planning a year out, System 1 doesn’t even care anymore and System 2 can take over. This model also has the nice feature of explaining why people with better self-control seem to behave more like they use exponential discounting,[PDF link] and why people do on occasion reason more or less exponentially, while I have literally never heard anyone try to reason hyperbolically, only economic theorists trying to use hyperbolic models to explain behavior.

Another theory is that discounting is “subadditive”, that is, if you break up a long time interval into many short intervals, people will discount it more, because it feels longer that way. Imagine a century. Now imagine a year, another year, another year, all the way up to 100 years. Now imagine a day, another day, another day, all the way up to 365 days for the first year, and then 365 days for the second year, and that on and on up to 100 years. It feels longer, doesn’t it? It is of course exactly the same. This can account for some weird anomalies in choice behavior, but I’m not convinced it’s as good as the two-system model.

Another theory is that we simply have a “present bias”, which we treat as a sort of fixed cost that we incur regardless of what the payments are. I like this because it is so supremely simple, but there’s something very fishy about it, because in this experiment it was just fixed at $4, and that can’t be right. It must be fixed at some proportion of the rewards, or something like that; or else we would always exhibit near-perfect exponential discounting for large amounts of money, which is more expensive to test (quite directly), but still seems rather unlikely.

Why is this important? This post is getting long, so I’ll save it for future posts, but in short, the ways that we value future costs and benefits, both as we actually do, and as we ought to, have far-reaching implications for everything from inflation to saving to environmental sustainability.

Prospect Theory: Why we buy insurance and lottery tickets

JDN 2457061 PST 14:18.

Today’s topic is called prospect theory. Prospect theory is basically what put cognitive economics on the map; it was the knock-down argument that Kahneman used to show that human beings are not completely rational in their economic decisions. It all goes back to a 1979 paper by Kahneman and Tversky that now has 34000 citations (yes, we’ve been having this argument for a rather long time now). In the 1990s it was refined into cumulative prospect theory, which is more mathematically precise but basically the same idea.

What was that argument? People buy both insurance and lottery tickets.

The “both” is very important. Buying insurance can definitely be rational—indeed, typically is. Buying lottery tickets could theoretically be rational, under very particular circumstances. But they cannot both be rational at the same time.

To see why, let’s talk some more about marginal utility of wealth. Recall that a dollar is not worth the same to everyone; to a billionaire a dollar is a rounding error, to most of us it is a bottle of Coke, but to a starving child in Ghana it could be life itself. We typically observe diminishing marginal utility of wealth—the more money you have, the less another dollar is worth to you.

If we sketch a graph of your utility versus wealth it would look something like this:

Marginal_utility_wealth

Notice how it increases as your wealth increases, but at a rapidly diminishing rate.

If you have diminishing marginal utility of wealth, you are what we call risk-averse. If you are risk-averse, you’ll (sometimes) want to buy insurance. Let’s suppose the units on that graph are tens of thousands of dollars. Suppose you currently have an income of $50,000. You are offered the chance to pay $10,000 a year to buy unemployment insurance, so that if you lose your job, instead of making $10,000 on welfare you’ll make $30,000 on unemployment. You think you have about a 20% chance of losing your job.

If you had constant marginal utility of wealth, this would not be a good deal for you. Your expected value of money would be reduced if you buy the insurance: Before you had an 80% chance of $50,000 and a 20% chance of $10,000 so your expected amount of money is $42,000. With the insurance you have an 80% chance of $40,000 and a 20% chance of $30,000 so your expected amount of money is $38,000. Why would you take such a deal? That’s like giving up $4,000 isn’t it?

Well, let’s look back at that utility graph. At $50,000 your utility is 1.80, uh… units, er… let’s say QALY. 1.80 QALY per year, meaning you live 80% better than the average human. Maybe, I guess? Doesn’t seem too far off. In any case, the units of measurement aren’t that important.

Insurance_options

By buying insurance your effective income goes down to $40,000 per year, which lowers your utility to 1.70 QALY. That’s a fairly significant hit, but it’s not unbearable. If you lose your job (20% chance), you’ll fall down to $30,000 and have a utility of 1.55 QALY. Again, noticeable, but bearable. Your overall expected utility with insurance is therefore 1.67 QALY.

But what if you don’t buy insurance? Well then you have a 20% chance of taking a big hit and falling all the way down to $10,000 where your utility is only 1.00 QALY. Your expected utility is therefore only 1.64 QALY. You’re better off going with the insurance.

And this is how insurance companies make a profit (well; the legitimate way anyway; they also like to gouge people and deny cancer patients of course); on average, they make more from each customer than they pay out, but customers are still better off because they are protected against big losses. In this case, the insurance company profits $4,000 per customer per year, customers each get 30 milliQALY per year (about the same utility as an extra $2,000 more or less), everyone is happy.

But if this is your marginal utility of wealth—and it most likely is, approximately—then you would never want to buy a lottery ticket. Let’s suppose you actually have pretty good odds; it’s a 1 in 1 million chance of $1 million for a ticket that costs $2. This means that the state is going to take in about $2 million for every $1 million they pay out to a winner.

That’s about as good as your odds for a lottery are ever going to get; usually it’s more like a 1 in 400 million chance of $150 million for $1, which is an even bigger difference than it sounds, because $150 million is nowhere near 150 times as good as $1 million. It’s a bit better from the state’s perspective though, because they get to receive $400 million for every $150 million they pay out.

For your convenience I have zoomed out the graph so that you can see 100, which is an income of $1 million (which you’ll have this year if you win; to get it next year, you’ll have to play again). You’ll notice I did not have to zoom out the vertical axis, because 20 times as much money only ends up being about 2 times as much utility. I’ve marked with lines the utility of $50,000 (1.80, as we said before) versus $1 million (3.30).

Lottery_utility

What about the utility of $49,998 which is what you’ll have if you buy the ticket and lose? At this number of decimal places you can’t see the difference, so I’ll need to go out a few more. At $50,000 you have 1.80472 QALY. At $49,998 you have 1.80470 QALY. That $2 only costs you 0.00002 QALY, 20 microQALY. Not much, really; but of course not, it’s only $2.

How much does the 1 in 1 million chance of $1 million give you? Even less than that. Remember, the utility gain for going from $50,000 to $1 million is only 1.50 QALY. So you’re adding one one-millionth of that in expected utility, which is of course 1.5 microQALY, or 0.0000015 QALY.

That $2 may not seem like it’s worth much, but that 1 in 1 million chance of $1 million is worth less than one tenth as much. Again, I’ve tried to make these figures fairly realistic; they are by no means exact (I don’t actually think $49,998 corresponds to exactly 1.804699 QALY), but the order of magnitude difference is right. You gain about ten times as much utility from spending that $2 on something you want than you do on taking the chance at $1 million.

I said before that it is theoretically possible for you to have a utility function for which the lottery would be rational. For that you’d need to have increasing marginal utility of wealth, so that you could be what we call risk-seeking. Your utility function would have to look like this:

Weird_utility

There’s no way marginal utility of wealth looks like that. This would be saying that it would hurt Bill Gates more to lose $1 than it would hurt a starving child in Ghana, which makes no sense at all. (It certainly would makes you wonder why he’s so willing to give it to them.) So frankly even if we didn’t buy insurance the fact that we buy lottery tickets would already look pretty irrational.

But in order for it to be rational to buy both lottery tickets and insurance, our utility function would have to be totally nonsensical. Maybe it could look like this or something; marginal utility decreases normally for awhile, and then suddenly starts going upward again for no apparent reason:

Weirder_utility

Clearly it does not actually look like that. Not only would this mean that Bill Gates is hurt more by losing $1 than the child in Ghana, we have this bizarre situation where the middle class are the people who have the lowest marginal utility of wealth in the world. Both the rich and the poor would need to have higher marginal utility of wealth than we do. This would mean that apparently yachts are just amazing and we have no idea. Riding a yacht is the pinnacle of human experience, a transcendence beyond our wildest imaginings; and riding a slightly bigger yacht is even more amazing and transcendent. Love and the joy of a life well-lived pale in comparison to the ecstasy of adding just one more layer of gold plate to your Ferrari collection.

Where increasing marginal utility is ridiculous, this is outright special pleading. You’re just making up bizarre utility functions that perfectly line up with whatever behavior people happen to have so that you can still call it rational. It’s like saying, “It could be perfectly rational! Maybe he enjoys banging his head against the wall!”

Kahneman and Tversky had a better idea. They realized that human beings aren’t so great at assessing probability, and furthermore tend not to think in terms of total amounts of wealth or annual income at all, but in terms of losses and gains. Through a series of clever experiments they showed that we are not so much risk-averse as we are loss-averse; we are actually willing to take more risk if it means that we will be able to avoid a loss.

In effect, we seem to be acting as if our utility function looks like this, where the zero no longer means “zero income”, it means “whatever we have right now“:

Prospect_theory

We tend to weight losses about twice as much as gains, and we tend to assume that losses also diminish in their marginal effect the same way that gains do. That is, we would only take a 50% chance to lose $1000 if it meant a 50% chance to gain $2000; but we’d take a 10% chance at losing $10,000 to save ourselves from a guaranteed loss of $1000.

This can explain why we buy insurance, provided that you frame it correctly. One of the things about prospect theory—and about human behavior in general—is that it exhibits framing effects: The answer we give depends upon the way you ask the question. That’s so totally obviously irrational it’s honestly hard to believe that we do it; but we do, and sometimes in really important situations. Doctors—doctors—will decide a moral dilemma differently based on whether you describe it as “saving 400 out of 600 patients” or “letting 200 out of 600 patients die”.

In this case, you need to frame insurance as the default option, and not buying insurance as an extra risk you are taking. Then saving money by not buying insurance is a gain, and therefore less important, while a higher risk of a bad outcome is a loss, and therefore important.

If you frame it the other way, with not buying insurance as the default option, then buying insurance is taking a loss by making insurance payments, only to get a gain if the insurance pays out. Suddenly the exact same insurance policy looks less attractive. This is a big part of why Obamacare has been effective but unpopular. It was set up as a fine—a loss—if you don’t buy insurance, rather than as a bonus—a gain—if you do buy insurance. The latter would be more expensive, but we could just make it up by taxing something else; and it might have made Obamacare more popular, because people would see the government as giving them something instead of taking something away. But the fine does a better job of framing insurance as the default option, so it motivates more people to actually buy insurance.

But even that would still not be enough to explain how it is rational to buy lottery tickets (Have I mentioned how it’s really not a good idea to buy lottery tickets?), because buying a ticket is a loss and winning the lottery is a gain. You actually have to get people to somehow frame not winning the lottery as a loss, making winning the default option despite the fact that it is absurdly unlikely. But I have definitely heard people say things like this: “Well if my numbers come up and I didn’t play that week, how would I feel then?” Pretty bad, I’ll grant you. But how much you wanna bet that never happens? (They’ll bet… the price of the ticket, apparently.)

In order for that to work, people either need to dramatically overestimate the probability of winning, or else ignore it entirely. Both of those things totally happen.

First, we overestimate the probability of rare events and underestimate the probability of common events—this is actually the part that makes it cumulative prospect theory instead of just regular prospect theory. If you make a graph of perceived probability versus actual probability, it looks like this:

cumulative_prospect

We don’t make much distinction between 40% and 60%, even though that’s actually pretty big; but we make a huge distinction between 0% and 0.00001% even though that’s actually really tiny. I think we basically have categories in our heads: “Never, almost never, rarely, sometimes, often, usually, almost always, always.” Moving from 0% to 0.00001% is going from “never” to “almost never”, but going from 40% to 60% is still in “often”. (And that for some reason reminded me of “Well, hardly ever!”)

But that’s not even the worst of it. After all that work to explain how we can make sense of people’s behavior in terms of something like a utility function (albeit a distorted one), I think there’s often a simpler explanation still: Regret aversion under total neglect of probability.

Neglect of probability is self-explanatory: You totally ignore the probability. But what’s regret aversion, exactly? Unfortunately I’ve had trouble finding any good popular sources on the topic; it’s all scholarly stuff. (Maybe I’m more cutting-edge than I thought!)

The basic idea that is that you minimize regret, where regret can be formalized as the difference in utility between the outcome you got and the best outcome you could have gotten. In effect, it doesn’t matter whether something is likely or unlikely; you only care how bad it is.

This explains insurance and lottery tickets in one fell swoop: With insurance, you have the choice of risking a big loss (big regret) which you can avoid by paying a small amount (small regret). You take the small regret, and buy insurance. With lottery tickets, you have the chance of getting a large gain (big regret if you don’t) which you gain by paying a small amount (small regret).

This can also explain why a typical American’s fears go in the order terrorists > Ebola > sharks > > cars > cheeseburgers, while the actual risk of dying goes in almost the opposite order, cheeseburgers > cars > > terrorists > sharks > Ebola. (Terrorists are scarier than sharks and Ebola and actually do kill more Americans! Yay, we got something right! Other than that it is literally reversed.)

Dying from a terrorist attack would be horrible; in addition to your own death you have all the other likely deaths and injuries, and the sheer horror and evil of the terrorist attack itself. Dying from Ebola would be almost as bad, with gruesome and agonizing symptoms. Dying of a shark attack would be still pretty awful, as you get dismembered alive. But dying in a car accident isn’t so bad; it’s usually over pretty quick and the event seems tragic but ordinary. And dying of heart disease and diabetes from your cheeseburger overdose will happen slowly over many years, you’ll barely even notice it coming and probably die rapidly from a heart attack or comfortably in your sleep. (Wasn’t that a pleasant paragraph? But there’s really no other way to make the point.)

If we try to estimate the probability at all—and I don’t think most people even bother—it isn’t by rigorous scientific research; it’s usually by availability heuristic: How many examples can you think of in which that event happened? If you can think of a lot, you assume that it happens a lot.

And that might even be reasonable, if we still lived in hunter-gatherer tribes or small farming villages and the 150 or so people you knew were the only people you ever heard about. But now that we have live TV and the Internet, news can get to us from all around the world, and the news isn’t trying to give us an accurate assessment of risk, it’s trying to get our attention by talking about the biggest, scariest, most exciting things that are happening around the world. The amount of news attention an item receives is in fact in inverse proportion to the probability of its occurrence, because things are more exciting if they are rare and unusual. Which means that if we are estimating how likely something is based on how many times we heard about it on the news, our estimates are going to be almost exactly reversed from reality. Ironically it is the very fact that we have more information that makes our estimates less accurate, because of the way that information is presented.

It would be a pretty boring news channel that spent all day saying things like this: “82 people died in car accidents today, and 1657 people had fatal heart attacks, 11.8 million had migraines, and 127 million played the lottery and lost; in world news, 214 countries did not go to war, and 6,147 children starved to death in Africa…” This would, however, be vastly more informative.

In the meantime, here are a couple of counter-heuristics I recommend to you: Don’t think about losses and gains, think about where you are and where you might be. Don’t say, “I’ll gain $1,000”; say “I’ll raise my income this year to $41,000.” Definitely do not think in terms of the percentage price of things; think in terms of absolute amounts of money. Cheap expensive things, expensive cheap things is a motto of mine; go ahead and buy the $5 toothbrush instead of the $1, because that’s only $4. But be very hesitant to buy the $22,000 car instead of the $21,000, because that’s $1,000. If you need to estimate the probability of something, actually look it up; don’t try to guess based on what it feels like the probability should be. Make this unprecedented access to information work for you instead of against you. If you want to know how many people die in car accidents each year, you can literally ask Google and it will tell you that (I tried it—it’s 1.3 million worldwide). The fatality rate of a given disease versus the risk of its vaccine, the safety rating of a particular brand of car, the number of airplane crash deaths last month, the total number of terrorist attacks, the probability of becoming a university professor, the average functional lifespan of a new television—all these things and more await you at the click of a button. Even if you think you’re pretty sure, why not look it up anyway?

Perhaps then we can make prospect theory wrong by making ourselves more rational.