Why are humans so bad with probability?

Apr 29 JDN 2458238

In previous posts on deviations from expected utility and cumulative prospect theory, I’ve detailed some of the myriad ways in which human beings deviate from optimal rational behavior when it comes to probability.

This post is going to be a bit different: Yes, we behave irrationally when it comes to probability. Why?

Why aren’t we optimal expected utility maximizers?
This question is not as simple as it sounds. Some of the ways that human beings deviate from neoclassical behavior are simply because neoclassical theory requires levels of knowledge and intelligence far beyond what human beings are capable of; basically anything requiring “perfect information” qualifies, as does any game theory prediction that involves solving extensive-form games with infinite strategy spaces by backward induction. (Don’t feel bad if you have no idea what that means; that’s kind of my point. Solving infinite extensive-form games by backward induction is an unsolved problem in game theory; just this past week I saw a new paper presented that offered a partial potential solutionand yet we expect people to do it optimally every time?)

I’m also not going to include questions of fundamental uncertainty, like “Will Apple stock rise or fall tomorrow?” or “Will the US go to war with North Korea in the next ten years?” where it isn’t even clear how we would assign a probability. (Though I will get back to them, for reasons that will become clear.)

No, let’s just look at the absolute simplest cases, where the probabilities are all well-defined and completely transparent: Lotteries and casino games. Why are we so bad at that?

Lotteries are not a computationally complex problem. You figure out how much the prize is worth to you, multiply it by the probability of winning—which is clearly spelled out for you—and compare that to how much the ticket price is worth to you. The most challenging part lies in specifying your marginal utility of wealth—the “how much it’s worth to you” part—but that’s something you basically had to do anyway, to make any kind of trade-offs on how to spend your time and money. Maybe you didn’t need to compute it quite so precisely over that particular range of parameters, but you need at least some idea how much $1 versus $10,000 is worth to you in order to get by in a market economy.

Casino games are a bit more complicated, but not much, and most of the work has been done for you; you can look on the Internet and find tables of probability calculations for poker, blackjack, roulette, craps and more. Memorizing all those probabilities might take some doing, but human memory is astonishingly capacious, and part of being an expert card player, especially in blackjack, seems to involve memorizing a lot of those probabilities.

Furthermore, by any plausible expected utility calculation, lotteries and casino games are a bad deal. Unless you’re an expert poker player or blackjack card-counter, your expected income from playing at a casino is always negative—and the casino set it up that way on purpose.

Why, then, can lotteries and casinos stay in business? Why are we so bad at such a simple problem?

Clearly we are using some sort of heuristic judgment in order to save computing power, and the people who make lotteries and casinos have designed formal models that can exploit those heuristics to pump money from us. (Shame on them, really; I don’t fully understand why this sort of thing is legal.)

In another previous post I proposed what I call “categorical prospect theory”, which I think is a decently accurate description of the heuristics people use when assessing probability (though I’ve not yet had the chance to test it experimentally).

But why use this particular heuristic? Indeed, why use a heuristic at all for such a simple problem?

I think it’s helpful to keep in mind that these simple problems are weird; they are absolutely not the sort of thing a tribe of hunter-gatherers is likely to encounter on the savannah. It doesn’t make sense for our brains to be optimized to solve poker or roulette.

The sort of problems that our ancestors encountered—indeed, the sort of problems that we encounter, most of the time—were not problems of calculable probability risk; they were problems of fundamental uncertainty. And they were frequently matters of life or death (which is why we’d expect them to be highly evolutionarily optimized): “Was that sound a lion, or just the wind?” “Is this mushroom safe to eat?” “Is that meat spoiled?”

In fact, many of the uncertainties most important to our ancestors are still important today: “Will these new strangers be friendly, or dangerous?” “Is that person attracted to me, or am I just projecting my own feelings?” “Can I trust you to keep your promise?” These sorts of social uncertainties are even deeper; it’s not clear that any finite being could ever totally resolve its uncertainty surrounding the behavior of other beings with the same level of intelligence, as the cognitive arms race continues indefinitely. The better I understand you, the better you understand me—and if you’re trying to deceive me, as I get better at detecting deception, you’ll get better at deceiving.

Personally, I think that it was precisely this sort of feedback loop that resulting in human beings getting such ridiculously huge brains in the first place. Chimpanzees are pretty good at dealing with the natural environment, maybe even better than we are; but even young children can outsmart them in social tasks any day. And once you start evolving for social cognition, it’s very hard to stop; basically you need to be constrained by something very fundamental, like, say, maximum caloric intake or the shape of the birth canal. Where chimpanzees look like their brains were what we call an “interior solution”, where evolution optimized toward a particular balance between cost and benefit, human brains look more like a “corner solution”, where the evolutionary pressure was entirely in one direction until we hit up against a hard constraint. That’s exactly what one would expect to happen if we were caught in a cognitive arms race.

What sort of heuristic makes sense for dealing with fundamental uncertainty—as opposed to precisely calculable probability? Well, you don’t want to compute a utility function and multiply by it, because that adds all sorts of extra computation and you have no idea what probability to assign. But you’ve got to do something like that in some sense, because that really is the optimal way to respond.

So here’s a heuristic you might try: Separate events into some broad categories based on how frequently they seem to occur, and what sort of response would be necessary.

Some things, like the sun rising each morning, seem to always happen. So you should act as if those things are going to happen pretty much always, because they do happen… pretty much always.

Other things, like rain, seem to happen frequently but not always. So you should look for signs that those things might happen, and prepare for them when the signs point in that direction.

Still other things, like being attacked by lions, happen very rarely, but are a really big deal when they do. You can’t go around expecting those to happen all the time, that would be crazy; but you need to be vigilant, and if you see any sign that they might be happening, even if you’re pretty sure they’re not, you may need to respond as if they were actually happening, just in case. The cost of a false positive is much lower than the cost of a false negative.

And still other things, like people sprouting wings and flying, never seem to happen. So you should act as if those things are never going to happen, and you don’t have to worry about them.

This heuristic is quite simple to apply once set up: It can simply slot in memories of when things did and didn’t happen in order to decide which category they go in—i.e. availability heuristic. If you can remember a lot of examples of “almost never”, maybe you should move it to “unlikely” instead. If you get a really big number of examples, you might even want to move it all the way to “likely”.

Another large advantage of this heuristic is that by combining utility and probability into one metric—we might call it “importance”, though Bayesian econometricians might complain about that—we can save on memory space and computing power. I don’t need to separately compute a utility and a probability; I just need to figure out how much effort I should put into dealing with this situation. A high probability of a small cost and a low probability of a large cost may be equally worth my time.

How might these heuristics go wrong? Well, if your environment changes sufficiently, the probabilities could shift and what seemed certain no longer is. For most of human history, “people walking on the Moon” would seem about as plausible as sprouting wings and flying away, and yet it has happened. Being attacked by lions is now exceedingly rare except in very specific places, but we still harbor a certain awe and fear before lions. And of course availability heuristic can be greatly distorted by mass media, which makes people feel like terrorist attacks and nuclear meltdowns are common and deaths by car accidents and influenza are rare—when exactly the opposite is true.

How many categories should you set, and what frequencies should they be associated with? This part I’m still struggling with, and it’s an important piece of the puzzle I will need before I can take this theory to experiment. There is probably a trade-off between more categories giving you more precision in tailoring your optimal behavior, but costing more cognitive resources to maintain. Is the optimal number 3? 4? 7? 10? I really don’t know. Even I could specify the number of categories, I’d still need to figure out precisely what categories to assign.

Are some ideas too ridiculous to bother with?

Apr 22 JDN 2458231

Flat Earth. Young-Earth Creationism. Reptilians. 9/11 “Truth”. Rothschild conspiracies.

There are an astonishing number of ideas that satisfy two apparently-contrary conditions:

  1. They are so obviously ridiculous that even a few minutes of honest, rational consideration of evidence that is almost universally available will immediately refute them;
  2. They are believed by tens or hundreds of millions of otherwise-intelligent people.

Young-Earth Creationism is probably the most alarming, seeing as it grips the minds of some 38% of Americans.

What should we do when faced with such ideas? This is something I’ve struggled with before.

I’ve spent a lot of time and effort trying to actively address and refute them—but I don’t think I’ve even once actually persuaded someone who believes these ideas to change their mind. This doesn’t mean my time and effort were entirely wasted; it’s possible that I managed to convince bystanders, or gained some useful understanding, or simply improved my argumentation skills. But it does seem likely that my time and effort were mostly wasted.

It’s tempting, therefore, to give up entirely, and just let people go on believing whatever nonsense they want to believe. But there’s a rather serious downside to that as well: Thirty-eight percent of Americans.

These people vote. They participate in community decisions. They make choices that affect the rest of our lives. Nearly all of those Creationists are Evangelical Christians—and White Evangelical Christians voted overwhelmingly in favor of Donald Trump. I can’t be sure that changing their minds about the age of the Earth would also change their minds about voting for Trump, but I can say this: If all the Creationists in the US had simply not voted, Hillary Clinton would have won the election.

And let’s not leave the left wing off the hook either. Jill Stein is a 9/11 “Truther”, and pulled a lot of fellow “Truthers” to her cause in the election as well. Had all of Jill Stein’s votes gone to Hillary Clinton instead, again Hillary would have won, even if all the votes for Trump had remained the same. (That said, there is reason to think that if Stein had dropped out, most of those folks wouldn’t have voted at all.)

Therefore, I don’t think it is safe to simply ignore these ridiculous beliefs. We need to do something; the question is what.

We could try to censor them, but first of all that violates basic human rights—which should be a sufficient reason not to do it—and second, it probably wouldn’t even work. Censorship typically leads to radicalization, not assimilation.

We could try to argue against them. Ideally this would be the best option, but it has not shown much effect so far. The kind of person who sincerely believes that the Earth is 6,000 years old (let alone that governments are secretly ruled by reptilian alien invaders) isn’t the kind of person who is highly responsive to evidence and rational argument.

In fact, there is reason to think that these people don’t actually believe what they say the same way that you and I believe things. I’m not saying they’re lying, exactly. They think they believe it; they want to believe it. They believe in believing it. But they don’t actually believe it—not the way that I believe that cyanide is poisonous or the way I believe the sun will rise tomorrow. It isn’t fully integrated into the way that they anticipate outcomes and choose behaviors. It’s more of a free-floating sort of belief, where professing a particular belief allows them to feel good about themselves, or represent their status in a community.

To be clear, it isn’t that these beliefs are unimportant to them; on the contrary, they are in some sense more important. Creationism isn’t really about the age of the Earth; it’s about who you are and where you belong. A conventional belief can be changed by evidence about the world because it is about the world; a belief-in-belief can’t be changed by evidence because it was never really about that.

But if someone’s ridiculous belief is really about their identity, how do we deal with that? I can’t refute an identity. If your identity is tied to a particular social group, maybe they could ostracize you and cause you to lose the identity; but an outsider has no power to do that. (Even then, I strongly suspect that, for instance, most excommunicated Catholics still see themselves as Catholic.) And if it’s a personal identity not tied to a particular group, even that option is unavailable.

Where, then, does that leave us? It would seem that we can’t change their minds—but we also can’t afford not to change their minds. We are caught in a terrible dilemma.

I think there might be a way out. It’s a bit counter-intuitive, but I think what we need to do is stop taking them seriously as beliefs, and start treating them purely as announcements of identity.

So when someone says something like, “The Rothschilds run everything!”, instead of responding as though this were a coherent proposition being asserted, treat it as if someone had announced, “Boo! I hate the Red Sox!” Belief in the Rothschild conspiracies isn’t a well-defined set of propositions about the world; it’s an assertion of membership in a particular sort of political sect that is vaguely left-wing and anarchist. You don’t really think the Rothschilds rule everything. You just want to express your (quite justifiable) anger at how our current political system privileges the rich.

Likewise, when someone says they think the Earth is 6,000 years old, you could try to present the overwhelming scientific evidence that they are wrong—but it might be more productive, and it is certainly easier, to just think of this as a funny way of saying “I’m an Evangelical Christian”.

Will this eliminate the ridiculous beliefs? Not immediately. But it might ultimately do so, in the following way: By openly acknowledging the belief-in-belief as a signaling mechanism, we can open opportunities for people to develop new, less pathological methods of signaling. (Instead of saying you think the Earth is 6,000 years old, maybe you could wear a funny hat, like Orthodox Jews do. Funny hats don’t hurt anybody. Everyone loves funny hats.) People will always want to signal their identity, and there are fundamental reasons why such signals will typically be costly for those who use them; but we can try to make them not so costly for everyone else.

This also makes arguments a lot less frustrating, at least at your end. It might make them more frustrating at the other end, because people want their belief-in-belief to be treated like proper belief, and you’ll be refusing them that opportunity. But this is not such a bad thing; if we make it more frustrating to express ridiculous beliefs in public, we might manage to reduce the frequency of such expression.

Reasonableness and public goods games

Apr 1 JDN 2458210

There’s a very common economics experiment called a public goods game, often used to study cooperation and altruistic behavior. I’m actually planning on running a variant of such an experiment for my second-year paper.

The game is quite simple, which is part of why it is used so frequently: You are placed into a group of people (usually about four), and given a little bit of money (say $10). Then you are offered a choice: You can keep the money, or you can donate some of it to a group fund. Money in the group fund will be multiplied by some factor (usually about two) and then redistributed evenly to everyone in the group. So for example if you donate $5, that will become $10, split four ways, so you’ll get back $2.50.

Donating more to the group will benefit everyone else, but at a cost to yourself. The game is usually set up so that the best outcome for everyone is if everyone donates the maximum amount, but the best outcome for you, holding everyone else’s choices constant, is to donate nothing and keep it all.

Yet it is a very robust finding that most people do neither of those things. There’s still a good deal of uncertainty surrounding what motivates people to donate what they do, but certain patterns that have emerged:

  1. Most people donate something, but hardly anyone donates everything.
  2. Increasing the multiplier tends to smoothly increase how much people donate.
  3. The number of people in the group isn’t very important, though very small groups (e.g. 2) behave differently from very large groups (e.g. 50).
  4. Letting people talk to each other tends to increase the rate of donations.
  5. Repetition of the game, or experience from previous games, tends to result in decreasing donation over time.
  6. Economists donate less than other people.

Number 6 is unfortunate, but easy to explain: Indoctrination into game theory and neoclassical economics has taught economists that selfish behavior is efficient and optimal, so they behave selfishly.

Number 3 is also fairly easy to explain: Very small groups allow opportunities for punishment and coordination that don’t exist in large groups. Think about how you would respond when faced with 2 defectors in a group of 4 as opposed to 10 defectors in a group of 50. You could punish the 2 by giving less next round; but punishing the 10 would end up punishing 40 others who had contributed like they were supposed to.

Number 4 is a very interesting finding. Game theory says that communication shouldn’t matter, because there is a unique Nash equilibrium: Donate nothing. All the promises in the world can’t change what is the optimal response in the game. But in fact, human beings don’t like to break their promises, and so when you get a bunch of people together and they all agree to donate, most of them will carry through on that agreement most of the time.

Number 5 is on the frontier of research right now. There are various theoretical accounts for why it might occur, but none of the models proposed so far have much predictive power.

But my focus today will be on findings 1 and 2.

If you’re not familiar with the underlying game theory, finding 2 may seem obvious to you: Well, of course if you increase the payoff for donating, people will donate more! It’s precisely that sense of obviousness which I am going to appeal to in a moment.

In fact, the game theory makes a very sharp prediction: For N players, if the multiplier is less than N, you should always contribute nothing. Only if the multiplier becomes larger than N should you donate—and at that point you should donate everything. The game theory prediction is not a smooth increase; it’s all-or-nothing. The only time game theory predicts intermediate amounts is on the knife-edge at exactly equal to N, where each player would be indifferent between donating and not donating.

But it feels reasonable that increasing the multiplier should increase donation, doesn’t it? It’s a “safer bet” in some sense to donate $1 if the payoff to everyone is $3 and the payoff to yourself is $0.75 than if the payoff to everyone is $1.04 and the payoff to yourself is $0.26. The cost-benefit analysis comes out better: In the former case, you can gain up to $2 if everyone donates, but would only lose $0.25 if you donate alone; but in the latter case, you would only gain $0.04 if everyone donates, and would lose $0.74 if you donate alone.

I think this notion of “reasonableness” is a deep principle that underlies a great deal of human thought. This is something that is sorely lacking from artificial intelligence: The same AI that tells you the precise width of the English Channel to the nearest foot may also tell you that the Earth is 14 feet in diameter, because the former was in its database and the latter wasn’t. Yes, WATSON may have won on Jeopardy, but it (he?) also made a nonsensical response to the Final Jeopardy question.

Human beings like to “sanity-check” our results against prior knowledge, making sure that everything fits together. And, of particular note for public goods games, human beings like to “hedge our bets”; we don’t like to over-commit to a single belief in the face of uncertainty.

I think this is what best explains findings 1 and 2. We don’t donate everything, because that requires committing totally to the belief that contributing is always better. We also don’t donate nothing, because that requires committing totally to the belief that contributing is always worse.

And of course we donate more as the payoffs to donating more increase; that also just seems reasonable. If something is better, you do more of it!

These choices could be modeled formally by assigning some sort of probability distribution over other’s choices, but in a rather unconventional way. We can’t simply assume that other people will randomly choose some decision and then optimize accordingly—that just gives you back the game theory prediction. We have to assume that our behavior and the behavior of others is in some sense correlated; if we decide to donate, we reason that others are more likely to donate as well.

Stated like that, this sounds irrational; some economists have taken to calling it “magical thinking”. Yet, as I always like to point out to such economists: On average, people who do that make more money in the games. Economists playing other economists always make very little money in these games, because they turn on each other immediately. So who is “irrational” now?

Indeed, if you ask people to predict how others will behave in these games, they generally do better than the game theory prediction: They say, correctly, that some people will give nothing, most will give something, and hardly any will give everything. The same “reasonableness” that they use to motivate their own decisions, they also accurately apply to forecasting the decisions of others.

Of course, to say that something is “reasonable” may be ultimately to say that it conforms to our heuristics well. To really have a theory, I need to specify exactly what those heuristics are.

“Don’t put all your eggs in one basket” seems to be one, but it’s probably not the only one that matters; my guess is that there are circumstances in which people would actually choose all-or-nothing, like if we said that the multiplier was 0.5 (so everyone giving to the group would make everyone worse off) or 10 (so that giving to the group makes you and everyone else way better off).

“Higher payoffs are better” is probably one as well, but precisely formulating that is actually surprisingly difficult. Higher payoffs for you? For the group? Conditional on what? Do you hold others’ behavior constant, or assume it is somehow affected by your own choices?

And of course, the theory wouldn’t be much good if it only worked on public goods games (though even that would be a substantial advance at this point). We want a theory that explains a broad class of human behavior; we can start with simple economics experiments, but ultimately we want to extend it to real-world choices.

Hyperbolic discounting: Why we procrastinate

Mar 25 JDN 2458203

Lately I’ve been so occupied by Trump and politics and various ideas from environmentalists that I haven’t really written much about the cognitive economics that was originally planned to be the core of this blog. So, I thought that this week I would take a step out of the political fray and go back to those core topics.

Why do we procrastinate? Why do we overeat? Why do we fail to exercise? It’s quite mysterious, from the perspective of neoclassical economic theory. We know these things are bad for us in the long run, and yet we do them anyway.

The reason has to do with the way our brains deal with time. We value the future less than the present—but that’s not actually the problem. The problem is that we do so inconsistently.

A perfectly-rational neoclassical agent would use time-consistent discounting; what this means is that the effect of a given time interval won’t change or vary based on the stakes. If having $100 in 2019 is as good as having $110 in 2020, then having $1000 in 2019 is as good as having $1100 in 2020; and if I ask you in 2019, you’ll still agree that having $100 in 2019 is as good as having $1100 in 2020. A perfectly-rational individual would have a certain discount rate (in this case, 10% per year), and would apply it consistently at all times on all things.

This is of course not how human beings behave at all.

A much more likely pattern is that you would agree, in 2018, that having $100 in 2019 is as good as having $110 in 2020 (a discount rate of 10%). But then if I wait until 2019, and then offer you the choice between $100 immediately and $120 in a year, you’ll probably take the $100 immediately—even though a year ago, you told me you wouldn’t. Your discount rate rose from 10% to at least 20% in the intervening time.

The leading model in cognitive economics right now to explain this is called hyperbolic discounting. The precise functional form of a hyperbola has been called into question by recent research, but the general pattern is definitely right: We act as though time matters a great deal when discussing time intervals that are close to us, but treat time as unimportant when discussing time intervals that are far away.

How does this explain procrastination and other failures of self-control over time? Let’s try an example.

Let’s say that you have a project you need to finish by the end of the day Friday, which has a benefit to you, received on Saturday, that I will arbitrarily scale at 1000 utilons.

Then, let’s say it’s Monday. You have five days to work on it, and each day of work costs you 100 utilons. If you work all five days, the project will get done.

If you skip a day of work, you will need to work so much harder that one of the following days your cost of work will be 300 utilons instead of 100. If you skip two days, you’ll have to pay 300 utilons twice. And if you skip three or more days, the project will not be finished and it will all be for naught.

If you don’t discount time at all (which, over a week, is probably close to optimal), the answer is obvious: Work all five days. Pay 100+100+100+100+100 = 500, receive 1000. Net benefit: 500.

But even if you discount time, as long as you do so consistently, you still wouldn’t procrastinate.

Let’s say your discount rate is extremely high (maybe you’re dying or something), so that each day is only worth 80% as much as the previous. Benefit that’s worth 1 on Monday is worth 0.8 if it comes on Tuesday, 0.64 if it comes on Wednesday, 0.512 if it comes on Thursday, 0.4096 if it comes on Friday,a and 0.32768 if it comes on Saturday. Then instead of paying 100+100+100+100+100 to get 1000, you’re paying 100+80+64+51+41=336 to get 328. It’s not worth doing the project; you should just enjoy your last few days on Earth. That’s not procrastinating; that’s rationally choosing not to undertake a project that isn’t worthwhile under your circumstances.

Procrastinating would look more like this: You skip the first two days, then work 100 the third day, then work 300 each of the last two days, finishing the project. If you didn’t discount at all, you would pay 100+300+300=700 to get 1000, so your net benefit has been reduced to 300.

There’s no consistent discount rate that would make this rational. If it was worth giving up 200 on Thursday and Friday to get 100 on Monday and Tuesday, you must be discounting at least 26% per day. But if you’re discounting that much, you shouldn’t bother with the project at all.

There is however an inconsistent discounting by which it makes perfect sense. Suppose that instead of consistently discounting some percentage each day, psychologically it feels like this: The value is the inverse of the length of time (that’s what it means to be hyperbolic). So the same amount of benefit on Monday which is worth 1 is only worth 1/2 if it comes on Tuesday, 1/3 if on Wednesday, 1/4 if on Thursday, and 1/5 if on Friday.

So, when thinking about your weekly schedule, you realize that by pushing back Monday’s work to Thursday, you can gain 100 today at a cost of only 200/4 = 50, since Thursday is 4 days away. And by pushing back Tuesday’s work to Friday, you can gain 100/2=50 today at a cost of only 200/5=40. So now it makes perfect sense to have fun on Monday and Tuesday, start working on Wednesday, and cram the biggest work into Thursday and Friday. And yes, it still makes sense to do the project, because 1000/6 = 166 is more than the 100/3+200/4+200/5 = 123 it will cost to do the work.

But now think about what happens when you come to Wednesday. The work today costs 100. The work on Thursday costs 200/2 = 100. The work on Friday costs 200/3 = 66. The benefit of completing the project will be 1000/4 = 250. So you are paying 100+100+66=266 to get a benefit of only 250. It’s not worth it anymore! You’ve changed your mind. So you don’t work Wednesday.

At that point, it’s too late, so you don’t work Thursday, you don’t work Friday, and the project doesn’t get done. You have procrastinated away the benefits you could have gotten from doing this project. If only you could have done the work on Monday and Tuesday, then on Wednesday it would have been worthwhile to continue: 100/1+100/2+100/3 = 183 is less than the benefit of 250.

What went wrong? The key event was the preference reversal: While on Monday you preferred having fun on Monday and working on Thursday to working on both days, when the time came you changed your mind. Someone with time-consistent discounting would never do that; they would either prefer one or the other, and never change their mind.

One way to think about this is to imagine future versions of yourself as different people, who agree with you on most things, but not on everything. They’re like friends or family; you want the best for them, but you don’t always see eye-to-eye.

Generally we find that our future selves are less rational about choices than we are. To be clear, this doesn’t mean that we’re all declining in rationality over time. Rather, it comes from the fact that future decisions are inherently closer to our future selves than they are to our current selves, and the closer a decision gets the more likely we are to use irrational time discounting.

This is why it’s useful to plan and make commitments. If starting on Monday you committed yourself to working every single day, you’d get the project done on time and everything would work out fine. Better yet, if you committed yourself last week to starting work on Monday, you wouldn’t even feel conflicted; you would be entirely willing to pay a cost of 100/8+100/9+100/10+100/11+100/12=51 to get a benefit of 1000/13=77. So you could set up some sort of scheme where you tell your friends ahead of time that you can’t go out that week, or you turn off access to social media sites (there are apps that will do this for you), or you set up a donation to an “anti-charity” you don’t like that will trigger if you fail to complete the project on time (there are websites to do that for you).

There is even a simpler way: Make a promise to yourself. This one can be tricky to follow through on, but if you can train yourself to do it, it is extraordinarily powerful and doesn’t come with the additional costs that a lot of other commitment devices involve. If you can really make yourself feel as bad about breaking a promise to yourself as you would about breaking a promise to someone else, then you can dramatically increase your own self-control with very little cost. The challenge lies in actually cultivating that sort of attitude, and then in following through with making only promises you can keep and actually keeping them. This, too, can be a delicate balance; it is dangerous to over-commit to promises to yourself and feel too much pain when you fail to meet them.
But given the strong correlations between self-control and long-term success, trying to train yourself to be a little better at it can provide enormous benefits.
If you ever get around to it, that is.

Is grade inflation a real problem?

Mar 4 JDN 2458182

You can’t spend much time teaching at the university level and not hear someone complain about “grade inflation”. Almost every professor seems to believe in it, and yet they must all be participating in it, if it’s really such a widespread problem.

This could be explained as a collective action problem, a Tragedy of the Commons: If the incentives are always to have the students with the highest grades—perhaps because of administrative pressure, or in order to get better reviews from students—then even if all professors would prefer a harsher grading scheme, no individual professor can afford to deviate from the prevailing norms.

But in fact I think there is a much simpler explanation: Grade inflation doesn’t exist.

In economic growth theory, economists make a sharp distinction between inflation—increase in prices without change in underlying fundamentals—and growth—increase in the real value of output. I contend that there is no such thing as grade inflation—what we are in fact observing is grade growth.
Am I saying that students are actually smarter now than they were 30 years ago?

Yes. That’s exactly what I’m saying.

But don’t take it from me. Take it from the decades of research on the Flynn Effect: IQ scores have been rising worldwide at a rate of about 0.3 IQ points per year for as long as we’ve been keeping good records. Students today are about 10 IQ points smarter than students 30 years ago—a 2018 IQ score of 95 is equivalent to a 1988 score of 105, which is equivalent to a 1958 score of 115. There is reason to think this trend won’t continue indefinitely, since the effect is mainly concentrated at the bottom end of the distribution; but it has continued for quite some time already.

This by itself would probably be enough to explain the observed increase in grades, but there’s more: College students are also a self-selected sample, admitted precisely because they were believed to be the smartest individuals in the application pool. Rising grades at top institutions are easily explained by rising selectivity at top schools: Harvard now accepts 5.6% of applicants. In 1942, Harvard accepted 92% of applicants. The odds of getting in have fallen from 9:1 in favor to 19:1 against. Today, you need a 4.0 GPA, a 36 ACT in every category, glowing letters of recommendation, and hundreds of hours of extracurricular activities (or a family member who donated millions of dollars, of course) to get into Harvard. In the 1940s, you needed a high school diploma and a B average.

In fact, when educational researchers have tried to quantitatively study the phenomenon of “grade inflation”, they usually come back with the result that they simply can’t find it. The US department of education conducted a study in 1995 showing that average university grades had declined since 1965. Given that the Flynn effect raised IQ by almost 10 points during that time, maybe we should be panicking about grade deflation.

It really wouldn’t be hard to make that case: “Back in my day, you could get an A just by knowing basic algebra! Now they want these kids to take partial derivatives?” “We used to just memorize facts to ace the exam; but now teachers keep asking for reasoning and critical thinking?”

More recently, a study in 2013 found that grades rose at the high school level, but fell at the college level, and showed no evidence of losing any informativeness as a signaling mechanism. The only recent study I could find showing genuinely compelling evidence for grade inflation was a 2017 study of UK students estimating that grades are growing about twice as fast as the Flynn effect alone would predict. Most studies don’t even consider the possibility that students are smarter than they used to be—they just take it for granted that any increase in average grades constitutes grade inflation. Many of them don’t even control for the increase in selectivity—here’s one using the fact that Harvard’s average rose from 2.7 to 3.4 from 1960 to 2000 as evidence of “grade inflation” when Harvard’s acceptance rate fell from almost 30% to only 10% during that period.

Indeed, the real mystery is why so many professors believe in grade inflation, when the evidence for it is so astonishingly weak.

I think it’s availability heuristic. Who are professors? They are the cream of the crop. They aced their way through high school, college, and graduate school, then got hired and earned tenure—they were one of a handful of individuals who won a fierce competition with hundreds of competitors at each stage. There are over 320 million people in the US, and only 1.3 million college faculty. This means that college professors represent about the top 0.4% of high-scoring students.

Combine that with the fact that human beings assort positively (we like to spend time with people who are similar to us) and use availability heuristic (we judge how likely something is based on how many times we have seen it).

Thus, when a professor compares to her own experience of college, she is remembering her fellow top-scoring students at elite educational institutions. She is recalling the extreme intellectual demands she had to meet to get where she is today, and erroneously assuming that these are representative of most the population of her generation. She probably went to school at one of a handful of elite institutions, even if she now teaches at a mid-level community college: three quarters of college faculty come from the top one quarter of graduate schools.

And now she compares to the students she has to teach, most of whom would not be able to meet such demands—but of course most people in her generation couldn’t either. She frets for the future of humanity only because not everyone is a genius like her.

Throw in the Curse of Knowledge: The professor doesn’t remember how hard it was to learn what she has learned so far, and so the fact that it seems easy now makes her think it was easy all along. “How can they not know how to take partial derivatives!?” Well, let’s see… were you born knowing how to take partial derivatives?

Giving a student an A for work far inferior to what you’d have done in their place isn’t unfair. Indeed, it would clearly be unfair to do anything less. You have years if not decades of additional education ahead of them, and you are from self-selected elite sample of highly intelligent individuals. Expecting everyone to perform as well as you would is simply setting up most of the population for failure.

There are potential incentives for grade inflation that do concern me: In particular, a lot of international student visas and scholarship programs insist upon maintaining a B or even A- average to continue. Professors are understandably loathe to condemn a student to having to drop out or return to their home country just because they scored 81% instead of 84% on the final exam. If we really intend to make C the average score, then students shouldn’t lose funding or visas just for scoring a B-. Indeed, I have trouble defending any threshold above outright failing—which is to say, a minimum score of D-. If you pass your classes, that should be good enough to keep your funding.

Yet apparently even this isn’t creating too much upward bias, as students who are 10 IQ points smarter are still getting about the same scores as their forebears. We should be celebrating that our population is getting smarter, but instead we’re panicking over “easy grading”.

But kids these days, am I right?

“DSGE or GTFO”: Macroeconomics took a wrong turn somewhere

Dec 31, JDN 2458119

The state of macro is good,” wrote Oliver Blanchard—in August 2008. This is rather like the turkey who is so pleased with how the farmer has been feeding him lately, the day before Thanksgiving.

It’s not easy to say exactly where macroeconomics went wrong, but I think Paul Romer is right when he makes the analogy between DSGE (dynamic stochastic general equilbrium) models and string theory. They are mathematically complex and difficult to understand, and people can make their careers by being the only ones who grasp them; therefore they must be right! Nevermind if they have no empirical support whatsoever.

To be fair, DSGE models are at least a little better than string theory; they can at least be fit to real-world data, which is better than string theory can say. But being fit to data and actually predicting data are fundamentally different things, and DSGE models typically forecast no better than far simpler models without their bold assumptions. You don’t need to assume all this stuff about a “representative agent” maximizing a well-defined utility function, or an Euler equation (that doesn’t even fit the data), or this ever-proliferating list of “random shocks” that end up taking up all the degrees of freedom your model was supposed to explain. Just regressing the variables on a few years of previous values of each other (a “vector autoregression” or VAR) generally gives you an equally-good forecast. The fact that these models can be made to fit the data well if you add enough degrees of freedom doesn’t actually make them good models. As Von Neumann warned us, with enough free parameters, you can fit an elephant.

But really what bothers me is not the DSGE but the GTFO (“get the [expletive] out”); it’s not that DSGE models are used, but that it’s almost impossible to get published as a macroeconomic theorist using anything else. Defenders of DSGE typically don’t even argue anymore that it is good; they argue that there are no credible alternatives. They characterize their opponents as “dilettantes” who aren’t opposing DSGE because we disagree with it; no, it must be because we don’t understand it. (Also, regarding that post, I’d just like to note that I now officially satisfy the Athreya Axiom of Absolute Arrogance: I have passed my qualifying exams in a top-50 economics PhD program. Yet my enmity toward DSGE has, if anything, only intensified.)

Of course, that argument only makes sense if you haven’t been actively suppressing all attempts to formulate an alternative, which is precisely what DSGE macroeconomists have been doing for the last two or three decades. And yet despite this suppression, there are alternatives emerging, particularly from the empirical side. There are now empirical approaches to macroeconomics that don’t use DSGE models. Regression discontinuity methods and other “natural experiment” designs—not to mention actual experiments—are quickly rising in popularity as economists realize that these methods allow us to actually empirically test our models instead of just adding more and more mathematical complexity to them.

But there still seems to be a lingering attitude that there is no other way to do macro theory. This is very frustrating for me personally, because deep down I think what I would like to do as a career is macro theory: By temperament I have always viewed the world through a very abstract, theoretical lens, and the issues I care most about—particularly inequality, development, and unemployment—are all fundamentally “macro” issues. I left physics when I realized I would be expected to do string theory. I don’t want to leave economics now that I’m expected to do DSGE. But I also definitely don’t want to do DSGE.

Fortunately with economics I have a backup plan: I can always be an “applied micreconomist” (rather the opposite of a theoretical macroeconomist I suppose), directly attached to the data in the form of empirical analyses or even direct, randomized controlled experiments. And there certainly is plenty of work to be done along the lines of Akerlof and Roth and Shiller and Kahneman and Thaler in cognitive and behavioral economics, which is also generally considered applied micro. I was never going to be an experimental physicist, but I can be an experimental economist. And I do get to use at least some theory: In particular, there’s an awful lot of game theory in experimental economics these days. Some of the most exciting stuff is actually in showing how human beings don’t behave the way classical game theory predicts (particularly in the Ultimatum Game and the Prisoner’s Dilemma), and trying to extend game theory into something that would fit our actual behavior. Cognitive science suggests that the result is going to end up looking quite different from game theory as we know it, and with my cognitive science background I may be particularly well-positioned to lead that charge.

Still, I don’t think I’ll be entirely satisfied if I can’t somehow bring my career back around to macroeconomic issues, and particularly the great elephant in the room of all economics, which is inequality. Underlying everything from Marxism to Trumpism, from the surging rents in Silicon Valley and the crushing poverty of Burkina Faso, to the Great Recession itself, is inequality. It is, in my view, the central question of economics: Who gets what, and why?

That is a fundamentally macro question, but you can’t even talk about that issue in DSGE as we know it; a “representative agent” inherently smooths over all inequality in the economy as though total GDP were all that mattered. A fundamentally new approach to macroeconomics is needed. Hopefully I can be part of that, but from my current position I don’t feel much empowered to fight this status quo. Maybe I need to spend at least a few more years doing something else, making a name for myself, and then I’ll be able to come back to this fight with a stronger position.

In the meantime, I guess there’s plenty of work to be done on cognitive biases and deviations from game theory.

I think I know what the Great Filter is now

Sep 3, JDN 2458000

One of the most plausible solutions to the Fermi Paradox of why we have not found any other intelligent life in the universe is called the Great Filter: Somewhere in the process of evolving from unicellular prokaryotes to becoming an interstellar civilization, there is some highly-probable event that breaks the process, a “filter” that screens out all but the luckiest species—or perhaps literally all of them.

I previously thought that this filter was the invention of nuclear weapons; I now realize that this theory is incomplete. Nuclear weapons by themselves are only an existential threat because they co-exist with widespread irrationality and bigotry. The Great Filter is the combination of the two.

Yet there is a deep reason why we would expect that this is precisely the combination that would emerge in most species (as it has certainly emerged in our own): The rationality of a species is not uniform. Some individuals in a species will always be more rational than others, so as a species increases its level of rationality, it does not do so all at once.

Indeed, the processes of economic development and scientific advancement that make a species more rational are unlikely to be spread evenly; some cultures will develop faster than others, and some individuals within a given culture will be further along than others. While the mean level of rationality increases, the variance will also tend to increase.

On some arbitrary and oversimplified scale where 1 is the level of rationality needed to maintain a hunter-gatherer tribe, and 20 is the level of rationality needed to invent nuclear weapons, the distribution of rationality in a population starts something like this:

Great_Filter_1

Most of the population is between levels 1 and 3, which we might think of as lying between the bare minimum for a tribe to survive and the level at which one can start to make advances in knowledge and culture.

Then, as the society advances, it goes through a phase like this:

Great_Filter_2

This is about where we were in Periclean Athens. Most of the population is between levels 2 and 8. Level 2 used to be the average level of rationality back when we were hunter-gatherers. Level 8 is the level of philosophers like Archimedes and Pythagoras.

Today, our society looks like this:
Great_Filter_3

Most of the society is between levels 4 and 20. As I said, level 20 is the point at which it becomes feasible to develop nuclear weapons. Some of the world’s people are extremely intelligent and rational, and almost everyone is more rational than even the smartest people in hunter-gatherer times, but now there is enormous variation.

Where on this chart are racism and nationalism? Importantly, I think they are above the level of rationality that most people had in ancient times. Even Greek philosophers had attitudes toward slaves and other cultures that the modern KKK would find repulsive. I think on this scale racism is about a 10 and nationalism is about a 12.

If we had managed to uniformly increase the rationality of our society, with everyone gaining at the same rate, our distribution would instead look like this:
Great_Filter_4

If that were the case, we’d be fine. The lowest level of rationality widespread in the population would be 14, which is already beyond racism and nationalism. (Maybe it’s about the level of humanities professors today? That makes them substantially below quantum physicists who are 20 by construction… but hey, still almost twice as good as the Greek philosophers they revere.) We would have our nuclear technology, but it would not endanger our future—we wouldn’t even use it for weapons, we’d use it for power generation and space travel. Indeed, this lower-variance high-rationality state seems to be about what they have the Star Trek universe.

But since we didn’t, a large chunk of our population is between 10 and 12—that is, still racist or nationalist. We have the nuclear weapons, and we have people who might actually be willing to use them.

Great_Filter_5

I think this is what happens to most advanced civilizations around the galaxy. By the time they invent space travel, they have also invented nuclear weapons—but they still have their equivalent of racism and nationalism. And most of the time, the two combine into a volatile mix that results in the destruction or regression of their entire civilization.

If this is right, then we may be living at the most important moment in human history. It may be right here, right now, that we have the only chance we’ll ever get to turn the tide. We have to find a way to reduce the variance, to raise the rest of the world’s population past nationalism to a cosmopolitan morality. And we may have very little time.