Valuing harm without devaluing the harmed

June 9 JDN 2458644

In last week’s post I talked about the matter of “putting a value on a human life”. I explained how we don’t actually need to make a transparently absurd statement like “a human life is worth $5 million” to do cost-benefit analysis; we simply need to ask ourselves what else we could do with any given amount of money. We don’t actually need to put a dollar value on human lives; we need only value them in terms of other lives.

But there is a deeper problem to face here, which is how we ought to value not simply life, but quality of life. The notion is built into the concept of quality-adjusted life-years (QALY), but how exactly do we make such a quality adjustment?

Indeed, much like cost-benefit analysis in general or the value of a statistical life, the very concept of QALY can be repugnant to many people. The problem seems to be that it violates our deeply-held belief that all lives are of equal value: If I say that saving one person adds 2.5 QALY and saving another adds 68 QALY, I seem to be saying that the second person is worth more than the first.

But this is not really true. QALY aren’t associated with a particular individual. They are associated with the duration and quality of life.

It should be fairly easy to convince yourself that duration matters: Saving a newborn baby who will go on to live to be 84 years old adds an awful lot more in terms of human happiness than extending the life of a dying person by a single hour. To call each of these things “saving a life” is actually very unequal: It’s implying that 1 hour for the second person is worth 84 years for the first.

Quality, on the other hand, poses much thornier problems. Presumably, we’d like to be able to say that being wheelchair-bound is a bad thing, and if we can make people able to walk we should want to do that. But this means that we need to assign some sort of QALY cost to being in a wheelchair, which then seems to imply that people in wheelchairs are worth less than people who can walk.

And the same goes for any disability or disorder: Assigning a QALY cost to depression, or migraine, or cystic fibrosis, or diabetes, or blindness, or pneumonia, always seems to imply that people with the condition are worth less than people without. This is a deeply unsettling result.

Yet I think the mistake is in how we are using the concept of “worth”. We are not saying that the happiness of someone with depression is less important than the happiness of someone without; we are saying that the person with depression experiences less happiness—which, in this case of depression especially, is basically true by construction.

Does this imply, however, that if we are given the choice between saving two people, one of whom has a disability, we should save the one without?

Well, here’s an extreme example: Suppose there is a plague which kills 50% of its victims within one year. There are two people in a burning building. One of them has the plague, the other does not. You only have time to save one: Which do you save? I think it’s quite obvious you save the person who doesn’t have the plague.

But that only relies upon duration, which wasn’t so difficult. All right, fine; say the plague doesn’t kill you. Instead, it renders you paralyzed and in constant pain for the rest of your life. Is it really that far-fetched to say that we should save the person who won’t have that experience?

We really shouldn’t think of it as valuing people; we should think of it as valuing actions. QALY are a way of deciding which actions we should take, not which people are more important or more worthy. “Is a person who can walk worth more than a person who needs a wheelchair?” is a fundamentally bizarre and ultimately useless question. ‘Worth more’ in what sense? “Should we spend $100 million developing this technology that will allow people who use wheelchairs to walk?” is the question we should be asking. The QALY cost we assign to a condition isn’t about how much people with that condition are worth; it’s about what resources we should be willing to commit in order to treat that condition. If you have a given condition, you should want us to assign a high QALY cost to it, to motivate us to find better treatments.

I think it’s also important to consider which individuals are having QALY added or subtracted. In last week’s post I talked about how some people read “the value of a statistical life is $5 million” to mean “it’s okay to kill someone as long as you profit at least $5 million”; but this doesn’t follow at all. We don’t say that it’s all right to steal $1,000 from someone just because they lose $1,000 and you gain $1,000. We wouldn’t say it was all right if you had a better investment strategy and would end up with $1,100 afterward. We probably wouldn’t even say it was all right if you were much poorer and desperate for the money (though then we might at least be tempted). If a billionaire kills people to make $10 million each (sadly I’m quite sure that oil executives have killed for far less), that’s still killing people. And in fact since he is a billionaire, his marginal utility of wealth is so low that his value of a statistical life isn’t $5 million; it’s got to be in the billions. So the net happiness of the world has not increased, in fact.

Above all, it’s vital to appreciate the benefits of doing good cost-benefit analysis. Cost-benefit analysis tells us to stop fighting wars. It tells us to focus our spending on medical research and foreign aid instead of yet more corporate subsidies or aircraft carriers. It tells us how to allocate our public health resources so as to save the most lives. It emphasizes how vital our environmental regulations are in making our lives better and longer.

Could we do all these things without QALY? Maybe—but I suspect we would not do them as well, and when millions of lives are on the line, “not as well” is thousands of innocent people dead. Sometimes we really are faced with two choices for a public health intervention, and we need to decide which one will help the most people. Sometimes we really do have to set a pollution target, and decide just what amount of risk is worth accepting for the economic benefits of industry. These are very difficult questions, and without good cost-benefit analysis we could get the answers dangerously wrong.

Markets value rich people more

Feb 26, JDN 2457811

Competitive markets are optimal at maximizing utility, as long as you value rich people more.

That is literally a theorem in neoclassical economics. I had previously thought that this was something most economists didn’t realize; I had delusions of grandeur that maybe I could finally convince them that this is the case. But no, it turns out this is actually a well-known finding; it’s just that somehow nobody seems to care. Or if they do care, they never talk about it. For all the thousands of papers and articles about the distortions created by minimum wage and capital gains tax, you’d think someone could spare the time to talk about the vastly larger fundamental distortions created by the structure of the market itself.

It’s not as if this is something completely hopeless we could never deal with. A basic income would go a long way toward correcting this distortion, especially if coupled with highly progressive taxes. By creating a hard floor and a soft ceiling on income, you can reduce the inequality that makes these distortions so large.

The basics of the theorem are quite straightforward, so I think it’s worth explaining them here. It’s extremely general; it applies anywhere that goods are allocated by market prices and different individuals have wildly different amounts of wealth.

Suppose that each person has a certain amount of wealth W to spend. Person 1 has W1, person 2 has W2, and so on. They all have some amount of happiness, defined by a utility function, which I’ll assume is only dependent on wealth; this is a massive oversimplification of course, but it wouldn’t substantially change my conclusions to include other factors—it would just make everything more complicated. (In fact, including altruistic motives would make the whole argument stronger, not weaker.) Thus I can write each person’s utility as a function U(W). The rate of change of this utility as wealth increases, the marginal utility of wealth, is denoted U'(W).

By the law of diminishing marginal utility, the marginal utility of wealth U'(W) is decreasing. That is, the more wealth you have, the less each new dollar is worth to you.

Now suppose people are buying goods. Each good C provides some amount of marginal utility U'(C) to the person who buys it. This can vary across individuals; some people like Pepsi, others Coke. This marginal utility is also decreasing; a house is worth a lot more to you if you are living in the street than if you already have a mansion. Ideally we would want the goods to go to the people who want them the most—but as you’ll see in a moment, markets systematically fail to do this.

If people are making their purchases rationally, each person’s willingness-to-pay P for a given good C will be equal to their marginal utility of that good, divided by their marginal utility of wealth:

P = U'(C)/U'(W)

Now consider this from the perspective of society as a whole. If you wanted to maximize utility, you’d equalize marginal utility across individuals (by the Extreme Value Theorem). The idea is that if marginal utility is higher for one person, you should give that person more, because the benefit of what you give them will be larger that way; and if marginal utility is lower for another person, you should give that person less, because the benefit of what you give them will be smaller. When everyone is equal, you are at the maximum.

But market prices don’t actually do this. Instead they equalize over willingness-to-pay. So if you’ve got two individuals 1 and 2, instead of having this:

U'(C1) = U'(C2)

you have this:

P1 = P2

which translates to:

U'(C1)/U'(W1) = U'(C2)/U'(W2)

If the marginal utilities were the same, U'(W1) = U'(W2), we’d be fine; these would give the same results. But that would only happen if W1 = W2, that is, if the two individuals had the same amount of wealth.

Now suppose we were instead maximizing weighted utility, where each person gets a weighting factor A based on how “important” they are or something. If your A is higher, your utility matters more. If we maximized this new weighted utility, we would end up like this:

A1*U'(C1) = A2*U'(C2)

Because person 1’s utility counts for more, their marginal utility also counts for more. This seems very strange; why are we valuing some people more than others? On what grounds?

Yet this is effectively what we’ve already done by using market prices.
Just set:
A = 1/U'(W)

Since marginal utility of wealth is decreasing, 1/U'(W) is higher precisely when W is higher.

How much higher? Well, that depends on the utility function. The two utility functions I find most plausible are logarithmic and harmonic. (Actually I think both apply, one to other-directed spending and the other to self-directed spending.)

If utility is logarithmic:

U = ln(W)

Then marginal utility is inversely proportional:

U'(W) = 1/W

In that case, your value as a human being, as spoken by the One True Market, is precisely equal to your wealth:

A = 1/U'(W) = W

If utility is harmonic, matters are even more severe.

U(W) = 1-1/W

Marginal utility goes as the inverse square of wealth:

U'(W) = 1/W^2

And thus your value, according to the market, is equal to the square of your wealth:

A = 1/U'(W) = W^2

What are we really saying here? Hopefully no one actually believes that Bill Gates is really morally worth 400 trillion times as much as a starving child in Malawi, as the calculation from harmonic utility would imply. (Bill Gates himself certainly doesn’t!) Even the logarithmic utility estimate saying that he’s worth 20 million times as much is pretty hard to believe.

But implicitly, the market “believes” that, because when it decides how to allocate resources, something that is worth 1 microQALY to Bill Gates (about the value a nickel dropped on the floor to you or I) but worth 20 QALY (twenty years of life!) to the Malawian child, will in either case be priced at $8,000, and since the child doesn’t have $8,000, it will probably go to Mr. Gates. Perhaps a middle-class American could purchase it, provided it was worth some 0.3 QALY to them.

Now consider that this is happening in every transaction, for every good, in every market. Goods are not being sold to the people who get the most value out of them; they are being sold to the people who have the most money.

And suddenly, the entire edifice of “market efficiency” comes crashing down like a house of cards. A global market that quite efficiently maximizes willingness-to-pay is so thoroughly out of whack when it comes to actually maximizing utility that massive redistribution of wealth could enormously increase human welfare, even if it turned out to cut our total output in half—if utility is harmonic, even if it cut our total output to one-tenth its current value.

The only way to escape this is to argue that marginal utility of wealth is not decreasing, or at least decreasing very, very slowly. Suppose for instance that utility goes as the 0.9 power of wealth:

U(W) = W^0.9

Then marginal utility goes as the -0.1 power of wealth:

U'(W) = 0.9 W^(-0.1)

On this scale, Bill Gates is only worth about 5 times as much as the Malawian child, which in his particular case might actually be too small—if a trolley is about to kill either Bill Gates or 5 Malawian children, I think I save Bill Gates, because he’ll go on to save many more than 5 Malawian children. (Of course, substitute Donald Trump or Charles Koch and I’d let the trolley run over him without a second thought if even a single child is at stake, so it’s not actually a function of wealth.) In any case, a 5 to 1 range across the whole range of human wealth is really not that big a deal. It would introduce some distortions, but not enough to justify any redistribution that would meaningfully reduce overall output.

Of course, that commits you to saying that $1 to a Malawian child is only worth about $1.50 to you or I and $5 to Bill Gates. If you can truly believe this, then perhaps you can sleep at night accepting the outcomes of neoclassical economics. But can you, really, believe that? If you had the choice between an intervention that would give $100 to each of 10,000 children in Malawi, and another that would give $50,000 to each of 100 billionaires, would you really choose the billionaires? Do you really think that the world would be better off if you did?

We don’t have precise measurements of marginal utility of wealth, unfortunately. At the moment, I think logarithmic utility is the safest assumption; it’s about the slowest decrease that is consistent with the data we have and it is very intuitive and mathematically tractable. Perhaps I’m wrong and the decrease is even slower than that, say W^(-0.5) (then the market only values billionaires as worth thousands of times as much as starving children). But there’s no way you can go as far as it would take to justify our current distribution of wealth. W^(-0.1) is simply not a plausible value.

And this means that free markets, left to their own devices, will systematically fail to maximize human welfare. We need redistribution—a lot of redistribution. Don’t take my word for it; the math says so.

How do people think about probability?

Nov 27, JDN 2457690

(This topic was chosen by vote of my Patreons.)

In neoclassical theory, it is assumed (explicitly or implicitly) that human beings judge probability in something like the optimal Bayesian way: We assign prior probabilities to events, and then when confronted with evidence we infer using the observed data to update our prior probabilities to posterior probabilities. Then, when we have to make decisions, we maximize our expected utility subject to our posterior probabilities.

This, of course, is nothing like how human beings actually think. Even very intelligent, rational, numerate people only engage in a vague approximation of this behavior, and only when dealing with major decisions likely to affect the course of their lives. (Yes, I literally decide which universities to attend based upon formal expected utility models. Thus far, I’ve never been dissatisfied with a decision made that way.) No one decides what to eat for lunch or what to do this weekend based on formal expected utility models—or at least I hope they don’t, because that point the computational cost far exceeds the expected benefit.

So how do human beings actually think about probability? Well, a good place to start is to look at ways in which we systematically deviate from expected utility theory.

A classic example is the Allais paradox. See if it applies to you.

In game A, you get $1 million dollars, guaranteed.
In game B, you have a 10% chance of getting $5 million, an 89% chance of getting $1 million, but now you have a 1% chance of getting nothing.

Which do you prefer, game A or game B?

In game C, you have an 11% chance of getting $1 million, and an 89% chance of getting nothing.

In game D, you have a 10% chance of getting $5 million, and a 90% chance of getting nothing.

Which do you prefer, game C or game D?

I have to think about it for a little bit and do some calculations, and it’s still very hard because it depends crucially on my projected lifetime income (which could easily exceed $3 million with a PhD, especially in economics) and the precise form of my marginal utility (I think I have constant relative risk aversion, but I’m not sure what parameter to use precisely), but in general I think I want to choose game A and game C, but I actually feel really ambivalent, because it’s not hard to find plausible parameters for my utility where I should go for the gamble.

But if you’re like most people, you choose game A and game D.

There is no coherent expected utility by which you would do this.

Why? Either a 10% chance of $5 million instead of $1 million is worth risking a 1% chance of nothing, or it isn’t. If it is, you should play B and D. If it’s not, you should play A and C. I can’t tell you for sure whether it is worth it—I can’t even fully decide for myself—but it either is or it isn’t.

Yet most people have a strong intuition that they should take game A but game D. Why? What does this say about how we judge probability?
The leading theory in behavioral economics right now is cumulative prospect theory, developed by the great Kahneman and Tversky, who essentially founded the field of behavioral economics. It’s quite intimidating to try to go up against them—which is probably why we should force ourselves to do it. Fear of challenging the favorite theories of the great scientists before us is how science stagnates.

I wrote about it more in a previous post, but as a brief review, cumulative prospect theory says that instead of judging based on a well-defined utility function, we instead consider gains and losses as fundamentally different sorts of thing, and in three specific ways:

First, we are loss-averse; we feel a loss about twice as intensely as a gain of the same amount.

Second, we are risk-averse for gains, but risk-seeking for losses; we assume that gaining twice as much isn’t actually twice as good (which is almost certainly true), but we also assume that losing twice as much isn’t actually twice as bad (which is almost certainly false and indeed contradictory with the previous).

Third, we judge probabilities as more important when they are close to certainty. We make a large distinction between a 0% probability and a 0.0000001% probability, but almost no distinction at all between a 41% probability and a 43% probability.

That last part is what I want to focus on for today. In Kahneman’s model, this is a continuous, monotonoic function that maps 0 to 0 and 1 to 1, but systematically overestimates probabilities below but near 1/2 and systematically underestimates probabilities above but near 1/2.

It looks something like this, where red is true probability and blue is subjective probability:

I don’t believe this is actually how humans think, for two reasons:

  1. It’s too hard. Humans are astonishingly innumerate creatures, given the enormous processing power of our brains. It’s true that we have some intuitive capacity for “solving” very complex equations, but that’s almost all within our motor system—we can “solve a differential equation” when we catch a ball, but we have no idea how we’re doing it. But probability judgments are often made consciously, especially in experiments like the Allais paradox; and the conscious brain is terrible at math. It’s actually really amazing how bad we are at math. Any model of normal human judgment should assume from the start that we will not do complicated math at any point in the process. Maybe you can hypothesize that we do so subconsciously, but you’d better have a good reason for assuming that.
  2. There is no reason to do this. Why in the world would any kind of optimization system function this way? You start with perfectly good probabilities, and then instead of using them, you subject them to some bizarre, unmotivated transformation that makes them less accurate and costs computing power? You may as well hit yourself in the head with a brick.

So, why might it look like we are doing this? Well, my proposal, admittedly still rather half-baked, is that human beings don’t assign probabilities numerically at all; we assign them categorically.

You may call this, for lack of a better term, categorical prospect theory.

My theory is that people don’t actually have in their head “there is an 11% chance of rain today” (unless they specifically heard that from a weather report this morning); they have in their head “it’s fairly unlikely that it will rain today”.

That is, we assign some small number of discrete categories of probability, and fit things into them. I’m not sure what exactly the categories are, and part of what makes my job difficult here is that they may be fuzzy-edged and vary from person to person, but roughly speaking, I think they correspond to the sort of things psychologists usually put on Likert scales in surveys: Impossible, almost impossible, very unlikely, unlikely, fairly unlikely, roughly even odds, fairly likely, likely, very likely, almost certain, certain. If I’m putting numbers on these probability categories, they go something like this: 0, 0.001, 0.01, 0.10, 0.20, 0.50, 0.8, 0.9, 0.99, 0.999, 1.

Notice that this would preserve the same basic effect as cumulative prospect theory: You care a lot more about differences in probability when they are near 0 or 1, because those are much more likely to actually shift your category. Indeed, as written, you wouldn’t care about a shift from 0.4 to 0.6 at all, despite caring a great deal about a shift from 0.001 to 0.01.

How does this solve the above problems?

  1. It’s easy. Not only don’t you compute a probability and then recompute it for no reason; you never even have to compute it precisely. Just get it within some vague error bounds and that will tell you what box it goes in. Instead of computing an approximation to a continuous function, you just slot things into a small number of discrete boxes, a dozen at the most.
  2. That explains why we would do it: It’s easy. Our brains need to conserve their capacity, and they did especially in our ancestral environment when we struggled to survive. Rather than having to iterate your approximation to arbitrary precision, you just get within 0.1 or so and call it a day. That saves time and computing power, which saves energy, which could save your life.

What new problems have I introduced?

  1. It’s very hard to know exactly where people’s categories are, if they vary between individuals or even between situations, and whether they are fuzzy-edged.
  2. If you take the model I just gave literally, even quite large probability changes will have absolutely no effect as long as they remain within a category such as “roughly even odds”.

With regard to 2, I think Kahneman may himself be able to save me, with his dual process theory concept of System 1 and System 2. What I’m really asserting is that System 1, the fast, intuitive judgment system, operates on these categories. System 2, on the other hand, the careful, rational thought system, can actually make use of proper numerical probabilities; it’s just very costly to boot up System 2 in the first place, much less ensure that it actually gets the right answer.

How might we test this? Well, I think that people are more likely to use System 1 when any of the following are true:

  1. They are under harsh time-pressure
  2. The decision isn’t very important
  3. The intuitive judgment is fast and obvious

And conversely they are likely to use System 2 when the following are true:

  1. They have plenty of time to think
  2. The decision is very important
  3. The intuitive judgment is difficult or unclear

So, it should be possible to arrange an experiment varying these parameters, such that in one treatment people almost always use System 1, and in another they almost always use System 2. And then, my prediction is that in the System 1 treatment, people will in fact not change their behavior at all when you change the probability from 15% to 25% (fairly unlikely) or 40% to 60% (roughly even odds).

To be clear, you can’t just present people with this choice between game E and game F:

Game E: You get a 60% chance of $50, and a 40% chance of nothing.

Game F: You get a 40% chance of $50, and a 60% chance of nothing.

People will obviously choose game E. If you can directly compare the numbers and one game is strictly better in every way, I think even without much effort people will be able to choose correctly.

Instead, what I’m saying is that if you make the following offers to two completely different sets of people, you will observe little difference in their choices, even though under expected utility theory you should.
Group I receives a choice between game E and game G:

Game E: You get a 60% chance of $50, and a 40% chance of nothing.

Game G: You get a 100% chance of $20.

Group II receives a choice between game F and game G:

Game F: You get a 40% chance of $50, and a 60% chance of nothing.

Game G: You get a 100% chance of $20.

Under two very plausible assumptions about marginal utility of wealth, I can fix what the rational judgment should be in each game.

The first assumption is that marginal utility of wealth is decreasing, so people are risk-averse (at least for gains, which these are). The second assumption is that most people’s lifetime income is at least two orders of magnitude higher than $50.

By the first assumption, group II should choose game G. The expected income is precisely the same, and being even ever so slightly risk-averse should make you go for the guaranteed $20.

By the second assumption, group I should choose game E. Yes, there is some risk, but because $50 should not be a huge sum to you, your risk aversion should be small and the higher expected income of $30 should sway you.

But I predict that most people will choose game G in both cases, and (within statistical error) the same proportion will choose F as chose E—thus showing that the difference between a 40% chance and a 60% chance was in fact negligible to their intuitive judgments.

However, this doesn’t actually disprove Kahneman’s theory; perhaps that part of the subjective probability function is just that flat. For that, I need to set up an experiment where I show discontinuity. I need to find the edge of a category and get people to switch categories sharply. Next week I’ll talk about how we might pull that off.

Bigotry is more powerful than the market

Nov 20, JDN 2457683

If there’s one message we can take from the election of Donald Trump, it is that bigotry remains a powerful force in our society. A lot of autoflagellating liberals have been trying to explain how this election result really reflects our failure to help people displaced by technology and globalization (despite the fact that personal income and local unemployment had negligible correlation with voting for Trump), or Hillary Clinton’s “bad campaign” that nonetheless managed the same proportion of Democrat turnout that re-elected her husband in 1996.

No, overwhelmingly, the strongest predictor of voting for Trump was being White, and living in an area where most people are White. (Well, actually, that’s if you exclude authoritarianism as an explanatory variable—but really I think that’s part of what we’re trying to explain.) Trump voters were actually concentrated in areas less affected by immigration and globalization. Indeed, there is evidence that these people aren’t racist because they have anxiety about the economy—they are anxious about the economy because they are racist. How does that work? Obama. They can’t believe that the economy is doing well when a Black man is in charge. So all the statistics and even personal experiences mean nothing to them. They know in their hearts that unemployment is rising, even as the BLS data clearly shows it’s falling.

The wide prevalence and enormous power of bigotry should be obvious. But economists rarely talk about it, and I think I know why: Their models say it shouldn’t exist. The free market is supposed to automatically eliminate all forms of bigotry, because they are inefficient.

The argument for why this is supposed to happen actually makes a great deal of sense: If a company has the choice of hiring a White man or a Black woman to do the same job, but they know that the market wage for Black women is lower than the market wage for White men (which it most certainly is), and they will do the same quality and quantity of work, why wouldn’t they hire the Black woman? And indeed, if human beings were rational profit-maximizers, this is probably how they would think.

More recently some neoclassical models have been developed to try to “explain” this behavior, but always without daring to give up the precious assumption of perfect rationality. So instead we get the two leading neoclassical theories of discrimination, which are statistical discrimination and taste-based discrimination.

Statistical discrimination is the idea that under asymmetric information (and we surely have that), features such as race and gender can act as signals of quality because they are correlated with actual quality for various reasons (usually left unspecified), so it is not irrational after all to choose based upon them, since they’re the best you have.

Taste-based discrimination is the idea that people are rationally maximizing preferences that simply aren’t oriented toward maximizing profit or well-being. Instead, they have this extra term in their utility function that says they should also treat White men better than women or Black people. It’s just this extra thing they have.

A small number of studies have been done trying to discern which of these is at work.
The correct answer, of course, is neither.

Statistical discrimination, at least, could be part of what’s going on. Knowing that Black people are less likely to be highly educated than Asians (as they definitely are) might actually be useful information in some circumstances… then again, you list your degree on your resume, don’t you? Knowing that women are more likely to drop out of the workforce after having a child could rationally (if coldly) affect your assessment of future productivity. But shouldn’t the fact that women CEOs outperform men CEOs be incentivizing shareholders to elect women CEOs? Yet that doesn’t seem to happen. Also, in general, people seem to be pretty bad at statistics.

The bigger problem with statistical discrimination as a theory is that it’s really only part of a theory. It explains why not all of the discrimination has to be irrational, but some of it still does. You need to explain why there are these huge disparities between groups in the first place, and statistical discrimination is unable to do that. In order for the statistics to differ this much, you need a past history of discrimination that wasn’t purely statistical.

Taste-based discrimination, on the other hand, is not a theory at all. It’s special pleading. Rather than admit that people are failing to rationally maximize their utility, we just redefine their utility so that whatever they happen to be doing now “maximizes” it.

This is really what makes the Axiom of Revealed Preference so insidious; if you really take it seriously, it says that whatever you do, must by definition be what you preferred. You can’t possibly be irrational, you can’t possibly be making mistakes of judgment, because by definition whatever you did must be what you wanted. Maybe you enjoy bashing your head into a wall, who am I to judge?

I mean, on some level taste-based discrimination is what’s happening; people think that the world is a better place if they put women and Black people in their place. So in that sense, they are trying to “maximize” some “utility function”. (By the way, most human beings behave in ways that are provably inconsistent with maximizing any well-defined utility function—the Allais Paradox is a classic example.) But the whole framework of calling it “taste-based” is a way of running away from the real explanation. If it’s just “taste”, well, it’s an unexplainable brute fact of the universe, and we just need to accept it. If people are happier being racist, what can you do, eh?

So I think it’s high time to start calling it what it is. This is not a question of taste. This is a question of tribal instinct. This is the product of millions of years of evolution optimizing the human brain to act in the perceived interest of whatever it defines as its “tribe”. It could be yourself, your family, your village, your town, your religion, your nation, your race, your gender, or even the whole of humanity or beyond into all sentient beings. But whatever it is, the fundamental tribe is the one thing you care most about. It is what you would sacrifice anything else for.

And what we learned on November 9 this year is that an awful lot of Americans define their tribe in very narrow terms. Nationalistic and xenophobic at best, racist and misogynistic at worst.

But I suppose this really isn’t so surprising, if you look at the history of our nation and the world. Segregation was not outlawed in US schools until 1955, and there are women who voted in this election who were born before American women got the right to vote in 1920. The nationalistic backlash against sending jobs to China (which was one of the chief ways that we reduced global poverty to its lowest level ever, by the way) really shouldn’t seem so strange when we remember that over 100,000 Japanese-Americans were literally forcibly relocated into camps as recently as 1942. The fact that so many White Americans seem all right with the biases against Black people in our justice system may not seem so strange when we recall that systemic lynching of Black people in the US didn’t end until the 1960s.

The wonder, in fact, is that we have made as much progress as we have. Tribal instinct is not a strange aberration of human behavior; it is our evolutionary default setting.

Indeed, perhaps it is unreasonable of me to ask humanity to change its ways so fast! We had millions of years to learn how to live the wrong way, and I’m giving you only a few centuries to learn the right way?

The problem, of course, is that the pace of technological change leaves us with no choice. It might be better if we could wait a thousand years for people to gradually adjust to globalization and become cosmopolitan; but climate change won’t wait a hundred, and nuclear weapons won’t wait at all. We are thrust into a world that is changing very fast indeed, and I understand that it is hard to keep up; but there is no way to turn back that tide of change.

Yet “turn back the tide” does seem to be part of the core message of the Trump voter, once you get past the racial slurs and sexist slogans. People are afraid of what the world is becoming. They feel that it is leaving them behind. Coal miners fret that we are leaving them behind by cutting coal consumption. Factory workers fear that we are leaving them behind by moving the factory to China or inventing robots to do the work in half the time for half the price.

And truth be told, they are not wrong about this. We are leaving them behind. Because we have to. Because coal is polluting our air and destroying our climate, we must stop using it. Moving the factories to China has raised them out of the most dire poverty, and given us a fighting chance toward ending world hunger. Inventing the robots is only the next logical step in the process that has carried humanity forward from the squalor and suffering of primitive life to the security and prosperity of modern society—and it is a step we must take, for the progress of civilization is not yet complete.

They wouldn’t have to let themselves be left behind, if they were willing to accept our help and learn to adapt. That carbon tax that closes your coal mine could also pay for your basic income and your job-matching program. The increased efficiency from the automated factories could provide an abundance of wealth that we could redistribute and share with you.

But this would require them to rethink their view of the world. They would have to accept that climate change is a real threat, and not a hoax created by… uh… never was clear on that point actually… the Chinese maybe? But 45% of Trump supporters don’t believe in climate change (and that’s actually not as bad as I’d have thought). They would have to accept that what they call “socialism” (which really is more precisely described as social democracy, or tax-and-transfer redistribution of wealth) is actually something they themselves need, and will need even more in the future. But despite rising inequality, redistribution of wealth remains fairly unpopular in the US, especially among Republicans.

Above all, it would require them to redefine their tribe, and start listening to—and valuing the lives of—people that they currently do not.

Perhaps we need to redefine our tribe as well; many liberals have argued that we mistakenly—and dangerously—did not include people like Trump voters in our tribe. But to be honest, that rings a little hollow to me: We aren’t the ones threatening to deport people or ban them from entering our borders. We aren’t the ones who want to build a wall (though some have in fact joked about building a wall to separate the West Coast from the rest of the country, I don’t think many people really want to do that). Perhaps we live in a bubble of liberal media? But I make a point of reading outlets like The American Conservative and The National Review for other perspectives (I usually disagree, but I do at least read them); how many Trump voters do you think have ever read the New York Times, let alone Huffington Post? Cosmopolitans almost by definition have the more inclusive tribe, the more open perspective on the world (in fact, do I even need the “almost”?).

Nor do I think we are actually ignoring their interests. We want to help them. We offer to help them. In fact, I want to give these people free money—that’s what a basic income would do, it would take money from people like me and give it to people like them—and they won’t let us, because that’s “socialism”! Rather, we are simply refusing to accept their offered solutions, because those so-called “solutions” are beyond unworkable; they are absurd, immoral and insane. We can’t bring back the coal mining jobs, unless we want Florida underwater in 50 years. We can’t reinstate the trade tariffs, unless we want millions of people in China to starve. We can’t tear down all the robots and force factories to use manual labor, unless we want to trigger a national—and then global—economic collapse. We can’t do it their way. So we’re trying to offer them another way, a better way, and they’re refusing to take it. So who here is ignoring the concerns of whom?

Of course, the fact that it’s really their fault doesn’t solve the problem. We do need to take it upon ourselves to do whatever we can, because, regardless of whose fault it is, the world will still suffer if we fail. And that presents us with our most difficult task of all, a task that I fully expect to spend a career trying to do and yet still probably failing: We must understand the human tribal instinct well enough that we can finally begin to change it. We must know enough about how human beings form their mental tribes that we can actually begin to shift those parameters. We must, in other words, cure bigotry—and we must do it now, for we are running out of time.

“The cake is a lie”: The fundamental distortions of inequality

July 13, JDN 2457583

Inequality of wealth and income, especially when it is very large, fundamentally and radically distorts outcomes in a capitalist market. I’ve already alluded to this matter in previous posts on externalities and marginal utility of wealth, but it is so important I think it deserves to have its own post. In many ways this marks a paradigm shift: You can’t think about economics the same way once you realize it is true.

To motivate what I’m getting at, I’ll expand upon an example from a previous post.

Suppose there are only two goods in the world; let’s call them “cake” (K) and “money” (M). Then suppose there are three people, Baker, who makes cakes, Richie, who is very rich, and Hungry, who is very poor. Furthermore, suppose that Baker, Richie and Hungry all have exactly the same utility function, which exhibits diminishing marginal utility in cake and money. To make it more concrete, let’s suppose that this utility function is logarithmic, specifically: U = 10*ln(K+1) + ln(M+1)

The only difference between them is in their initial endowments: Baker starts with 10 cakes, Richie starts with $100,000, and Hungry starts with $10.

Therefore their starting utilities are:

U(B) = 10*ln(10+1)= 23.98

U(R) = ln(100,000+1) = 11.51

U(H) = ln(10+1) = 2.40

Thus, the total happiness is the sum of these: U = 37.89

Now let’s ask two very simple questions:

1. What redistribution would maximize overall happiness?
2. What redistribution will actually occur if the three agents trade rationally?

If multiple agents have the same diminishing marginal utility function, it’s actually a simple and deep theorem that the total will be maximized if they split the wealth exactly evenly. In the following blockquote I’ll prove the simplest case, which is two agents and one good; it’s an incredibly elegant proof:

Given: for all x, f(x) > 0, f'(x) > 0, f”(x) < 0.

Maximize: f(x) + f(A-x) for fixed A

f'(x) – f'(A – x) = 0

f'(x) = f'(A – x)

Since f”(x) < 0, this is a maximum.

Since f'(x) > 0, f is monotonic; therefore f is injective.

x = A – x


This can be generalized to any number of agents, and for multiple goods. Thus, in this case overall happiness is maximized if the cakes and money are both evenly distributed, so that each person gets 3 1/3 cakes and $33,336.66.

The total utility in that case is:

3 * (10 ln(10/3+1) + ln(33,336.66+1)) = 3 * (14.66 + 10.414) = 3 (25.074) =75.22

That’s considerably better than our initial distribution (almost twice as good). Now, how close do we get by rational trade?

Each person is willing to trade up until the point where their marginal utility of cake is equal to their marginal utility of money. The price of cake will be set by the respective marginal utilities.

In particular, let’s look at the trade that will occur between Baker and Richie. They will trade until their marginal rate of substitution is the same.

The actual algebra involved is obnoxious (if you’re really curious, here are some solved exercises of similar trade problems), so let’s just skip to the end. (I rushed through, so I’m not actually totally sure I got it right, but to make my point the precise numbers aren’t important.)
Basically what happens is that Richie pays an exorbitant price of $10,000 per cake, buying half the cakes with half of his money.

Baker’s new utility and Richie’s new utility are thus the same:
U(R) = U(B) = 10*ln(5+1) + ln(50,000+1) = 17.92 + 10.82 = 28.74
What about Hungry? Yeah, well, he doesn’t have $10,000. If cakes are infinitely divisible, he can buy up to 1/1000 of a cake. But it turns out that even that isn’t worth doing (it would cost too much for what he gains from it), so he may as well buy nothing, and his utility remains 2.40.

Hungry wanted cake just as much as Richie, and because Richie has so much more Hungry would have gotten more happiness from each new bite. Neoclassical economists promised him that markets were efficient and optimal, and so he thought he’d get the cake he needs—but the cake is a lie.

The total utility is therefore:

U = U(B) + U(R) + U(H)

U = 28.74 + 28.74 + 2.40

U = 59.88

Note three things about this result: First, it is more than where we started at 37.89—trade increases utility. Second, both Richie and Baker are better off than they were—trade is Pareto-improving. Third, the total is less than the optimal value of 75.22—trade is not utility-maximizing in the presence of inequality. This is a general theorem that I could prove formally, if I wanted to bore and confuse all my readers. (Perhaps someday I will try to publish a paper doing that.)

This result is incredibly radical—it basically goes against the core of neoclassical welfare theory, or at least of all its applications to real-world policy—so let me be absolutely clear about what I’m saying, and what assumptions I had to make to get there.

I am saying that if people start with different amounts of wealth, the trades they would willfully engage in, acting purely under their own self interest, would not maximize the total happiness of the population. Redistribution of wealth toward equality would increase total happiness.

First, I had to assume that we could simply redistribute goods however we like without affecting the total amount of goods. This is wildly unrealistic, which is why I’m not actually saying we should reduce inequality to zero (as would follow if you took this result completely literally). Ironically, this is an assumption that most neoclassical welfare theory agrees with—the Second Welfare Theorem only makes any sense in a world where wealth can be magically redistributed between people without any harmful economic effects. If you weaken this assumption, what you find is basically that we should redistribute wealth toward equality, but beware of the tradeoff between too much redistribution and too little.

Second, I had to assume that there’s such a thing as “utility”—specifically, interpersonally comparable cardinal utility. In other words, I had to assume that there’s some way of measuring how much happiness each person has, and meaningfully comparing them so that I can say whether taking something from one person and giving it to someone else is good or bad in any given circumstance.

This is the assumption neoclassical welfare theory generally does not accept; instead they use ordinal utility, on which we can only say whether things are better or worse, but never by how much. Thus, their only way of determining whether a situation is better or worse is Pareto efficiency, which I discussed in a post a couple years ago. The change from the situation where Baker and Richie trade and Hungry is left in the lurch to the situation where all share cake and money equally in socialist utopia is not a Pareto-improvement. Richie and Baker are slightly worse off with 25.07 utilons in the latter scenario, while they had 28.74 utilons in the former.

Third, I had to assume selfishness—which is again fairly unrealistic, but again not something neoclassical theory disagrees with. If you weaken this assumption and say that people are at least partially altruistic, you can get the result where instead of buying things for themselves, people donate money to help others out, and eventually the whole system achieves optimal utility by willful actions. (It depends just how altruistic people are, as well as how unequal the initial endowments are.) This actually is basically what I’m trying to make happen in the real world—I want to show people that markets won’t do it on their own, but we have the chance to do it ourselves. But even then, it would go a lot faster if we used the power of government instead of waiting on private donations.

Also, I’m ignoring externalities, which are a different type of market failure which in no way conflicts with this type of failure. Indeed, there are three basic functions of government in my view: One is to maintain security. The second is to cancel externalities. The third is to redistribute wealth. The DOD, the EPA, and the SSA, basically. One could also add macroeconomic stability as a fourth core function—the Fed.

One way to escape my theorem would be to deny interpersonally comparable utility, but this makes measuring welfare in any way (including the usual methods of consumer surplus and GDP) meaningless, and furthermore results in the ridiculous claim that we have no way of being sure whether Bill Gates is happier than a child starving and dying of malaria in Burkina Faso, because they are two different people and we can’t compare different people. Far more reasonable is not to believe in cardinal utility, meaning that we can say an extra dollar makes you better off, but we can’t put a number on how much.

And indeed, the difficulty of even finding a unit of measure for utility would seem to support this view: Should I use QALY? DALY? A Likert scale from 0 to 10? There is no known measure of utility that is without serious flaws and limitations.

But it’s important to understand just how strong your denial of cardinal utility needs to be in order for this theorem to fail. It’s not enough that we can’t measure precisely; it’s not even enough that we can’t measure with current knowledge and technology. It must be fundamentally impossible to measure. It must be literally meaningless to say that taking a dollar from Bill Gates and giving it to the starving Burkinabe would do more good than harm, as if you were asserting that triangles are greener than schadenfreude.

Indeed, the whole project of welfare theory doesn’t make a whole lot of sense if all you have to work with is ordinal utility. Yes, in principle there are policy changes that could make absolutely everyone better off, or make some better off while harming absolutely no one; and the Pareto criterion can indeed tell you that those would be good things to do.

But in reality, such policies almost never exist. In the real world, almost anything you do is going to harm someone. The Nuremburg trials harmed Nazi war criminals. The invention of the automobile harmed horse trainers. The discovery of scientific medicine took jobs away from witch doctors. Inversely, almost any policy is going to benefit someone. The Great Leap Forward was a pretty good deal for Mao. The purges advanced the self-interest of Stalin. Slavery was profitable for plantation owners. So if you can only evaluate policy outcomes based on the Pareto criterion, you are literally committed to saying that there is no difference in welfare between the Great Leap Forward and the invention of the polio vaccine.

One way around it (that might actually be a good kludge for now, until we get better at measuring utility) is to broaden the Pareto criterion: We could use a majoritarian criterion, where you care about the number of people benefited versus harmed, without worrying about magnitudes—but this can lead to Tyranny of the Majority. Or you could use the Difference Principle developed by Rawls: find an ordering where we can say that some people are better or worse off than others, and then make the system so that the worst-off people are benefited as much as possible. I can think of a few cases where I wouldn’t want to apply this criterion (essentially they are circumstances where autonomy and consent are vital), but in general it’s a very good approach.

Neither of these depends upon cardinal utility, so have you escaped my theorem? Well, no, actually. You’ve weakened it, to be sure—it is no longer a statement about the fundamental impossibility of welfare-maximizing markets. But applied to the real world, people in Third World poverty are obviously the worst off, and therefore worthy of our help by the Difference Principle; and there are an awful lot of them and very few billionaires, so majority rule says take from the billionaires. The basic conclusion that it is a moral imperative to dramatically reduce global inequality remains—as does the realization that the “efficiency” and “optimality” of unregulated capitalism is a chimera.

Two terms in marginal utility of wealth

JDN 2457569

This post is going to be a little wonkier than most; I’m actually trying to sort out my thoughts and draw some public comment on a theory that has been dancing around my head for awhile. The original idea of separating terms in marginal utility of wealth was actually suggested by my boyfriend, and from there I’ve been trying to give it some more mathematical precision to see if I can come up with a way to test it experimentally. My thinking is also influenced by a paper Miles Kimball wrote about the distinction between happiness and utility.

There are lots of ways one could conceivably spend money—everything from watching football games to buying refrigerators to building museums to inventing vaccines. But insofar as we are rational (and we are after all about 90% rational), we’re going to try to spend our money in such a way that its marginal utility is approximately equal across various activities. You’ll buy one refrigerator, maybe two, but not seven, because the marginal utility of refrigerators drops off pretty fast; instead you’ll spend that money elsewhere. You probably won’t buy a house that’s twice as large if it means you can’t afford groceries anymore. I don’t think our spending is truly optimal at maximizing utility, but I think it’s fairly good.

Therefore, it doesn’t make much sense to break down marginal utility of wealth into all these different categories—cars, refrigerators, football games, shoes, and so on—because we already do a fairly good job of equalizing marginal utility across all those different categories. I could see breaking it down into a few specific categories, such as food, housing, transportation, medicine, and entertainment (and this definitely seems useful for making your own household budget); but even then, I don’t get the impression that most people routinely spend too much on one of these categories and not enough on the others.

However, I can think of two quite different fundamental motives behind spending money, which I think are distinct enough to be worth separating.

One way to spend money is on yourself, raising your own standard of living, making yourself more comfortable. This would include both football games and refrigerators, really anything that makes your life better. We could call this the consumption motive, or maybe simply the self-directed motive.

The other way is to spend it on other people, which, depending on your personality can take either the form of philanthropy to help others, or as a means of self-aggrandizement to raise your own relative status. It’s also possible to do both at the same time in various combinations; while the Gates Foundation is almost entirely philanthropic and Trump Tower is almost entirely self-aggrandizing, Carnegie Hall falls somewhere in between, being at once a significant contribution to our society and an obvious attempt to bring praise and adulation to himself. I would also include spending on Veblen goods that are mainly to show off your own wealth and status in this category. We can call this spending the philanthropic/status motive, or simply the other-directed motive.

There is some spending which combines both motives: A car is surely useful, but a Ferrari is mainly for show—but then, a Lexus or a BMW could be either to show off or really because you like the car better. Some form of housing is a basic human need, and bigger, fancier houses are often better, but the main reason one builds mansions in Beverly Hills is to demonstrate to the world that one is fabulously rich. This complicates the theory somewhat, but basically I think the best approach is to try to separate a sort of “spending proportion” on such goods, so that say $20,000 of the Lexus is for usefulness and $15,000 is for show. Empirically this might be hard to do, but theoretically it makes sense.

One of the central mysteries in cognitive economics right now is the fact that while self-reported happiness rises very little, if at all, as income increases, a finding which was recently replicated even in poor countries where we might not expect it to be true, nonetheless self-reported satisfaction continues to rise indefinitely. A number of theories have been proposed to explain this apparent paradox.

This model might just be able to account for that, if by “happiness” we’re really talking about the self-directed motive, and by “satisfaction” we’re talking about the other-directed motive. Self-reported happiness seems to obey a rule that $100 is worth as much to someone with $10,000 as $25 is to someone with $5,000, or $400 to someone with $20,000.

Self-reported satisfaction seems to obey a different rule, such that each unit of additional satisfaction requires a roughly equal proportional increase in income.

By having a utility function with two terms, we can account for both of these effects. Total utility will be u(x), happiness h(x), and satisfaction s(x).

u(x) = h(x) + s(x)

To obey the above rule, happiness must obey harmonic utility, like this, for some constants h0 and r:

h(x) = h0 – r/x

Proof of this is straightforward, though to keep it simple I’ve hand-waved why it’s a power law:


h'(2x) = 1/4 h'(x)


h'(x) = r x^n

h'(2x) = r (2x)^n

r (2x)^n = 1/4 r x^n

n = -2

h'(x) = r/x^2

h(x) = – r x^(-1) + C

h(x) = h0 – r/x

Miles Kimball also has some more discussion on his blog about how a utility function of this form works. (His statement about redistribution at the end is kind of baffling though; sure, dollar for dollar, redistributing wealth from the middle class to the poor would produce a higher gain in utility than redistributing wealth from the rich to the middle class. But neither is as good as redistributing from the rich to the poor, and the rich have a lot more dollars to redistribute.)

Satisfaction, however, must obey logarithmic utility, like this, for some constants s0 and k.

The x+1 means that it takes slightly less proportionally to have the same effect as your wealth increases, but it allows the function to be equal to s0 at x=0 instead of going to negative infinity:

s(x) = s0 + k ln(x)

Proof of this is very simple, almost trivial:


s'(x) = k/x

s(x) = k ln(x) + s0

Both of these functions actually have a serious problem that as x approaches zero, they go to negative infinity. For self-directed utility this almost makes sense (if your real consumption goes to zero, you die), but it makes no sense at all for other-directed utility, and since there are causes most of us would willingly die for, the disutility of dying should be large, but not infinite.

Therefore I think it’s probably better to use x +1 in place of x:

h(x) = h0 – r/(x+1)

s(x) = s0 + k ln(x+1)

This makes s0 the baseline satisfaction of having no other-directed spending, though the baseline happiness of zero self-directed spending is actually h0 – r rather than just h0. If we want it to be h0, we could use this form instead:

h(x) = h0 + r x/(x+1)

This looks quite different, but actually only differs by a constant.

Therefore, my final answer for the utility of wealth (or possibly income, or spending? I’m not sure which interpretation is best just yet) is actually this:

u(x) = h(x) + s(x)

h(x) = h0 + r x/(x+1)

s(x) = s0 + k ln(x+1)

Marginal utility is then the derivatives of these:

h'(x) = r/(x+1)^2

s'(x) = k/(x+1)

Let’s assign some values to the constants so that we can actually graph these.

Let h0 = s0 = 0, so our baseline is just zero.

Furthermore, let r = k = 1, which would mean that the value of $1 is the same whether spent either on yourself or on others, if $1 is all you have. (This is probably wrong, actually, but it’s the simplest to start with. Shortly I’ll discuss what happens as you vary the ratio k/r.)

Here is the result graphed on a linear scale:


And now, graphed with wealth on a logarithmic scale:


As you can see, self-directed marginal utility drops off much faster than other-directed marginal utility, so the amount you spend on others relative to yourself rapidly increases as your wealth increases. If that doesn’t sound right, remember that I’m including Veblen goods as “other-directed”; when you buy a Ferrari, it’s not really for yourself. While proportional rates of charitable donation do not increase as wealth increases (it’s actually a U-shaped pattern, largely driven by poor people giving to religious institutions), they probably should (people should really stop giving to religious institutions! Even the good ones aren’t cost-effective, and some are very, very bad.). Furthermore, if you include spending on relative power and status as the other-directed motive, that kind of spending clearly does proportionally increase as wealth increases—gotta keep up with those Joneses.

If r/k = 1, that basically means you value others exactly as much as yourself, which I think is implausible (maybe some extreme altruists do that, and Peter Singer seems to think this would be morally optimal). r/k < 1 would mean you should never spend anything on yourself, which not even Peter Singer believes. I think r/k = 10 is a more reasonable estimate.

For any given value of r/k, there is an optimal ratio of self-directed versus other-directed spending, which can vary based on your total wealth.

Actually deriving what the optimal proportion would be requires a whole lot of algebra in a post that probably already has too much algebra, but the point is, there is one, and it will depend strongly on the ratio r/k, that is, the overall relative importance of self-directed versus other-directed motivation.

Take a look at this graph, which uses r/k = 10.


If you only have 2 to spend, you should spend it entirely on yourself, because up to that point the marginal utility of self-directed spending is always higher. If you have 3 to spend, you should spend most of it on yourself, but a little bit on other people, because after you’ve spent about 2.2 on yourself there is more marginal utility for spending on others than on yourself.

If your available wealth is W, you would spend some amount x on yourself, and then W-x on others:

u(x) = h(x) + s(W-x)

u(x) = r x/(x+1) + k ln(W – x + 1)

Then you take the derivative and set it equal to zero to find the local maximum. I’ll spare you the algebra, but this is the result of that optimization:

x = – 1 – r/(2k) + sqrt(r/k) sqrt(2 + W + r/(4k))

As long as k <= r (which more or less means that you care at least as much about yourself as about others—I think this is true of basically everyone) then as long as W > 0 (as long as you have some money to spend) we also have x > 0 (you will spend at least something on yourself).

Below a certain threshold (depending on r/k), the optimal value of x is greater than W, which means that, if possible, you should be receiving donations from other people and spending them on yourself. (Otherwise, just spend everything on yourself). After that, x < W, which means that you should be donating to others. The proportion that you should be donating smoothly increases as W increases, as you can see on this graph (which uses r/k = 10, a figure I find fairly plausible):


While I’m sure no one literally does this calculation, most people do seem to have an intuitive sense that you should donate an increasing proportion of your income to others as your income increases, and similarly that you should pay a higher proportion in taxes. This utility function would justify that—which is something that most proposed utility functions cannot do. In most models there is a hard cutoff where you should donate nothing up to the point where your marginal utility is equal to the marginal utility of donating, and then from that point forward you should donate absolutely everything. Maybe a case can be made for that ethically, but psychologically I think it’s a non-starter.

I’m still not sure exactly how to test this empirically. It’s already quite difficult to get people to answer questions about marginal utility in a way that is meaningful and coherent (people just don’t think about questions like “Which is worth more? $4 to me now or $10 if I had twice as much wealth?” on a regular basis). I’m thinking maybe they could play some sort of game where they have the opportunity to make money at the game, but must perform tasks or bear risks to do so, and can then keep the money or donate it to charity. The biggest problem I see with that is that the amounts would probably be too small to really cover a significant part of anyone’s total wealth, and therefore couldn’t cover much of their marginal utility of wealth function either. (This is actually a big problem with a lot of experiments that use risk aversion to try to tease out marginal utility of wealth.) But maybe with a variety of experimental participants, all of whom we get income figures on?

The difference between price, cost, and value

JDN 2457559

This topic has been on the voting list for my Patreons for several months, but it never quite seems to win the vote. Well, this time it did. I’m glad, because I was tempted to do it anyway.

“Price”, “cost”, and “value”; the words are often used more or less interchangeably, not only by regular people but even by economists. I’ve read papers that talked about “rising labor costs” when what they clearly meant was rising wages—rising labor prices. I’ve read papers that tried to assess the projected “cost” of climate change by using the prices of different commodity futures. And hardly a day goes buy that I don’t see a TV commercial listing one (purely theoretical) price, cutting it in half (to the actual price), and saying they’re now giving you “more value”.

As I’ll get to, there are reasons to think they would be approximately the same for some purposes. Indeed, they would be equal, at the margin, in a perfectly efficient market—that may be why so many economists use them this way, because they implicitly or explicitly assume efficient markets. But they are fundamentally different concepts, and it’s dangerous to equate them casually.


Price is exactly what you think it is: The number of dollars you must pay to purchase something. Most of the time when we talk about “cost” or “value” and then give a dollar figure, we’re actually talking about some notion of price.

Generally we speak in terms of nominal prices, which are the usual concept of prices in actual dollars paid, but sometimes we do also speak in terms of real prices, which are relative prices of different things once you’ve adjusted for overall inflation. “Inflation-adjusted price” can be a somewhat counter-intuitive concept; if a good’s (nominal) price rises, but by less than most other prices have risen, its real price has actually fallen.

You also need to be careful about just what price you’re looking at. When we look at labor prices, for example, we need to consider not only cash wages, but also fringe benefits and other compensation such as stock options. But other than that, prices are fairly straightforward.


Cost is probably not at all what you think it is. The real cost of something has nothing to do with money; saying that a candy bar “costs $2” or a computer “costs $2,000” is at best a somewhat sloppy shorthand and at worst a fundamental distortion of what cost is and why it matters. No, those are prices. The cost of a candy bar is the toil of children in cocoa farms in Cote d’Ivoire. The cost of a computer is the ecological damage and displaced indigenous people caused by coltan mining in Congo.

The cost of something is the harm that it does to human well-being (or for that matter to the well-being of any sentient being). It is not measured in money but in “the sweat of our laborers, the genius of our scientists, the hopes of our children” (to quote Eisenhower, who understood real cost better than most economists). There is also opportunity cost, the real cost we pay not by what we did, but by what we didn’t do—what we could have done instead.

This is important precisely because while costs should always be reduced when possible, prices can in fact be too low—and indeed, artificially low prices of goods due to externalities are probably the leading reason why humanity bears so many excess real costs. If the price of that chocolate bar accurately reflected the suffering of those African children (perhaps by—Gasp! Paying them a fair wage?), and the price of that computer accurately reflected the ecological damage of those coltan mines (a carbon tax, at least?), you might not want to buy them anymore; in which case, you should not have bought them. In fact, as I’ll get to once I discuss value, there is reason to think that even if you would buy them at a price that accurately reflected the dollar value of the real cost to their producers, we would still buy more than we should.

There is a point at which we should still buy things even though people get hurt making them; if you deny this, stop buying literally anything ever again. We don’t like to think about it, but any product we buy did cause some person, in some place, some degree of discomfort or unpleasantness in production. And many quite useful products will in fact cause death to a nonzero number of human beings.

For some products this is only barely true—it’s hard to feel bad for bestselling authors and artists who sell their work for millions, for whatever toil they may put into their work, whatever their elevated suicide rate (which is clearly endogenous; people aren’t randomly assigned to be writers), they also surely enjoy it a good deal of the time, and even if they didn’t, their work sells for millions. But for many products it is quite obviously true: A certain proportion of roofers, steelworkers, and truck drivers will die doing their jobs. We can either accept that, recognizing that it’s worth it to have roofs, steel, and trucking—and by extension, industrial capitalism, and its whole babies not dying thing—or we can give up on the entire project of human civilization, and go back to hunting and gathering; even if we somehow managed to avoid the direct homicide most hunter-gatherers engage in, far more people would simply die of disease or get eaten by predators.

Of course, we should have safety standards; but the benefits of higher safety must be carefully weighed against the potential costs of inefficiency, unemployment, and poverty. Safety regulations can reduce some real costs and increase others, even if they almost always increase prices. A good balance is struck when real cost is minimized, where any additional regulation would increase inefficiency more than it improves safety.

Actually OSHA are unsung heroes for their excellent performance at striking this balance, just as EPA are unsung heroes for their balance in environmental regulations (and that whole cutting crime in half business). If activists are mad at you for not banning everything bad and business owners are mad at you for not letting them do whatever they want, you’re probably doing it right. Would you rather people saved from fires, or fires prevented by good safety procedures? Would you rather murderers imprisoned, or boys who grow up healthy and never become murderers? If an ounce of prevention is worth a pound of cure, why does everyone love firefighters and hate safety regulators?So let me take this opportunity to say thank you, OSHA and EPA, for doing the jobs of firefighters and police way better than they do, and unlike them, never expecting to be lauded for it.

And now back to our regularly scheduled programming. Markets are supposed to reflect costs in prices, which is why it’s not totally nonsensical to say “cost” when you mean “price”; but in fact they aren’t very good at that, for reasons I’ll get to in a moment.


Value is how much something is worth—not to sell it (that’s the price again), but to use it. One of the core principles of economics is that trade is nonzero-sum, because people can exchange goods that they value differently and thereby make everyone better off. They can’t price them differently—the buyer and the seller must agree upon a price to make the trade. But they can value them differently.

To see how this works, let’s look at a very simple toy model, the simplest essence of trade: Alice likes chocolate ice cream, but all she has is a gallon of vanilla ice cream. Bob likes vanilla ice cream, but all he has is a gallon of chocolate ice cream. So Alice and Bob agree to trade their ice cream, and both of them are happier.

We can measure value in “willingness-to-pay” (WTP), the highest price you’d willingly pay for something. That makes value look more like a price; but there are several reasons we must be careful when we do that. The obvious reason is that WTP is obviously going to vary based on overall inflation; since $5 isn’t worth as much in 2016 as it was in 1956, something with a WTP of $5 in 1956 would have a much higher WTP in 2016. The not-so-obvious reason is that money is worth less to you the more you have, so we also need to take into account the effect of wealth, and the marginal utility of wealth. The more money you have, the more money you’ll be willing to pay in order to get the same amount of real benefit. (This actually creates some very serious market distortions in the presence of high income inequality, which I may make the subject of a post or even a paper at some point.) Similarly there is “willingness-to-accept” (WTA), the lowest price you’d willingly accept for it. In theory these should be equal; in practice, WTA is usually slightly higher than WTP in what’s called endowment effect.

So to make our model a bit more quantitative, we could suppose that Alice values vanilla at $5 per gallon and chocolate at $10 per gallon, while Bob also values vanilla at $5 per gallon but only values chocolate at $4 per gallon. (I’m using these numbers to point out that not all the valuations have to be different for trade to be beneficial, as long as some are.) Therefore, if Alice sells her vanilla ice cream to Bob for $5, both will (just barely) accept that deal; and then Alice can buy chocolate ice cream from Bob for anywhere between $4 and $10 and still make both people better off. Let’s say they agree to also sell for $5, so that no net money is exchanged and it is effectively the same as just trading ice cream for ice cream. In that case, Alice has gained $5 in consumer surplus (her WTP of $10 minus the $5 she paid) while Bob has gained $1 in producer surplus (the $5 he received minus his $4 WTP). The total surplus will be $6 no matter what price they choose, which we can compute directly from Alice’s WTP of $10 minus Bob’s WTA of $4. The price ultimately decides how that total surplus is distributed between the two parties, and in the real world it would very likely be the result of which one is the better negotiator.

The enormous cost of our distorted understanding

(See what I did there?) If markets were perfectly efficient, prices would automatically adjust so that, at the margin, value is equal to price is equal to cost. What I mean by “at the margin” might be clearer with an example: Suppose we’re selling apples. How many apples do you decide to buy? Well, the value of each successive apple to you is lower, the more apples you have (the law of diminishing marginal utility, which unlike most “laws” in economics is actually almost always true). At some point, the value of the next apple will be just barely above what you have to pay for it, so you’ll stop there. By a similar argument, the cost of producing apples increases the more apples you produce (the law of diminishing returns, which is a lot less reliable, more like the Pirate Code), and the producers of apples will keep selling them until the price they can get is only just barely larger than the cost of production. Thus, in the theoretical limit of infinitely-divisible apples and perfect rationality, marginal value = price = marginal cost. In such a world, markets are perfectly efficient and they maximize surplus, which is the difference between value and cost.

But in the real world of course, none of those assumptions are true. No product is infinitely divisible (though the gasoline in a car is obviously a lot more divisible than the car itself). No one is perfectly rational. And worst of all, we’re not measuring value in the same units. As a result, there is basically no reason to think that markets are optimizing anything; their optimization mechanism is setting two things equal that aren’t measured the same way, like trying to achieve thermal equilibrium by matching the temperature of one thing in Celsius to the temperature of other things in Fahrenheit.

An implicit assumption of the above argument that didn’t even seem worth mentioning was that when I set value equal to price and set price equal to cost, I’m setting value equal to cost; transitive property of equality, right? Wrong. The value is equal to the price, as measured by the buyer. The cost is equal to the price, as measured by the seller.

If the buyer and seller have the same marginal utility of wealth, no problem; they are measuring in the same units. But if not, we convert from utility to money and then back to utility, using a different function to convert each time. In the real world, wealth inequality is massive, so it’s wildly implausible that we all have anything close to the same marginal utility of wealth. Maybe that’s close enough if you restrict yourself to middle-class people in the First World; so when a tutoring client pays me, we might really be getting close to setting marginal value equal to marginal cost. But once you include corporations that are owned by billionaires and people who live on $2 per day, there’s simply no way that those price-to-utility conversions are the same at each end. For Bill Gates, a million dollars is a rounding error. For me, it would buy a house, give me more flexible work options, and keep me out of debt, but not radically change the course of my life. For a child on a cocoa farm in Cote d’Ivoire, it could change her life in ways she can probably not even comprehend.

The market distortions created by this are huge; indeed, most of the fundamental flaws in capitalism as we know it are ultimately traceable to this. Why do Americans throw away enough food to feed all the starving children in Africa? Marginal utility of wealth. Why are Silicon Valley programmers driving the prices for homes in San Francisco higher than most Americans will make in their lifetimes? Marginal utility of wealth. Why are the Koch brothers spending more on this year’s elections than the nominal GDP of the Gambia? Marginal utility of wealth. It’s the sort of pattern that once you see it suddenly seems obvious and undeniable, a paradigm shift a bit like the heliocentric model of the solar system. Forget trade barriers, immigration laws, and taxes; the most important market distortions around the world are all created by wealth inequality. Indeed, the wonder is that markets work as well as they do.

The real challenge is what to do about it, how to reduce this huge inequality of wealth and therefore marginal utility of wealth, without giving up entirely on the undeniable successes of free market capitalism. My hope is that once more people fully appreciate the difference between price, cost, and value, this paradigm shift will be much easier to make; and then perhaps we can all work together to find a solution.

Why it matters that torture is ineffective

JDN 2457531

Like “longest-ever-serving Speaker of the House sexually abuses teenagers” and “NSA spy program is trying to monitor the entire telephone and email system”, the news that the US government systematically tortures suspects is an egregious violation that goes to the highest levels of our government—that for some reason most Americans don’t particularly seem to care about.

The good news is that President Obama signed an executive order in 2009 banning torture domestically, reversing official policy under the Bush Administration, and then better yet in 2014 expanded the order to apply to all US interests worldwide. If this is properly enforced, perhaps our history of hypocrisy will finally be at its end. (Well, not if Trump wins…)

Yet as often seems to happen, there are two extremes in this debate and I think they’re both wrong.
The really disturbing side is “Torture works and we have to use it!” The preferred mode of argumentation for this is the “ticking time bomb scenario”, in which we have some urgent disaster to prevent (such as a nuclear bomb about to go off) and torture is the only way to stop it from happening. Surely then torture is justified? This argument may sound plausible, but as I’ll get to below, this is a lot like saying, “If aliens were attacking from outer space trying to wipe out humanity, nuclear bombs would probably be justified against them; therefore nuclear bombs are always justified and we can use them whenever we want.” If you can’t wait for my explanation, The Atlantic skewers the argument nicely.

Yet the opponents of torture have brought this sort of argument on themselves, by staking out a position so extreme as “It doesn’t matter if torture works! It’s wrong, wrong, wrong!” This kind of simplistic deontological reasoning is very appealing and intuitive to humans, because it casts the world into simple black-and-white categories. To show that this is not a strawman, here are several different people all making this same basic argument, that since torture is illegal and wrong it doesn’t matter if it works and there should be no further debate.

But the truth is, if it really were true that the only way to stop a nuclear bomb from leveling Los Angeles was to torture someone, it would be entirely justified—indeed obligatory—to torture that suspect and stop that nuclear bomb.

The problem with that argument is not just that this is not our usual scenario (though it certainly isn’t); it goes much deeper than that:

That scenario makes no sense. It wouldn’t happen.

To use the example the late Antonin Scalia used from an episode of 24 (perhaps the most egregious Fictional Evidence Fallacy ever committed), if there ever is a nuclear bomb planted in Los Angeles, that would literally be one of the worst things that ever happened in the history of the human race—literally a Holocaust in the blink of an eye. We should be prepared to cause extreme suffering and death in order to prevent it. But not only is that event (fortunately) very unlikely, torture would not help us.

Why? Because torture just doesn’t work that well.

It would be too strong to say that it doesn’t work at all; it’s possible that it could produce some valuable intelligence—though clear examples of such results are amazingly hard to come by. There are some social scientists who have found empirical results showing some effectiveness of torture, however. We can’t say with any certainty that it is completely useless. (For obvious reasons, a randomized controlled experiment in torture is wildly unethical, so none have ever been attempted.) But to justify torture it isn’t enough that it could work sometimes; it has to work vastly better than any other method we have.

And our empirical data is in fact reliable enough to show that that is not the case. Torture often produces unreliable information, as we would expect from the game theory involved—your incentive is to stop the pain, not provide accurate intel; the psychological trauma that torture causes actually distorts memory and reasoning; and as a matter of fact basically all the useful intelligence obtained in the War on Terror was obtained through humane interrogation methods. As interrogation experts agree, torture just isn’t that effective.

In principle, there are four basic cases to consider:

1. Torture is vastly more effective than the best humane interrogation methods.

2. Torture is slightly more effective than the best humane interrogation methods.

3. Torture is as effective as the best humane interrogation methods.

4. Torture is less effective than the best humane interrogation methods.

The evidence points most strongly to case 4, which would mean that torture is a no-brainer; if it doesn’t even work as well as other methods, it’s absurd to use it. You’re basically kicking puppies at that point—purely sadistic violence that accomplishes nothing. But the data isn’t clear enough for us to rule out case 3 or even case 2. There is only one case we can strictly rule out, and that is case 1.

But it was only in case 1 that torture could ever be justified!

If you’re trying to justify doing something intrinsically horrible, it’s not enough that it has some slight benefit.

People seem to have this bizarre notion that we have only two choices in morality:

Either we are strict deontologists, and wrong actions can never be justified by good outcomes ever, in which case apparently vaccines are morally wrong, because stabbing children with needles is wrong. Tto be fair, some people seem to actually believe this; but then, some people believe the Earth is less than 10,000 years old.

Or alternatively we are the bizarre strawman concept most people seem to have of utilitarianism, under which any wrong action can be justified by even the slightest good outcome, in which case all you need to do to justify slavery is show that it would lead to a 1% increase in per-capita GDP. Sadly, there honestly do seem to be economists who believe this sort of thing. Here’s one arguing that US chattel slavery was economically efficient, and some of the more extreme arguments for why sweatshops are good can take on this character. Sweatshops may be a necessary evil for the time being, but they are still an evil.

But what utilitarianism actually says (and I consider myself some form of nuanced rule-utilitarian, though actually I sometimes call it “deontological consequentialism” to emphasize that I mean to synthesize the best parts of the two extremes) is not that the ends always justify the means, but that the ends can justify the means—that it can be morally good or even obligatory to do something intrinsically bad (like stabbing children with needles) if it is the best way to accomplish some greater good (like saving them from measles and polio). But the good actually has to be greater, and it has to be the best way to accomplish that good.

To see why this later proviso is important, consider the real-world ethical issues involved in psychology experiments. The benefits of psychology experiments are already quite large, and poised to grow as the science improves; one day the benefits of cognitive science to humanity may be even larger than the benefits of physics and biology are today. Imagine a world without mood disorders or mental illness of any kind; a world without psychopathy, where everyone is compassionate; a world where everyone is achieving their full potential for happiness and self-actualization. Cognitive science may yet make that world possible—and I haven’t even gotten into its applications in artificial intelligence.

To achieve that world, we will need a great many psychology experiments. But does that mean we can just corral people off the street and throw them into psychology experiments without their consent—or perhaps even their knowledge? That we can do whatever we want in those experiments, as long as it’s scientifically useful? No, it does not. We have ethical standards in psychology experiments for a very good reason, and while those ethical standards do slightly reduce the efficiency of the research process, the reduction is small enough that the moral choice is obviously to retain the ethics committees and accept the slight reduction in research efficiency. Yes, randomly throwing people into psychology experiments might actually be slightly better in purely scientific terms (larger and more random samples)—but it would be terrible in moral terms.

Along similar lines, even if torture works about as well or even slightly better than other methods, that’s simply not enough to justify it morally. Making a successful interrogation take 16 days instead of 17 simply wouldn’t be enough benefit to justify the psychological trauma to the suspect (and perhaps the interrogator!), the risk of harm to the falsely accused, or the violation of international human rights law. And in fact a number of terrorism suspects were waterboarded for months, so even the idea that it could shorten the interrogation is pretty implausible. If anything, torture seems to make interrogations take longer and give less reliable information—case 4.

A lot of people seem to have this impression that torture is amazingly, wildly effective, that a suspect who won’t crack after hours of humane interrogation can be tortured for just a few minutes and give you all the information you need. This is exactly what we do not find empirically; if he didn’t crack after hours of talk, he won’t crack after hours of torture. If you literally only have 30 minutes to find the nuke in Los Angeles, I’m sorry; you’re not going to find the nuke in Los Angeles. No adversarial interrogation is ever going to be completed that quickly, no matter what technique you use. Evacuate as many people to safe distances or underground shelters as you can in the time you have left.

This is why the “ticking time-bomb” scenario is so ridiculous (and so insidious); that’s simply not how interrogation works. The best methods we have for “rapid” interrogation of hostile suspects take hours or even days, and they are humane—building trust and rapport is the most important step. The goal is to get the suspect to want to give you accurate information.

For the purposes of the thought experiment, okay, you can stipulate that it would work (this is what the Stanford Encyclopedia of Philosophy does). But now all you’ve done is made the thought experiment more distant from the real-world moral question. The closest real-world examples we’ve ever had involved individual crimes, probably too small to justify the torture (as bad as a murdered child is, think about what you’re doing if you let the police torture people). But by the time the terrorism to be prevented is large enough to really be sufficient justification, it (1) hasn’t happened in the real world and (2) surely involves terrorists who are sufficiently ideologically committed that they’ll be able to resist the torture. If such a situation arises, of course we should try to get information from the suspects—but what we try should be our best methods, the ones that work most consistently, not the ones that “feel right” and maybe happen to work on occasion.

Indeed, the best explanation I have for why people use torture at all, given its horrible effects and mediocre effectiveness at best is that it feels right.

When someone does something terrible (such as an act of terrorism), we rightfully reduce our moral valuation of them relative to everyone else. If you are even tempted to deny this, suppose a terrorist and a random civilian are both inside a burning building and you only have time to save one. Of course you save the civilian and not the terrorist. And that’s still true even if you know that once the terrorist was rescued he’d go to prison and never be a threat to anyone else. He’s just not worth as much.

In the most extreme circumstances, a person can be so terrible that their moral valuation should be effectively zero: If the only person in a burning building is Stalin, I’m not sure you should save him even if you easily could. But it is a grave moral mistake to think that a person’s moral valuation should ever go negative, yet I think this is something that people do when confronted with someone they truly hate. The federal agents torturing those terrorists didn’t merely think of them as worthless—they thought of them as having negative worth. They felt it was a positive good to harm them. But this is fundamentally wrong; no sentient being has negative worth. Some may be so terrible as to have essentially zero worth; and we are often justified in causing harm to some in order to save others. It would have been entirely justified to kill Stalin (as a matter of fact he died of heart disease and old age), to remove the continued threat he posed; but to torture him would not have made the world a better place, and actually might well have made it worse.

Yet I can see how psychologically it could be useful to have a mechanism in our brains that makes us hate someone so much we view them as having negative worth. It makes it a lot easier to harm them when necessary, makes us feel a lot better about ourselves when we do. The idea that any act of homicide is a tragedy but some of them are necessary tragedies is a lot harder to deal with than the idea that some people are just so evil that killing or even torturing them is intrinsically good. But some of the worst things human beings have ever done ultimately came from that place in our brains—and torture is one of them.

Do we always want to internalize externalities?

JDN 2457437

I often talk about the importance of externalitiesa full discussion in this earlier post, and one of their important implications, the tragedy of the commons, in another. Briefly, externalities are consequences of actions incurred upon people who did not perform those actions. Anything I do affecting you that you had no say in, is an externality.

Usually I’m talking about how we want to internalize externalities, meaning that we set up a system of incentives to make it so that the consequences fall upon the people who chose the actions instead of anyone else. If you pollute a river, you should have to pay to clean it up. If you assault someone, you should serve jail time as punishment. If you invent a new technology, you should be rewarded for it. These are all attempts to internalize externalities.

But today I’m going to push back a little, and ask whether we really always want to internalize externalities. If you think carefully, it’s not hard to come up with scenarios where it actually seems fairer to leave the externality in place, or perhaps reduce it somewhat without eliminating it.

For example, suppose indeed that someone invents a great new technology. To be specific, let’s think about Jonas Salk, inventing the polio vaccine. This vaccine saved the lives of thousands of people and saved millions more from pain and suffering. Its value to society is enormous, and of course Salk deserved to be rewarded for it.

But we did not actually fully internalize the externality. If we had, every family whose child was saved from polio would have had to pay Jonas Salk an amount equal to what they saved on medical treatments as a result, or even an amount somehow equal to the value of their child’s life (imagine how offended people would get if you asked that on a survey!). Those millions of people spared from suffering would need to each pay, at minimum, thousands of dollars to Jonas Salk, making him of course a billionaire.

And indeed this is more or less what would have happened, if he had been willing and able to enforce a patent on the vaccine. The inability of some to pay for the vaccine at its monopoly prices would add some deadweight loss, but even that could be removed if Salk Industries had found a way to offer targeted price vouchers that let them precisely price-discriminate so that every single customer paid exactly what they could afford to pay. If that had happened, we would have fully internalized the externality and therefore maximized economic efficiency.

But doesn’t that sound awful? Doesn’t it sound much worse than what we actually did, where Jonas Salk received a great deal of funding and support from governments and universities, and lived out his life comfortably upper-middle class as a tenured university professor?

Now, perhaps he should have been awarded a Nobel Prize—I take that back, there’s no “perhaps” about it, he definitely should have been awarded a Nobel Prize in Medicine, it’s absurd that he did not—which means that I at least do feel the externality should have been internalized a bit more than it was. But a Nobel Prize is only 10 million SEK, about $1.1 million. That’s about enough to be independently wealthy and live comfortably for the rest of your life; but it’s a small fraction of the roughly $7 billion he could have gotten if he had patented the vaccine. Yet while the possible world in which he wins a Nobel is better than this one, I’m fairly well convinced that the possible world in which he patents the vaccine and becomes a billionaire is considerably worse.

Internalizing externalities makes sense if your goal is to maximize total surplus (a concept I explain further in the linked post), but total surplus is actually a terrible measure of human welfare.

Total surplus counts every dollar of willingness-to-pay exactly the same across different people, regardless of whether they live on $400 per year or $4 billion.

It also takes no account whatsoever of how wealth is distributed. Suppose a new technology adds $10 billion in wealth to the world. As far as total surplus, it makes no difference whether that $10 billion is spread evenly across the entire planet, distributed among a city of a million people, concentrated in a small town of 2,000, or even held entirely in the bank account of a single man.

Particularly a propos of the Salk example, total surplus makes no distinction between these two scenarios: a perfectly-competitive market where everything is sold at a fair price, and a perfectly price-discriminating monopoly, where everything is sold at the very highest possible price each person would be willing to pay.

This is a perfectly-competitive market, where the benefits are more or less equally (in this case exactly equally, but that need not be true in real life) between sellers and buyers:


This is a perfectly price-discriminating monopoly, where the benefits accrue entirely to the corporation selling the good:


In the former case, the company profits, consumers are better off, everyone is happy. In the latter case, the company reaps all the benefits and everyone else is left exactly as they were. In real terms those are obviously very different outcomes—the former being what we want, the latter being the cyberpunk dystopia we seem to be hurtling mercilessly toward. But in terms of total surplus, and therefore the kind of “efficiency” that is maximize by internalizing all externalities, they are indistinguishable.

In fact (as I hope to publish a paper about at some point), the way willingness-to-pay works, it weights rich people more. Redistributing goods from the poor to the rich will typically increase total surplus.

Here’s an example. Suppose there is a cake, which is sufficiently delicious that it offers 2 milliQALY in utility to whoever consumes it (this is a truly fabulous cake). Suppose there are two people to whom we might give this cake: Richie, who has $10 million in annual income, and Hungry, who has only $1,000 in annual income. How much will each of them be willing to pay?

Well, assuming logarithmic marginal utility of wealth (which is itself probably biasing slightly in favor of the rich), 1 milliQALY is about $1 to Hungry, so Hungry will be willing to pay $2 for the cake. To Richie, however, 1 milliQALY is about $10,000; so he will be willing to pay a whopping $20,000 for this cake.

What this means is that the cake will almost certainly be sold to Richie; and if we proposed a policy to redistribute the cake from Richie to Hungry, economists would emerge to tell us that we have just reduced total surplus by $19,998 and thereby committed a great sin against economic efficiency. They will cajole us into returning the cake to Richie and thus raising total surplus by $19,998 once more.

This despite the fact that I stipulated that the cake is worth just as much in real terms to Hungry as it is to Richie; the difference is due to their wildly differing marginal utility of wealth.

Indeed, it gets worse, because even if we suppose that the cake is worth much more in real utility to Hungry—because he is in fact hungry—it can still easily turn out that Richie’s willingness-to-pay is substantially higher. Suppose that Hungry actually gets 20 milliQALY out of eating the cake, while Richie still only gets 2 milliQALY. Hungry’s willingness-to-pay is now $20, but Richie is still going to end up with the cake.

Now, if your thought is, “Why would Richie pay $20,000, when he can go to another store and get another cake that’s just as good for $20?” Well, he wouldn’t—but in the sense we mean for total surplus, willingness-to-pay isn’t just what you’d actually be willing to pay given the actual prices of the goods, but the absolute maximum price you’d be willing to pay to get that good under any circumstances. It is instead the marginal utility of the good divided by your marginal utility of wealth. In this sense the cake is “worth” $20,000 to Richie, and “worth” substantially less to Hungry—but not because it’s actually worth less in real terms, but simply because Richie has so much more money.

Even economists often equate these two, implicitly assuming that we are spending our money up to the point where our marginal willingness-to-pay is the actual price we choose to pay; but in general our willingness-to-pay is higher than the price if we are willing to buy the good at all. The consumer surplus we get from goods is in fact equal to the difference between willingness-to-pay and actual price paid, summed up over all the goods we have purchased.

Internalizing all externalities would definitely maximize total surplus—but would it actually maximize happiness? Probably not.

If you asked most people what their marginal utility of wealth is, they’d have no idea what you’re talking about. But most people do actually have an intuitive sense that a dollar is worth more to a homeless person than it is to a millionaire, and that’s really all we mean by diminishing marginal utility of wealth.

I think the reason we’re uncomfortable with the idea of Jonas Salk getting $7 billion from selling the polio vaccine, rather than the same number of people getting the polio vaccine and Jonas Salk only getting the $1.1 million from a Nobel Prize, is that we intuitively grasp that after that $1.1 million makes him independently wealthy, the rest of the money is just going to sit in some stock account and continue making even more money, while if we’d let the families keep it they would have put it to much better use raising their children who are now protected from polio. We do want to reward Salk for his great accomplishment, but we don’t see why we should keep throwing cash at him when it could obviously be spent in better ways.

And indeed I think this intuition is correct; great accomplishments—which is to say, large positive externalities—should be rewarded, but not in direct proportion. Maybe there should be some threshold above which we say, “You know what? You’re rich enough now; we can stop giving you money.” Or maybe it should simply damp down very quickly, so that a contribution which is worth $10 billion to the world pays only slightly more than one that is worth $100 million, but a contribution that is worth $100,000 pays considerably more than one which is only worth $10,000.

What it ultimately comes down to is that if we make all the benefits incur to the person who did it, there aren’t any benefits anymore. The whole point of Jonas Salk inventing the polio vaccine (or Einstein discovering relativity, or Darwin figuring out natural selection, or any great achievement) is that it will benefit the rest of humanity, preferably on to future generations. If you managed to fully internalize that externality, this would no longer be true; Salk and Einstein and Darwin would have become fabulously wealthy, and then somehow we’d all have to continue paying into their estates or something an amount equal to the benefits we received from their discoveries. (Every time you use your GPS, pay a royalty to the Einsteins. Every time you take a pill, pay a royalty to the Darwins.) At some point we’d probably get fed up and decide we’re no better off with them than without them—which is exactly by construction how we should feel if the externality were fully internalized.

Internalizing negative externalities is much less problematic—it’s your mess, clean it up. We don’t want other people to be harmed by your actions, and if we can pull that off that’s fantastic. (In reality, we usually can’t fully internalize negative externalities, but we can at least try.)

But maybe internalizing positive externalities really isn’t so great after all.

What do we mean by “risk”?

JDN 2457118 EDT 20:50.

In an earlier post I talked about how, empirically, expected utility theory can’t explain the fact that we buy both insurance and lottery tickets, and how, normatively it really doesn’t make a lot of sense to buy lottery tickets precisely because of what expected utility theory says about them.

But today I’d like to talk about one of the major problems with expected utility theory, which I consider one of the major unexplored frontiers of economics: Expected utility theory treats all kinds of risk exactly the same.

In reality there are three kinds of risk: The first is what I’ll call classical risk, which is like the game of roulette; the odds are well-defined and known in advance, and you can play the game a large number of times and average out the results. This is where expected utility theory really shines; if you are dealing with classical risk, expected utility is obviously the way to go and Von Neumann and Morgenstern quite literally proved mathematically that anything else is irrational.

The second is uncertainty, a distinction which was most famously expounded by Frank Knight, an economist at the University of Chicago. (Chicago is a funny place; on the one hand they are a haven for the madness that is Austrian economics; on the other hand they have led the charge in behavioral and cognitive economics. Knight was a perfect fit, because he was a little of both.) Uncertainty is risk under ill-defined or unknown probabilities, where there is no way to play the game twice. Most real-world “risk” is actually uncertainty: Will the People’s Republic of China collapse in the 21st century? How many deaths will global warming cause? Will human beings ever colonize Mars? Is P = NP? None of those questions have known answers, but nor can we clearly assign probabilities either; Either P = NP or not, as a mathematical theorem (or, like the continuum hypothesis, it’s independent of ZFC, the most bizarre possibility of all), and it’s not as if someone is rolling dice to decide how many people global warming will kill. You can think of this in terms of “possible worlds”, though actually most modal theorists would tell you that we can’t even say that P=NP is possible (nor can we say it isn’t possible!) because, as a necessary statement, it can only be possible if it is actually true; this follows from the S5 axiom of modal logic, and you know what, even I am already bored with that sentence. Clearly there is some sense in which P=NP is possible, and if that’s not what modal logic says then so much the worse for modal logic. I am not a modal realist (not to be confused with a moral realist, which I am); I don’t think that possible worlds are real things out there somewhere. I think possibility is ultimately a statement about ignorance, and since we don’t know that P=NP is false then I contend that it is possible that it is true. Put another way, it would not be obviously irrational to place a bet that P=NP will be proved true by 2100; but if we can’t even say that it is possible, how can that be?

Anyway, that’s the mess that uncertainty puts us in, and almost everything is made of uncertainty. Expected utility theory basically falls apart under uncertainty; it doesn’t even know how to give an answer, let alone one that is correct. In reality what we usually end up doing is waving our hands and trying to assign a probability anyway—because we simply don’t know what else to do.

The third one is not one that’s usually talked about, yet I think it’s quite important; I will call it one-shot risk. The probabilities are known or at least reasonably well approximated, but you only get to play the game once. You can also generalize to few-shot risk, where you can play a small number of times, where “small” is defined relative to the probabilities involved; this is a little vaguer, but basically what I have in mind is that even though you can play more than once, you can’t play enough times to realistically expect the rarest outcomes to occur. Expected utility theory almost works on one-shot and few-shot risk, but you have to be very careful about taking it literally.

I think an example make things clearer: Playing the lottery is a few-shot risk. You can play the lottery multiple times, yes; potentially hundreds of times in fact. But hundreds of times is nothing compared to the 1 in 400 million chance you have of actually winning. You know that probability; it can be computed exactly from the rules of the game. But nonetheless expected utility theory runs into some serious problems here.

If we were playing a classical risk game, expected utility would obviously be right. So for example if you know that you will live one billion years, and you are offered the chance to play a game (somehow compensating for the mind-boggling levels of inflation, economic growth, transhuman transcendence, and/or total extinction that will occur during that vast expanse of time) in which at each year you can either have a guaranteed $40,000 of inflation-adjusted income or a 99.999,999,75% chance of $39,999 of inflation-adjusted income and a 0.000,000,25% chance of $100 million in inflation-adjusted income—which will disappear at the end of the year, along with everything you bought with it, so that each year you start afresh. Should you take the second option? Absolutely not, and expected utility theory explains why; that one or two years where you’ll experience 8 QALY per year isn’t worth dropping from 4.602056 QALY per year to 4.602049 QALY per year for the other nine hundred and ninety-eight million years. (Can you even fathom how long that is? From here, one billion years is all the way back to the Mesoproterozoic Era, which we think is when single-celled organisms first began to reproduce sexually. The gain is to be Mitt Romney for a year or two; the loss is the value of a dollar each year over and over again for the entire time that has elapsed since the existence of gamete meiosis.) I think it goes without saying that this whole situation is almost unimaginably bizarre. Yet that is implicitly what we’re assuming when we use expected utility theory to assess whether you should buy lottery tickets.

The real situation is more like this: There’s one world you can end up in, and almost certainly will, in which you buy lottery tickets every year and end up with an income of $39,999 instead of $40,000. There is another world, so unlikely as to be barely worth considering, yet not totally impossible, in which you get $100 million and you are completely set for life and able to live however you want for the rest of your life. Averaging over those two worlds is a really weird thing to do; what do we even mean by doing that? You don’t experience one world 0.000,000,25% as much as the other (whereas in the billion-year scenario, that is exactly what you do); you only experience one world or the other.

In fact, it’s worse than this, because if a classical risk game is such that you can play it as many times as you want as quickly as you want, we don’t even need expected utility theory—expected money theory will do. If you can play a game where you have a 50% chance of winning $200,000 and a 50% chance of losing $50,000, which you can play up to once an hour for the next 48 hours, and you will be extended any credit necessary to cover any losses, you’d be insane not to play; your 99.9% confidence level of wealth at the end of the two days is from $850,000 to $6,180,000. While you may lose money for awhile, it is vanishingly unlikely that you will end up losing more than you gain.

Yet if you are offered the chance to play this game only once, you probably should not take it, and the reason why then comes back to expected utility. If you have good access to credit you might consider it, because going $50,000 into debt is bad but not unbearably so (I did, going to college) and gaining $200,000 might actually be enough better to justify the risk. Then the effect can be averaged over your lifetime; let’s say you make $50,000 per year over 40 years. Losing $50,000 means making your average income $48,750, while gaining $200,000 means making your average income $55,000; so your QALY per year go from a guaranteed 4.70 to a 50% chance of 4.69 and a 50% chance of 4.74; that raises your expected utility from 4.70 to 4.715.

But if you don’t have good access to credit and your income for this year is $50,000, then losing $50,000 means losing everything you have and living in poverty or even starving to death. The benefits of raising your income by $200,000 this year aren’t nearly great enough to take that chance. Your expected utility goes from 4.70 to a 50% chance of 5.30 and a 50% chance of zero.

So expected utility theory only seems to properly apply if we can play the game enough times that the improbable events are likely to happen a few times, but not so many times that we can be sure our money will approach the average. And that’s assuming we know the odds and we aren’t just stuck with uncertainty.

Unfortunately, I don’t have a good alternative; so far expected utility theory may actually be the best we have. But it remains deeply unsatisfying, and I like to think we’ll one day come up with something better.