On labor theories of value

May 3 JDN 246164

I got into an argument a little while ago with an acquaintance of mine who is an avowed Marxist. He posted something that’s been going around Marxist social media about the “irony” that Marx’s labor theory of value is based on Smith and Ricardo’s labor theories of value (plural; they’re not the same), and thus when defenders of capitalism criticize the labor theory of value, they are in effect betraying their founding figures.

The first point I made in response to this was basically, “Yeah. So?” I think one thing that Marxists—at least this flavor of Marxist; I am prepared to exempt more serious Marxian economists—don’t really understand is that mainstream economists don’t have a founding figure that they worship and consider infallible. There is no inerrant text. I am fully prepared to acknowledge—and did, in fact, in that conversation, acknowledge—that Adam Smith made errors and his labor theory of value was one of them. And quite frankly, any defender of capitalism who worships Milton Friedman or Ayn Rand isn’t a mainstream economist, or is at best a very bad one.

My interlocutor then challenged me to describe these different labor theories of value, and I was foolish enough to take the bait, and then the whole conversation devolved into him playing this smug game of “That’s not what Marx really meant” and “clearly you haven’t read Das Kapital” (even though I have, but I admit it was several years ago; I did call up a PDF copy to refresh my memory during the conversation).

But it got me thinking about labor theories of value, and trying to understand why so many people find them seductive when it really doesn’t take much thought to show that they can’t possibly be right. (This post turned out to be a bit long, but I promise I won’t be as long-winded as Marx.)

So what’s wrong with labor theories of value?

If objects are valued based on the labor put into them, the following four propositions should hold:

  1. A project you spend 100 hours on which ultimately failed and produced nothing useful was extremely valuable.
  2. Everything in the Garden of Eden is worthless, because it doesn’t require labor to access.
  3. If you come up with a cure for cancer in a random stroke of insight, it’s worthless because you didn’t put any labor into it, even though both its utility (the lives it will save) and its price (the money you could make off of it) are surely astronomical.
  4. Increased productivity is worthless, because all it does is make our goods worthless as we get better at making them.

All four of these propositions are clearly preposterous, and yet they all seem to follow directly from the basic concept of valuing things by the labor that goes into them. Mainstream economists eventually realized this, and gave up on labor theories of value in favor of the now-consensus utility theory of value.

To be fair, Marx was no idiot, and he did try to address concerns like these in Das Kapital. (Well, the first three he does; I’ll talk about the fourth one in a moment.) But the way he does so is by continually re-defining his terms in contradictory ways, so that by the time you get through the book, you realize he doesn’t even have a labor theory of value. He has many labor theories of value, and he substitutes them ad hoc whenever they seem to yield the conclusions he’s looking for.

For example: Sometimes he says that it’s the actual labor that goes in which matters. Other times that it’s the “usual” or “socially necessary” amount of labor. Other times that it’s the average amount of labor that would be required for this production across the whole economy. These are not the same thing! They yield radically different results in many cases!

Marx tries to distinguish use-value (approximately utility) from exchange-value (approximately price), which is good; those two things are different. It’s very important to distinguish price from value.

But then he doesn’t even use these concepts consistently! At one point, he gives us this absolute howler:

The use-value of the money-commodity becomes two-fold. In addition to its special use-value as

a commodity (gold, for instance, serving to stop teeth, to form the raw material of articles of

luxury, &c.), it acquires a formal use-value, originating in its specific social function.

Das Kapital, Volume 1, Chapter 2, p. 63

No, dude. That is exchange-value. That is paradigmatic exchange-value. People mainly want gold because they can sell it at a high price to buy stuff that’s actually useful. If this is use-value, then the distinction between use-value and exchange-value collapses to, well, useless.

I think what Marx is doing here is that he wants use-value to always be higher than exchange-value, so that surplus-value can be the difference between them and always be positive. But gold is a very clear example of a good for which the price greatly exceeds the marginal utility, which I think you can convince yourself by imagining being stranded alone on a desert island with a crate full of gold. If that crate had contained non-perishable food, or water purification equipment, or tools and materials for building shelter, or best of all, a satellite phone and some solar panels, you’d be overjoyed to have it. Even a crate full of books, plushies, or underwear would have some use to you. (Plushies make better friends even than Wilson!) But gold? You have nothing to do but laugh—or cry—at the cruel irony. (And cash would be the same way, though maybe you could use the linen for something.)

But we actually do have a good explanation for how assets such as gold (and Bitcoin) can have prices far exceeding their marginal utility; expectations. If you expect that you’ll be able to sell an asset for more than you paid for it, you have reason to buy that asset, even if it’s useless to you. And for gold, that’s actually been a pretty smart gamble most of the time (Bitcoin, it very much depends on when you bought it). This could be a non-stationary equilibrium in rational expectations, or it could just be an ever-replenishing array of Greater Fools; but one way or another, the reason gold has a high price is that people expect it to have an even higher price in the future.

In fact, this seems like a deep flaw in capitalism! Marx could have spent a whole chapter on why gold is stupid and financial markets are basically a casino—he would have beaten out Keynes on that by decades. (If I were going to worship an economist, it would be Keynes. But again, I still don’t think his work is inerrant. Just very, very good.) But instead, Marx accepted that gold is priced the way it should be, and contorted his already-tortured theory of value into accommodating that.

I really don’t know why Marx was so insistent that all goods had to be valued based on labor. Marx actually had a lot of good insights about capitalism, and he wasn’t entirely wrong that capitalism as we know it breeds exploitation and ever-growing inequality. I believe that relatively simple reforms (like antitrust enforcement, co-ops, and progressive taxation) can solve, or at least mitigate, these problems, and allow us to enjoy the fruits of higher productivity that capitalism provides. But I recognize that I could be wrong about that; maybe some more radical change is genuinely needed. Yet this in no way vindicates Marx’s theory of value, which was simply wrongheaded from the start.

Indeed, why was he so insistent about it?

Why not simply give up on it, and adopt a new theory, or state it as an unsolved problem?

I have a hypothesis about that. Let me reprise proposition 4:

  1. Increased productivity is worthless, because all it does is make our goods worthless as we get better at making them.

This proposition is preposterous, as I’ve already said: A technology that allows you to make 100 cars with the same labor previously required to make 1 car does not make cars less useful. It simply makes them available to more people at lower prices, and this is generally a good thing.

But I think that Marx did not regard it as preposterous; in fact, I think he regarded it as true.

Consider this paragraph:

In proportion as capitalist production is developed in a country, in the same proportion do the

national intensity and productivity of labour there rise above the international level.2 The

different quantities of commodities of the same kind, produced in different countries in the same

working-time, have, therefore, unequal international values, which are expressed in different

prices, i.e., in sums of money varying according to international values. The relative value of

money will, therefore, be less in the nation with more developed capitalist mode of production

than in the nation with less developed. It follows, then, that the nominal wages, the equivalent of

labour-power expressed in money, will also be higher in the first nation than in the second; which

does not at all prove that this holds also for the real wages, i.e., for the means of subsistence

placed at the disposal of the labourer

– Das Kapital, Volume 1, chapter 22, p. 394

So he does get one qualitative fact right here: Nominal prices are higher in rich countries, for goods and services that are not traded across international borders. This is why we use purchasing power parity.

But he then goes on to say that real wages aren’t higher in rich countries. This… is just clearly false. By any reasonable measure, real wages are higher in the United States or France than they are in Congo or Haiti.

One can quibble with the particular measure used; I in fact happen to believe that we do overestimate real wages in the US by using the CPI instead of an index that better reflects the price of necessities. But there’s just no plausible way to say that a laborer in Malawi who makes $600 a year is at the same standard of living as a laborer in the US who makes $20,000. They might both be legitimately considered poor; but saying that real wages aren’t better here just isn’t plausible.

And Marx’s views on wages get weirder from there:

But hand-in-hand with the increasing productivity of labour, goes, as we have seen, the cheapening of the labourer, therefore a higher rate of surplus-value, even when the real wages are rising. The latter never rise proportionally to the productive power of labour. The same value in variable capital therefore sets in movement more labour-power, and, therefore, more labour.

Das Kapital, Volume 1, Chapter 24, p. 421

I’d in particular like to draw your attention to these two clauses: “the cheapening of the labourer, […] even when the real wages are rising.” What in the world does that mean? How can labor simultaneously get cheaper and more expensive? How can I be “cheapened” even as I am better off?

A bit later, he gets close to acknowledging that higher productivity increases value, but he characterizes it in a very strange way:

Labour transmits to its product the value of the means of production consumed by it. On the other

hand, the value and mass of the means of production set in motion by a given quantity of labour

increase as the labour becomes more productive. Though the same quantity of labour adds always

to its products only the same sum of new value, still the old capital value, transmitted by the

labour to the products, increases with the growing productivity of labour.

Das Kapital, Volume 1, Chapter 24, p. 422

So what he seems to be saying here is that the value added from capital is itself denominated in terms of the labor that was used to create that capital. Yet this is a very strange accounting indeed, as I think a simple model will help you see.

Consider a productivity-enhancing technology.

Suppose that, initially, one can make 1 widget per person-hour. So, Marx says, the value of 1 widget is precisely 1 person-hour.

And suppose there are enough laborers to do 20 person-hours of work. Then we make 20 widgets, and we get value equal to 20 person-hours. Okay, seems reasonable so far.

Then, an engineer comes along, spending 100 hours to invent a machine that costs 10 person-hours to build, and can produce 1000 widgets using 10 person-hours of labor.

So the value of that machine, according to Marx as I understand him, is 10+X person-hours, where X is some amortized fraction of the 100 person-hours involved in inventing it. It’s unclear how to do this amortization; what time frame should be using? Once invented, the machine can be built many times. But I guess we could maybe make sense of it as the patent duration—the price of the machine will surely be higher during the time the patent is still valid, and I guess we could say that is somehow reflected in its value. (Notice how this is already getting pretty weird.)

Now, let’s go ahead and make 1000 widgets with the machine.

We have spent 10 person-hours of labor running the machine, another 10 building it, and we’re supposed to count in X from inventing it in the first place. X ranges somewhere between 0 and 100.

So at the low end, when X=0, these 1000 widgets have only cost us 20 person-hours to make, increasing productivity 50-fold. This is sort of where we expect to end up after the machine goes out of patent and becomes commonplace.

But at the high end, when X=100, these 1000 widgets have cost us 120 person-hours to make, increasing productivity a lesser, but still substantial, 8-fold. This might be where we find ourselves when the very first machine comes online and it’s still an experimental prototype.

Under the utility theory of value (which, again, virtually all mainstream economists, including both neoclassical, behavioral, and even Marxian economists, accept), the value of widgets has increased from U(20) to U(1000); exactly what this value is depends on how many consumers there are and what their utility functions are, but two things we can say for sure:

  • This is definitely much higher than before. (Probably more than 10 but less than 50 times higher.)
  • The value is the same regardless of how we account for the person-hours that went into inventing the machine.
  • The cost gets lower over time, as the technology becomes established.
  • Thus the value added should increase over time. (Whether or not profit does depends upon additional factors we haven’t modeled.)

But as Marx seems to be saying here (again, he may say differently elsewhere, but that’s kind of my point; he doesn’t have a coherent theory), we are to value these 1000 widgets as follows:

When the technology is new, X=100, and so the value of the 1000 widgets is 120 person-hours, the labor that went into inventing, producing, and using the machine. So this productivity enhancement has increased value somewhat—a 6-fold increase—but not all that much. And the value of each widget has been radically reduced: It is now only 0.12 person-hours, or about 7 person-minutes.

Yet once the technology becomes established, X=0, and so the value of the 1000 widgets is 20 person-hours, the labor that went into producing and using the machine. So now this productivity enhancement has not increased value at all. The value of each widget has fallen even further: It is now a mere 0.02 person-hours, or just over 1 person-minute.

This weird dynamic, where technology increases value temporarily, then brings it back down to exactly what it was before, is clearly not how technology actually works. The value added from new technologies—in terms of utility, what really matters—is permanent and increasing over time.

Yet upon re-reading Marx and reflecting some more on his labor theories of value, I think Marx believed that this is actually what happens.

I think that Marx’s whole account of why the rate of profit must fall (even though it absolutely hasn’t, empirically, and even Marxian economists today recognize there’s no particular theoretical reason it should) is based on this misconception.

I think because he believed that labor is the correct measure of value, the fact that human beings can only do so much labor (which hasn’t really changed much over the millennia) means that standard of living can never really increase, because higher productivity simply translates into stuff becoming more and more worthless.

And I think part of where the confusion comes from is that price does sort of behave this way, at least qualitatively; no doubt in a world where widgets can be produced with only 1 minute of labor instead of 60 is one in which widgets are much cheaper to buy. But that doesn’t mean that their value has been correspondingly reduced; they are still just as useful (for whatever widgets do) as they were before, and any decline in marginal value merely comes from diminishing marginal utility as people get more and more of them.

Yet I think Marx didn’t want that result, because it seemed to imply that capitalism could actually make life better, even for workers. (As, empirically, it absolutely did.) He wanted to be able to prove that, despite all appearances, workers have gained absolutely nothing from capitalism and technology, and live just as poorly today as they did in the Middle Ages. And a labor theory of value was just the way to do that, for we only work slightly more hours today than most people did in the Middle Ages (and given the state of Medieval scholarship at the time, Marx may have even thought it was the same). Yet I for one am really a fan of vaccines and flush toilets; I don’t know about you.

He quickly realized many of the problems with this theory, and so he added more and more epicycles to try to correct these problems; but the result was a theory that wasn’t even coherent. Yet in part because of Marx’s incredibly dense and verbose writing style (please note; there are 547 pages in Volume I of Das Kapital, and it has three volumes.)it remained plausible enough to non-experts to catch on, and due to its very complexity, it becomes genuinely hard for anyone to understand. So then we can have the argument I had, where even as I clearly demonstrated the deep flaws in the theory, my interlocutor could always insist I hadn’t really understood what Marx was saying, and it was all my failing, not anything wrong with the theory, which is of course inerrant and handed down from On High.

For some people (not all, but some), Marxism really does seem more like a religion than a scientific theory: “I don’t know exactly what it means, but dammit, I know it’s true and you’ll never convince me otherwise.”

Is there a way to make a labor theory of value work?

I’m pretty well convinced that Marx’s labor theory of value is either wrong, or so incoherent as to be not even wrong. (Adam Smith’s and David Ricardo’s theories were coherent, so they were definitely just wrong.)

But could there, somewhere buried in all those hundreds of pages of mind-numbingly dense and self-contradictory text, be a theory worth salvaging?

Can I steelman the labor theory of value?

I’m going to give it a try.

Okay, so clearly it’s not the actual amount of labor used, as that runs afoul of proposition 1 immediately:

  1. A project you spend 100 hours on which ultimately failed and produced nothing useful was extremely valuable.

That’s nonsense, so we’ll rule that theory out.

Okay, maybe we can patch it up by saying it’s the socially necessary amount of labor required; the amount of labor that the most-efficient worker would require. Clearly, if you are spending 100 hours on something useless, you’re not being the most-efficient worker.

This seems to be closer to Marx’s account, but it still runs afoul of propositions 2, 3, and 4:

  1. Everything in the Garden of Eden is worthless, because it doesn’t require labor to access.
  2. If you come up with a cure for cancer in a random stroke of insight, it’s worthless because you didn’t put any labor into it, even though both its utility (the lives it will save) and its price (the money you could make off of it) are surely astronomical.
  3. Increased productivity is worthless, because all it does is make our goods worthless as we get better at making them.

Marx actually seemed to like proposition 4, but we can see that it’s wrong. So this is a problem.

Also, while propositions 2 and 3 may seem like extreme thought experiments, consider the following:

First, “The Garden of Eden” is very much what a Star Trek-style fully automated luxury communism would feel like. Many leftists say that they really would like to see such a world, and I agree with them on this. But on this theory of value, it’s all worthless, because nobody has to work to get anything.

Second, a sudden insight into a miracle cure that ends up becoming cheap and plentiful is pretty much what happened with penicillin and vaccines. Yes, there was some labor involved in making them (and still is), but it was clearly far less than the utility gained from all the improvements in health and lifespan that we have received from these inventions. Valuing these technologies in terms of their labor cost seems to completely miss the point of why they were such miracles.

So is there some other way to make a labor theory of value work?

The best I can come up with is this:

The value of a product is the amount of labor it would take to make that product by hand with pre-historic technology.

This is my attempt at steelmanning the labor theory of value. It does solve propositions 2, 3, and 4:

For 2, the fact that everything is handed to you (perhaps by robots) doesn’t change the fact that making it yourself would be really, really hard.

For 3, it’s much harder to make penicillin by hand than in a factory (though it can be done!), so improved penicillin technology is a gain in value. And every new vial of penicillin is worth the many hours that would have gone into making it by hand.

And for 4, any improvement in labor productivity works exactly how you’d expect: A machine that can do the work of 100 people produces 100 times as much value in goods. (In some ways, this is even more intuitive to most people than the utility theory of value, which predicts an increase, but not a one-to-one increase.)

So, okay, this theory is not preposterous, unlike everything we’ve considered so far.

But it really can’t be Marx’s theory, because he contradicts it very heavily in multiple places, and this theory, unlike his, does not predict that the rate of profit must fall. (Which, again, is good, because it doesn’t.)

Yet even this theory is ultimately unsatisfying, for the following reasons:

  1. Some products literally cannot be made by hand using pre-historic technology. Consider a graphics card or a strong-force microscope. In order to make these things, we had to make tools to make better tools to make even better tools to make still better tools to make yet even better tools to make staggeringly near-flawless tools to make them. Even if you had the complete schematics for all the necessary tools and machines, all the raw materials you needed, and an unlimited supply of labor, I’m not sure you could build a graphics card from scratch within a single lifetime.
  2. While it can account for the value of increased efficiency in producing a given good, it doesn’t seem to be able to account for the value of inventing whole new classes of goods. (Yes, penicillin can be made by hand using pre-historic tools, but nobody did as far as we know, and the value of that invention was absolutely enormous in a way that even this labor theory of value cannot account for.)

These two problems are related: The new products you can make now that you couldn’t before are made possible by a mix of new ideas and an accumulation of better and better tools.

As far as proposition 5, I think we might be able to shore up the theory by counting the value of capital accumulation in terms of the labor that would be needed at each level of technology: however many person-hours to make the optical microscope, and then however many person-hours to make lasers, and however many person-hours to make sulfuric acid, and so on and so forth, until you’ve finally added up all the labor that went into producing the things that produced the things that produced the things that produced the things that produced graphics cards.

But as for proposition 6? I think this is just fatal. I don’t think there’s any way for a labor theory of value to not systematically and catastrophically undervalue new discoveries and new inventions.

The whole point of new inventions is that they make new things possible or allow us to do things with far less effort or cost than before. The value they create is in the labor they save. But if they are things we theoretically could have done, just didn’t know how (like penicillin), then there is no value added by the discovery (though at least there can be a lot of value added by the actual production). And if they are things we couldn’t have done until we reached a certain level of technology and capital, the value added seems to all be captured by the production of each new tier of technology, with nothing left to go to the discovery itself.

Maybe there’s still a way to save this theory. But at some point, we have to stop and ask ourselves:

Why?

Why do we even want a labor theory of value, when we already have a utility theory of value?

Maybe it’s the fact that utility is hard to measure precisely, and so the idea of basing our value system on it is uncomfortable? Yet I think this is just a fact of life: The things that really matter are hard to measure precisely.

And it’s not as if we have absolutely no idea: We can tell the difference between happiness and suffering, and we can see how various products and technologies can contribute to happiness and alleviate suffering. (We can also see how some products and technologies can reduce happiness and contribute to suffering! Not all new technologies are good, and some products that are good for their users are bad for other people!)

Indeed, we even have a unit of measurement: The QALY. And for some particular technologies—such as penicillin and vaccines—we actually have a pretty good idea of the number of QALY they’ve added to the world, and it’s enormous.

I’m not even saying Marx was wrong about everything. He had some good ideas, actually. And Marxian economists today do sometimes come up with useful findings that can be integrated into a deeper understanding of political economy.

But he was wrong about some things, and the labor theory of value is one of them.

Bundling the stakes to recalibrate ourselves

Mar 31 JDN 2460402

In a previous post I reflected on how our minds evolved for an environment of immediate return: An immediate threat with high chance of success and life-or-death stakes. But the world we live in is one of delayed return: delayed consequences with low chance of success and minimal stakes.

We evolved for a world where you need to either jump that ravine right now or you’ll die; but we live in a world where you’ll submit a hundred job applications before finally getting a good offer.

Thus, our anxiety system is miscalibrated for our modern world, and this miscalibration causes us to have deep, chronic anxiety which is pathological, instead of brief, intense anxiety that would protect us from harm.

I had an idea for how we might try to jury-rig this system and recalibrate ourselves:

Bundle the stakes.

Consider job applications.

The obvious way to think about it is to consider each application, and decide whether it’s worth the effort.

Any particular job application in today’s market probably costs you 30 minutes, but you won’t hear back for 2 weeks, and you have maybe a 2% chance of success. But if you fail, all you lost was that 30 minutes. This is the exact opposite of what our brains evolved to handle.

So now suppose if you think of it in terms of sending 100 job applications.

That will cost you 30 times 100 minutes = 50 hours. You still won’t hear back for weeks, but you’ve spent weeks, so that won’t feel as strange. And your chances of success after 100 applications are something like 1-(0.98)^100 = 87%.

Even losing 50 hours over a few weeks is not the disaster that falling down a ravine is. But it still feels a lot more reasonable to be anxious about that than to be anxious about losing 30 minutes.

More importantly, we have radically changed the chances of success.

Each individual application will almost certainly fail, but all 100 together will probably succeed.

If we were optimally rational, these two methods would lead to the same outcomes, by a rather deep mathematical law, the linearity of expectation:
E[nX] = n E[X]

Thus, the expected utility of doing something n times is precisely n times the expected utility of doing it once (all other things equal); and so, it doesn’t matter which way you look at it.

But of course we aren’t perfectly rational. We don’t actually respond to the expected utility. It’s still not entirely clear how we do assess probability in our minds (prospect theory seems to be onto something, but it’s computationally harder than rational probability, which means it makes absolutely no sense to evolve it).

If instead we are trying to match up our decisions with a much simpler heuristic that evolved for things like jumping over ravines, our representation of probability may be very simple indeed, something like “definitely”, “probably”, “maybe”, “probably not”, “definitely not”. (This is essentially my categorical prospect theory, which, like the stochastic overload model, is a half-baked theory that I haven’t published and at this point probably never will.)

2% chance of success is solidly “probably not” (or maybe something even stronger, like “almost definitely not”). Then, outcomes that are in that category are presumably weighted pretty low, because they generally don’t happen. Unless they are really good or really bad, it’s probably safest to ignore them—and in this case, they are neither.

But 87% chance of success is a clear “probably”; and outcomes in that category deserve our attention, even if their stakes aren’t especially high. And in fact, by bundling them, we have even made the stakes a bit higher—likely making the outcome a bit more salient.

The goal is to change “this will never work” to “this is going to work”.

For an individual application, there’s really no way to do that (without self-delusion); maybe you can make the odds a little better than 2%, but you surely can’t make them so high they deserve to go all the way up to “probably”. (At best you might manage a “maybe”, if you’ve got the right contacts or something.)

But for the whole set of 100 applications, this is in fact the correct assessment. It will probably work. And if 100 doesn’t, 150 might; if 150 doesn’t, 200 might. At no point do you need to delude yourself into over-estimating the odds, because the actual odds are in your favor.

This isn’t perfect, though.

There’s a glaring problem with this technique that I still can’t resolve: It feels overwhelming.

Doing one job application is really not that big a deal. It accomplishes very little, but also costs very little.

Doing 100 job applications is an enormous undertaking that will take up most of your time for multiple weeks.

So if you are feeling demotivated, asking you to bundle the stakes is asking you to take on a huge, overwhelming task that surely feels utterly beyond you.

Also, when it comes to this particular example, I even managed to do 100 job applications and still get a pretty bad outcome: My only offer was Edinburgh, and I ended up being miserable there. I have reason to believe that these were exceptional circumstances (due to COVID), but it has still been hard to shake the feeling of helplessness I learned from that ordeal.

Maybe there’s some additional reframing that can help here. If so, I haven’t found it yet.

But maybe stakes bundling can help you, or someone out there, even if it can’t help me.

Against average utilitarianism

Jul 30 JDN 2460156

Content warning: Suicide and suicidal ideation

There are two broad strands of utilitarianism, known as average utilitarianism and total utilitarianism. As utilitarianism, both versions concern themselves with maximizing happiness and minimizing suffering. And for many types of ethical question, they yield the same results.

Under average utilitarianism, the goal is to maximize the average level of happiness minus suffering: It doesn’t matter how many people there are in the world, only how happy they are.

Under total utilitarianism, the goal is to maximize the total level of happiness minus suffering: Adding another person is a good thing, as long as their life is worth living.

Mathematically, its the difference between taking the sum of net happiness (total utilitarianism), and taking that sum and dividing it by the population (average utilitarianism).

It would make for too long a post to discuss the validity of utilitarianism in general. Overall I will say briefly that I think utilitarianism is basically correct, but there are some particular issues with it that need to be resolved, and usually end up being resolved by heading slightly in the direction of a more deontological ethics—in short, rule utilitarianism.

But for today, I want to focus on the difference between average and total utilitarianism, because average utilitarianism is a very common ethical view despite having appalling, horrifying implications.

Above all: under average utilitarianism, if you are considering suicide, you should probably do it.

Why? Because anyone who is considering suicide is probably of below-average happiness. And average utilitarianism necessarily implies that anyone who expects to be of below-average happiness should be immediately killed as painlessly as possible.

Note that this does not require that your life be one of endless suffering, so that it isn’t even worth going on living. Even a total utilitarian would be willing to commit suicide, if their life is expected to be so full of suffering that it isn’t worth going on.

Indeed, I suspect that most actual suicidal ideation by depressed people takes this form: My life will always be endless suffering. I will never be happy again. My life is worthless.

The problem with such suicidal ideation is not the ethical logic, which is valid: If indeed your existence from this point forward would be nothing but endless suffering, suicide actually makes sense. (Imagine someone who is being held in a dungeon being continually mercilessly tortured with no hope of escape; it doesn’t seem unreasonable for them to take a cyanide pill.) The problem is the prediction, which says that your life from this point forward will be nothing but endless suffering. Most people with depression do, eventually, feel better. They may never be quite as happy overall as people who aren’t depressed, but they do, in fact, have happy times. And most people who considered suicide but didn’t go through with it end up glad that they went on living.

No, an average utilitarian says you should commit suicide as long as your happiness is below average.

We could be living in a glorious utopia, where almost everyone is happy almost all the time, and people are only occasionally annoyed by minor inconveniences—and average utilitarianism would say that if you expect to suffer a more than average rate of such inconveniences, the world would be better off if you ceased to exist.

Moreover, average utilitarianism says that you should commit suicide if your life is expected to get worse—even if it’s still going to be good, adding more years to your life will just bring your average happiness down. If you had a very happy childhood and adulthood is going just sort of okay, you may as well end it now.

Average utilitarianism also implies that we should bomb Third World countries into oblivion, because their people are less happy than ours and thus their deaths will raise the population average.

Are there ways an average utilitarian can respond to these problems? Perhaps. But every response I’ve seen is far too weak to resolve the real problem.

One approach would be to say that the killing itself is bad, or will cause sufficient grief as to offset the loss of the unhappy person. (An average utilitarian is inherently committed to the claim that losing an unhappy person is itself an inherent good. There is something to be offset.)

This might work for the utopia case: The grief from losing someone you love is much worse than even a very large number of minor inconveniences.

It may even work for the case of declining happiness over your lifespan: Presumably some other people would be sad to lose you, even if they agreed that your overall happiness is expected to gradually decline. Then again, if their happiness is also expected to decline… should they, too, shuffle off this mortal coil?

But does it work for the question of bombing? Would most Americans really be so aggrieved at the injustice of bombing Burundi or Somalia to oblivion? Most of them don’t seem particularly aggrieved at the actual bombings of literally dozens of countries—including, by the way, Somalia. Granted, these bombings were ostensibly justified by various humanitarian or geopolitical objectives, but some of those justifications (e.g. Kosovo) seem a lot stronger than others (e.g. Grenada). And quite frankly, I care more about this sort of thing than most people, and I still can’t muster anything like the same kind of grief for random strangers in a foreign country that I feel when a friend or relative dies. Indeed, I can’t muster the same grief for one million random strangers in a foreign country that I feel for one lost loved one. Human grief just doesn’t seem to work that way. Sometimes I wish it did—but then, I’m not quite sure what our lives would be like in such a radically different world.

Moreover, the whole point is that an average utilitarian should consider it an intrinsically good thing to eliminate the existence of unhappy people, as long as it can be done swiftly and painlessly. So why, then, should people be aggrieved at the deaths of millions of innocent strangers they know are mostly unhappy? Under average utilitarianism, the greatest harm of war is the survivors you leave, because they will feel grief—so your job is to make sure you annihilate them as thoroughly as possible, presumably with nuclear weapons. Killing a soldier is bad as long as his family is left alive to mourn him—but if you kill an entire country, that’s good, because their country was unhappy.

Enough about killing and dying. Let’s talk about something happier: Babies.

At least, total utilitarians are happy about babies. When a new person is brought into the world, a total utilitarian considers this a good thing, as long as the baby is expected to have a life worth living and their existence doesn’t harm the rest of the world too much.

I think that fits with most people’s notions of what is good. Generally the response when someone has a baby is “Congratulations!” rather than “I’m sorry”. We see adding another person to the world as generally a good thing.

But under average utilitarianism, babies must reach a much higher standard in order to be a good thing. Your baby only deserves to exist if they will be happier than average.

Granted, this is the average for the whole world, so perhaps First World people can justify the existence of their children by pointing out that unless things go very badly, they should end up happier than the world average. (Then again, if you have a family history of depression….)

But for Third World families, quite the opposite: The baby may well bring joy to all around them, but unless that joy is enough to bring someone above the global average, it would still be better if the baby did not exist. Adding one more person of moderately-low happiness will just bring the world average down.

So in fact, on a global scale, an average utilitarian should always expect that babies are nearly as likely to be bad as they are good, unless we have some reason to think that the next generation would be substantially happier than this one.

And while I’m not aware of anyone who sincerely believes that we should nuke Third World countries for their own good, I have heard people speak this way about population growth in Third World countries: such discussions of “overpopulation” are usually ostensibly about ecological sustainability, even though the ecological impact of First World countries is dramatically higher—and such talk often shades very quickly into eugenics.

Of course, we wouldn’t want to say that having babies is always good, lest we all be compelled to crank out as many babies as possible and genuinely overpopulate the world. But total utilitarianism can solve this problem: It’s worth adding more people to the world unless the harm of adding those additional people is sufficient to offset the benefit of adding another person whose life is worth living.

Moreover, total utilitarianism can say that it would be good to delay adding another person to the world, until the situation is better. Potentially this delay could be quite long: Perhaps it is best for us not to have too many children until we can colonize the stars. For now, let’s just keep our population sustainable while we develop the technology for interstellar travel. If having more children now would increase the risk that we won’t ever manage to colonize distant stars, total utilitarianism would absolutely say we shouldn’t do it.

There’s also a subtler problem here, which is that it may seem good for any particular individual to have more children, but the net result is that the higher total population is harmful. Then what I think is happening is that we are unaware of, or uncertain about, or simply inattentive to, the small harm to many other people caused by adding one new person to the world. Alternatively, we may not be entirely altruistic, and a benefit that accrues to our own family may be taken as greater than a harm that accrues to many other people far away. If we really knew the actual marginal costs and benefits, and we really agreed on that utility function, we would in fact make the right decision. It’s our ignorance or disagreement that makes us fail, not total utilitarianism in principle. In practice, this means coming up with general rules that seem to result in a fair and reasonable outcome, like “families who want to have kids should aim for two or three”—and again we’re at something like rule utilitarianism.

Another case average utilitarianism seems tempting is in resolving the mere addition paradox.

Consider three possible worlds, A, B, and C:

In world A, there is a population of 1 billion, and everyone is living an utterly happy, utopian life.

In world B, there is a population of 1 billion living in a utopia, and a population of 2 billion living mediocre lives.

In world C, there is a population of 3 billion living good, but not utopian, lives.

The mere addition paradox is that, to many people, world B seems worse than world A, even though all we’ve done is add 2 billion people whose lives are worth living.

Moreover, many people seem to think that the ordering goes like this:


World B is better than world A, because all we’ve done is add more people whose lives are worth living.

World C is better than world B, because it’s fairer, and overall happiness is higher.

World A is better than world C, because everyone is happier, and all we’ve done is reduce the population.


This is intransitive: We have A > C > B > A. Our preferences over worlds are incoherent.

Average utilitarianism resolves this by saying that A > C is true, and C > B is true—but it says that B > A is false. Since average happiness is higher in world A, A > B.

But of course this results in the conclusion that if we are faced with world B, we should do whatever we can to annihilate the 2 billion extra unhappy people, so that we can get to world A. And the whole point of this post is that this is an utterly appalling conclusion we should immediately reject.

What does total utilitarianism say? It says that indeed C > B and B > A, but it denies that A > C. Rather, since there are more people in world C, it’s okay that people aren’t quite as happy.

Derek Parfit argues that this leads to what he calls the “repugnant conclusion”: If we keep increasing the population by a large amount while decreasing happiness by a small amount, the best possible world ends up being one where population is utterly massive but our lives are only barely worth living.

I do believe that total utilitarianism results in this outcome. I can live with that.

Under average utilitarianism, the best possible world is precisely one person who is immortal and absolutely ecstatic 100% of the time. Adding even one person who is not quite that happy will make things worse.

Under total utilitarianism, adding more people who are still very happy would be good, even if it makes that one ecstatic person a bit less ecstatic. And adding more people would continue to be good, as long as it didn’t bring the average down too quickly.

If you find this conclusion repugnant, as Parfit does, I submit that it is because it is difficult to imagine just how large a population we are talking about. Maybe putting some numbers on it will help.

Let’s say the happiness level of an average person in the world today is 35 quality-adjusted life years—our life expectancy of 70, times an average happiness level of 0.5.

So right now we have a world of 8 billion people at 35 QALY, for a total of 280 TQALY. (That’s tera-QALY, 1 trillion QALY.)

(Note: I’m not addressing inequality here. If you believe that a world where one person has 100 QALY and another has 50 QALY is worse than one where both have 75 QALY, you should adjust your scores accordingly—which mainly serves to make the current world look worse, due to our utterly staggering inequality. In fact I think I do not believe this—in my view, the problem is not that happiness is unequal, but that staggering inequality of wealth makes much greater suffering among the poor in exchange for very little happiness among the rich.)

Average utilitarianism says that we should eliminate the less happy people, so we can raise the average QALY higher, maybe to something like 60. I’ve already said why I find this appalling.

So now consider what total utilitarianism asks of us. If we could raise that figure above 280 TQALY, we should. Say we could increase our population to 10 billion, at the cost of reducing average happiness to 30 QALY; should we? Yes, we should, because that’s 300 TQALY.

But notice that in this scenario we’re still 85% as happy as we were. That doesn’t sound so bad. Parfit is worried about a scenario where our lives are barely worth living. So let’s consider what that would require.

“Barely worth living” sounds like maybe 1 QALY. This wouldn’t mean we all live exactly one year; that’s not sustainable, because babies can’t have babies. So it would be more like a life expectancy of 33, with a happiness of 0.03—pretty bad, but still worth living.

In that case, we would need to raise our population over 800 billion to make it better than our current existence. We must colonize at least 100 other planets and fill them as full as we’ve filled Earth.

In fact, I think this 1 QALY life was something like that human beings had at the dawn of agriculture (which by some estimates was actually worse than ancient hunter-gatherer life; we were sort of forced into early agriculture, rather than choosing it because it was better): Nasty, brutish, and short, but still, worth living.

So, Parfit’s repugnant conclusion is that filling 100 planets with people who live like the ancient Babylonians would be as good as life on Earth is now? I don’t really see how this is obviously horrible. Certainly not to the same degree that saying we should immediately nuke Somalia is obviously horrible.

Moreover, total utilitarianism absolutely still says that if we can make those 800 billion people happier, we should. A world of 800 billion people each getting 35 QALY is 100 times better than the way things are now—and doesn’t that seem right, at least?


Yet if you indeed believe that copying a good world 100 times gives you a 100 times better world, you are basically committed to total utilitarianism.

There are actually other views that would allow you to escape this conclusion without being an average utilitarian.

One way, naturally, is to not be a utilitarian. You could be a deontologist or something. I don’t have time to go into that in this post, so let’s save it for another time. For now, let me say that, historically, utilitarianism has led the charge in positive moral change, from feminism to gay rights, from labor unions to animal welfare. We tend to drag stodgy deontologists kicking and screaming toward a better world. (I vaguely recall an excellent tweet on this, though not who wrote it: “Yes, historically, almost every positive social change has been spearheaded by utilitarians. But sometimes utilitarianism seems to lead to weird conclusions in bizarre thought experiments, and surely that’s more important!”)

Another way, which has gotten surprisingly little attention, is to use an aggregating function that is neither a sum nor an average. For instance, you could add up all utility and divide by the square root of population, so that larger populations get penalized for being larger, but you aren’t simply trying to maximize average happiness. That does seem to still tell some people to die even though their lives were worth living, but at least it doesn’t require us to exterminate all who are below average. And it may also avoid the conclusion Parfit considers repugnant, by making our galactic civilization span 10,000 worlds. Of course, why square root? Why not a cube root, or a logarithm? Maybe the arbitrariness is why it hasn’t been seriously considered. But honestly, I think dividing by anything is suspicious; how can adding someone else who is happy ever make things worse?

But if I must admit that a sufficiently large galactic civilization would be better than our current lives, even if everyone there is mostly pretty unhappy? That’s a bullet I’m prepared to bite. At least I’m not saying we should annihilate everyone who is unhappy.

When maximizing utility doesn’t

Jun 4 JDN 2460100

Expected utility theory behaves quite strangely when you consider questions involving mortality.

Nick Beckstead and Teruji Thomas recently published a paper on this: All well-defined utility functions are either reckless in that they make you take crazy risks, or timid in that they tell you not to take even very small risks. It’s starting to make me wonder if utility theory is even the right way to make decisions after all.

Consider a game of Russian roulette where the prize is $1 million. The revolver has 6 chambers, 3 with a bullet. So that’s a 1/2 chance of $1 million, and a 1/2 chance of dying. Should you play?

I think it’s probably a bad idea to play. But the prize does matter; if it were $100 million, or $1 billion, maybe you should play after all. And if it were $10,000, you clearly shouldn’t.

And lest you think that there is no chance of dying you should be willing to accept for any amount of money, consider this: Do you drive a car? Do you cross the street? Do you do anything that could ever have any risk of shortening your lifespan in exchange for some other gain? I don’t see how you could live a remotely normal life without doing so. It might be a very small risk, but it’s still there.

This raises the question: Suppose we have some utility function over wealth; ln(x) is a quite plausible one. What utility should we assign to dying?


The fact that the prize matters means that we can’t assign death a utility of negative infinity. It must be some finite value.

But suppose we choose some value, -V, (so V is positive), for the utility of dying. Then we can find some amount of money that will make you willing to play: ln(x) = V, x = e^(V).

Now, suppose that you have the chance to play this game over and over again. Your marginal utility of wealth will change each time you win, so we may need to increase the prize to keep you playing; but we could do that. The prizes could keep scaling up as needed to make you willing to play. So then, you will keep playing, over and over—and then, sooner or later, you’ll die. So, at each step you maximized utility—but at the end, you didn’t get any utility.

Well, at that point your heirs will be rich, right? So maybe you’re actually okay with that. Maybe there is some amount of money ($1 billion?) that you’d be willing to die in order to ensure your heirs have.

But what if you don’t have any heirs? Or, what if we consider making such a decision as a civilization? What if death means not only the destruction of you, but also the destruction of everything you care about?

As a civilization, are there choices before us that would result in some chance of a glorious, wonderful future, but also some chance of total annihilation? I think it’s pretty clear that there are. Nuclear technology, biotechnology, artificial intelligence. For about the last century, humanity has been at a unique epoch: We are being forced to make this kind of existential decision, to face this kind of existential risk.

It’s not that we were immune to being wiped out before; an asteroid could have taken us out at any time (as happened to the dinosaurs), and a volcanic eruption nearly did. But this is the first time in humanity’s existence that we have had the power to destroy ourselves. This is the first time we have a decision to make about it.

One possible answer would be to say we should never be willing to take any kind of existential risk. Unlike the case of an individual, when we speaking about an entire civilization, it no longer seems obvious that we shouldn’t set the utility of death at negative infinity. But if we really did this, it would require shutting down whole industries—definitely halting all research in AI and biotechnology, probably disarming all nuclear weapons and destroying all their blueprints, and quite possibly even shutting down the coal and oil industries. It would be an utterly radical change, and it would require bearing great costs.

On the other hand, if we should decide that it is sometimes worth the risk, we will need to know when it is worth the risk. We currently don’t know that.

Even worse, we will need some mechanism for ensuring that we don’t take the risk when it isn’t worth it. And we have nothing like such a mechanism. In fact, most of our process of research in AI and biotechnology is widely dispersed, with no central governing authority and regulations that are inconsistent between countries. I think it’s quite apparent that right now, there are research projects going on somewhere in the world that aren’t worth the existential risk they pose for humanity—but the people doing them are convinced that they are worth it because they so greatly advance their national interest—or simply because they could be so very profitable.

In other words, humanity finally has the power to make a decision about our survival, and we’re not doing it. We aren’t making a decision at all. We’re letting that responsibility fall upon more or less randomly-chosen individuals in government and corporate labs around the world. We may be careening toward an abyss, and we don’t even know who has the steering wheel.

What behavioral economics needs

Apr 16 JDN 2460049

The transition from neoclassical to behavioral economics has been a vital step forward in science. But lately we seem to have reached a plateau, with no major advances in the paradigm in quite some time.

It could be that there is work already being done which will, in hindsight, turn out to be significant enough to make that next step forward. But my fear is that we are getting bogged down by our own methodological limitations.

Neoclassical economics shared with us its obsession with mathematical sophistication. To some extent this was inevitable; in order to impress neoclassical economists enough to convert some of them, we had to use fancy math. We had to show that we could do it their way in order to convince them why we shouldn’t—otherwise, they’d just have dismissed us the way they had dismissed psychologists for decades, as too “fuzzy-headed” to do the “hard work” of putting everything into equations.

But the truth is, putting everything into equations was never the right approach. Because human beings clearly don’t think in equations. Once we write down a utility function and get ready to take its derivative and set it equal to zero, we have already distanced ourselves from how human thought actually works.

When dealing with a simple physical system, like an atom, equations make sense. Nobody thinks that the electron knows the equation and is following it intentionally. That equation simply describes how the forces of the universe operate, and the electron is subject to those forces.

But human beings do actually know things and do things intentionally. And while an equation could be useful for analyzing human behavior in the aggregate—I’m certainly not objecting to statistical analysis—it really never made sense to say that people make their decisions by optimizing the value of some function. Most people barely even know what a function is, much less remember calculus well enough to optimize one.

Yet right now, behavioral economics is still all based in that utility-maximization paradigm. We don’t use the same simplistic utility functions as neoclassical economists; we make them more sophisticated and realistic. Yet in that very sophistication we make things more complicated, more difficult—and thus in at least that respect, even further removed from how actual human thought must operate.

The worst offender here is surely Prospect Theory. I recognize that Prospect Theory predicts human behavior better than conventional expected utility theory; nevertheless, it makes absolutely no sense to suppose that human beings actually do some kind of probability-weighting calculation in their heads when they make judgments. Most of my students—who are well-trained in mathematics and economics—can’t even do that probability-weighting calculation on paper, with a calculator, on an exam. (There’s also absolutely no reason to do it! All it does it make your decisions worse!) This is a totally unrealistic model of human thought.

This is not to say that human beings are stupid. We are still smarter than any other entity in the known universe—computers are rapidly catching up, but they haven’t caught up yet. It is just that whatever makes us smart must not be easily expressible as an equation that maximizes a function. Our thoughts are bundles of heuristics, each of which may be individually quite simple, but all of which together make us capable of not only intelligence, but something computers still sorely, pathetically lack: wisdom. Computers optimize functions better than we ever will, but we still make better decisions than they do.

I think that what behavioral economics needs now is a new unifying theory of these heuristics, which accounts for not only how they work, but how we select which one to use in a given situation, and perhaps even where they come from in the first place. This new theory will of course be complex; there’s a lot of things to explain, and human behavior is a very complex phenomenon. But it shouldn’t be—mustn’t be—reliant on sophisticated advanced mathematics, because most people can’t do advanced mathematics (almost by construction—we would call it something different otherwise). If your model assumes that people are taking derivatives in their heads, your model is already broken. 90% of the world’s people can’t take a derivative.

I guess it could be that our cognitive processes in some sense operate as if they are optimizing some function. This is commonly posited for the human motor system, for instance; clearly baseball players aren’t actually solving differential equations when they throw and catch balls, but the trajectories that balls follow do in fact obey such equations, and the reliability with which baseball players can catch and throw suggests that they are in some sense acting as if they can solve them.

But I think that a careful analysis of even this classic example reveals some deeper insights that should call this whole notion into question. How do baseball players actually do what they do? They don’t seem to be calculating at all—in fact, if you asked them to try to calculate while they were playing, it would destroy their ability to play. They learn. They engage in practiced motions, acquire skills, and notice patterns. I don’t think there is anywhere in their brains that is actually doing anything like solving a differential equation. It’s all a process of throwing and catching, throwing and catching, over and over again, watching and remembering and subtly adjusting.

One thing that is particularly interesting to me about that process is that is astonishingly flexible. It doesn’t really seem to matter what physical process you are interacting with; as long as it is sufficiently orderly, such a method will allow you to predict and ultimately control that process. You don’t need to know anything about differential equations in order to learn in this way—and, indeed, I really can’t emphasize this enough, baseball players typically don’t.

In fact, learning is so flexible that it can even perform better than calculation. The usual differential equations most people would think to use to predict the throw of a ball would assume ballistic motion in a vacuum, which absolutely not what a curveball is. In order to throw a curveball, the ball must interact with the air, and it must be launched with spin; curving a baseball relies very heavily on the Magnus Effect. I think it’s probably possible to construct an equation that would fully predict the motion of a curveball, but it would be a tremendously complicated one, and might not even have an exact closed-form solution. In fact, I think it would require solving the Navier-Stokes equations, for which there is an outstanding Millennium Prize. Since the viscosity of air is very low, maybe you could get away with approximating using the Euler fluid equations.

To be fair, a learning process that is adapting to a system that obeys an equation will yield results that become an ever-closer approximation of that equation. And it is in that sense that a baseball player can be said to be acting as if solving a differential equation. But this relies heavily on the system in question being one that obeys an equation—and when it comes to economic systems, is that even true?

What if the reason we can’t find a simple set of equations that accurately describe the economy (as opposed to equations of ever-escalating complexity that still utterly fail to describe the economy) is that there isn’t one? What if the reason we can’t find the utility function people are maximizing is that they aren’t maximizing anything?

What behavioral economics needs now is a new approach, something less constrained by the norms of neoclassical economics and more aligned with psychology and cognitive science. We should be modeling human beings based on how they actually think, not some weird mathematical construct that bears no resemblance to human reasoning but is designed to impress people who are obsessed with math.

I’m of course not the first person to have suggested this. I probably won’t be the last, or even the one who most gets listened to. But I hope that I might get at least a few more people to listen to it, because I have gone through the mathematical gauntlet and earned my bona fides. It is too easy to dismiss this kind of reasoning from people who don’t actually understand advanced mathematics. But I do understand differential equations—and I’m telling you, that’s not how people think.

Optimization is unstable. Maybe that’s why we satisfice.

Feb 26 JDN 2460002

Imagine you have become stranded on a deserted island. You need to find shelter, food, and water, and then perhaps you can start working on a way to get help or escape the island.

Suppose you are programmed to be an optimizerto get the absolute best solution to any problem. At first this may seem to be a boon: You’ll build the best shelter, find the best food, get the best water, find the best way off the island.

But you’ll also expend an enormous amount of effort trying to make it the best. You could spend hours just trying to decide what the best possible shelter would be. You could pass up dozens of viable food sources because you aren’t sure that any of them are the best. And you’ll never get any rest because you’re constantly trying to improve everything.

In principle your optimization could include that: The cost of thinking too hard or searching too long could be one of the things you are optimizing over. But in practice, this sort of bounded optimization is often remarkably intractable.

And what if you forgot about something? You were so busy optimizing your shelter you forgot to treat your wounds. You were so busy seeking out the perfect food source that you didn’t realize you’d been bitten by a venomous snake.

This is not the way to survive. You don’t want to be an optimizer.

No, the person who survives is a satisficerthey make sure that what they have is good enough and then they move on to the next thing. Their shelter is lopsided and ugly. Their food is tasteless and bland. Their water is hard. But they have them.

Once they have shelter and food and water, they will have time and energy to do other things. They will notice the snakebite. They will treat the wound. Once all their needs are met, they will get enough rest.

Empirically, humans are satisficers. We seem to be happier because of it—in fact, the people who are the happiest satisfice the most. And really this shouldn’t be so surprising: Because our ancestral environment wasn’t so different from being stranded on a desert island.

Good enough is perfect. Perfect is bad.

Let’s consider another example. Suppose that you have created a powerful artificial intelligence, an AGI with the capacity to surpass human reasoning. (It hasn’t happened yet—but it probably will someday, and maybe sooner than most people think.)

What do you want that AI’s goals to be?

Okay, ideally maybe they would be something like “Maximize goodness”, where we actually somehow include all the panoply of different factors that go into goodness, like beneficence, harm, fairness, justice, kindness, honesty, and autonomy. Do you have any idea how to do that? Do you even know what your own full moral framework looks like at that level of detail?

Far more likely, the goals you program into the AGI will be much simpler than that. You’ll have something you want it to accomplish, and you’ll tell it to do that well.

Let’s make this concrete and say that you own a paperclip company. You want to make more profits by selling paperclips.

First of all, let me note that this is not an unreasonable thing for you to want. It is not an inherently evil goal for one to have. The world needs paperclips, and it’s perfectly reasonable for you to want to make a profit selling them.

But it’s also not a true ultimate goal: There are a lot of other things that matter in life besides profits and paperclips. Anyone who isn’t a complete psychopath will realize that.

But the AI won’t. Not unless you tell it to. And so if we tell it to optimize, we would need to actually include in its optimization all of the things we genuinely care about—not missing a single one—or else whatever choices it makes are probably not going to be the ones we want. Oops, we forgot to say we need clean air, and now we’re all suffocating. Oops, we forgot to say that puppies don’t like to be melted down into plastic.

The simplest cases to consider are obviously horrific: Tell it to maximize the number of paperclips produced, and it starts tearing the world apart to convert everything to paperclips. (This is the original “paperclipper” concept from Less Wrong.) Tell it to maximize the amount of money you make, and it seizes control of all the world’s central banks and starts printing $9 quintillion for itself. (Why that amount? I’m assuming it uses 64-bit signed integers, and 2^63 is over 9 quintillion. If it uses long ints, we’re even more doomed.) No, inflation-adjusting won’t fix that; even hyperinflation typically still results in more real seigniorage for the central banks doing the printing (which is, you know, why they do it). The AI won’t ever be able to own more than all the world’s real GDP—but it will be able to own that if it prints enough and we can’t stop it.

But even if we try to come up with some more sophisticated optimization for it to perform (what I’m really talking about here is specifying its utility function), it becomes vital for us to include everything we genuinely care about: Anything we forget to include will be treated as a resource to be consumed in the service of maximizing everything else.

Consider instead what would happen if we programmed the AI to satisfice. The goal would be something like, “Produce at least 400,000 paperclips at a price of at most $0.002 per paperclip.”

Given such an instruction, in all likelihood, it would in fact produce exactly 400,000 paperclips at a price of exactly $0.002 per paperclip. And maybe that’s not strictly the best outcome for your company. But if it’s better than what you were previously doing, it will still increase your profits.

Moreover, such an instruction is far less likely to result in the end of the world.

If the AI has a particular target to meet for its production quota and price limit, the first thing it would probably try is to use your existing machinery. If that’s not good enough, it might start trying to modify the machinery, or acquire new machines, or develop its own techniques for making paperclips. But there are quite strict limits on how creative it is likely to be—because there are quite strict limits on how creative it needs to be. If you were previously producing 200,000 paperclips at $0.004 per paperclip, all it needs to do is double production and halve the cost. That’s a very standard sort of industrial innovation— in computing hardware (admittedly an extreme case), we do this sort of thing every couple of years.

It certainly won’t tear the world apart making paperclips—at most it’ll tear apart enough of the world to make 400,000 paperclips, which is a pretty small chunk of the world, because paperclips aren’t that big. A paperclip weighs about a gram, so you’ve only destroyed about 400 kilos of stuff. (You might even survive the lawsuits!)

Are you leaving money on the table relative to the optimization scenario? Eh, maybe. One, it’s a small price to pay for not ending the world. But two, if 400,000 at $0.002 was too easy, next time try 600,000 at $0.001. Over time, you can gently increase its quotas and tighten its price requirements until your company becomes more and more successful—all without risking the AI going completely rogue and doing something insane and destructive.

Of course this is no guarantee of safety—and I absolutely want us to use every safeguard we possibly can when it comes to advanced AGI. But the simple change from optimizing to satisficing seems to solve the most severe problems immediately and reliably, at very little cost.

Good enough is perfect; perfect is bad.

I see broader implications here for behavioral economics. When all of our models are based on optimization, but human beings overwhelmingly seem to satisfice, maybe it’s time to stop assuming that the models are right and the humans are wrong.

Optimization is perfect if it works—and awful if it doesn’t. Satisficing is always pretty good. Optimization is unstable, while satisficing is robust.

In the real world, that probably means that satisficing is better.

Good enough is perfect; perfect is bad.

Inequality-adjusted GDP and median income

Dec 11 JDN 2459925

There are many problems with GDP as a measure of a nation’s prosperity. For one, GDP ignores natural resources and ecological degradation; so a tree is only counted in GDP once it is cut down. For another, it doesn’t value unpaid work, so caring for a child only increases GDP if you are a paid nanny rather than the child’s parents.

But one of the most obvious problems is the use of an average to evaluate overall prosperity, without considering the level of inequality.

Consider two countries. In Alphania, everyone has an income of about $50,000. In Betavia, 99% of people have an income of $1,000 and 1% have an income of $10 million. What is the per-capita GDP of each country? Alphania’s is $50,000 of course; but Betavia’s is $100,990. Does it really make sense to say that Betavia is a more prosperous country? Maybe it has more wealth overall, but its huge inequality means that it is really not at a high level of development. It honestly sounds like an awful place to live.

A much more sensible measure would be something like median income: How much does a typical person have? In Alphania this is still $50,000; but in Betavia it is only $1,000.

Yet even this leaves out most of the actual distribution; by definition a median is only determined by what is the 50th percentile. We could vary all other incomes a great deal without changing the median.

A better measure would be some sort of inequality-adjusted per-capita GDP, which rescales GDP based on the level of inequality in a country. But we would need a good way of making that adjustment.

I contend that the most sensible way would be to adopt some kind of model of marginal utility of income, and then figure out what income would correspond to the overall average level of utility.

In other words, average over the level of happiness that people in a country get from their income, and then figure out what level of income would correspond to that level of happiness. If we magically gave everyone the same amount of money, how much would they need to get in order for the average happiness in the country to remain the same?

This is clearly going to be less than the average level of income, because marginal utility of income is decreasing; a dollar is not worth as much in real terms to a rich person as it is to a poor person. So if we could somehow redistribute all income evenly while keeping the average the same, that would actually increase overall happiness (though, for many reasons, we can’t simply do that).

For example, suppose that utility of income is logarithmic: U = ln(I).

This means that the marginal utility of an additional dollar is inversely proportional to how many dollars you already have: U'(I) = 1/I.

It also means that a 1% gain or loss in your income feels about the same regardless of how much income you have: ln((1+r)Y) = ln(Y) + ln(1+r). This seems like a quite reasonable, maybe even a bit conservative, assumption; I suspect that losing 1% of your income actually hurts more when you are poor than when you are rich.

Then the inequality adjusted GDP Y is a value such that ln(Y) is equal to the overall average level of utility: E[U] = ln(Y), so Y = exp(E[U]).

This sounds like a very difficult thing to calculate. But fortunately, the distribution of actual income seems to quite closely follow a log-normal distribution. This means that when we take the logarithm of income to get utility, we just get back a very nice, convenient normal distribution!

In fact, it turns out that for a log-normal distribution, the following holds: exp(E[ln(Y)]) = median(Y)

The income which corresponds to the average utility turns out to simply be the median income! We went looking for a better measure than median income, and ended up finding out that median income was the right measure all along.

This wouldn’t hold for most other distributions; and since real-world economies don’t perfectly follow a log-normal distribution, a more precise estimate would need to be adjusted accordingly. But the approximation is quite good for most countries we have good data on, so even for the ones we don’t, median income is likely a very good estimate.

The ranking of countries by median income isn’t radically different from the ranking by per-capita GDP; rich countries are still rich and poor countries are still poor. But it is different enough to matter.

Luxembourg is in 1st place on both lists. Scandinavian countries and the US are in the top 10 in both cases. So it’s fair to say that #ScandinaviaIsBetter for real, and the US really is so rich that our higher inequality doesn’t make our median income lower than the rest of the First World.

But some countries are quite different. Ireland looks quite good in per-capita GDP, but quite bad in median income. This is because a lot of the GDP in Ireland is actually profits by corporations that are only nominally headquartered in Ireland and don’t actually employ very many people there.

The comparison between the US, the UK, and Canada seems particularly instructive. If you look at per-capita GDP PPP, the US looks much richer at $75,000 compared to Canada’s $57,800 (a difference of 29% or 26 log points). But if you look at median personal income, they are nearly equal: $19,300 in the US and $18,600 in Canada (3.7% or 3.7 log points).

On the other hand, in per-capita GDP PPP, the UK looks close to Canada at $55,800 (3.6% or 3.6 lp); but in median income it is dramatically worse, at only $14,800 (26% or 23 lp). So Canada and the UK have similar overall levels of wealth, but life for a typical Canadian is much better than life for a typical Briton because of the higher inequality in Britain. And the US has more wealth than Canada, but it doesn’t meaningfully improve the lifestyle of a typical American relative to a typical Canadian.

Marriage and matching

Oct 10 JDN 2459498

When this post goes live, I will be married. We already had a long engagement, but it was made even longer by the pandemic: We originally planned to be married in October 2020, but then rescheduled for October 2021. Back then, we naively thought that the pandemic would be under control by now and we could have a wedding without COVID testing and masks. As it turns out, all we really accomplished was having a wedding where everyone is vaccinated—and the venue still required testing and masks. Still, it should at least be safer than it was last year, because everyone is vaccinated.

Since marriage is on my mind, I thought I would at least say a few things about the behavioral economics of marriage.

Now when I say the “economics of marriage” you likely have in mind things like tax laws that advantage (or disadvantage) marriage at different incomes, or the efficiency gains from living together that allow you to save money relative to each having your own place. That isn’t what I’m interested in.

What I want to talk about today is something a bit less economic, but more directly about marriage: the matching process by which one finds a spouse.

Economists would refer to marriage as a matching market. Unlike a conventional market where you can buy and sell arbitrary quantities, marriage is (usually; polygamy notwithstanding) a one-to-one arrangement. And unlike even the job market (which is also a one-to-one matching market), marriage usually doesn’t involve direct monetary payments (though in cultures with dowries it arguably does).

The usual model of a matching market has two separate pools: Employers and employees, for example. Typical heteronormative analyses of marriage have done likewise, separating men and women into different pools. But it turns out that sometimes men marry men and women marry women.

So what happens to our matching theory if we allow the pools to overlap?

I think the most sensible way to do it, actually, is to have only one pool: people who want to get married. Then, the way we capture the fact that most—but not all—men only want to marry women, and most—but not all—women only want to marry men is through the utililty function: Heterosexuals are simply those for whom a same-sex match would have very low utility. This would actually mean modeling marriage as a form of the stable roommates problem. (Oh my god, they were roommates!)

The stable roommates problem actually turns out to be harder than the conventional (heteronormative) stable marriage problem; in fact, while the hetero marriage problem (as I’ll henceforth call it) guarantees at least one stable matching for any preference ordering, the queer marriage problem can fail to have any stable solutions. While the hetero marriage problem ensures that everyone will eventually be matched to someone (if the number of men is equal to the number of women), sadly, the queer marriage problem can result in some people being forever rejected and forever alone. (There. Now you can blame the gays for ruining something: We ruined marriage matching.)

The queer marriage problem is actually more general than the hetero marriage problem: The hetero marriage problem is just the queer marriage problem with a particular utility function that assigns everyone strictly gendered preferences.

The best known algorithm for the queer marriage problem is an extension of the standard Gale-Shapley algorithm for the hetero marriage problem, with the same O(n^2) complexity in theory but a considerably more complicated implementation in practice. Honestly, while I can clearly grok the standard algorithm well enough to explain it to someone, I’m not sure I completely follow this one.

Then again, maybe preference orderings aren’t such a great approach after all. There has been a movement in economics toward what is called ordinal utility, where we speak only of preference orderings: You can like A more than B, but there’s no way to say how much more. But I for one am much more inclined toward cardinal utility, where differences have magnitudes: I like Coke more than Pepsi, and I like getting massaged more than being stabbed—and the difference between Coke and Pepsi is a lot smaller than the difference between getting massaged and being stabbed. (Many economists make much of the notion that even cardinal utility is “equivalent up to an affine transformation”, but I’ve got some news for you: So are temperature and time. All you are really doing by making an “affine transformation” is assigning a starting point and a unit of measurement. Temperature has a sensible absolute zero to use as a starting point, you say? Well, so does utility—not existing. )

With cardinal utility, I can offer you a very simple naive algorithm for finding an optimal match: Just try out every possible set of matchings and pick the one that has the highest total utility.

There are up to n!/((n/2)! 2^n) possible matchings to check, so this could take a long time—but it should work. I’m sure there’s a more efficient algorithm out there, but I don’t have the mental energy to figure it out at the moment. It might still be NP-hard, but I doubt it’s that hard.

Moreover, even once we find a utility-maximizing matching, that doesn’t guarantee a stable matching: Some people might still prefer to change even if it would end up reducing total utility.

Here’s a simple set of preferences for which that becomes an issue. In this table, the row is the person making the evaluation, and the columns are how much utility they assign to a match with each person. The total utility of a match is just the sum of utility from the two partners. The utility of “matching with yourself” is the utility of not being matched at all.


ABCD
A0321
B2031
C3201
D3210

Since everyone prefers every other person to not being matched at all (likely not true in real life!), the optimal matchings will always match everyone with someone. Thus, there are actually only 3 matchings to compare:

AB, CD: (3+2)+(1+1) = 7

AC, BD: (2+3)+(1+2) = 8

AD, BC: (1+3)+(3+2) = 9

The optimal matching, in utilitarian terms, is to match A with D and B with C. This yields total utility of 9.

But that’s not stable, because A prefers C over D, and C prefers A over B. So A and C would choose to pair up instead.

In fact, this set of preferences yields no stable matching at all. For anyone who is partnered with D, another member will rate them highest, and D’s partner will prefer that person over D (because D is everyone’s last choice).

There is always a nonempty set of utility-maximizing matchings. (There must be at least one, and could in principle have as many as there are possible matchings.) This actually just follows from the well-ordering property of the real numbers: Any finite set of reals has a maximum.

As this counterexample shows, there isn’t always a stable matching.

So here are a couple of interesting theoretical questions that this gives rise to:
1. If there is a stable matching, must it be in the set of utility-maximizing matchings?

2. If there is a stable matching, must all utility-maximizing matchings be stable?

Question 1 asks whether being stable implies being utility-maximizing.
Question 2 asks whether being utility-maximizing implies being stable—conditional on there being at least one stable possibility.

So, what is the answer to these questions? I don’t know! I’m actually not sure anyone does! We may have stumbled onto cutting-edge research!

I found a paper showing that these properties do not hold when you are doing the hetero marriage problem and you use multiplicative utility for matchings, but this is the queer marriage problem, and moreover I think multiplicative utility is the wrong approach. It doesn’t make sense to me to say that a marriage where one person is extremely happy and the other is indifferent to leaving is equivalent to a marriage where both partners are indifferent to leaving, but that’s what you’d get if you multiply 1*0 = 0. And if you allow negative utility from matchings (i.e. some people would prefer to remain single than to be in a particular match—which seems sensible enough, right?), since -1*-1 = 1, multiplicative utility yields the incredibly perverse result that two people who despise each other constitute a great match. Additive utility solves both problems: 1+0 = 1 and -1+-1 = -2, so, as we would hope, like + indifferent = like, and hate + hate = even more hate.

There is something to be said for the idea that two people who kind of like each other is better than one person ecstatic and the other miserable, but (1) that’s actually debatable, isn’t it? And (2) I think that would be better captured by somehow penalizing inequality in matches, not by using multiplicative utility.

Of course, I haven’t done a really thorough literature search, so other papers may exist. Nor have I spent a lot of time just trying to puzzle through this problem myself. Perhaps I should; this is sort of my job, after all. But even if I had the spare energy to invest heavily in research at the moment (which I sadly do not), I’ve been warned many times that pure theory papers are hard to publish, and I have enough trouble getting published as it is… so perhaps not.

My intuition is telling me that 2 is probably true but 1 is probably false. That is, I would guess that the set of stable matchings, when it’s not empty, is actually larger than the set of utility-maximizing matchings.

I think where I’m getting that intuition is from the properties of Pareto-efficient allocations: Any utility-maximizing allocation is necessarily Pareto-efficient, but many Pareto-efficient allocations are not utility-maximizing. A stable matching is sort of a strengthening of the notion of a Pareto-efficient allocation (though the problem of finding a Pareto-efficient matching for the general queer marriage problem has been solved).

But it is interesting to note that while a Pareto-efficient allocation must exist (typically there are many, but there must be at least one, because it’s impossible to have a cycle of Pareto improvements as long as preferences are transitive), it’s entirely possible to have no stable matchings at all.

Valuing harm without devaluing the harmed

June 9 JDN 2458644

In last week’s post I talked about the matter of “putting a value on a human life”. I explained how we don’t actually need to make a transparently absurd statement like “a human life is worth $5 million” to do cost-benefit analysis; we simply need to ask ourselves what else we could do with any given amount of money. We don’t actually need to put a dollar value on human lives; we need only value them in terms of other lives.

But there is a deeper problem to face here, which is how we ought to value not simply life, but quality of life. The notion is built into the concept of quality-adjusted life-years (QALY), but how exactly do we make such a quality adjustment?

Indeed, much like cost-benefit analysis in general or the value of a statistical life, the very concept of QALY can be repugnant to many people. The problem seems to be that it violates our deeply-held belief that all lives are of equal value: If I say that saving one person adds 2.5 QALY and saving another adds 68 QALY, I seem to be saying that the second person is worth more than the first.

But this is not really true. QALY aren’t associated with a particular individual. They are associated with the duration and quality of life.

It should be fairly easy to convince yourself that duration matters: Saving a newborn baby who will go on to live to be 84 years old adds an awful lot more in terms of human happiness than extending the life of a dying person by a single hour. To call each of these things “saving a life” is actually very unequal: It’s implying that 1 hour for the second person is worth 84 years for the first.

Quality, on the other hand, poses much thornier problems. Presumably, we’d like to be able to say that being wheelchair-bound is a bad thing, and if we can make people able to walk we should want to do that. But this means that we need to assign some sort of QALY cost to being in a wheelchair, which then seems to imply that people in wheelchairs are worth less than people who can walk.

And the same goes for any disability or disorder: Assigning a QALY cost to depression, or migraine, or cystic fibrosis, or diabetes, or blindness, or pneumonia, always seems to imply that people with the condition are worth less than people without. This is a deeply unsettling result.

Yet I think the mistake is in how we are using the concept of “worth”. We are not saying that the happiness of someone with depression is less important than the happiness of someone without; we are saying that the person with depression experiences less happiness—which, in this case of depression especially, is basically true by construction.

Does this imply, however, that if we are given the choice between saving two people, one of whom has a disability, we should save the one without?

Well, here’s an extreme example: Suppose there is a plague which kills 50% of its victims within one year. There are two people in a burning building. One of them has the plague, the other does not. You only have time to save one: Which do you save? I think it’s quite obvious you save the person who doesn’t have the plague.

But that only relies upon duration, which wasn’t so difficult. All right, fine; say the plague doesn’t kill you. Instead, it renders you paralyzed and in constant pain for the rest of your life. Is it really that far-fetched to say that we should save the person who won’t have that experience?

We really shouldn’t think of it as valuing people; we should think of it as valuing actions. QALY are a way of deciding which actions we should take, not which people are more important or more worthy. “Is a person who can walk worth more than a person who needs a wheelchair?” is a fundamentally bizarre and ultimately useless question. ‘Worth more’ in what sense? “Should we spend $100 million developing this technology that will allow people who use wheelchairs to walk?” is the question we should be asking. The QALY cost we assign to a condition isn’t about how much people with that condition are worth; it’s about what resources we should be willing to commit in order to treat that condition. If you have a given condition, you should want us to assign a high QALY cost to it, to motivate us to find better treatments.

I think it’s also important to consider which individuals are having QALY added or subtracted. In last week’s post I talked about how some people read “the value of a statistical life is $5 million” to mean “it’s okay to kill someone as long as you profit at least $5 million”; but this doesn’t follow at all. We don’t say that it’s all right to steal $1,000 from someone just because they lose $1,000 and you gain $1,000. We wouldn’t say it was all right if you had a better investment strategy and would end up with $1,100 afterward. We probably wouldn’t even say it was all right if you were much poorer and desperate for the money (though then we might at least be tempted). If a billionaire kills people to make $10 million each (sadly I’m quite sure that oil executives have killed for far less), that’s still killing people. And in fact since he is a billionaire, his marginal utility of wealth is so low that his value of a statistical life isn’t $5 million; it’s got to be in the billions. So the net happiness of the world has not increased, in fact.

Above all, it’s vital to appreciate the benefits of doing good cost-benefit analysis. Cost-benefit analysis tells us to stop fighting wars. It tells us to focus our spending on medical research and foreign aid instead of yet more corporate subsidies or aircraft carriers. It tells us how to allocate our public health resources so as to save the most lives. It emphasizes how vital our environmental regulations are in making our lives better and longer.

Could we do all these things without QALY? Maybe—but I suspect we would not do them as well, and when millions of lives are on the line, “not as well” is thousands of innocent people dead. Sometimes we really are faced with two choices for a public health intervention, and we need to decide which one will help the most people. Sometimes we really do have to set a pollution target, and decide just what amount of risk is worth accepting for the economic benefits of industry. These are very difficult questions, and without good cost-benefit analysis we could get the answers dangerously wrong.

Markets value rich people more

Feb 26, JDN 2457811

Competitive markets are optimal at maximizing utility, as long as you value rich people more.

That is literally a theorem in neoclassical economics. I had previously thought that this was something most economists didn’t realize; I had delusions of grandeur that maybe I could finally convince them that this is the case. But no, it turns out this is actually a well-known finding; it’s just that somehow nobody seems to care. Or if they do care, they never talk about it. For all the thousands of papers and articles about the distortions created by minimum wage and capital gains tax, you’d think someone could spare the time to talk about the vastly larger fundamental distortions created by the structure of the market itself.

It’s not as if this is something completely hopeless we could never deal with. A basic income would go a long way toward correcting this distortion, especially if coupled with highly progressive taxes. By creating a hard floor and a soft ceiling on income, you can reduce the inequality that makes these distortions so large.

The basics of the theorem are quite straightforward, so I think it’s worth explaining them here. It’s extremely general; it applies anywhere that goods are allocated by market prices and different individuals have wildly different amounts of wealth.

Suppose that each person has a certain amount of wealth W to spend. Person 1 has W1, person 2 has W2, and so on. They all have some amount of happiness, defined by a utility function, which I’ll assume is only dependent on wealth; this is a massive oversimplification of course, but it wouldn’t substantially change my conclusions to include other factors—it would just make everything more complicated. (In fact, including altruistic motives would make the whole argument stronger, not weaker.) Thus I can write each person’s utility as a function U(W). The rate of change of this utility as wealth increases, the marginal utility of wealth, is denoted U'(W).

By the law of diminishing marginal utility, the marginal utility of wealth U'(W) is decreasing. That is, the more wealth you have, the less each new dollar is worth to you.

Now suppose people are buying goods. Each good C provides some amount of marginal utility U'(C) to the person who buys it. This can vary across individuals; some people like Pepsi, others Coke. This marginal utility is also decreasing; a house is worth a lot more to you if you are living in the street than if you already have a mansion. Ideally we would want the goods to go to the people who want them the most—but as you’ll see in a moment, markets systematically fail to do this.

If people are making their purchases rationally, each person’s willingness-to-pay P for a given good C will be equal to their marginal utility of that good, divided by their marginal utility of wealth:

P = U'(C)/U'(W)

Now consider this from the perspective of society as a whole. If you wanted to maximize utility, you’d equalize marginal utility across individuals (by the Extreme Value Theorem). The idea is that if marginal utility is higher for one person, you should give that person more, because the benefit of what you give them will be larger that way; and if marginal utility is lower for another person, you should give that person less, because the benefit of what you give them will be smaller. When everyone is equal, you are at the maximum.

But market prices don’t actually do this. Instead they equalize over willingness-to-pay. So if you’ve got two individuals 1 and 2, instead of having this:

U'(C1) = U'(C2)

you have this:

P1 = P2

which translates to:

U'(C1)/U'(W1) = U'(C2)/U'(W2)

If the marginal utilities were the same, U'(W1) = U'(W2), we’d be fine; these would give the same results. But that would only happen if W1 = W2, that is, if the two individuals had the same amount of wealth.

Now suppose we were instead maximizing weighted utility, where each person gets a weighting factor A based on how “important” they are or something. If your A is higher, your utility matters more. If we maximized this new weighted utility, we would end up like this:

A1*U'(C1) = A2*U'(C2)

Because person 1’s utility counts for more, their marginal utility also counts for more. This seems very strange; why are we valuing some people more than others? On what grounds?

Yet this is effectively what we’ve already done by using market prices.
Just set:
A = 1/U'(W)

Since marginal utility of wealth is decreasing, 1/U'(W) is higher precisely when W is higher.

How much higher? Well, that depends on the utility function. The two utility functions I find most plausible are logarithmic and harmonic. (Actually I think both apply, one to other-directed spending and the other to self-directed spending.)

If utility is logarithmic:

U = ln(W)

Then marginal utility is inversely proportional:

U'(W) = 1/W

In that case, your value as a human being, as spoken by the One True Market, is precisely equal to your wealth:

A = 1/U'(W) = W

If utility is harmonic, matters are even more severe.

U(W) = 1-1/W

Marginal utility goes as the inverse square of wealth:

U'(W) = 1/W^2

And thus your value, according to the market, is equal to the square of your wealth:

A = 1/U'(W) = W^2

What are we really saying here? Hopefully no one actually believes that Bill Gates is really morally worth 400 trillion times as much as a starving child in Malawi, as the calculation from harmonic utility would imply. (Bill Gates himself certainly doesn’t!) Even the logarithmic utility estimate saying that he’s worth 20 million times as much is pretty hard to believe.

But implicitly, the market “believes” that, because when it decides how to allocate resources, something that is worth 1 microQALY to Bill Gates (about the value a nickel dropped on the floor to you or I) but worth 20 QALY (twenty years of life!) to the Malawian child, will in either case be priced at $8,000, and since the child doesn’t have $8,000, it will probably go to Mr. Gates. Perhaps a middle-class American could purchase it, provided it was worth some 0.3 QALY to them.

Now consider that this is happening in every transaction, for every good, in every market. Goods are not being sold to the people who get the most value out of them; they are being sold to the people who have the most money.

And suddenly, the entire edifice of “market efficiency” comes crashing down like a house of cards. A global market that quite efficiently maximizes willingness-to-pay is so thoroughly out of whack when it comes to actually maximizing utility that massive redistribution of wealth could enormously increase human welfare, even if it turned out to cut our total output in half—if utility is harmonic, even if it cut our total output to one-tenth its current value.

The only way to escape this is to argue that marginal utility of wealth is not decreasing, or at least decreasing very, very slowly. Suppose for instance that utility goes as the 0.9 power of wealth:

U(W) = W^0.9

Then marginal utility goes as the -0.1 power of wealth:

U'(W) = 0.9 W^(-0.1)

On this scale, Bill Gates is only worth about 5 times as much as the Malawian child, which in his particular case might actually be too small—if a trolley is about to kill either Bill Gates or 5 Malawian children, I think I save Bill Gates, because he’ll go on to save many more than 5 Malawian children. (Of course, substitute Donald Trump or Charles Koch and I’d let the trolley run over him without a second thought if even a single child is at stake, so it’s not actually a function of wealth.) In any case, a 5 to 1 range across the whole range of human wealth is really not that big a deal. It would introduce some distortions, but not enough to justify any redistribution that would meaningfully reduce overall output.

Of course, that commits you to saying that $1 to a Malawian child is only worth about $1.50 to you or I and $5 to Bill Gates. If you can truly believe this, then perhaps you can sleep at night accepting the outcomes of neoclassical economics. But can you, really, believe that? If you had the choice between an intervention that would give $100 to each of 10,000 children in Malawi, and another that would give $50,000 to each of 100 billionaires, would you really choose the billionaires? Do you really think that the world would be better off if you did?

We don’t have precise measurements of marginal utility of wealth, unfortunately. At the moment, I think logarithmic utility is the safest assumption; it’s about the slowest decrease that is consistent with the data we have and it is very intuitive and mathematically tractable. Perhaps I’m wrong and the decrease is even slower than that, say W^(-0.5) (then the market only values billionaires as worth thousands of times as much as starving children). But there’s no way you can go as far as it would take to justify our current distribution of wealth. W^(-0.1) is simply not a plausible value.

And this means that free markets, left to their own devices, will systematically fail to maximize human welfare. We need redistribution—a lot of redistribution. Don’t take my word for it; the math says so.