The Tragedy of the Commons

JDN 2457387

In a previous post I talked about one of the most fundamental—perhaps the most fundamental—problem in game theory, the Prisoner’s Dilemma, and how neoclassical economic theory totally fails to explain actual human behavior when faced with this problem in both experiments and the real world.

As a brief review, the essence of the game is that both players can either cooperate or defect; if they both cooperate, the outcome is best overall; but it is always in each player’s interest to defect. So a neoclassically “rational” player would always defect—resulting in a bad outcome for everyone. But real human beings typically cooperate, and thus do better. The “paradox” of the Prisoner’s Dilemma is that being “rational” results in making less money at the end.

Obviously, this is not actually a good definition of rational behavior. Being short-sighted and ignoring the impact of your behavior on others doesn’t actually produce good outcomes for anybody, including yourself.

But the Prisoner’s Dilemma only has two players. If we expand to a larger number of players, the expanded game is called a Tragedy of the Commons.

When we do this, something quite surprising happens: As you add more people, their behavior starts converging toward the neoclassical solution, in which everyone defects and we get a bad outcome for everyone.

Indeed, people in general become less cooperative, less courageous, and more apathetic the more of them you put together. K was quite apt when he said, “A person is smart; people are dumb, panicky, dangerous animals and you know it.” There are ways to counteract this effect, as I’ll get to in a moment—but there is a strong effect that needs to be counteracted.

We see this most vividly in the bystander effect. If someone is walking down the street and sees someone fall and injure themselves, there is about a 70% chance that they will go try to help the person who fell—humans are altruistic. But if there are a dozen people walking down the street who all witness the same event, there is only a 40% chance that any of them will help—humans are irrational.

The primary reason appears to be diffusion of responsibility. When we are alone, we are the only one could help, so we feel responsible for helping. But when there are others around, we assume that someone else could take care of it for us, so if it isn’t done that’s not our fault.

There also appears to be a conformity effect: We want to conform our behavior to social norms (as I said, to a first approximation, all human behavior is social norms). The mere fact that there are other people who could have helped but didn’t suggests the presence of an implicit social norm that we aren’t supposed to help this person for some reason. It never occurs to most people to ask why such a norm would exist or whether it’s a good one—it simply never occurs to most people to ask those questions about any social norms. In this case, by hesitating to act, people actually end up creating the very norm they think they are obeying.

This can lead to what’s called an Abilene Paradox, in which people simultaneously try to follow what they think everyone else wants and also try to second-guess what everyone else wants based on what they do, and therefore end up doing something that none of them actually wanted. I think a lot of the weird things humans do can actually be attributed to some form of the Abilene Paradox. (“Why are we sacrificing this goat?” “I don’t know, I thought you wanted to!”)

Autistic people are not as good at following social norms (though some psychologists believe this is simply because our social norms are optimized for the neurotypical population). My suspicion is that autistic people are therefore less likely to suffer from the bystander effect, and more likely to intervene to help someone even if they are surrounded by passive onlookers. (Unfortunately I wasn’t able to find any good empirical data on that—it appears no one has ever thought to check before.) I’m quite certain that autistic people are less likely to suffer from the Abilene Paradox—if they don’t want to do something, they’ll tell you so (which sometimes gets them in trouble).

Because of these psychological effects that blunt our rationality, in large groups human beings often do end up behaving in a way that appears selfish and short-sighted.

Nowhere is this more apparent than in ecology. Recycling, becoming vegetarian, driving less, buying more energy-efficient appliances, insulating buildings better, installing solar panels—none of these things are particularly difficult or expensive to do, especially when weighed against the tens of millions of people who will die if climate change continues unabated. Every recyclable can we throw in the trash is a silent vote for a global holocaust.

But as it no doubt immediately occurred to you to respond: No single one of us is responsible for all that. There’s no way I myself could possibly save enough carbon emissions to significantly reduce climate change—indeed, probably not even enough to save a single human life (though maybe). This is certainly true; the error lies in thinking that this somehow absolves us of the responsibility to do our share.

I think part of what makes the Tragedy of the Commons so different from the Prisoner’s Dilemma, at least psychologically, is that the latter has an identifiable victimwe know we are specifically hurting that person more than we are helping ourselves. We may even know their name (and if we don’t, we’re more likely to defect—simply being on the Internet makes people more aggressive because they don’t interact face-to-face). In the Tragedy of the Commons, it is often the case that we don’t know who any of our victims are; moreover, it’s quite likely that we harm each one less than we benefit ourselves—even though we harm everyone overall more.

Suppose that driving a gas-guzzling car gives me 1 milliQALY of happiness, but takes away an average of 1 nanoQALY from everyone else in the world. A nanoQALY is tiny! Negligible, even, right? One billionth of a year, a mere 30 milliseconds! Literally less than the blink of an eye. But take away 30 milliseconds from everyone on Earth and you have taken away 7 years of human life overall. Do that 10 times, and statistically one more person is dead because of you. And you have gained only 10 milliQALY, roughly the value of $300 to a typical American. Would you kill someone for $300?

Peter Singer has argued that we should in fact think of it this way—when we cause a statistical death by our inaction, we should call it murder, just as if we had left a child to drown to keep our clothes from getting wet. I can’t agree with that. When you think seriously about the scale and uncertainty involved, it would be impossible to live at all if we were constantly trying to assess whether every action would lead to statistically more or less happiness to the aggregate of all human beings through all time. We would agonize over every cup of coffee, every new video game. In fact, the global economy would probably collapse because none of us would be able to work or willing to buy anything for fear of the consequences—and then whom would we be helping?

That uncertainty matters. Even the fact that there are other people who could do the job matters. If a child is drowning and there is a trained lifeguard right next to you, the lifeguard should go save the child, and if they don’t it’s their responsibility, not yours. Maybe if they don’t you should try; but really they should have been the one to do it.
But we must also not allow ourselves to simply fall into apathy, to do nothing simply because we cannot do everything. We cannot assess the consequences of every specific action into the indefinite future, but we can find general rules and patterns that govern the consequences of actions we might take. (This is the difference between act utilitarianism, which is unrealistic, and rule utilitarianism, which I believe is the proper foundation for moral understanding.)

Thus, I believe the solution to the Tragedy of the Commons is policy. It is to coordinate our actions together, and create enforcement mechanisms to ensure compliance with that coordinated effort. We don’t look at acts in isolation, but at policy systems holistically. The proper question is not “What should I do?” but “How should we live?”

In the short run, this can lead to results that seem deeply suboptimal—but in the long run, policy answers lead to sustainable solutions rather than quick-fixes.

People are starving! Why don’t we just steal money from the rich and use it to feed people? Well, think about what would happen if we said that the property system can simply be unilaterally undermined if someone believes they are achieving good by doing so. The property system would essentially collapse, along with the economy as we know it. A policy answer to that same question might involve progressive taxation enacted by a democratic legislature—we agree, as a society, that it is justified to redistribute wealth from those who have much more than they need to those who have much less.

Our government is corrupt! We should launch a revolution! Think about how many people die when you launch a revolution. Think about past revolutions. While some did succeed in bringing about more just governments (e.g. the French Revolution, the American Revolution), they did so only after a long period of strife; and other revolutions (e.g. the Russian Revolution, the Iranian Revolution) have made things even worse. Revolution is extremely costly and highly unpredictable; we must use it only as a last resort against truly intractable tyranny. The policy answer is of course democracy; we establish a system of government that elects leaders based on votes, and then if they become corrupt we vote to remove them. (Sadly, we don’t seem so good about that second part—the US Congress has a 14% approval rating but a 95% re-election rate.)

And in terms of ecology, this means that berating ourselves for our sinfulness in forgetting to recycle or not buying a hybrid car does not solve the problem. (Not that it’s bad to recycle, drive a hybrid car, and eat vegetarian—by all means, do these things. But it’s not enough.) We need a policy solution, something like a carbon tax or cap-and-trade that will enforce incentives against excessive carbon emissions.

In case you don’t think politics makes a difference, all of the Democrat candidates for President have proposed such plans—Bernie Sanders favors a carbon tax, Martin O’Malley supports an aggressive cap-and-trade plan, and Hillary Clinton favors heavily subsidizing wind and solar power. The Republican candidates on the other hand? Most of them don’t even believe in climate change. Chris Christie and Carly Fiorina at least accept the basic scientific facts, but (1) they are very unlikely to win at this point and (2) even they haven’t announced any specific policy proposals for dealing with it.

This is why voting is so important. We can’t do enough on our own; the coordination problem is too large. We need to elect politicians who will make policy. We need to use the systems of coordination enforcement that we have built over generations—and that is fundamentally what a government is, a system of coordination enforcement. Only then can we overcome the tendency among human beings to become apathetic and short-sighted when faced with a Tragedy of the Commons.

Tax incidence revisited, part 5: Who really pays the tax?

JDN 2457359

I think all the pieces are now in place to really talk about tax incidence.

In earlier posts I discussed how taxes have important downsides, then talked about how taxes can distort prices, then explained that taxes are actually what gives money its value. In the most recent post in the series, I used supply and demand curves to show precisely how taxes create deadweight loss.

Now at last I can get to the fundamental question: Who really pays the tax?

The common-sense answer would be that whoever writes the check to the government pays the tax, but this is almost completely wrong. It is right about one aspect, a sort of political economy notion, which is that if there is any trouble collecting the tax, it’s generally that person who is on the hook to pay it. But especially in First World countries, most taxes are collected successfully almost all the time. Tax avoidance—using loopholes to reduce your tax burden—is all over the place, but tax evasion—illegally refusing to pay the tax you owe—is quite rare. And for this political economy argument to hold, you really need significant amounts of tax evasion and enforcement against it.

The real economic answer is that the person who pays the tax is the person who bears the loss in surplus. In essence, the person who bears the tax is the person who is most unhappy about it.

In the previous post in this series, I explained what surplus is, but it bears a brief repetition. Surplus is the value you get from purchases you make, in excess of the price you paid to get them. It’s measured in dollars, because that way we can read it right off the supply and demand curve. We should actually be adjusting for marginal utility of wealth and measuring in QALY, but that’s a lot harder so it rarely gets done.

In the graphs I drew in part 4, I already talked about how the deadweight loss is much greater if supply and demand are elastic than if they are inelastic. But in those graphs I intentionally set it up so that the elasticities of supply and demand were about the same. What if they aren’t?

Consider what happens if supply is very inelastic, but demand is very elastic. In fact, to keep it simple, lets suppose that supply is perfectly inelastic, but demand is perfectly elastic. This means that supply elasticity is 0, but demand elasticity is infinite.

The zero supply elasticity means that the worker would actually be willing to work up to their maximum hours for nothing, but is unwilling to go above that regardless of the wage. They have a specific amount of hours they want to work, regardless of what they are paid.

The infinite demand elasticity means that each hour of work is worth exactly the same amount the employer, with no diminishing returns. They have a specific wage they are willing to pay, regardless of how many hours it buys.

Both of these are quite extreme; it’s unlikely that in real life we would ever have an elasticity that is literally zero or infinity. But we do actually see elasticities that get very low or very high, and qualitatively they act the same way.

So let’s suppose as before that the wage is $20 and the number of hours worked is 40. The supply and demand graph actually looks a little weird: There is no consumer surplus whatsoever.


Each hour is worth $20 to the employer, and that is what they shall pay. The whole graph is full of producer surplus; the worker would have been willing to work for free, but instead gets $20 per hour for 40 hours, so they gain a whopping $800 in surplus.


Now let’s implement a tax, say 50% to make it easy. (That’s actually a huge payroll tax, and if anybody ever suggested implementing that I’d be among the people pulling out a Laffer curve to show them why it’s a bad idea.)

Normally a tax would push the demand wage higher, but in this case $20 is exactly what they can afford, so they continue to pay exactly the same as if nothing had happened. This is the extreme example in which your “pre-tax” wage is actually your pre-tax wage, what you’d get if there hadn’t been a tax. This is the only such example—if demand elasticity is anything less than infinity, the wage you see listed as “pre-tax” will in fact be higher than what you’d have gotten in the absence of the tax.

The tax revenue is therefore borne entirely by the worker; they used to take home $20 per hour, but now they only get $10. Their new surplus is only $400, precisely 40% lower. The extra $400 goes directly to the government, which makes this example unusual in another way: There is no deadweight loss. The employer is completely unaffected; their surplus goes from zero to zero. No surplus is destroyed, only moved. Surplus is simply redistributed from the worker to the government, so the worker bears the entirety of the tax. Note that this is true regardless of who actually writes the check; I didn’t even have to include that in the model. Once we know that there was a tax imposed on each hour of work, the market prices decided who would bear the burden of that tax.

By Jove, we’ve actually found an example in which it’s fair to say “the government is taking my hard-earned money!” (I’m fairly certain if you replied to such people with “So you think your supply elasticity is zero but your employer’s demand elasticity is infinite?” you would be met with blank stares or worse.)

This is however quite an extreme case. Let’s try a more realistic example, where supply elasticity is very small, but not zero, and demand elasticity is very high, but not infinite. I’ve made the demand elasticity -10 and the supply elasticity 0.5 for this example.


Before the tax, the wage was $20 for 40 hours of work. The worker received a producer surplus of $700. The employer received a consumer surplus of only $80. The reason their demand is so elastic is that they are only barely getting more from each hour of work than they have to pay.

Total surplus is $780.


After the tax, the number of hours worked has dropped to 35. The “pre-tax” (demand) wage has only risen to $20.25. The after-tax (supply) wage the worker actually receives has dropped all the way to $10. The employer’s surplus has only fallen to $65.63, a decrease of $14.37 or 18%. Meanwhile the worker’s surplus has fallen all the way to $325, a decrease of $275 or 46%. The employer does feel the tax, but in both absolute and relative terms, the worker feels the tax much more than the employer does.

The tax revenue is $358.75, which means that the total surplus has been reduced to $749.38. There is now $30.62 of deadweight loss. Where both elasticities are finite and nonzero, deadweight loss is basically inevitable.

In this more realistic example, the burden was shared somewhat, but it still mostly fell on the worker, because the worker had a much lower elasticity. Let’s try turning the tables and making demand elasticity low while supply elasticity is high—in fact, once again let’s illustrate by using the extreme case of zero versus infinity.

In order to do this, I need to also set a maximum wage the employer is willing to pay. With nonzero elasticity, that maximum sort of came out automatically when the demand curve hits zero; but when elasticity is zero, the line is parallel so it never crosses. Let’s say in this case that the maximum is $50 per hour.

(Think about why we didn’t need to set a minimum wage for the worker when supply was perfectly inelastic—there already was a minimum, zero.)


This graph looks deceptively similar to the previous; basically all that has happened is the supply and demand curves have switched places, but that makes all the difference. Now instead of the worker getting all the surplus, it’s the employer who gets all the surplus. At their maximum wage of $50, they are getting $1200 in surplus.

Now let’s impose that same 50% tax again.


The worker will not accept any wage less than $20, so the demand wage must rise all the way to $40. The government will then receive $800 in revenue, while the employer will only get $400 in surplus. Notice again that the deadweight loss is zero. The employer will now bear the entire burden of the tax.

In this case the “pre-tax” wage is basically meaningless; regardless of the value of the tax the worker would receive the same amount, and the “pre-tax” wage is really just an accounting mechanism the government uses to say how large the tax is. They could just as well have said, “Hey employer, give us $800!” and the outcome would be the same. This is called a lump-sum tax, and they don’t work in the real world but are sometimes used for comparison. The thing about a lump-sum tax is that it doesn’t distort prices in any way, so in principle you could use it to redistribute wealth however you want. But in practice, there’s no way to implement a lump-sum tax that would be large enough to raise sufficient revenue but small enough to be affordable by the entire population. Also, a lump-sum tax is extremely regressive, hurting the poor tremendously while the rich feel nothing. (Actually the closest I can think of to a realistic lump-sum tax would be a basic income, which is essentially a negative lump-sum tax.)

I could keep going with more examples, but the basic argument is the same.

In general what you will find is that the person who bears a tax is the person who has the most to lose if less of that good is sold. This will mean their supply or demand is very inelastic and their surplus is very large.

Inversely, the person who doesn’t feel the tax is the person who has the least to lose if the good stops being sold. That will mean their supply or demand is very elastic and their surplus is very small.
Once again, it really does not matter how the tax is collected. It could be taken entirely from the employer, or entirely from the worker, or shared 50-50, or 60-40, or whatever. As long as it actually does get paid, the person who will actually feel the tax depends upon the structure of the market, not the method of tax collection. Raising “employer contributions” to payroll taxes won’t actually make workers take any more home; their “pre-tax” wages will simply be adjusted downward to compensate. Likewise, raising the “employee contribution” won’t actually put more money in the pockets of the corporation, it will just force them to raise wages to avoid losing employees. The actual amount that each party must contribute to the tax isn’t based on how the checks are written; it’s based on the elasticities of the supply and demand curves.

And that’s why I actually can’t get that strongly behind corporate taxes; even though they are formally collected from the corporation, they could simply be hurting customers or employees. We don’t actually know; we really don’t understand the incidence of corporate taxes. I’d much rather use income taxes or even sales taxes, because we understand the incidence of those.


JDN 2457202 EDT 17:52.

The 1992 Bill Clinton campaign had a slogan, “It’s the economy, stupid.”: A snowclone I’ve used on occasion is “it’s the externalities, stupid.” (Though I’m actually not all that fond of calling people ‘stupid’; though occasionally true is it never polite and rarely useful.) Externalities are one of the most important concepts in economics, and yet one that even all too many economists frequently neglect.

Fortunately for this one, I really don’t need much math; the concept isn’t even that complicated, which makes it all the more mysterious how frequently it is ignored. An externality is simply an effect that an action has upon those who were not involved in choosing to perform that action.

All sorts of actions have externalities; indeed, much rarer are actions that don’t. An obvious example is that punching someone in the face has the externality of injuring that person. Pollution is an important externality of many forms of production, because the people harmed by pollution are typically not the same people who were responsible for creating it. Traffic jams are created because every car on the road causes a congestion externality on all the other cars.

All the aforementioned are negative externalities, but there are also positive externalities. When one individual becomes educated, they tend to improve the overall economic viability of the place in which they live. Building infrastructure benefits whole communities. New scientific discoveries enhance the well-being of all humanity.

Externalities are a fundamental problem for the functioning of markets. In the absence of externalities—if each person’s actions only affected that one person and nobody else—then rational self-interest would be optimal and anything else would make no sense. In arguing that rationality is equivalent to self-interest, generations of economists have been, tacitly or explicitly, assuming that there are no such things as externalities.

This is a necessary assumption to show that self-interest would lead to something I discussed in an earlier post: Pareto-efficiency, in which the only way to make one person better off is to make someone else worse off. As I already talked about in that other post, Pareto-efficiency is wildly overrated; a wide variety of Pareto-efficient systems would be intolerable to actually live in. But in the presence of externalities, markets can’t even guarantee Pareto-efficiency, because it’s possible to have everyone acting in their rational self-interest cause harm to everyone at once.

This is called a tragedy of the commons; the basic idea is really quite simple. Suppose that when I burn a gallon of gasoline, that makes me gain 5 milliQALY by driving my car, but then makes everyone lose 1 milliQALY in increased pollution. On net, I gain 4 milliQALY, so if I am rational and self-interested I would do that. But now suppose that there are 10 people all given the same choice. If we all make that same choice, each of us will gain 1 milliQALY—and then lose 10 milliQALY. We would all have been better off if none of us had done it, even though it made sense to each of us at the time. Burning a gallon of gasoline to drive my car is beneficial to me, more so than the release of carbon dioxide into the atmosphere is harmful; but as a result of millions of people burning gasoline, the carbon dioxide in the atmosphere is destabilizing our planet’s climate. We’d all be better off if we could find some way to burn less gasoline.

In order for rational self-interest to be optimal, externalities have to somehow be removed from the system. Otherwise, there are actions we can take that benefit ourselves but harm other people—and thus, we would all be better off if we acted to some degree altruistically. (When I say things like this, most non-economists think I am saying something trivial and obvious, while most economists insist that I am making an assertion that is radical if not outright absurd.)

But of course a world without externalities is a world of complete isolation; it’s a world where everyone lives on their own deserted island and there is no way of communicating or interacting with any other human being in the world. The only reasonable question about this world is whether we would die first or go completely insane first; clearly those are the two things that would happen. Human beings are fundamentally social animals—I would argue that we are in fact more social even than eusocial animals like ants and bees. (Ants and bees are only altruistic toward their own kin; humans are altruistic to groups of millions of people we’ve never even met.) Humans without social interaction are like flowers without sunlight.

Indeed, externalities are so common that if markets only worked in their absence, markets would make no sense at all. Fortunately this isn’t true; there are some ways that markets can be adjusted to deal with at least some kinds of externalities.

One of the most well-known is the Coase theorem; this is odd because it is by far the worst solution. The Coase theorem basically says that if you can assign and enforce well-defined property rights and there is absolutely no cost in making any transaction, markets will automatically work out all externalities. The basic idea is that if someone is about to perform an action that would harm you, you can instead pay them not to do it. Then, the harm to you will be prevented and they will incur an additional benefit.

In the above example, we could all agree to pay $30 (which let’s say is worth 1 milliQALY) to each person who doesn’t burn a gallon of gasoline that would pollute our air. Then, if I were thinking about burning some gasoline, I wouldn’t want to do it, because I’d lose the $300 in payments, which costs me 10 milliQALY, while the benefits of burning the gasoline are only 5 milliQALY. We all reason the same way, and the result is that nobody burns gasoline and actually the money exchanged all balances out so we end up where we were before. The result is that we are all better off.

The first thought you probably have is: How do I pay everyone who doesn’t hurt me? How do I even find all those people? How do I ensure that they follow through and actually don’t hurt me? These are the problems of transaction costs and contract enforcement that are usually presented as the problem with the Coase theorem, and they certainly are very serious problems. You end up needing some sort of government simply to enforce all those contracts, and even then there’s the question of how we can possibly locate everyone who has ever polluted our air or our water.

But in fact there’s an even more fundamental problem: This is extortion. We are almost always in the condition of being able to harm other people, and a system in which the reason people don’t hurt each other is because they’re constantly paying each other not to is a system in which the most intimidating psychopath is the wealthiest person in the world. That system is in fact Pareto-efficient (the psychopath does quite well for himself indeed); but it’s exactly the sort of Pareto-efficient system that isn’t worth pursuing.

Another response to externalities is simply to accept them, which isn’t as awful as it sounds. There are many kinds of externalities that really aren’t that bad, and anything we might do to prevent them is likely to make the cure worse than the disease. Think about the externality of people standing in front of you in line, or the externality of people buying the last cereal box off the shelf before you can get there. The externality of taking the job you applied for may hurt at the time, but in the long run that’s how we maintain a thriving and competitive labor market. In fact, even the externality of ‘gentrifying’ your neighborhood so you can no longer afford it is not nearly as bad as most people seem to think—indeed, the much larger problem seems to be the poor neighborhoods that don’t have rising incomes, remaining poor for generations. (It also makes no sense to call this “gentrifying”; the only landed gentry we have in America is the landowners who claim a ludicrous proportion of our wealth, not the middle-class people who buy cheap homes and move in. If you really want to talk about a gentry, you should be thinking Waltons and Kochs—or Bushs and Clintons.) These sorts of minor externalities that are better left alone are sometimes characterized as pecuniary externalities because they usually are linked to prices, but I think that really misses the point; it’s quite possible for an externality to be entirely price-related and do enormous damage (read: the entire financial system) and to have little or nothing to do with prices and still be not that bad (like standing in line as I mentioned above).

But obviously we can’t leave all externalities alone in this way. We can’t just let people rob and murder one another arbitrarily, or ignore the destruction of the world’s climate that threatens hundreds of millions of lives. We can’t stand back and let forests burn and rivers run dry when we could easily have saved them.

The much more reasonable and realistic response to externalities is what we call government—there are rules you have to follow in society and punishments you face if you don’t. We can avoid most of the transaction problems involved in figuring out who polluted our water by simply making strict rules about polluting water in general. We can prevent people from stealing each other’s things or murdering each other by police who will investigate and punish such crimes.

This is why regulation—and a government strong enough to enforce that regulation—is necessary for the functioning of a society. This dichotomy we have been sold about “regulations versus the market” is totally nonsensical; the market depends upon regulations. This doesn’t justify any particular regulation—and indeed, an awful lot of regulations are astonshingly bad. But some sort of regulatory system is necessary for a market to function at all, and the question has never been whether we will have regulations but which regulations we will have. People who argue that all regulations must go and the market would somehow work on its own are either deeply ignorant of economics or operating from an ulterior motive; some truly horrendous policies have been made by arguing that “less government is always better” when the truth is nothing of the sort.

In fact, there is one real-world method I can think of that actually comes reasonably close to eliminating all externalities—and it is called social democracy. By involving everyone—democracy—in a system that regulates the economy—socialism—we can, in a sense, involve everyone in every transaction, and thus make it impossible to have externalities. In practice it’s never that simple, of course; but the basic concept of involving our whole society in making the rules that our society will follow is sound—and in fact I can think of no reasonable alternative.

We have to institute some sort of regulatory system, but then we need to decide what the regulations will be and who will control them. If we want to instead vest power in a technocratic elite, how do you decide whom to include in that elite? How do we ensure that the technocrats are actually better for the general population if there is no way for that general population to have a say in their election? By involving as many people as we can in the decision-making process, we make it much less likely that one person’s selfish action will harm many others. Indeed, this is probably why democracy prevents famine and genocide—which are, after all, rather extreme examples of negative externalities.

Monopoly and Oligopoly

JDN 2457180 EDT 08:49

Welcome to the second installment in my series, “Top 10 Things to Know About Economics.” The first was not all that well-received, because it turns it out it was just too dense with equations (it didn’t help that the equation formatting was a pain.) Fortunately I think I can explain monopoly and oligopoly with far fewer equations—which I will represent as PNG for your convenience.

You probably already know at least in basic terms how a monopoly works: When there is only one seller of a product, that seller can charge higher prices. But did you ever stop and think about why they can charge higher prices—or why they’d want to?

The latter question is not as trivial as it sounds; higher prices don’t necessarily mean higher profits. By the Law of Demand (which, like the Pirate Code, is really more like a guideline), raising the price of a product will result in fewer being sold. There are two countervailing effects: Raising the price raises the profits from selling each item, but reduces the number of items sold. The optimal price, therefore, is the one that balances these two effects, maximizing price times quantity.

A monopoly can actually set this optimal price (provided that they can figure out what it is, of course; but let’s assume they can). They therefore solve this maximization problem for price P(Q) a function of quantity sold, quantity Q, and cost C(Q) a function of quantity produced (which at the optimum is equal to quantity sold; no sense making them if you won’t sell them!):


As you may remember if you’ve studied calculus, the maximum is achieved at the point where the derivative is zero. If you haven’t studied calculus, the basic intuition here is that you move along the curve seeing whether the profits go up or down with each small change, and when you reach the very top—the maximum—you’ll be at a point where you switch from going up to going down, and at that exact point a small change will move neither up nor down. The derivative is really just a fancy term for the slope of the curve at each point; at a maximum this slope changes from positive to negative, and at the exact point it is zero.



This is a general solution, but it’s easier to understand if we use something more specific. As usual, let’s make things simpler by assuming everything is linear; we’ll assume that demand starts at a maximum price of P0 and then decreases at a rate 1/e. This is the demand curve.


Then, we’ll assume that the marginal cost of production C'(Q) is also linear, increasing at a rate 1/n. This is the supply curve.


Now we can graph the supply and demand curves from these equations. But the monopoly doesn’t simply set supply equal to demand; instead, they set supply equal to marginal revenue, which takes into account the fact that selling more items requires lowering the price on all of them. Marginal revenue is this term:


This is strictly less than the actual price, because increasing the quantity sold requires decreasing the price—which means that P'(Q) < 0. They set the quantity by setting marginal revenue equal to marginal cost. Then they set the price by substituting that quantity back into the demand equation.

Thus, the monopoly should set this quantity:


They would then charge this price (substitute back into the demand equation):


On a graph, there are the supply and demand curves, and then below the demand curve, the marginal revenue curve; it’s the intersection of that curve with the supply curve that the monopoly uses to set its quantity, and then it substitutes that quantity into the demand curve to get the price:


Now I’ll show that this is higher than the price in a perfectly competitive market. In a competitive market, competitive companies can’t do anything to change the price, so from their perspective P'(Q) = 0. They can only control the quantity they produce and sell; they keep producing more as long as they receive more money for each one than it cost to produce it. By the Law of Diminishing Returns (again more like a guideline) the cost will increase as they produce more, until finally the last one they sell cost just as much to make as they made from selling it. (Why bother selling that last one, you ask? You’re right; they’d actually sell one less than this, but if we assume that we’re talking about thousands of products sold, one shouldn’t make much difference.)

Price is simply equal to marginal cost:


In our specific linear case that comes out to this quantity:


Therefore, they charge this price (you can substitute into either the supply or demand equations, because in a competitive market supply equals demand):


Subtract the two, and you can see that monopoly price is higher than the competitive price by this amount:


Notice that the monopoly price will always be larger than the competitive price, so long as e > 0 and n > 0, meaning that increasing the quantity sold requires decreasing the price, but increasing the cost of production. A monopoly has an incentive to raise the price higher than the competitive price, but not too much higher—they still want to make sure they sell enough products.

Monopolies introduce deadweight loss, because in order to hold the price up they don’t produce as many products as people actually want. More precisely, each new product produced would add overall value to the economy, but the monopoly stops producing them anyway because it wouldn’t add to their own profits.

One “solution” to this problem is to let the monopoly actually take those profits; they can do this if they price-discriminate, charging a higher price for some customers than others. In the best-case scenario (for them), they charge each customer a price that they are just barely willing to pay, and thus produce until no customer is willing to pay more than the product costs to make. That final product sold also has price equal to marginal cost, so the total quantity sold is the same under competition. It is, in that sense, “efficient”.

What many neoclassical economists seem to forget about price-discriminating monopolies is that they appropriate the entire surplus value of the product—the customers are only just barely willing to buy; they get no surplus value from doing so.

In reality, very few monopolies can price-discriminate that precisely; instead, they put customers into broad categories and then try to optimize the price for each of those categories. Credit ratings, student discounts, veteran discounts, even happy hours are all forms of this categorical price discrimination. If the company cares even a little bit about what sort of customer you are rather than how much money you’re paying, they are price-discriminating.

It’s so ubiquitous I’m actually having trouble finding a good example of a product that doesn’t have categorical price discrimination. I was thinking maybe computers? Nope, student discounts. Cars? No, employee discounts and credit ratings. Refrigerators, maybe? Well, unless there are coupons (coupons price discriminate against people who don’t want to bother clipping them). Certainly not cocktails (happy hour) or haircuts (discrimination by sex, the audacity!); and don’t even get me started on software.

I introduced price-discrimination in the context of monopoly, which is usually how it’s done; but one thing you’ll notice about all the markets I just indicated is that they aren’t monopolies, yet they still exhibit price discrimination. Cars, computers, refrigerators, and software are made under oligopoly, a system in which a handful of companies control the majority of the market. As you might imagine, an oligopoly tends to act somewhere in between a monopoly and a competitive market—but there are some very interesting wrinkles I’ll get to in a moment.

Cocktails and haircuts are sold in a different but still quite interesting system called monopolistic competition; indeed, I’m not convinced that there is any other form of competition in the real world. True perfectly-competitive markets just don’t seem to actually exist. Under monopolistic competition, there are many companies that don’t have much control over price in the overall market, but the products they sell aren’t quite the same—they’re close, but not equivalent. Some barbers are just better at cutting hair, and some bars are more fun than others. More importantly, they aren’t the same for everyone. They have different customer bases, which may overlap but still aren’t the same. You don’t just want a barber who is good, you want one who works close to where you live. You don’t just want a bar that’s fun; you want one that you can stop by after work. Even if you are quite discerning and sensitive to price, you’re not going to drive from Ann Arbor to Cleveland to get your hair cut—it would cost more for the gasoline than the difference. And someone is Cleveland isn’t going to drive all the way to Ann Arbor, either! Hence, barbers in Ann Arbor have something like a monopoly (or oligopoly) over Ann Arbor haircuts, and barbers in Cleveland have something like a monopoly over Cleveland haircuts. That’s monopolistic competition.

Supposedly monopolistic competition drives profits to zero in the long run, but I’ve yet to see this happen in any real market. Maybe the problem is that conceit “the long run”; as Keynes said, “in the long run we are all dead.” Sometimes the argument is made that it has driven real economic profits to zero, because you’ve got to take into account the cost of entry, the normal profit. But of course, that’s extremely difficult to measure, so how do we know whether profits have been driven to normal profit? Moreover, the cost of entry isn’t the same for everyone, so people with lower cost of entry are still going to make real economic profits. This means that the majority of companies are going to still make some real economic profit, and only the ones that had the hardest time entering will actually see their profits driven to zero.

Monopolistic competition is relatively simple. Oligopoly, on the other hand, is fiercely complicated. Why? Because under oligopoly, you actually have to treat human beings as human beings.

What I mean by that is that under perfect competition or even monopolistic competition, the economic incentives are so powerful that people basically have to behave according to the neoclassical rational agent model, or they’re going to go out of business. There is very little room for errors or even altruistic acts, because your profit margin is so tight. In perfect competition, there is literally zero room; in monopolistic competition, the only room for individual behavior is provided by the degree of monopoly, which in most industries is fairly small. One person’s actions are unable to shift the direction of the overall market, so the market as a system has ultimate power.

Under oligopoly, on the other hand, there are a handful of companies, and people know their names. You as a CEO have a reputation with customers—and perhaps more importantly, a reputation with other companies. Individual decision-makers matter, and one person’s decision depends on their prediction of other people’s decision. That means we need game theory.

The simplest case is that of duopoly, where there are only two major companies. Not many industries are like this, but I can think of three: soft drinks (Coke and Pepsi), commercial airliners (Boeing and Airbus), and home-user operating systems (Microsoft and Apple). In all three cases, there is also some monopolistic element, because the products they sell are not exactly the same; but for now let’s ignore that and suppose they are close enough that nobody cares.

Imagine yourself in the position of, say, Boeing: How much should you charge for an airplane?

If Airbus didn’t exist, it’s simple; you’d charge the monopoly price. But since they do exist, the price you charge must depend not only on the conditions of the market, but also what you think Airbus is likely to do—and what they are likely to do depends in turn on what they think you are likely to do.

If you think Airbus is going to charge the monopoly price, what should you do? You could charge the monopoly price as well, which is called collusion. It’s illegal to actually sign a contract with Airbus to charge that price (though this doesn’t seem to stop cable companies or banks—probably has something to do with the fact that we never punish them for doing it), and let’s suppose you as the CEO of Boeing are an honest and law-abiding citizen (I know, it’s pretty fanciful; I’m having trouble keeping a straight face myself) and aren’t going to violate the antitrust laws. You can still engage in tacit collusion, in which you both charge the monopoly price and take your half of the very high monopoly profits.

There’s a temptation not to collude, however, which the airlines who buy your planes are very much hoping you’ll succumb to. Suppose Airbus is selling their A350-100 for $341 million. You could sell the comparable 777-300ER for $330 million and basically collude, or you could cut the price and draw in more buyers. Say you cut it to $250 million; it probably only costs $150 million to make, so you’re still making a profit on each one; but where you sold say 150 planes a year and profited $180 million on each (a total profit of $27 billion), you could instead capture the whole market and sell 300 planes a year and profit $100 million on each (a total profit of $30 billion). That’s a 10% higher profit and $3 billion a year for your shareholders; why wouldn’t you do that?

Well, think about what will happen when Airbus releases next year’s price list. You cut the price to $250 million, so they retaliate by cutting their price to $200 million. Next thing you know, you’re cutting your own price to $150.1 million just to stay in the market, and they’re doing the same. When the dust settles, you still only control half the market, but now you profit a mere $100,000 per airplane, making your total profits a measly $15 million instead of $27 billion—that’s $27,000 million. (I looked it up, and as it turns out, Boeing’s actual gross profit is about $14 billion, so I underestimated the real cost of each airplane—but they’re clearly still colluding.) For a gain of 10% in one year you’ve paid a loss of 99.95% indefinitely. The airlines will be thrilled, and they’ll likely pass on much of those savings to their customers, who will fly more often, engage in more tourism, and improve the economy in tourism-dependent countries like France and Greece, so the world may well be better off. But you as CEO of Boeing don’t care about the world; you care about the shareholders of Boeing—and the shareholders of Boeing just got hosed. Don’t expect to keep your seat in the next election.

But now, suppose you think that Airbus is planning on setting a price of $250 million next year anyway. They should know you’ll retaliate, but maybe their current CEO is retiring next year and doesn’t care what happens to the company after that or something. Or maybe they’re just stupid or reckless. In any case, your sources (which, as an upstanding citizen, obviously wouldn’t include any industrial espionage!) tell you that Airbus is going to charge $250 million next year.

Well, in that case there’s no point in you charging $330 million; you’ll lose the market and look like a sucker. You could drop to $250 million and try to set up a new, lower collusive equilibrium; but really what you want to do is punish them severely for backstabbing you. (After all, human beings are particularly quick to anger when we perceive betrayal. So maybe you’ll charge $200 million and beat them at their own conniving game.

The next year, Airbus has a choice. They could raise back to $341 million and give you another year of big profits to atone for their reckless actions, or they could cut down to $180 million and keep the price war going. You might think that they should continue the war, but that’s short-term thinking; in the long run their best strategy is to atone for their actions and work to restore the collusion. In response, Boeing’s best strategy is to punish them when they break the collusion, but not hold a grudge; if they go back to the high price, Boeing should as well. This very simple strategy is called tit-for-tat, and it is utterly dominant in every simulation we’ve ever tried of this situation, which is technically called an iterated prisoner’s dilemma.

What if there are more than two companies involved? Then things get even more complicated, because now we’re dealing with things like what A’s prediction of what B predicts that C will predict A will do. In general this is a situation we only barely understand, and I think it is a topic that needs considerably more research than it has received.

There is an interesting simple model that actually seems to capture a lot about how oligopolies work, but no one can quite figure out why it works. That model is called Cournot competition. It assumes that companies take prices and fixed and compete by selecting the quantity they produce at each cycle. That’s incredibly bizarre; it seems much more realistic to say that they compete by setting prices. But if you do that, you get Bertrand competition, which requires us to go through that whole game-theory analysis—but now with three, or four, or ten companies!

Under Cournot competition, you decide how much to produce Q1 by monopolizing what’s left over after the other companies have produced their quantities Q2, Q3, and so on. If there are k companies, you optimize under the constraint that (k-1)Q2 has already been produced.

Let’s use our linear models again. Here, the quantity that goes into figuring the price is the total quantity, which is Q1+(k-1)Q2; while the quantity you sell is just Q1. But then, another weird part is that for the marginal cost function we use the whole market—maybe you’re limited by some natural resource, like oil or lithium?

It’s not as important for you to follow along with the algebra, though here you go if you want:


Then the key point is that the situation is symmetric, so Q1 = Q2 = Q3 = Q. Then the total quantity produced, which is what consumers care about, is kQ. That’s what sets the actual price as well.


The two equations to focus on are these ones:


If you plug in k=1, you get a monopoly. If you take the limit as k approaches infinity, you get perfect competition. And in between, you actually get a fairly accurate representation of how the number of companies in an industry affects the price and quantity sold! From some really bizarre assumptions about how competition works! The best explanation I’ve seen of why this might happen is this 1983 paper showing that price competition can behave like Cournot competition if companies have to first commit to producing a certain quantity before naming their prices.

But of course, it doesn’t always give an accurate representation of oligopoly, and for that we’ll probably need a much more sophisticated multiplayer game theory analysis which has yet to be done.

And that, dear readers, is how monopoly and oligopoly raise prices.

What you need to know about tax incidence

JDN 2457152 EDT 14:54.

I said in my previous post that I consider tax incidence to be one of the top ten things you should know about economics. If I actually try to make a top ten list, I think it goes something like this:

  1. Supply and demand
  2. Monopoly and oligopoly
  3. Externalities
  4. Tax incidence
  5. Utility, especially marginal utility of wealth
  6. Pareto-efficiency
  7. Risk and loss aversion
  8. Biases and heuristics, including sunk-cost fallacy, scope neglect, herd behavior, anchoring and representative heuristic
  9. Asymmetric information
  10. Winner-takes-all effect

So really tax incidence is in my top five things you should know about economics, and yet I still haven’t talked about it very much. Well, today I will. The basic principles of supply and demand I’m basically assuming you know, but I really should spend some more time on monopoly and externalities at some point.

Why is tax incidence so important? Because of one central fact: The person who pays the tax is not the person who writes the check.

It doesn’t matter whether a tax is paid by the buyer or the seller; it matters what the buyer and seller can do to avoid the tax. If you can change your behavior in order to avoid paying the tax—buy less stuff, or buy somewhere else, or deduct something—you will not bear the tax as much as someone else who can’t do anything to avoid the tax, even if you are the one who writes the check. If you can avoid it and they can’t, other parties in the transaction will adjust their prices in order to eat the tax on your behalf.

Thus, if you have a good that you absolutely must buy no matter what—like, say, table saltand then we make everyone who sells that good pay an extra $5 per kilogram, I can guarantee you that you will pay an extra $5 per kilogram, and the suppliers will make just as much money as they did before. (A salt tax would be an excellent way to redistribute wealth from ordinary people to corporations, if you’re into that sort of thing. Not that we have any trouble doing that in America.)

On the other hand, if you have a good that you’ll only buy at a very specific price—like, say, fast food—then we can make you write the check for a tax of an extra $5 per kilogram you use, and in real terms you’ll pay hardly any tax at all, because the sellers will either eat the cost themselves by lowering the prices or stop selling the product entirely. (A fast food tax might actually be a good idea as a public health measure, because it would reduce production and consumption of fast food—remember, heart disease is one of the leading causes of death in the United States, making cheeseburgers a good deal more dangerous than terrorists—but it’s a bad idea as a revenue measure, because rather than pay it, people are just going to buy and sell less.)

In the limit in which supply and demand are both completely fixed (perfectly inelastic), you can tax however you want and it’s just free redistribution of wealth however you like. In the limit in which supply and demand are both locked into a single price (perfectly elastic), you literally cannot tax that good—you’ll just eliminate production entirely. There aren’t a lot of perfectly elastic goods in the real world, but the closest I can think of is cash. If you instituted a 2% tax on all cash withdrawn, most people would stop using cash basically overnight. If you want a simple way to make all transactions digital, find a way to enforce a cash tax. When you have a perfect substitute available, taxation eliminates production entirely.

To really make sense out of tax incidence, I’m going to need a lot of a neoclassical economists’ favorite thing: Supply and demand curves. These things pop up everywhere in economics; and they’re quite useful. I’m not so sure about their application to things like aggregate demand and the business cycle, for example, but today I’m going to use them for the sort of microeconomic small-market stuff that they were originally designed for; and what I say here is going to be basically completely orthodox, right out of what you’d find in an ECON 301 textbook.

Let’s assume that things are linear, just to make the math easier. You’d get basically the same answers with nonlinear demand and supply functions, but it would be a lot more work. Likewise, I’m going to assume a unit tax on goods—like $2890 per hectare—as opposed to a proportional tax on sales—like 6% property tax—again, for mathematical simplicity.

The next concept I’m going to have to talk about is elasticitywhich is the proportional amount that quantity sold changes relative to price. If price increases 2% and you buy 4% less, you have a demand elasticity of -2. If price increases 2% and you buy 1% less, you have a demand elasticity of -1/2. If price increases 3% and you sell 6% more, you have a supply elasticity of 2. If price decreases 5% and you sell 1% less, you have a supply elasticity of 1/5.

Elasticity doesn’t have any units of measurement, it’s just a number—which is part of why we like to use it. It also has some very nice mathematical properties involving logarithms, but we won’t be needing those today.

The price that renters are willing and able to pay, the demand price PD will start at their maximum price, the reserve price PR, and then it will decrease linearly according to the quantity of land rented Q, according to a linear function (simply because we assumed that) which will vary according to a parameter e that represents the elasticity of demand (it isn’t strictly equal to it, but it’s sort of a linearization).

We’re interested in what is called the consumer surplus; it is equal to the total amount of value that buyers get from their purchases, converted into dollars, minus the amount they had to pay for those purchases. This we add to the producer surplus, which is the amount paid for those purchases minus the cost of producing themwhich is basically just the same thing as profit. Togerther the consumer surplus and producer surplus make the total economic surplus, which economists generally try to maximize. Because different people have different marginal utility of wealth, this is actually a really terrible idea for deep and fundamental reasons—taking a house from Mitt Romney and giving it to a homeless person would most definitely reduce economic surplus, even though it would obviously make the world a better place. Indeed, I think that many of the problems in the world, particularly those related to inequality, can be traced to the fact that markets maximize economic surplus rather than actual utility. But for now I’m going to ignore all that, and pretend that maximizing economic surplus is what we want to do.

You can read off the economic surplus straight from the supply and demand curves; it’s the area between the lines. (Mathematically, it’s an integral; but that’s equivalent to the area under a curve, and with straight lines they’re just triangles.) I’m going to call the consumer surplus just “surplus”, and producer surplus I’ll call “profit”.

Below the demand curve and above the price is the surplus, and below the price and above the supply curve is the profit:


I’m going to be bold here and actually use equations! Hopefully this won’t turn off too many readers. I will give each equation in both a simple text format and in proper LaTeX. Remember, you can render LaTeX here.

PD = PR – 1/e * Q

P_D = P_R – \frac{1}{e} Q \\

The marginal cost that landlords have to pay, the supply price PS, is a bit weirder, as I’ll talk about more in a moment. For now let’s say that it is a linear function, starting at zero cost for some quantity Q0 and then increases linearly according to a parameter n that similarly represents the elasticity of supply.

PS = 1/n * (Q – Q0)

P_S = \frac{1}{n} \left( Q – Q_0 \right) \\

Now, if you introduce a tax, there will be a difference between the price that renters pay and the price that landlords receive—namely, the tax, which we’ll call T. I’m going to assume that, on paper, the landlord pays the whole tax. As I said above, this literally does not matter. I could assume that on paper the renter pays the whole tax, and the real effect on the distribution of wealth would be identical. All we’d have to do is set PD = P and PS = P – T; the consumer and producer surplus would end up exactly the same. Or we could do something in between, with P’D = P + rT and P’S = P – (1 – r) T.

Then, if the market is competitive, we just set the prices equal, taking the tax into account:

P = PD – T = PR – 1/e * Q – T = PS = 1/n * (Q – Q0)

P= P_D – T = P_R – \frac{1}{e} Q – T= P_S = \frac{1}{n} \left(Q – Q_0 \right) \\

P_R – 1/e * Q – T = 1/n * (Q – Q0)

P_R – \frac{1}{e} Q – T = \frac{1}{n} \left(Q – Q_0 \right) \\

Notice the equivalency here; if we set P’D = P + rT and P’S = P – (1 – r) T, so that the consumer now pays a fraction of the tax r.

P = P’D – rT = P_r – 1/e*Q = P’S + (1 – r) T + 1/n * (Q – Q0) + (1 – r) T

P^\prime_D – r T = P = P_R – \frac{1}{e} Q = P^\prime_S = \frac{1}{n} \left(Q – Q_0 \right) + (1 – r) T\\

The result is exactly the same:

P_R – 1/e * Q – T = 1/n * (Q – Q0)

P_R – \frac{1}{e} Q – T = \frac{1}{n} \left(Q – Q_0 \right) \\

I’ll spare you the algebra, but this comes out to:

Q = (PR – T)/(1/n + 1/e) + (Q0)/(1 + n/e)

Q = \frac{P_R – T}{\frac{1}{n} + \frac{1}{e}} + \frac{Q_0}{1 + \frac{n}{e}}

P = (PR – T)/(1+ n/e) – (Q0)/(e + n)

P = \frac{P_R – T}}{1 + \frac{n}{e}} – \frac{Q_0}{e+n} \\

That’s if the market is competitive.

If the market is a monopoly, instead of setting the prices equal, we set the price the landlord receives equal to the marginal revenue—which takes into account the fact that increasing the amount they sell forces them to reduce the price they charge everyone else. Thus, the marginal revenue drops faster than the price as the quantity sold increases.

After a bunch of algebra (and just a dash of calculus), that comes out to these very similar, but not quite identical, equations:

Q = (PR – T)/(1/n + 2/e) + (Q0)/(1+ 2n/e)

Q = \frac{P_R – T}{\frac{1}{n} + \frac{2}{e}} + \frac{Q_0}{1 + \frac{2n}{e}} \\

P = (PR – T)*((1/n + 1/e)/(1/n + 2/e) – (Q0)/(e + 2n)

P = \left( P_R – T\right)\frac{\frac{1}{n} + \frac{1}{e}}{\frac{1}{n} + \frac{2}{e}} – \frac{Q_0}{e+2n} \\

Yes, it changes some 1s into 2s. That by itself accounts for the full effect of monopoly. That’s why I think it’s worthwhile to use the equations; they are deeply elegant and express in a compact form all of the different cases. They look really intimidating right now, but for most of the cases we’ll consider these general equations simplify quite dramatically.

There are several cases to consider.

Land has an extremely high cost to create—for practical purposes, we can consider its supply fixed, that is, perfectly inelastic. If the market is competitive, so that landlords have no market power, then they will simply rent out all the land they have at whatever price the market will bear:


This is like setting n = 0 and T = 0 in the above equations, the competitive ones.

Q = Q0

Q = Q_0 \\

P = PR – Q0/e

P = P_R – \frac{Q_0}{e} \\

If we now introduce a tax, it will fall completely on the landlords, because they have little choice but to rent out all the land they have, and they can only rent it at a price—including tax—that the market will bear.


Now we still have n = 0 but not T = 0.

Q = Q0

Q = Q_0 \\

P = PR – T – Q0/e

P = P_R – T – \frac{Q_0}{e} \\

The consumer surplus will be:

½ (Q)(PR – P – T) = 1/(2e)* Q02

\frac{1}{2}Q(P_R – P – T) = \frac{1}{2e}Q_0^2 \\

Notice how T isn’t in the result. The consumer surplus is unaffected by the tax.

The producer surplus, on the other hand, will be reduced by the tax:

(Q)(P) = (PR – T – Q0/e) Q0 = PR Q0 – 1/e Q02 – TQ0

(Q)(P) = (P_R – T – \frac{Q_0}{e})Q_0 = P_R Q_0 – \frac{1}{e} Q_0^2 – T Q_0 \\

T appears linearly as TQ0, which is the same as the tax revenue. All the money goes directly from the landlord to the government, as we want if our goal is to redistribute wealth without raising rent.

But now suppose that the market is not competitive, and by tacit collusion or regulatory capture the landlords can exert some market power; this is quite likely the case in reality. Actually in reality we’re probably somewhere in between monopoly and competition, either oligopoly or monopolistic competitionwhich I will talk about a good deal more in a later post, I promise.

It could be that demand is still sufficiently high that even with their market power, landlords have an incentive to rent out all their available land, in which case the result will be the same as in the competitive market.


A tax will then fall completely on the landlords as before:


Indeed, in this case it doesn’t really matter that the market is monopolistic; everything is the same as it would be under a competitive market. Notice how if you set n = 0, the monopolistic equations and the competitive equations come out exactly the same. The good news is, this is quite likely our actual situation! So even in the presence of significant market power the land tax can redistribute wealth in just the way we want.

But there are a few other possibilities. One is that demand is not sufficiently high, so that the landlords’ market power causes them to actually hold back some land in order to raise the price:


This will create some of what we call deadweight loss, in which some economic value is wasted. By restricting the land they rent out, the landlords make more profit, but the harm they cause to tenant is created than the profit they gain, so there is value wasted.

Now instead of setting n = 0, we actually set n = infinity. Why? Because the reason that the landlords restrict the land they sell is that their marginal revenue is actually negative beyond that point—they would actually get less money in total if they sold more land. Instead of being bounded by their cost of production (because they have none, the land is there whether they sell it or not), they are bounded by zero. (Once again we’ve hit upon a fundamental concept in economics, particularly macroeconomics, that I don’t have time to talk about today: the zero lower bound.) Thus, they can change quantity all they want (within a certain range) without changing the price, which is equivalent to a supply elasticity of infinity.

Introducing a tax will then exacerbate this deadweight loss (adding DWL2 to the original DWL1), because it provides even more incentive for the landlords to restrict the supply of land:


Q = e/2*(PR – T)

Q = \frac{e}{2} \left(P_R – T\right)\\

P = 1/2*(PR – T)

P = \frac{1}{2} \left(P_R – T\right) \\

The quantity Q0 completely drops out, because it doesn’t matter how much land is available (as long as it’s enough); it only matters how much land it is profitable to rent out.

We can then find the consumer and producer surplus, and see that they are both reduced by the tax. The consumer surplus is as follows:

½ (Q)(PR – 1/2(PR – T)) = e/4*(PR2 – T2)

\frac{1}{2}Q \left( P_R – \frac{1}{2}left( P – T \right) \right) = \frac{e}{4}\left( P_R^2 – T^2 \right) \\

This time, the tax does have an effect on reducing the consumer surplus.

The producer surplus, on the other hand, will be:

(Q)(P) = 1/2*(PR – T)*e/2*(PR – T) = e/4*(PR – T)2

(Q)(P) = \frac{1}{2}\left(P_R – T \right) \frac{e}{2} \left(P_R – T\right) = \frac{e}{4} \left(P_R – T)^2 \\

Notice how it is also reduced by the tax—and no longer in a simple linear way.

The tax revenue is now a function of the demand:

TQ = e/2*T(PR – T)

T Q = \frac{e}{2} T (P_R – T) \\

If you add all these up, you’ll find that the sum is this:

e/2 * (PR^2 – T^2)

\frac{e}{2} \left(P_R^2 – T^2 \right) \\

The sum is actually reduced by an amount equal to e/2*T^2, which is the deadweight loss.

Finally there is an even worse scenario, in which the tax is so large that it actually creates an incentive to restrict land where none previously existed:


Notice, however, that because the supply of land is inelastic the deadweight loss is still relatively small compared to the huge amount of tax revenue.

But actually this isn’t the whole story, because a land tax provides an incentive to get rid of land that you’re not profiting from. If this incentive is strong enough, the monopolistic power of landlords will disappear, as the unused land gets sold to more landholders or to the government. This is a way of avoiding the tax, but it’s one that actually benefits society, so we don’t mind incentivizing it.

Now, let’s compare this to our current system of property taxes, which include the value of buildings. Buildings are expensive to create, but we build them all the time; the supply of buildings is strongly dependent upon the price at which those buildings will sell. This makes for a supply curve that is somewhat elastic.

If the market were competitive and we had no taxes, it would be optimally efficient:


Property taxes create an incentive to produce fewer buildings, and this creates deadweight loss. Notice that this happens even if the market is perfectly competitive:


Since both n and e are finite and nonzero, we’d need to use the whole equations: Since the algebra is such a mess, I don’t see any reason to subject you to it; but suffice it to say, the T does not drop out. Tenants do see their consumer surplus reduced, and the larger the tax the more this is so.

Now, suppose that the market for buildings is monopolistic, as it most likely is. This would create deadweight loss even in the absence of a tax:


But a tax will add even more deadweight loss:


Once again, we’d need the full equations, and once again it’s a mess; but the result is, as before, that the tax gets passed on to the tenants in the form of more restricted sales and therefore higher rents.

Because of the finite supply elasticity, there’s no way that the tax can avoid raising the rent. As long as landlords have to pay more taxes when they build more or better buildings, they are going to raise the rent in those buildings accordingly—whether the market is competitive or not.

If the market is indeed monopolistic, there may be ways to bring the rent down: suppose we know what the competitive market price of rent should be, and we can establish rent control to that effect. If we are truly correct about the price to set, this rent control can not only reduce rent, it can actually reduce the deadweight loss:


But if we set the rent control too low, or don’t properly account for the varying cost of different buildings, we can instead introduce a new kind of deadweight loss, by making it too expensive to make new buildings.


In fact, what actually seems to happen is more complicated than that—because otherwise the number of buildings is obviously far too small, rent control is usually set to affect some buildings and not others. So what seems to happen is that the rent market fragments into two markets: One, which is too small, but very good for those few who get the chance to use it; and the other, which is unaffected by the rent control but is more monopolistic and therefore raises prices even further. This is why almost all economists are opposed to rent control (PDF); it doesn’t solve the problem of high rent and simply causes a whole new set of problems.

A land tax with a basic income, on the other hand, would help poor people at least as much as rent control presently does—probably a good deal more—without discouraging the production and maintenance of new apartment buildings.

But now we come to a key point: The land tax must be uniform per hectare.

If it is instead based on the value of the land, then this acts like a finite elasticity of supply; it provides an incentive to reduce the value of your own land in order to avoid the tax. As I showed above, this is particularly pernicious if the market is monopolistic, but even if it is competitive the effect is still there.

One exception I can see is if there are different tiers based on broad classes of land that it’s difficult to switch between, such as “land in Manhattan” versus “land in Brooklyn” or “desert land” versus “forest land”. But even this policy would have to be done very carefully, because any opportunity to substitute can create an opportunity to pass on the tax to someone else—for instance if land taxes are lower in Brooklyn developers are going to move to Brooklyn. Maybe we want that, in which case that is a good policy; but we should be aware of these sorts of additional consequences. The simplest way to avoid all these problems is to simply make the land tax uniform. And given the quantities we’re talking about—less than $3000 per hectare per year—it should be affordable for anyone except the very large landholders we’re trying to distribute wealth from in the first place.

The good news is, most economists would probably be on board with this proposal. After all, the neoclassical models themselves say it would be more efficient than our current system of rent control and property taxes—and the idea is at least as old as Adam Smith. Perhaps we can finally change the fact that the rent is too damn high.

Prospect Theory: Why we buy insurance and lottery tickets

JDN 2457061 PST 14:18.

Today’s topic is called prospect theory. Prospect theory is basically what put cognitive economics on the map; it was the knock-down argument that Kahneman used to show that human beings are not completely rational in their economic decisions. It all goes back to a 1979 paper by Kahneman and Tversky that now has 34000 citations (yes, we’ve been having this argument for a rather long time now). In the 1990s it was refined into cumulative prospect theory, which is more mathematically precise but basically the same idea.

What was that argument? People buy both insurance and lottery tickets.

The “both” is very important. Buying insurance can definitely be rational—indeed, typically is. Buying lottery tickets could theoretically be rational, under very particular circumstances. But they cannot both be rational at the same time.

To see why, let’s talk some more about marginal utility of wealth. Recall that a dollar is not worth the same to everyone; to a billionaire a dollar is a rounding error, to most of us it is a bottle of Coke, but to a starving child in Ghana it could be life itself. We typically observe diminishing marginal utility of wealth—the more money you have, the less another dollar is worth to you.

If we sketch a graph of your utility versus wealth it would look something like this:


Notice how it increases as your wealth increases, but at a rapidly diminishing rate.

If you have diminishing marginal utility of wealth, you are what we call risk-averse. If you are risk-averse, you’ll (sometimes) want to buy insurance. Let’s suppose the units on that graph are tens of thousands of dollars. Suppose you currently have an income of $50,000. You are offered the chance to pay $10,000 a year to buy unemployment insurance, so that if you lose your job, instead of making $10,000 on welfare you’ll make $30,000 on unemployment. You think you have about a 20% chance of losing your job.

If you had constant marginal utility of wealth, this would not be a good deal for you. Your expected value of money would be reduced if you buy the insurance: Before you had an 80% chance of $50,000 and a 20% chance of $10,000 so your expected amount of money is $42,000. With the insurance you have an 80% chance of $40,000 and a 20% chance of $30,000 so your expected amount of money is $38,000. Why would you take such a deal? That’s like giving up $4,000 isn’t it?

Well, let’s look back at that utility graph. At $50,000 your utility is 1.80, uh… units, er… let’s say QALY. 1.80 QALY per year, meaning you live 80% better than the average human. Maybe, I guess? Doesn’t seem too far off. In any case, the units of measurement aren’t that important.


By buying insurance your effective income goes down to $40,000 per year, which lowers your utility to 1.70 QALY. That’s a fairly significant hit, but it’s not unbearable. If you lose your job (20% chance), you’ll fall down to $30,000 and have a utility of 1.55 QALY. Again, noticeable, but bearable. Your overall expected utility with insurance is therefore 1.67 QALY.

But what if you don’t buy insurance? Well then you have a 20% chance of taking a big hit and falling all the way down to $10,000 where your utility is only 1.00 QALY. Your expected utility is therefore only 1.64 QALY. You’re better off going with the insurance.

And this is how insurance companies make a profit (well; the legitimate way anyway; they also like to gouge people and deny cancer patients of course); on average, they make more from each customer than they pay out, but customers are still better off because they are protected against big losses. In this case, the insurance company profits $4,000 per customer per year, customers each get 30 milliQALY per year (about the same utility as an extra $2,000 more or less), everyone is happy.

But if this is your marginal utility of wealth—and it most likely is, approximately—then you would never want to buy a lottery ticket. Let’s suppose you actually have pretty good odds; it’s a 1 in 1 million chance of $1 million for a ticket that costs $2. This means that the state is going to take in about $2 million for every $1 million they pay out to a winner.

That’s about as good as your odds for a lottery are ever going to get; usually it’s more like a 1 in 400 million chance of $150 million for $1, which is an even bigger difference than it sounds, because $150 million is nowhere near 150 times as good as $1 million. It’s a bit better from the state’s perspective though, because they get to receive $400 million for every $150 million they pay out.

For your convenience I have zoomed out the graph so that you can see 100, which is an income of $1 million (which you’ll have this year if you win; to get it next year, you’ll have to play again). You’ll notice I did not have to zoom out the vertical axis, because 20 times as much money only ends up being about 2 times as much utility. I’ve marked with lines the utility of $50,000 (1.80, as we said before) versus $1 million (3.30).


What about the utility of $49,998 which is what you’ll have if you buy the ticket and lose? At this number of decimal places you can’t see the difference, so I’ll need to go out a few more. At $50,000 you have 1.80472 QALY. At $49,998 you have 1.80470 QALY. That $2 only costs you 0.00002 QALY, 20 microQALY. Not much, really; but of course not, it’s only $2.

How much does the 1 in 1 million chance of $1 million give you? Even less than that. Remember, the utility gain for going from $50,000 to $1 million is only 1.50 QALY. So you’re adding one one-millionth of that in expected utility, which is of course 1.5 microQALY, or 0.0000015 QALY.

That $2 may not seem like it’s worth much, but that 1 in 1 million chance of $1 million is worth less than one tenth as much. Again, I’ve tried to make these figures fairly realistic; they are by no means exact (I don’t actually think $49,998 corresponds to exactly 1.804699 QALY), but the order of magnitude difference is right. You gain about ten times as much utility from spending that $2 on something you want than you do on taking the chance at $1 million.

I said before that it is theoretically possible for you to have a utility function for which the lottery would be rational. For that you’d need to have increasing marginal utility of wealth, so that you could be what we call risk-seeking. Your utility function would have to look like this:


There’s no way marginal utility of wealth looks like that. This would be saying that it would hurt Bill Gates more to lose $1 than it would hurt a starving child in Ghana, which makes no sense at all. (It certainly would makes you wonder why he’s so willing to give it to them.) So frankly even if we didn’t buy insurance the fact that we buy lottery tickets would already look pretty irrational.

But in order for it to be rational to buy both lottery tickets and insurance, our utility function would have to be totally nonsensical. Maybe it could look like this or something; marginal utility decreases normally for awhile, and then suddenly starts going upward again for no apparent reason:


Clearly it does not actually look like that. Not only would this mean that Bill Gates is hurt more by losing $1 than the child in Ghana, we have this bizarre situation where the middle class are the people who have the lowest marginal utility of wealth in the world. Both the rich and the poor would need to have higher marginal utility of wealth than we do. This would mean that apparently yachts are just amazing and we have no idea. Riding a yacht is the pinnacle of human experience, a transcendence beyond our wildest imaginings; and riding a slightly bigger yacht is even more amazing and transcendent. Love and the joy of a life well-lived pale in comparison to the ecstasy of adding just one more layer of gold plate to your Ferrari collection.

Where increasing marginal utility is ridiculous, this is outright special pleading. You’re just making up bizarre utility functions that perfectly line up with whatever behavior people happen to have so that you can still call it rational. It’s like saying, “It could be perfectly rational! Maybe he enjoys banging his head against the wall!”

Kahneman and Tversky had a better idea. They realized that human beings aren’t so great at assessing probability, and furthermore tend not to think in terms of total amounts of wealth or annual income at all, but in terms of losses and gains. Through a series of clever experiments they showed that we are not so much risk-averse as we are loss-averse; we are actually willing to take more risk if it means that we will be able to avoid a loss.

In effect, we seem to be acting as if our utility function looks like this, where the zero no longer means “zero income”, it means “whatever we have right now“:


We tend to weight losses about twice as much as gains, and we tend to assume that losses also diminish in their marginal effect the same way that gains do. That is, we would only take a 50% chance to lose $1000 if it meant a 50% chance to gain $2000; but we’d take a 10% chance at losing $10,000 to save ourselves from a guaranteed loss of $1000.

This can explain why we buy insurance, provided that you frame it correctly. One of the things about prospect theory—and about human behavior in general—is that it exhibits framing effects: The answer we give depends upon the way you ask the question. That’s so totally obviously irrational it’s honestly hard to believe that we do it; but we do, and sometimes in really important situations. Doctors—doctors—will decide a moral dilemma differently based on whether you describe it as “saving 400 out of 600 patients” or “letting 200 out of 600 patients die”.

In this case, you need to frame insurance as the default option, and not buying insurance as an extra risk you are taking. Then saving money by not buying insurance is a gain, and therefore less important, while a higher risk of a bad outcome is a loss, and therefore important.

If you frame it the other way, with not buying insurance as the default option, then buying insurance is taking a loss by making insurance payments, only to get a gain if the insurance pays out. Suddenly the exact same insurance policy looks less attractive. This is a big part of why Obamacare has been effective but unpopular. It was set up as a fine—a loss—if you don’t buy insurance, rather than as a bonus—a gain—if you do buy insurance. The latter would be more expensive, but we could just make it up by taxing something else; and it might have made Obamacare more popular, because people would see the government as giving them something instead of taking something away. But the fine does a better job of framing insurance as the default option, so it motivates more people to actually buy insurance.

But even that would still not be enough to explain how it is rational to buy lottery tickets (Have I mentioned how it’s really not a good idea to buy lottery tickets?), because buying a ticket is a loss and winning the lottery is a gain. You actually have to get people to somehow frame not winning the lottery as a loss, making winning the default option despite the fact that it is absurdly unlikely. But I have definitely heard people say things like this: “Well if my numbers come up and I didn’t play that week, how would I feel then?” Pretty bad, I’ll grant you. But how much you wanna bet that never happens? (They’ll bet… the price of the ticket, apparently.)

In order for that to work, people either need to dramatically overestimate the probability of winning, or else ignore it entirely. Both of those things totally happen.

First, we overestimate the probability of rare events and underestimate the probability of common events—this is actually the part that makes it cumulative prospect theory instead of just regular prospect theory. If you make a graph of perceived probability versus actual probability, it looks like this:


We don’t make much distinction between 40% and 60%, even though that’s actually pretty big; but we make a huge distinction between 0% and 0.00001% even though that’s actually really tiny. I think we basically have categories in our heads: “Never, almost never, rarely, sometimes, often, usually, almost always, always.” Moving from 0% to 0.00001% is going from “never” to “almost never”, but going from 40% to 60% is still in “often”. (And that for some reason reminded me of “Well, hardly ever!”)

But that’s not even the worst of it. After all that work to explain how we can make sense of people’s behavior in terms of something like a utility function (albeit a distorted one), I think there’s often a simpler explanation still: Regret aversion under total neglect of probability.

Neglect of probability is self-explanatory: You totally ignore the probability. But what’s regret aversion, exactly? Unfortunately I’ve had trouble finding any good popular sources on the topic; it’s all scholarly stuff. (Maybe I’m more cutting-edge than I thought!)

The basic idea that is that you minimize regret, where regret can be formalized as the difference in utility between the outcome you got and the best outcome you could have gotten. In effect, it doesn’t matter whether something is likely or unlikely; you only care how bad it is.

This explains insurance and lottery tickets in one fell swoop: With insurance, you have the choice of risking a big loss (big regret) which you can avoid by paying a small amount (small regret). You take the small regret, and buy insurance. With lottery tickets, you have the chance of getting a large gain (big regret if you don’t) which you gain by paying a small amount (small regret).

This can also explain why a typical American’s fears go in the order terrorists > Ebola > sharks > > cars > cheeseburgers, while the actual risk of dying goes in almost the opposite order, cheeseburgers > cars > > terrorists > sharks > Ebola. (Terrorists are scarier than sharks and Ebola and actually do kill more Americans! Yay, we got something right! Other than that it is literally reversed.)

Dying from a terrorist attack would be horrible; in addition to your own death you have all the other likely deaths and injuries, and the sheer horror and evil of the terrorist attack itself. Dying from Ebola would be almost as bad, with gruesome and agonizing symptoms. Dying of a shark attack would be still pretty awful, as you get dismembered alive. But dying in a car accident isn’t so bad; it’s usually over pretty quick and the event seems tragic but ordinary. And dying of heart disease and diabetes from your cheeseburger overdose will happen slowly over many years, you’ll barely even notice it coming and probably die rapidly from a heart attack or comfortably in your sleep. (Wasn’t that a pleasant paragraph? But there’s really no other way to make the point.)

If we try to estimate the probability at all—and I don’t think most people even bother—it isn’t by rigorous scientific research; it’s usually by availability heuristic: How many examples can you think of in which that event happened? If you can think of a lot, you assume that it happens a lot.

And that might even be reasonable, if we still lived in hunter-gatherer tribes or small farming villages and the 150 or so people you knew were the only people you ever heard about. But now that we have live TV and the Internet, news can get to us from all around the world, and the news isn’t trying to give us an accurate assessment of risk, it’s trying to get our attention by talking about the biggest, scariest, most exciting things that are happening around the world. The amount of news attention an item receives is in fact in inverse proportion to the probability of its occurrence, because things are more exciting if they are rare and unusual. Which means that if we are estimating how likely something is based on how many times we heard about it on the news, our estimates are going to be almost exactly reversed from reality. Ironically it is the very fact that we have more information that makes our estimates less accurate, because of the way that information is presented.

It would be a pretty boring news channel that spent all day saying things like this: “82 people died in car accidents today, and 1657 people had fatal heart attacks, 11.8 million had migraines, and 127 million played the lottery and lost; in world news, 214 countries did not go to war, and 6,147 children starved to death in Africa…” This would, however, be vastly more informative.

In the meantime, here are a couple of counter-heuristics I recommend to you: Don’t think about losses and gains, think about where you are and where you might be. Don’t say, “I’ll gain $1,000”; say “I’ll raise my income this year to $41,000.” Definitely do not think in terms of the percentage price of things; think in terms of absolute amounts of money. Cheap expensive things, expensive cheap things is a motto of mine; go ahead and buy the $5 toothbrush instead of the $1, because that’s only $4. But be very hesitant to buy the $22,000 car instead of the $21,000, because that’s $1,000. If you need to estimate the probability of something, actually look it up; don’t try to guess based on what it feels like the probability should be. Make this unprecedented access to information work for you instead of against you. If you want to know how many people die in car accidents each year, you can literally ask Google and it will tell you that (I tried it—it’s 1.3 million worldwide). The fatality rate of a given disease versus the risk of its vaccine, the safety rating of a particular brand of car, the number of airplane crash deaths last month, the total number of terrorist attacks, the probability of becoming a university professor, the average functional lifespan of a new television—all these things and more await you at the click of a button. Even if you think you’re pretty sure, why not look it up anyway?

Perhaps then we can make prospect theory wrong by making ourselves more rational.

The winner-takes-all effect

JDN 2457054 PST 14:06.

As I write there is some sort of mariachi band playing on my front lawn. It is actually rather odd that I have a front lawn, since my apartment is set back from the road; yet there is the patch of grass, and there is the band playing upon it. This sort of thing is part of the excitement of living in a large city (and Long Beach would seem like a large city were it not right next to the sprawling immensity that is Los Angeles—there are more people in Long Beach than in Cleveland, but there are more people in greater Los Angeles than in Sweden); with a certain critical mass of human beings comes unexpected pieces of culture.

The fact that people agglomerate in this way is actually relevant to today’s topic, which is what I will call the winner-takes-all effect. I actually just finished reading a book called The Winner-Take-All Society, which is particularly horrifying to read because it came out in 1996. That’s almost twenty years ago, and things were already bad; and since then everything it describes has only gotten worse.

What is the winner-takes-all effect? It is the simple fact that in competitive capitalist markets, a small difference in quality can yield an enormous difference in return. The third most popular soda drink company probably still makes drinks that are pretty good, but do you have any idea what it is? There’s Coke, there’s Pepsi, and then there’s… uh… Dr. Pepper, apparently! But I didn’t know that before today and I bet you didn’t either. Now think about what it must be like to be the 15th most popular soda drink company, or the 37th. That’s the winner-takes-all effect.

I don’t generally follow football, but since tomorrow is the Super Bowl I feel some obligation to use that example as well. The highest-paid quarterback is Russell Wilson of the Seattle Seahawks, who is signing onto a five-year contract worth $110 million ($22 million a year). In annual income that will make him pass Jay Cutler of the Chicago Bears who has a seven-year contract worth $127 million ($18.5 million a year). This shift may have something to do with the fact that the Seahawks are in the Super Bowl this year and the Bears are not (they haven’t since 2007). Now consider what life is like for most football players; the median income of football players is most likely zero (at least as far as football-related income), and the median income of NFL players—the cream of the crop already—is $770,000; that’s still very good money of course (more than Krugman makes, actually! But he could make more, if he were willing to sell out to Wall Street), but it’s barely 1/30 of what Wilson is going to be making. To make that million-dollar salary, you need to be the best, of the best, of the best (sir!). That’s the winner-takes-all effect.

To go back to the example of cities, it is for similar reasons that the largest cities (New York, Los Angeles, London, Tokyo, Shanghai, Hong Kong, Delhi) become packed with tens of millions of people while others (Long Beach, Ann Arbor, Cleveland) get hundreds of thousands and most (Petoskey, Ketchikan, Heber City, and hundreds of others you’ve never heard of) get only a few thousand. Beyond that there are thousands of tiny little hamlets that many don’t even consider cities. The median city probably has about 10,000 people in it, and that only because we’d stop calling it a city if it fell below 1,000. If we include every tiny little village, the median town size is probably about 20 people. Meanwhile the largest city in the world is Tokyo, with a greater metropolitan area that holds almost 38 million people—or to put it another way almost exactly as many people as California. Huh, LA doesn’t seem so big now does it? How big is a typical town? Well, that’s the thing about this sort of power-law distribution; the concept of “typical” or “average” doesn’t really apply anymore. Each little piece of the distribution has basically the same shape as the whole distribution, so there isn’t a “typical” size or scale. That’s the winner-takes-all effect.

As they freely admit in the book, it isn’t literally that a single winner takes everything. That is the theoretical maximum level of wealth inequality, and fortunately no society has ever quite reached it. The closest we get in today’s society is probably Saudi Arabia, which recently lost its king—and yes I do mean king in the fullest sense of the word, a man of virtually unlimited riches and near-absolute power. His net wealth was estimated at $18 billion, which frankly sounds low; still even if that’s half the true amount it’s oddly comforting to know that he is still not quite as rich as Bill Gates ($78 billion), who earned his wealth at least semi-legitimately in a basically free society. Say what you will about intellectual property rents and market manipulation—and you know I do—but they are worlds away from what Abdullah’s family did, which was literally and directly robbed from millions of people by the power of the sword. Mostly he just inherited all that, and he did implement some minor reforms, but make no mistake: He was ruthless and by no means willing to give up his absolute power—he beheaded dozens of political dissidents, for example. Saudi Arabia does spread their wealth around a little, such that basically no one is below the UN poverty lines of $1.25 and $2 per day, but about a fourth of the population is below the national poverty line—which is just about the same distribution of wealth as what we have in the US, which actually makes me wonder just how free and legitimate our markets really are.

The winner-takes-all effect would really be more accurately described as the “top small fraction takes the vast majority” effect, but that isn’t nearly as catchy, now is it?

There are several different causes that can all lead to this same result. In the book, Robert Frank and Philip Cook argue that we should not attribute the cause to market manipulation, but in fact to the natural functioning of competitive markets. There’s something to be said for this—I used to buy the whole idea that competitive markets are the best, but increasingly I’ve been seeing ways that less competitive markets can make better overall outcomes.

Where they lose me is in arguing that the skyrocketing compensation packages for CEOs are due to their superior performance, and corporations are just being rational in competing for the best CEOs. If that were true, we wouldn’t find that the rank correlation between the CEO’s pay and the company’s stock performance is statistically indistinguishable from zero. Actually even a small positive correlation wouldn’t prove that the CEOs are actually performing well; it could just be that companies that perform well are willing to pay their CEOs more—and stock option compensation will do this automatically. But in fact the correlation is so tiny as to be negligible; corporations would be better off hiring a random person off the street and paying them $50,000 for all the CEO does for their stock performance. If you adjust for the size of the company, you find that having a higher-paid CEO is positively related to performance for small startups, but negatively correlated for large well-established corporations. No, clearly there’s something going on here besides competitive pay for high performance—corruption comes to mind, which you’ll remember was the subject of my master’s thesis.

But in some cases there isn’t any apparent corruption, and yet we still see these enormously unequal distributions of income. Another good example of this is the publishing industry, in which J.K. Rowling can make over $1 billion (she donated enough to charity to officially lose her billionaire status) but most authors make little or nothing, particularly those who can’t get published in the first place. I have no reason to believe that J.K. Rowling acquired this massive wealth by corruption; she just sold an awful lot of booksover 100 million of the first Harry Potter book alone.

But why would she be able to sell 100 million while thousands of authors write books that are probably just as good or nearly so make nothing? Am I just bitter and envious, as Mitt Romney would say? Is J.K. Rowling actually a million times as good an author as I am?

Obviously not, right? She may be better, but she’s not that much better. So how is it that she ends up making a million times as much as I do from writing? It feels like squaring the circle: How can markets be efficient and competitive, yet some people are being paid millions of times as others despite being only slightly more productive?

The answer is simple but enormously powerful: positive feedback.Once you start doing well, it’s easier to do better. You have what economists call an economy of scale. The first 10,000 books sold is the hardest; then the next 10,000 is a little easier; the next 10,000 a little easier still. In fact I suspect that in many cases the first 10% growth is harder than the second 10% growth and so on—which is actually a much stronger claim. For my sales to grow 10% I’d need to add like 20 people. For J.K. Rowling’s sales to grow 10% she’d need to add 10 million. Yet it might actually be easier for J.K. Rowling to add 10 million than for me to add 20. If not, it isn’t much harder. Suppose we tried by just sending out enticing tweets. I have about 100 Twitter followers, so I’d need 0.2 sales per follower; she has about 4 million, so she’d need an average of 2.5 sales per follower. That’s an advantage for me, percentage-wise—but if we have the same uptake rate I sell 20 books and she sells 800,000.

If you have only a handful of book sales like I do, those sales are static; but once you cross that line into millions of sales, it’s easy for that to spread into tens or even hundreds of millions. In the particular case of books, this is because it spreads by word-of-mouth; say each person who reads a book recommends it to 10 friends, and you only read a book if at least 2 of your friends recommended it. In a city of 100,000 people, if you start with 50 people reading it, odds are that most of those people don’t have friends that overlap and so you stop at 50. But if you start at 50,000, there is bound to be a great deal of overlap; so then that 50,000 recruits another 10,000, then another 10,000, and pretty soon the whole 100,000 have read it. In this case we have what are called network externalitiesyou’re more likely to read a book if your friends have read it, so the more people there are who have read it, the more people there are who want to read it. There’s a very similar effect at work in social networks; why does everyone still use Facebook, even though it’s actually pretty awful? Because everyone uses Facebook. Less important than the quality of the software platform (Google Plus is better, and there are some third-party networks that are likely better still) is the fact that all your friends and family are on it. We all use Facebook because we all use Facebook? We all read Harry Potter books because we all read Harry Potter books? The first rule of tautology club is…

Languages are also like this, which is why I can write this post in English and yet people can still read it around the world. English is the winner of the language competition (we call it the lingua franca, as weird as that is—French is not the lingua franca anymore). The losers are those hundreds of New Guinean languages you’ve never heard of, many of which are dying. And their distribution obeys, once again, a power-law. (Individual words actually obey a power-law as well, which makes this whole fractal business delightfully ever more so.)
Network externalities are not the only way that the winner-takes-all effect can occur, though I think it is the most common. You can also have economies of scale from the supply side, particularly in the case of information: Recording a song is a lot of time and effort, but once you record a song, it’s trivial to make more copies of it. So that first recording costs a great deal, while every subsequent recording costs next to nothing. This is probably also at work in the case of J.K. Rowling and the NFL; the two phenomena are by no means mutually exclusive. But clearly the sizes of cities are due to network externalities: It’s quite expensive to live in a big city—no supply-side economy of scale—but you want to live in a city where other people live because that’s where friends and family and opportunities are.

The most worrisome kind of winner-takes-all effect is what Frank and Cook call deep pockets: Once you have concentration of wealth in a few hands, those few individuals can now choose their own winners in a much more literal sense: the rich can commission works of art from their favorite artists, exacerbating the inequality among artists; worse yet they can use their money to influence politicians (as the Kochs are planning on spending $900 million—$3 for every person in America—to do in 2016) and exacerbate the inequality in the whole system. That gives us even more positive feedback on top of all the other positive feedbacks.

Sure enough, if you run the standard neoclassical economic models of competition and just insert the assumption of economies of scale, the result is concentration of wealth—in fact, if nothing about the rules prevents it, the result is a complete monopoly. Nor is this result in any sense economically efficient; it’s just what naturally happens in the presence of economies of scale.

Frank and Cook seem most concerned about the fact that these winner-take-all incomes will tend to push too many people to seek those careers, leaving millions of would-be artists, musicians and quarterbacks with dashed dreams when they might have been perfectly happy as electrical engineers or high school teachers. While this may be true—next week I’ll go into detail about prospect theory and why human beings are terrible at making judgments based on probability—it isn’t really what I’m most concerned about. For all the cost of frustrated ambition there is also a good deal of benefit; striving for greatness does not just make the world better if we succeed, it can make ourselves better even if we fail. I’d strongly encourage people to have backup plans; but I’m not going to tell people to stop painting, singing, writing, or playing football just because they’re unlikely to make a living at it. The one concern I do have is that the competition is so fierce that we are pressured to go all in, to not have backup plans, to use performance-enhancing drugs—they may carry awful risks, but they also work. And it’s probably true, actually, that you’re a bit more likely to make it all the way to the top if you don’t have a backup plan. You’re also vastly more likely to end up at the bottom. Is raising your probability of being a bestselling author from 0.00011% to 0.00012% worth giving up all other career options? Skipping chemistry class to practice football may improve your chances of being an NFL quarterback from 0.000013% to 0.000014%, but it will also drop your chances of being a chemical engineer from 95% (a degree in chemical engineering almost guarantees you a job eventually) to more like 5% (it’s hard to get a degree when you flunk all your classes).

Frank and Cook offer a solution that I think is basically right; they call it positional arms control agreements. By analogy with arms control agreements between nations—and what is war, if not the ultimate winner-takes-all contest?—they propose that we use taxation and regulation policy to provide incentives to make people compete less fiercely for the top positions. Some of these we already do: Performance-enhancing drugs are banned in professional sports, for instance. Even where there are no regulations, we can use social norms: That’s why it’s actually a good thing that your parents rarely support your decision to drop out of school and become a movie star.

That’s yet another reason why progressive taxation is a good idea, as if we needed another; by paring down those top incomes it makes the prospect of winning big less enticing. If NFL quarterbacks only made 10 times what chemical engineers make instead of 300 times, people would be a lot more hesitant to give up on chemical engineering to become a quarterback. If top Wall Street executives only made 50 times what normal people make instead of 5000, people with physics degrees might go back to actually being physicists instead of speculating on stock markets.

There is one case where we might not want fewer people to try, and that is entrepreneurship. Most startups fail, and only a handful go on to make mind-bogglingly huge amounts of money (often for no apparent reason, like the Snuggie and Flappy Bird), yet entrepreneurship is what drives the dynamism of a capitalist economy. We need people to start new businesses, and right now they do that mainly because of a tiny chance of a huge benefit. Yet we don’t want them to be too unrealistic in their expectations: Entrepreneurs are much more optimistic than the general population, but the most successful entrepreneurs are a bit less optimistic than other entrepreneurs. The most successful strategy is to be optimistic but realistic; this outperforms both unrealistic optimism and pessimism. That seems pretty intuitive; you have to be confident you’ll succeed, but you can’t be totally delusional. Yet it’s precisely the realistic optimists who are most likely to be disincentivized by a reduction in the top prizes.

Here’s my solution: Let’s change it from a tiny change of a huge benefit to a large chance of a moderately large benefit. Let’s reward entrepreneurs for trying—with standards for what constitutes a really serious, good attempt rather than something frivolous that was guaranteed to fail. Use part of the funds from the progressive tax as a fund for angel grants, provided to a large number of the most promising entrepreneurs. It can’t be a million-dollar prize for the top 100. It needs to be more like a $50,000 prize for the top 100,000 (which would cost $5 billion a year, affordable for the US government). It should be paid at the proposal phase; the top 100,000 business plans receive the funding and are under no obligation to repay it. It has to be enough money that someone can rationally commit themselves to years of dedicated work without throwing themselves into poverty, and it has to be confirmed money so that they don’t have to worry about throwing themselves into debt. As for the upper limit, it only needs to be small enough that there is still an incentive for the business to succeed; but even with a 99% tax Mark Zuckerberg would still be a millionaire, so the rewards for success are high indeed.

The good news is that we actually have such a system to some extent. For research scientists rather than entrepreneurs, NSF grants are pretty close to what I have in mind, but at present they are a bit too competitive: 8,000 research grants with a median of $130,000 each and a 20% acceptance rate isn’t quite enough people—the acceptance rate should be higher, since most of these proposals are quite worthy. Still, it’s close, and definitely a much better incentive system than what we have for entrepreneurs; there are almost 12 million entrepreneurs in the United States, starting 6 million businesses a year, 75% of which fail before they can return their venture capital. Those that succeed have incomes higher than the general population, with a median income of around $70,000 per year, but most of this is accounted for by the fact that entrepreneurs are more educated and talented than the general population. Once you factor that in, successful entrepreneurs have about 50% more income on average, but their standard deviation of income is also 60% higher—so some are getting a lot and some are getting very little. Since 75% fail, we’re talking about a 25% chance of entering an income distribution that’s higher on average but much more variable, and a 75% chance of going through a period with little or no income at all—is it worth it? Maybe, maybe not. But if you could get a guaranteed $50,000 for having a good idea—and let me be clear, only serious proposals that have a good chance of success should qualify—that deal sounds an awful lot better.