The injustice of talent

Sep 4 JDN 2459827

Consider the following two principles of distributive justice.

A: People deserve to be rewarded in proportion to what they accomplish.

B: People deserve to be rewarded in proportion to the effort they put in.

Both principles sound pretty reasonable, don’t they? They both seem like sensible notions of fairness, and I think most people would broadly agree with both them.

This is a problem, because they are mutually contradictory. We cannot possibly follow them both.

For, as much as our society would like to pretend otherwise—and I think this contradiction is precisely why our society would like to pretend otherwise—what you accomplish is not simply a function of the effort you put in.

Don’t get me wrong; it is partly a function of the effort you put in. Hard work does contribute to success. But it is neither sufficient, nor strictly necessary.

Rather, success is a function of three factors: Effort, Environment, and Talent.

Effort is the work you yourself put in, and basically everyone agrees you deserve to be rewarded for that.

Environment includes all the outside factors that affect you—including both natural and social environment. Inheritance, illness, and just plain luck are all in here, and there is general, if not universal, agreement that society should make at least some efforts to minimize inequality created by such causes.

And then, there is talent. Talent includes whatever capacities you innately have. It could be strictly genetic, or it could be acquired in childhood or even in the womb. But by the time you are an adult and responsible for your own life, these factors are largely fixed and immutable. This includes things like intelligence, disability, even height. The trillion-dollar question is: How much should we reward talent?

For talent clearly does matter. I will never swim like Michael Phelps, run like Usain Bolt, or shoot hoops like Steph Curry. It doesn’t matter how much effort I put in, how many hours I spend training—I will never reach their level of capability. Never. It’s impossible. I could certainly improve from my current condition; perhaps it would even be good for me to do so. But there are certain hard fundamental constraints imposed by biology that give them more potential in these skills than I will ever have.

Conversely, there are likely things I can do that they will never be able to do, though this is less obvious. Could Michael Phelps never be as good a programmer or as skilled a mathematician as I am? He certainly isn’t now. Maybe, with enough time, enough training, he could be; I honestly don’t know. But I can tell you this: I’m sure it would be harder for him than it was for me. He couldn’t breeze through college-level courses in differential equations and quantum mechanics the way I did. There is something I have that he doesn’t, and I’m pretty sure I was born with it. Call it spatial working memory, or mathematical intuition, or just plain IQ. Whatever it is, math comes easy to me in not so different a way from how swimming comes easy to Michael Phelps. I have talent for math; he has talent for swimming.

Moreover, these are not small differences. It’s not like we all come with basically the same capabilities with a little bit of variation that can be easily washed out by effort. We’d like to believe that—we have all sorts of cultural tropes that try to inculcate that belief in us—but it’s obviously not true. The vast majority of quantum physicists are people born with high IQ. The vast majority of pro athletes are people born with physical prowess. The vast majority of movie stars are people born with pretty faces. For many types of jobs, the determining factor seems to be talent.

This isn’t too surprising, actually—even if effort matters a lot, we would still expect talent to show up as the determining factor much of the time.

Let’s go back to that contest function model I used to analyze the job market awhile back (the one that suggests we spend way too much time and money in the hiring process). This time let’s focus on the perspective of the employees themselves.

Each employee has a level of talent, h. Employee X has talent hx and exerts effort x, producing output of a quality that is the product of these: hx x. Similarly, employee Z has talent hz and exerts effort z, producing output hz z.

Then, there’s a certain amount of luck that factors in. The most successful output isn’t necessarily the best, or maybe what should have been the best wasn’t because some random circumstance prevailed. But we’ll say that the probability an individual succeeds is proportional to the quality of their output.

So the probability that employee X succeeds is: hx x / ( hx x + hz z)

I’ll skip the algebra this time (if you’re interested you can look back at that previous post), but to make a long story short, in Nash equilibrium the two employees will exert exactly the same amount of effort.

Then, which one succeeds will be entirely determined by talent; because x = z, the probability that X succeeds is hx / ( hx + hz).

It’s not that effort doesn’t matter—it absolutely does matter, and in fact in this model, with zero effort you get zero output (which isn’t necessarily the case in real life). It’s that in equilibrium, everyone is exerting the same amount of effort; so what determines who wins is innate talent. And I gotta say, that sounds an awful lot like how professional sports works. It’s less clear whether it applies to quantum physicists.

But maybe we don’t really exert the same amount of effort! This is true. Indeed, it seems like actually effort is easier for people with higher talent—that the same hour spent running on a track is easier for Usain Bolt than for me, and the same hour studying calculus is easier for me than it would be for Usain Bolt. So in the end our equilibrium effort isn’t the same—but rather than compensating, this effect only serves to exaggerate the difference in innate talent between us.

It’s simple enough to generalize the model to allow for such a thing. For instance, I could say that the cost of producing a unit of effort is inversely proportional to your talent; then instead of hx / ( hx + hz ), in equilibrium the probability of X succeeding would become hx2 / ( hx2 + hz2). The equilibrium effort would also be different, with x > z if hx > hz.

Once we acknowledge that talent is genuinely important, we face an ethical problem. Do we want to reward people for their accomplishment (A), or for their effort (B)? There are good cases to be made for each.

Rewarding for accomplishment, which we might call meritocracy,will tend to, well, maximize accomplishment. We’ll get the best basketball players playing basketball, the best surgeons doing surgery. Moreover, accomplishment is often quite easy to measure, even when effort isn’t.

Rewarding for effort, which we might call egalitarianism, will give people the most control over their lives, and might well feel the most fair. Those who succeed will be precisely those who work hard, even if they do things they are objectively bad at. Even people who are born with very little talent will still be able to make a living by working hard. And it will ensure that people do work hard, which meritocracy can actually fail at: If you are extremely talented, you don’t really need to work hard because you just automatically succeed.

Capitalism, as an economic system, is very good at rewarding accomplishment. I think part of what makes socialism appealing to so many people is that it tries to reward effort instead. (Is it very good at that? Not so clear.)

The more extreme differences are actually in terms of disability. There’s a certain baseline level of activities that most people are capable of, which we think of as “normal”: most people can talk; most people can run, if not necessarily very fast; most people can throw a ball, if not pitch a proper curveball. But some people can’t throw. Some people can’t run. Some people can’t even talk. It’s not that they are bad at it; it’s that they are literally not capable of it. No amount of effort could have made Stephen Hawking into a baseball player—not even a bad one.

It’s these cases when I think egalitarianism becomes most appealing: It just seems deeply unfair that people with severe disabilities should have to suffer in poverty. Even if they really can’t do much productive work on their own, it just seems wrong not to help them, at least enough that they can get by. But capitalism by itself absolutely would not do that—if you aren’t making a profit for the company, they’re not going to keep you employed. So we need some kind of social safety net to help such people. And it turns out that such people are quite numerous, and our current system is really not adequate to help them.

But meritocracy has its pull as well. Especially when the job is really important—like surgery, not so much basketball—we really want the highest quality work. It’s not so important whether the neurosurgeon who removes your tumor worked really hard at it or found it a breeze; what we care about is getting that tumor out.

Where does this leave us?

I think we have no choice but to compromise, on both principles. We will reward both effort and accomplishment, to greater or lesser degree—perhaps varying based on circumstances. We will never be able to entirely reward accomplishment or entirely reward effort.

This is more or less what we already do in practice, so why worry about it? Well, because we don’t like to admit that it’s what we do in practice, and a lot of problems seem to stem from that.

We have people acting like billionaires are such brilliant, hard-working people just because they’re rich—because our society rewards effort, right? So they couldn’t be so successful if they didn’t work so hard, right? Right?

Conversely, we have people who denigrate the poor as lazy and stupid just because they are poor. Because it couldn’t possibly be that their circumstances were worse than yours? Or hey, even if they are genuinely less talented than you—do less talented people deserve to be homeless and starving?

We tell kids from a young age, “You can be whatever you want to be”, and “Work hard and you’ll succeed”; and these things simply aren’t true. There are limitations on what you can achieve through effort—limitations imposed by your environment, and limitations imposed by your innate talents.

I’m not saying we should crush children’s dreams; I’m saying we should help them to build more realistic dreams, dreams that can actually be achieved in the real world. And then, when they grow up, they either will actually succeed, or when they don’t, at least they won’t hate themselves for failing to live up to what you told them they’d be able to do.

If you were wondering why Millennials are so depressed, that’s clearly a big part of it: We were told we could be and do whatever we wanted if we worked hard enough, and then that didn’t happen; and we had so internalized what we were told that we thought it had to be our fault that we failed. We didn’t try hard enough. We weren’t good enough. I have spent years feeling this way—on some level I do still feel this way—and it was not because adults tried to crush my dreams when I was a child, but on the contrary because they didn’t do anything to temper them. They never told me that life is hard, and people fail, and that I would probably fail at my most ambitious goals—and it wouldn’t be my fault, and it would still turn out okay.

That’s really it, I think: They never told me that it’s okay not to be wildly successful. They never told me that I’d still be good enough even if I never had any great world-class accomplishments. Instead, they kept feeding me the lie that I would have great world-class accomplishments; and then, when I didn’t, I felt like a failure and I hated myself. I think my own experience may be particularly extreme in this regard, but I know a lot of other people in my generation who had similar experiences, especially those who were also considered “gifted” as children. And we are all now suffering from depression, anxiety, and Impostor Syndrome.

All because nobody wanted to admit that talent, effort, and success are not the same thing.

Hyper-competition

Dec13 JDN 2459197

This phenomenon has been particularly salient for me the last few months, but I think it’s a common experience for most people in my generation: Getting a job takes an awful lot of work.

Over the past six months, I’ve applied to over 70 different positions and so far gone through 4 interviews (2 by video, 2 by phone). I’ve done about 10 hours of test work. That so far has gotten me no offers, though I have yet to hear from 50 employers. Ahead of me I probably have about another 10 interviews, then perhaps 4 of what would have been flyouts and in-person presentations but instead will be “comprehensive interviews” and presentations conducted online, likely several more hours of test work, and then finally, maybe, if I’m lucky, I’ll get a good offer or two. If I’m unlucky, I won’t, and I’ll have to stick around for another year and do all this over again next year.

Aside from the limitations imposed by the pandemic, this is basically standard practice for PhD graduates. And this is only the most extreme end of a continuum of intensive job search efforts, for which even applying to be a cashier at Target requires a formal application, references, and a personality test.

This wasn’t how things used to be. Just a couple of generations ago, low-wage employers would more or less hire you on the spot, with perhaps a resume or a cursory interview. More prestigious employers would almost always require a CV with references and an interview, but it more or less stopped there. I discussed in an earlier post how much of the difference actually seems to come from our chronic labor surplus.

Is all of this extra effort worthwhile? Are we actually fitting people to better jobs this way? Even if the matches are better, are they enough better to justify all this effort?

It is a commonly-held notion among economists that competition in markets is good, that it increases efficiency and improves outcomes. I think that this is often, perhaps usually, the case. But the labor market has become so intensely competitive, particularly for high-paying positions, that the costs of this competitive effort likely outweigh the benefits.

How could this happen? Shouldn’t the free market correct for such an imbalance? Not necessarily. Here is a simple formal model of how this sort of intensive competition can result in significant waste.

Note that this post is about a formal mathematical model, so it’s going to use a lot of algebra. If you are uninterested in such things, you can read the next two paragraphs and then skip to the conclusions at the end.

The overall argument is straightforward: If candidates are similar in skill level, a complicated application process can make sense from a firm’s perspective, but be harmful from society’s perspective, due to the great cost to the applicants. This can happen because the difficult application process imposes an externality on the workers who don’t get the job.

All right, here is where the algebra begins.

I’ve included each equation as both formatted text and LaTeX.

Consider a competition between two applicants, X and Z.

They are each asked to complete a series of tasks in an application process. The amount of effort X puts into the application is x, and the amount of effort Z puts into the application is z. Let’s say each additional bit of effort has a fixed cost, normalized to 1.

Let’s say that their skills are similar, but not identical; this seems quite realistic. X has skill level hx, and Z has skill level hz.

Getting hired has a payoff for each worker of V. This includes all the expected benefits of the salary, benefits, and working conditions. I’ll assume that these are essentially the same for both workers, which also seems realistic.

The benefit to the employer is proportional to the worker’s skill, so letting h be the skill level of the actually hired worker, the benefit of hiring that worker is hY. The reason they are requiring this application process is precisely because they want to get the worker with the highest h. Let’s say that this application process has a cost to implement, c.

Who will get hired? Well, presumably whoever does better on the application. The skill level will amplify the quality of their output, let’s say proportionally to the effort they put in; so X’s expected quality will be hxx and Z’s expected output will be hzz.

Let’s also say there’s a certain amount of error in the process; maybe the more-qualified candidate will sleep badly the day of the interview, or make a glaring and embarrassing typo on their CV. And quite likely the quality of application output isn’t perfectly correlated with the quality of actual output once hired. To capture all this, let’s say that having more skill and putting in more effort only increases your probability of getting the job, rather than actually guaranteeing it.

In particular, let’s say that the probability of X getting hired is P[X] = hxx/(hxx + hzz).

\[ P[X] = \frac{h_x}{h_x x + h_z z} \]

This results in a contest function, a type of model that I’ve discussed in some earlier posts in a rather different context.


The expected payoff for worker X is:

E[Ux] = hxx/(hxx + hzz) V – x

\[ E[U_x] = \frac{h_x x}{h_x x + h_z z} V – x \]

Maximizing this with respect to the choice of effort x (which is all that X can control at this point) yields:

hxhzz V = (hxx + hzz)2

\[ h_x h_z x V = (h_x x + h_z z)^2 \]

A similar maximization for worker Z yields:

hxhzx V = (hxx + hzz)2

\[ h_x h_z z V = (h_x x + h_z z)^2 \]

It follows that x=z, i.e. X and Z will exert equal efforts in Nash equilibrium. Their probability of success will then be contingent entirely on their skill levels:

P[X] = hx/(hx + hz).

\[ P[X] = \frac{h_x}{h_x + h_y} \]

Substituting that back in, we can solve for the actual amount of effort:

hxhzx V = (hx + hz)2x2

\[h_x h_z x V = (h_x + h_z)^2 x^2 \]

x = hxhzV/(hx + hz)2

\[ x = \frac{h_x h_z}{h_x + h_z} V \]

Now let’s see what that gives for the expected payoffs of the firm and the workers. This is worker X’s expected payoff:

E[Ux] = hx/(hx + hz) V – hxhzV/(hx + hz)2 = (hx/(hx + hz))2 V

\[ E[U_x] = \frac{h_x}{h_x + h_z} V – \frac{h_x h_z}{(h_x + h_z)^2} V = \left( \frac{h_x}{h_x + h_z}\right)^2 V \]

Worker Z’s expected payoff is the same, with hx and hz exchanged:

E[Uz] = (hz/(hx + hz))2 V

\[ E[U_z] = \left( \frac{h_z}{h_x + h_z}\right)^2 V \]

What about the firm? Their expected payoff is the the probability of hiring X, times the value of hiring X, plus the probability of hiring Z, times the value of hiring Z, all minus the cost c:

E[Uf] = hx/(hx + hz) hx Y + hz/(hx + hz) hz Y – c= (hx2 + hz2)/(hx + hz) Y – c

\[ E[U_f] = \frac{h_x}{h_x + h_z} h_x Y + \frac{h_z}{h_x + h_z} h_z Y – c = \frac{h_x^2 + h_z^2}{h_x + h_z} Y – c\]

To see whether the application process was worthwhile, let’s compare against the alternative of simply flipping a coin and hiring X or Z at random. The probability of getting hired is then 1/2 for each candidate.

Expected payoffs for X and Z are now equal:

E[Ux] = E[Uz] = V/2

\[ E[U_x] = E[U_z] = \frac{V}{2} \]

The expected payoff for the firm can be computed the same as before, but now without the cost c:

E[Uf] = 1/2 hx Y + 1/2 hz Y = (hx + hz)/2 Y

\[ E[U_f] = \frac{1}{2} h_x Y + \frac{1}{2} h_z Y = \frac{h_x + h_z}{2} Y \]

This has a very simple interpretation: The expected value to the firm is just the average quality of the two workers, times the overall value of the job.

Which of these two outcomes is better? Well, that depends on the parameters, of course. But in particular, it depends on the difference between hx and hz.

Consider two extremes: In one case, the two workers are indistinguishable, and hx = hz = h. In that case, the payoffs for the hiring process reduce to the following:

E[Ux] = E[Uz] = V/4

\[ E[U_x] = E[U_z] = \frac{V}{4} \]

E[Uf] = h Y – c

\[ E[U_f] = h Y – c \]

Compare this against the payoffs for hiring randomly:

E[Ux] = E[Uz] = V/2

\[ E[U_x] = E[U_z] = \frac{V}{2} \]

E[Uf] = h Y

\[ E[U_f] = h Y \]

Both the workers and the firm are strictly better off if the firm just hires at random. This makes sense, since the workers have identical skill levels.

Now consider the other extreme, where one worker is far better than the other; in fact, one is nearly worthless, so hz ~ 0. (I can’t do exactly zero because I’d be dividing by zero, but let’s say one is 100 times better or something.)

In that case, the payoffs for the hiring process reduce to the following:

E[Ux] = V

E[Uz] = 0

\[ E[U_x] = V \]

\[ E[U_z] = 0 \]

X will definitely get the job, so X is much better off.

E[Uf] = hx Y – c

\[ E[U_f] = h_x Y – c \]

If the firm had hired randomly, this would have happened instead:

E[Ux] = E[Uz] = V/2

\[ E[U_x] = E[U_z] = \frac{V}{2} \]

E[Uf] = hY/2

\[ E[U_f] = \frac{h}{2} Y \]

As long as c < hY/2, both the firm and the higher-skill worker are better off in this scenario. (The lower-skill worker is worse off, but that’s not surprising.) The total expected benefit for everyone is also higher in this scenario.


Thus, the difference in skill level between the applicants is vital. If candidates are very different in skill level, in a way that the application process can accurately measure, then a long and costly application process can be beneficial, not only for the firm but also for society as a whole.

In these extreme examples, it was either not worth it for the firm, or worth it for everyone. But there is an intermediate case worth looking at, where the long and costly process can be worth it for the firm, but not for society as a whole. I will call this case hyper-competition—a system that is so competitive it makes society overall worse off.

This inefficient result occurs precisely when:
c < (hx2 + hz2)/(hx + hz) Y – (hx + hz)/2 Y < c + (hx/(hx + hz))2 V + (hz/(hx + hz))2 V

\[ c < \frac{h_x^2 + h_z^2}{h_x + h_z} Y – \frac{h_x + h_z}{2} Y < c + \left( \frac{h_x}{h_x + h_z}\right)^2 V + \left( \frac{h_z}{h_x + h_z}\right)^2 V \]

This simplifies to:

c < (hx – hz)2/(2hx + 2hz) Y < c + (hx2 + hz2)/(hx + hz)2 V

\[ c < \frac{(h_x – h_z)^2}{2 (h_x + h_z)} Y < c + \frac{(h_x^2 + h_z^2)}{(h_x+h_z)^2} V \]

If c is small, then we are interested in the case where:

(hx – hz)2 Y/2 < (hx2 + hz2)/(hx + hz) V

\[ \frac{(h_x – h_z)^2}{2} Y < \frac{h_x^2 + h_z^2}{h_x + h_z} V \]

This is true precisely when the difference hx – hz is small compared to the overall size of hx or hz—that is, precisely when candidates are highly skilled but similar. This is pretty clearly the typical case in the real world. If the candidates were obviously different, you wouldn’t need a competitive process.

For instance, suppose that hx = 10 and hz = 8, while V = 180, Y = 20 and c = 1.

Then, if we hire randomly, these are the expected payoffs:

E[Uf] = (hx + hz)/2 Y = 180

E[Ux] = E[Uz] = V/2 = 90

If we use the complicated hiring process, these are the expected payoffs:

E[Ux] = (hx/(hx + hz))2 V = 55.5

E[Uz] = (hz/(hx + hz))2 V = 35.5

E[Uf] = (hx2 + hz2)/(hx + hz) Y – c = 181

The firm gets a net benefit of 1, quite small; while the workers face a far larger total expected loss of 90. And these candidates aren’t that similar: One is 25% better than the other. Yet because the effort expended in applying was so large, even this improvement in quality wasn’t worth it from society’s perspective.

This conclude’s the algebra for today, if you’ve been skipping it.

In this model I’ve only considered the case of exactly two applicants, but this can be generalized to more applicants, and the effect only gets stronger: Seemingly-large differences in each worker’s skill level can be outweighed by the massive cost of making so many people work so hard to apply and get nothing to show for it.

Thus, hyper-competition can exist despite apparently large differences in skill. Indeed, it is precisely the typical real-world scenario with many applicants who are similar that we expect to see the greatest inefficiencies. In the absence of intervention, we should expect markets to get this wrong.

Of course, we don’t actually want employers to hire randomly, right? We want people who are actually qualified for their jobs. Yes, of course; but you can probably assess that with nothing more than a resume and maybe a short interview. Most employers are not actually trying to find qualified candidates; they are trying to sift through a long list of qualified candidates to find the one that they think is best qualified. And my suspicion is that most of them honestly don’t have good methods of determining that.

This means that it could be an improvement for society to simply ban long hiring processes like these—indeed, perhaps ban job interviews altogether, as I can hardly think of a more efficient mechanism for allowing employers to discriminate based on race, gender, age, or disability than a job interview. Just collect a resume from each applicant, remove the ones that are unqualified, and then roll a die to decide which one you hire.

This would probably make the fit of workers to their jobs somewhat worse than the current system. But most jobs are learned primarily through experience anyway, so once someone has been in a job for a few years it may not matter much who was hired originally. And whatever cost we might pay in less efficient job matches could be made up several times over by the much faster, cheaper, easier, and less stressful process of applying for jobs.

Indeed, think for a moment of how much worse it feels being turned down for a job after a lengthy and costly application process that is designed to assess your merit (but may or may not actually do so particularly well), as opposed to simply finding out that you lost a high-stakes die roll. Employers could even send out letters saying one of two things: “You were rejected as unqualifed for this position.” versus “You were qualified, but you did not have the highest die roll.” Applying for jobs already feels like a crapshoot; maybe it should literally be one.

People would still have to apply for a lot of jobs—actually, they’d probably end up applying for more, because the lower cost of applying would attract more applicants. But since the cost is so much lower, it would still almost certainly be easier to do a job search than it is in the current system. In fact, it could largely be automated: simply post your resume on a central server and the system matches you with employers’ requirements and then randomly generates offers. Employers and prospective employees could fill out a series of forms just once indicating what they were looking for, and then the system could do the rest.

What I find most interesting about this policy idea is that it is in an important sense anti-meritocratic. We are in fact reducing the rewards for high levels of skill—at least a little bit—in order to improve society overall and especially for those with less skill. This is exactly the kind of policy proposal that I had hoped to see from a book like The Meritocracy Trap, but never found there. Perhaps it’s too radical? But the book was all about how we need fundamental, radical change—and then its actual suggestions were simple, obvious, and almost uncontroversial.

Note that this simplified process would not eliminate the incentives to get major, verifiable qualifications like college degrees or years of work experience. In fact, it would focus the incentives so that only those things matter, instead of whatever idiosyncratic or even capricious preferences HR agents might have. There would be no more talk of “culture fit” or “feeling right for the job”, just: “What is their highest degree? How many years have they worked in this industry?” I suppose this is credentialism, but in a world of asymmetric information, I think credentialism may be our only viable alternative to nepotism.

Of course, it’s too late for me. But perhaps future generations may benefit from this wisdom.

What about a tax on political contributions?

Jan 7, JDN 2458126

In my previous post, I argued that an advertising tax could reduce advertising, raise revenue, and produce almost no real economic distortion. Now I’m going to generalize this idea to an even bolder proposal: What if we tax political contributions?

Donations to political campaigns are very similar to advertising. A contest function framework also makes a lot of sense: Increased spending improves your odds of winning, but it doesn’t actually produce any real goods.

Suppose there’s some benefit B that I get if a given politician wins an election. That benefit could include direct benefits to me, as well as altruistic benefits to other citizens I care about, or even my concern for the world as a whole. But presumably, I do benefit in some fashion from my favored politician winning—otherwise, why are they my favored politician?

In this very simple model, let’s assume that there are only two parties and two donors (obviously in the real world there are more parties and vastly more donors; but it doesn’t fundamentally change the argument). Say I will donate x and the other side will donate y.

Assuming that donations are all that matter, the probability my party will win the election is x/(x+y).

Fortunately that isn’t the case. A lot of things matter, some that should (policy platforms, experience, qualifications, character) and some that shouldn’t (race, gender, age, heightpart of why Trump won may in fact be that he is tall; he’s about 6’1”.). So let’s put all the other factors that affect elections into a package and call that F.

The probability that my candidate wins is then x/(x+y) + F, where F can be positive or negative. If F is positive, it means that my candidate is more likely to win, while if it’s negative, it means my candidate is less likely to win. (If you want to be pedantic, the probability of winning has to be capped at 0 and 1, but this doesn’t fundamentally change the argument, and only matters for candidates that are obvious winners or obvious losers regardless of how much anyone donates.)

The donation costs me money, x. The cost in utility of that money depends on my utility function, so for now I’ll just call it a cost function C(x).
Then my net benefit is:
B*[x/(x+y)+F] – C(x)

I can maximize this by a first-order condition. Notice how the F just drops out. I like F to be large, but it doesn’t affect my choice of x.

B*y/(x+y)^2 = C'(x)

Turning that into an exact value requires knowing my cost function and my opponent’s cost function (which need not be the same, in general; unlike the advertising case, it’s not a matter of splitting fungible profits between us), but it’s actually possible to stop here. We can already tell that there is a well-defined solution: There’s a certain amount of donation x that maximizes my expected utility, given the amount y that the other side has donated. Moreover, with a little bit of calculus you can show that the optimal amount of x is strictly increasing in y, which makes intuitive sense: The more they give, the more you need to give in order to keep up. Since x is increasing in y and y is increasing in x, there is a Nash equilibrium: At some amount x and y we each are giving the optimal amount from our perspective.

We can get a precise answer if we assume that the amount of the donations is small compared to my overall wealth, so I will be approximately risk-neutral; then we can just say C(x) = x, and C'(x) = 1:

B*y/(x+y)^2 = 1
Then we get essentially the same result we did for the advertising:

x = y = B/4

According to this, I should be willing to donate up to one-fourth the benefit I’d get from my candidate winning in donations. This actually sounds quite high; I think once you take into account the fact that lots of other people are donating and political contributions aren’t that effective at winning elections, the optimal donation is actually quite a bit smaller—though perhaps still larger than most people give.

If we impose a tax rate r on political contributions, nothing changes. The cost to me of donating is still the same, and as long as the tax is proportional, the ratio x/(x+y) and the probability x/(x+y) + F will remain exactly the same as before. Therefore, I will continue to donate the same amount, as will my opponent, and each candidate will have the same probability of winning as before. The only difference is that some of the money (r of the money, to be precise) will go to the government instead of the politicians.

The total amount of donations will not change. The probability of each candidate winning will not change. All that will happen is money will be transferred from politicians to the government. If this tax revenue is earmarked for some socially beneficial function, this will obviously be an improvement in welfare.

The revenue gained is not nearly as large an amount of money as is spent on advertising (which tells you something about American society), but it’s still quite a bit: Since we currently spend about $5 billion per year on federal elections, a tax rate of 50% could raise about $2.5 billion.

But in fact this seriously under-estimates the benefits of such a tax. This simple model assumes that political contributions only change which candidate wins; but that’s actually not the main concern. (If F is large enough, it can offset any possible donations.)
The real concern is how political contributions affect the choices politicians make once they get into office. While outright quid-pro-quo bribery is illegal, it’s well-known that many corporations and wealthy individuals will give campaign donations with the reasonable expectation of influencing what sort of policies will be made.

You don’t think Goldman Sachs gives millions of dollars each election out of the goodness of their hearts, do you? And they give to both major parties, which really only makes sense if their goal is not to make a particular candidate win, but to make sure that whoever wins feels indebted to Goldman Sachs. (I guess it could also be to prevent third parties from winning—but they hardly ever win anyway, so that wouldn’t be a smart investment from the bank’s perspective.)

Lynda Powell at the University of Rochester has documented the many subtle but significant ways that these donations have influenced policy. Campaign donations aren’t as important as party platforms, but a lot of subtle changes across a wide variety of policies add up to large differences in outcomes.

A political contribution tax would reduce these influences. If politicians’ sole goal were to win, the tax would have no effect. But it seems quite likely that politicians enjoy various personal benefits from lobbying and campaign contributions: Fine dinners, luxurious vacations, and so on. And insofar as that is influencing politicians’ behavior, it is both obviously corrupt and clearly reduced by a political contribution tax. How large an effect this would be is difficult to say; but the direction of the effect is clearly the one we want.

Taxing donations would also allow us to protect the right to give to campaigns (which does seem to be a limited kind of civil liberty, even though the precise interpretation “money is speech” is Orwellian), while reducing corruption and allowing us to keep close track on donations that are made. Taxing a money stream, even a small amount, is often one of the best ways to incentivize close monitoring of that money stream.

With a subtle change, the tax could even be made to bias in favor of populism: All you need to do is exempt small donations from the tax. If say the first $1000 per person per year is exempt from taxation, then the imposition of the tax will reduce the effectiveness of million-dollar contributions from Goldman Sachs and the Koch brothers without having any effect on $50 donations from people like you and me. That would technically be “distorting” elections—but it seems like it might be a distortion worth making.

Of course, this is probably even less likely to happen than the advertising tax.

The potential of an advertising tax

Jan 7, JDN 2458126

Advertising is everywhere in our society. You may see some on this very page (though if I hit my next Patreon target I’m going to pay to get rid of those). Ad-blockers can help when you’re on the Web, and premium channels like HBO will save you from ads when watching TV, but what are you supposed to do about ads on billboards as you drive down the highway, ads on buses as you walk down the street, ads on the walls of the subway train?

And Banksy isn’t entirely wrong; this stuff can be quite damaging. Based on decades of research, the American Psychological Association has issued official statements condemning the use of advertising to children for its harmful psychological effects. Medical research has shown that advertisements for food can cause overeating—and thus, the correlated rise of advertising and obesity may be no coincidence.

Worst of all, political advertising distorts our view of the world. Though we may not be able to blame advertising per se for Trump; most of his publicity was gained for free by irresponsible media coverage.

And yet, advertising is almost pure rent-seeking. It costs resources, but it doesn’t produce anything. In most cases it doesn’t even raise awareness about something or find new customers. The primary goal of most advertising is to get you to choose that brand instead of a different brand. A secondary goal (especially for food ads) is to increase your overall consumption of that good, but since the means employed typically involve psychological manipulation, this increase in consumption is probably harmful to social welfare.

A general principle of economics that has almost universal consensus is the Pigou Principle: If you want less of something, you should put a tax on it. So, what would happen if we put a tax on advertising?

The amazing thing is that in this case, we would probably not actually reduce advertising spending, but we would reduce advertising, which is what we actually care about. Moreover, we would be able to raise an enormous amount of revenue with zero social cost. Like the other big Pigovian tax (the carbon tax), this a rare example of a tax that will give you a huge amount of revenue while actually yielding a benefit to society.

This is far from obvious, so I think it is worth explaining where it comes from.

The key point is that advertising doesn’t typically increase the overall size of the market (though in some cases it does; I’ll get back to that in a moment). Rather that a conventional production function like we would have for most types of expenditure, advertising is better modeled by what is called a contest function (something that our own Stergios Skaperdas at UCI is actually a world-class expert in). In a production function, inputs increase the total amount of output. But in a contest function, inputs only redistribute output from one place to another. Contest functions thus provide a good model of rent-seeking, which is what most advertising is.

Suppose there’s a total market M for some good, where M is the total profits that can be gained from capturing that entire market.
Then, to keep it simple, let’s suppose there are only two major firms in the market, a duopoly like Coke and Pepsi or Boeing and Airbus.

Let’s say Coke decides to spend an amount x on advertising, and Pepsi decides to spend an amount y.

For now, let’s assume that total beverage consumption won’t change; so the total profits to be had from the market are always M.

What advertising does is it changes the share of that market which each firm will get. Specifically, let’s use the simplest model, where the share of the market is equal to the share of advertising spending.

Then the net profit for Coke is the following:

The share they get, x/(x+y), times the size of the whole market, M, minus the advertising spending x.

max M*x/(x+y) – x

We can maximize this with the usual first-order condition:

y/(x+y)^2 M – 1 = 0

(x+y)^2 = My

Since the game is symmetric, in a Nash equilibrium, Pepsi will use the same reasoning:

(x+y)^2 = Mx

Thus we have:

x = y

(2x)^2 = Mx

x = M/4

In this very simple model, each firm will spend one-fourth of the market’s value, and the total advertising spending will be equal to half the size of the market. Then, each company’s net income will be equal to its advertising spending. This is a pretty good estimate for Coca-Cola in real life, which spends about $3.3 billion on advertising and receives about $2.8 billion in net income each year.

What would happen if we introduce a tax? Let’s say we introduce a proportional tax r on all advertising spending. That is, for every dollar you spend on advertising, you must pay the government $r in tax. The really remarkable thing is that companies who advertise shouldn’t care what we make the tax; the only ones who will care are the advertising companies themselves.

If Coke pays x, the actual amount of advertising they receive is x – r x = x(1-r).

Likewise, Pepsi’s actual advertising received is y(1-r).

But notice that the share of total advertising spending is completely unchanged!

(x(1-r))/(x(1-r) + y(1-r)) = x/(x+y)

Since the payoff for Coke only depends on how much Coke spends and what market share they get, it is also unchanged. Since the same is true for Pepsi, nothing will change in how the two companies behave. They will spend the same amount on advertising, and they will receive the same amount of net income when all is said and done.

The total quantity of advertising will be reduced, from x+y to (x+y)(1-r). That means fewer billboards, fewer posters in subway stations, fewer TV commercials. That will hurt advertising companies, but benefit everyone else.

How much revenue will we get for the government? r x + r y = r(x+y).

Since the goal is to substantially reduce advertising output, and it won’t distort other industries in any way, we should set this tax quite high. A reasonable value for r would be 50%. We might even want to consider something as high as 90%; but for now let’s look at what 50% would do.

Total advertising spending in the US is over $200 billion per year. Since an advertising tax would not change total advertising spending, we can expect that a tax rate of 50% would simply capture 50% of this spending as revenue, which is to say $100 billion per year. That would be enough to pay for the entire Federal education budget, or the foreign aid and environment budgets combined.
Another great aspect of how an advertising tax is actually better than a carbon tax is that countries will want to compete to have the highest advertising taxes. If say Canada imposes a carbon tax but the US doesn’t, industries will move production to the US where it is cheaper, which hurts Canada. Yet the total amount of pollution will remain about the same, and Canada will be just as affected by climate change as they would have been anyway. So we need to coordinate across countries so that the carbon taxes are all the same (or at least close), to prevent industries from moving around; and each country has an incentive to cheat by imposing a lower carbon tax.

But advertising taxes aren’t like that. If Canada imposes an advertising tax and the US doesn’t, companies won’t shift production to the US; they will shift advertising to the US. And having your country suddenly flooded with advertisements is bad. That provides a strong incentive for you to impose your own equal or even higher advertising tax to stem the tide. And pretty soon, everyone will have imposed an advertising tax at the same rate.

Of course, in all the above I’ve assumed a pure contest function, meaning that advertisements are completely unproductive. What if they are at least a little bit productive? Then we wouldn’t want to set the tax too high, but the basic conclusions would be unchanged.

Suppose, for instance, that the advertising spending adds half its value to the value of the market. This is a pretty high estimate of the benefits of advertising.

Under this assumption, in place of M we have M+(x+y)/2. Everything else is unchanged.

We can maximize as before:

max (M+(x+y)/2)*x/(x+y) – x

The math is a bit trickier, but we can still solve by a first-order condition, which simplifies to:

(x+y)^2 = 2My

By the same symmetry reasoning as before:

(2x)^2 = 2Mx

x = M/2

Now, total advertising spending would equal the size of the market without advertising, and net income for each firm after advertising would be:

2M(1/2) – M/2 = M/2

That is, advertising spending would equal net income, as before. (A surprisingly robust result!)

What if we imposed a tax? Now the algebra gets even nastier:

max (M+(x+y)(1-r)/2)*x/(x+y) – x

But the ultimate outcome is still quite similar:

(1+r)(x+y)^2 = 2My

(1+r)(2x)^2 = 2Mx

x = M/2*1/(1+r)

Advertising spending will be reduced by a factor of 1/(1+r). Even if r is 50%, that still means we’ll have 2/3 of the advertising spending we had before.

Total tax revenue will then be M*r/(1+r), which for r of 50% would be M/3.

Total advertising will be M(1-r)/(1+r), which would be M/3. So we managed to reduce advertising by 2/3, while reducing advertising spending by only 1/3. Then we would receive half of that spending as revenue. Thus, instead of getting $100 billion per year, we would get $67 billion, which is still just about enough to pay for food stamps.

What’s the downside of this tax? Unlike most taxes, there really isn’t one. Yes, it would hurt advertising companies, which I suppose counts as a downside. But that was mostly waste anyway; anyone employed in advertising would be better employed almost anywhere else. Millions of minds are being wasted coming up with better ways to sell Viagra instead of better treatments for cancer. Any unemployment introduced by an advertising tax would be temporary and easily rectified by monetary policy, and most of it would hit highly educated white-collar professionals who have high incomes to begin with and can more easily find jobs when displaced.

The real question is why we aren’t doing this already. And that, I suppose, has to come down to politics.