Hyper-competition

Dec13 JDN 2459197

This phenomenon has been particularly salient for me the last few months, but I think it’s a common experience for most people in my generation: Getting a job takes an awful lot of work.

Over the past six months, I’ve applied to over 70 different positions and so far gone through 4 interviews (2 by video, 2 by phone). I’ve done about 10 hours of test work. That so far has gotten me no offers, though I have yet to hear from 50 employers. Ahead of me I probably have about another 10 interviews, then perhaps 4 of what would have been flyouts and in-person presentations but instead will be “comprehensive interviews” and presentations conducted online, likely several more hours of test work, and then finally, maybe, if I’m lucky, I’ll get a good offer or two. If I’m unlucky, I won’t, and I’ll have to stick around for another year and do all this over again next year.

Aside from the limitations imposed by the pandemic, this is basically standard practice for PhD graduates. And this is only the most extreme end of a continuum of intensive job search efforts, for which even applying to be a cashier at Target requires a formal application, references, and a personality test.

This wasn’t how things used to be. Just a couple of generations ago, low-wage employers would more or less hire you on the spot, with perhaps a resume or a cursory interview. More prestigious employers would almost always require a CV with references and an interview, but it more or less stopped there. I discussed in an earlier post how much of the difference actually seems to come from our chronic labor surplus.

Is all of this extra effort worthwhile? Are we actually fitting people to better jobs this way? Even if the matches are better, are they enough better to justify all this effort?

It is a commonly-held notion among economists that competition in markets is good, that it increases efficiency and improves outcomes. I think that this is often, perhaps usually, the case. But the labor market has become so intensely competitive, particularly for high-paying positions, that the costs of this competitive effort likely outweigh the benefits.

How could this happen? Shouldn’t the free market correct for such an imbalance? Not necessarily. Here is a simple formal model of how this sort of intensive competition can result in significant waste.

Note that this post is about a formal mathematical model, so it’s going to use a lot of algebra. If you are uninterested in such things, you can read the next two paragraphs and then skip to the conclusions at the end.

The overall argument is straightforward: If candidates are similar in skill level, a complicated application process can make sense from a firm’s perspective, but be harmful from society’s perspective, due to the great cost to the applicants. This can happen because the difficult application process imposes an externality on the workers who don’t get the job.

All right, here is where the algebra begins.

I’ve included each equation as both formatted text and LaTeX.

Consider a competition between two applicants, X and Z.

They are each asked to complete a series of tasks in an application process. The amount of effort X puts into the application is x, and the amount of effort Z puts into the application is z. Let’s say each additional bit of effort has a fixed cost, normalized to 1.

Let’s say that their skills are similar, but not identical; this seems quite realistic. X has skill level hx, and Z has skill level hz.

Getting hired has a payoff for each worker of V. This includes all the expected benefits of the salary, benefits, and working conditions. I’ll assume that these are essentially the same for both workers, which also seems realistic.

The benefit to the employer is proportional to the worker’s skill, so letting h be the skill level of the actually hired worker, the benefit of hiring that worker is hY. The reason they are requiring this application process is precisely because they want to get the worker with the highest h. Let’s say that this application process has a cost to implement, c.

Who will get hired? Well, presumably whoever does better on the application. The skill level will amplify the quality of their output, let’s say proportionally to the effort they put in; so X’s expected quality will be hxx and Z’s expected output will be hzz.

Let’s also say there’s a certain amount of error in the process; maybe the more-qualified candidate will sleep badly the day of the interview, or make a glaring and embarrassing typo on their CV. And quite likely the quality of application output isn’t perfectly correlated with the quality of actual output once hired. To capture all this, let’s say that having more skill and putting in more effort only increases your probability of getting the job, rather than actually guaranteeing it.

In particular, let’s say that the probability of X getting hired is P[X] = hxx/(hxx + hzz).

\[ P[X] = \frac{h_x}{h_x x + h_z z} \]

This results in a contest function, a type of model that I’ve discussed in some earlier posts in a rather different context.


The expected payoff for worker X is:

E[Ux] = hxx/(hxx + hzz) V – x

\[ E[U_x] = \frac{h_x x}{h_x x + h_z z} V – x \]

Maximizing this with respect to the choice of effort x (which is all that X can control at this point) yields:

hxhzz V = (hxx + hzz)2

\[ h_x h_z x V = (h_x x + h_z z)^2 \]

A similar maximization for worker Z yields:

hxhzx V = (hxx + hzz)2

\[ h_x h_z z V = (h_x x + h_z z)^2 \]

It follows that x=z, i.e. X and Z will exert equal efforts in Nash equilibrium. Their probability of success will then be contingent entirely on their skill levels:

P[X] = hx/(hx + hz).

\[ P[X] = \frac{h_x}{h_x + h_y} \]

Substituting that back in, we can solve for the actual amount of effort:

hxhzx V = (hx + hz)2x2

\[h_x h_z x V = (h_x + h_z)^2 x^2 \]

x = hxhzV/(hx + hz)2

\[ x = \frac{h_x h_z}{h_x + h_z} V \]

Now let’s see what that gives for the expected payoffs of the firm and the workers. This is worker X’s expected payoff:

E[Ux] = hx/(hx + hz) V – hxhzV/(hx + hz)2 = (hx/(hx + hz))2 V

\[ E[U_x] = \frac{h_x}{h_x + h_z} V – \frac{h_x h_z}{(h_x + h_z)^2} V = \left( \frac{h_x}{h_x + h_z}\right)^2 V \]

Worker Z’s expected payoff is the same, with hx and hz exchanged:

E[Uz] = (hz/(hx + hz))2 V

\[ E[U_z] = \left( \frac{h_z}{h_x + h_z}\right)^2 V \]

What about the firm? Their expected payoff is the the probability of hiring X, times the value of hiring X, plus the probability of hiring Z, times the value of hiring Z, all minus the cost c:

E[Uf] = hx/(hx + hz) hx Y + hz/(hx + hz) hz Y – c= (hx2 + hz2)/(hx + hz) Y – c

\[ E[U_f] = \frac{h_x}{h_x + h_z} h_x Y + \frac{h_z}{h_x + h_z} h_z Y – c = \frac{h_x^2 + h_z^2}{h_x + h_z} Y – c\]

To see whether the application process was worthwhile, let’s compare against the alternative of simply flipping a coin and hiring X or Z at random. The probability of getting hired is then 1/2 for each candidate.

Expected payoffs for X and Z are now equal:

E[Ux] = E[Uz] = V/2

\[ E[U_x] = E[U_z] = \frac{V}{2} \]

The expected payoff for the firm can be computed the same as before, but now without the cost c:

E[Uf] = 1/2 hx Y + 1/2 hz Y = (hx + hz)/2 Y

\[ E[U_f] = \frac{1}{2} h_x Y + \frac{1}{2} h_z Y = \frac{h_x + h_z}{2} Y \]

This has a very simple interpretation: The expected value to the firm is just the average quality of the two workers, times the overall value of the job.

Which of these two outcomes is better? Well, that depends on the parameters, of course. But in particular, it depends on the difference between hx and hz.

Consider two extremes: In one case, the two workers are indistinguishable, and hx = hz = h. In that case, the payoffs for the hiring process reduce to the following:

E[Ux] = E[Uz] = V/4

\[ E[U_x] = E[U_z] = \frac{V}{4} \]

E[Uf] = h Y – c

\[ E[U_f] = h Y – c \]

Compare this against the payoffs for hiring randomly:

E[Ux] = E[Uz] = V/2

\[ E[U_x] = E[U_z] = \frac{V}{2} \]

E[Uf] = h Y

\[ E[U_f] = h Y \]

Both the workers and the firm are strictly better off if the firm just hires at random. This makes sense, since the workers have identical skill levels.

Now consider the other extreme, where one worker is far better than the other; in fact, one is nearly worthless, so hz ~ 0. (I can’t do exactly zero because I’d be dividing by zero, but let’s say one is 100 times better or something.)

In that case, the payoffs for the hiring process reduce to the following:

E[Ux] = V

E[Uz] = 0

\[ E[U_x] = V \]

\[ E[U_z] = 0 \]

X will definitely get the job, so X is much better off.

E[Uf] = hx Y – c

\[ E[U_f] = h_x Y – c \]

If the firm had hired randomly, this would have happened instead:

E[Ux] = E[Uz] = V/2

\[ E[U_x] = E[U_z] = \frac{V}{2} \]

E[Uf] = hY/2

\[ E[U_f] = \frac{h}{2} Y \]

As long as c < hY/2, both the firm and the higher-skill worker are better off in this scenario. (The lower-skill worker is worse off, but that’s not surprising.) The total expected benefit for everyone is also higher in this scenario.


Thus, the difference in skill level between the applicants is vital. If candidates are very different in skill level, in a way that the application process can accurately measure, then a long and costly application process can be beneficial, not only for the firm but also for society as a whole.

In these extreme examples, it was either not worth it for the firm, or worth it for everyone. But there is an intermediate case worth looking at, where the long and costly process can be worth it for the firm, but not for society as a whole. I will call this case hyper-competition—a system that is so competitive it makes society overall worse off.

This inefficient result occurs precisely when:
c < (hx2 + hz2)/(hx + hz) Y – (hx + hz)/2 Y < c + (hx/(hx + hz))2 V + (hz/(hx + hz))2 V

\[ c < \frac{h_x^2 + h_z^2}{h_x + h_z} Y – \frac{h_x + h_z}{2} Y < c + \left( \frac{h_x}{h_x + h_z}\right)^2 V + \left( \frac{h_z}{h_x + h_z}\right)^2 V \]

This simplifies to:

c < (hx – hz)2/(2hx + 2hz) Y < c + (hx2 + hz2)/(hx + hz)2 V

\[ c < \frac{(h_x – h_z)^2}{2 (h_x + h_z)} Y < c + \frac{(h_x^2 + h_z^2)}{(h_x+h_z)^2} V \]

If c is small, then we are interested in the case where:

(hx – hz)2 Y/2 < (hx2 + hz2)/(hx + hz) V

\[ \frac{(h_x – h_z)^2}{2} Y < \frac{h_x^2 + h_z^2}{h_x + h_z} V \]

This is true precisely when the difference hx – hz is small compared to the overall size of hx or hz—that is, precisely when candidates are highly skilled but similar. This is pretty clearly the typical case in the real world. If the candidates were obviously different, you wouldn’t need a competitive process.

For instance, suppose that hx = 10 and hz = 8, while V = 180, Y = 20 and c = 1.

Then, if we hire randomly, these are the expected payoffs:

E[Uf] = (hx + hz)/2 Y = 180

E[Ux] = E[Uz] = V/2 = 90

If we use the complicated hiring process, these are the expected payoffs:

E[Ux] = (hx/(hx + hz))2 V = 55.5

E[Uz] = (hz/(hx + hz))2 V = 35.5

E[Uf] = (hx2 + hz2)/(hx + hz) Y – c = 181

The firm gets a net benefit of 1, quite small; while the workers face a far larger total expected loss of 90. And these candidates aren’t that similar: One is 25% better than the other. Yet because the effort expended in applying was so large, even this improvement in quality wasn’t worth it from society’s perspective.

This conclude’s the algebra for today, if you’ve been skipping it.

In this model I’ve only considered the case of exactly two applicants, but this can be generalized to more applicants, and the effect only gets stronger: Seemingly-large differences in each worker’s skill level can be outweighed by the massive cost of making so many people work so hard to apply and get nothing to show for it.

Thus, hyper-competition can exist despite apparently large differences in skill. Indeed, it is precisely the typical real-world scenario with many applicants who are similar that we expect to see the greatest inefficiencies. In the absence of intervention, we should expect markets to get this wrong.

Of course, we don’t actually want employers to hire randomly, right? We want people who are actually qualified for their jobs. Yes, of course; but you can probably assess that with nothing more than a resume and maybe a short interview. Most employers are not actually trying to find qualified candidates; they are trying to sift through a long list of qualified candidates to find the one that they think is best qualified. And my suspicion is that most of them honestly don’t have good methods of determining that.

This means that it could be an improvement for society to simply ban long hiring processes like these—indeed, perhaps ban job interviews altogether, as I can hardly think of a more efficient mechanism for allowing employers to discriminate based on race, gender, age, or disability than a job interview. Just collect a resume from each applicant, remove the ones that are unqualified, and then roll a die to decide which one you hire.

This would probably make the fit of workers to their jobs somewhat worse than the current system. But most jobs are learned primarily through experience anyway, so once someone has been in a job for a few years it may not matter much who was hired originally. And whatever cost we might pay in less efficient job matches could be made up several times over by the much faster, cheaper, easier, and less stressful process of applying for jobs.

Indeed, think for a moment of how much worse it feels being turned down for a job after a lengthy and costly application process that is designed to assess your merit (but may or may not actually do so particularly well), as opposed to simply finding out that you lost a high-stakes die roll. Employers could even send out letters saying one of two things: “You were rejected as unqualifed for this position.” versus “You were qualified, but you did not have the highest die roll.” Applying for jobs already feels like a crapshoot; maybe it should literally be one.

People would still have to apply for a lot of jobs—actually, they’d probably end up applying for more, because the lower cost of applying would attract more applicants. But since the cost is so much lower, it would still almost certainly be easier to do a job search than it is in the current system. In fact, it could largely be automated: simply post your resume on a central server and the system matches you with employers’ requirements and then randomly generates offers. Employers and prospective employees could fill out a series of forms just once indicating what they were looking for, and then the system could do the rest.

What I find most interesting about this policy idea is that it is in an important sense anti-meritocratic. We are in fact reducing the rewards for high levels of skill—at least a little bit—in order to improve society overall and especially for those with less skill. This is exactly the kind of policy proposal that I had hoped to see from a book like The Meritocracy Trap, but never found there. Perhaps it’s too radical? But the book was all about how we need fundamental, radical change—and then its actual suggestions were simple, obvious, and almost uncontroversial.

Note that this simplified process would not eliminate the incentives to get major, verifiable qualifications like college degrees or years of work experience. In fact, it would focus the incentives so that only those things matter, instead of whatever idiosyncratic or even capricious preferences HR agents might have. There would be no more talk of “culture fit” or “feeling right for the job”, just: “What is their highest degree? How many years have they worked in this industry?” I suppose this is credentialism, but in a world of asymmetric information, I think credentialism may be our only viable alternative to nepotism.

Of course, it’s too late for me. But perhaps future generations may benefit from this wisdom.

What about a tax on political contributions?

Jan 7, JDN 2458126

In my previous post, I argued that an advertising tax could reduce advertising, raise revenue, and produce almost no real economic distortion. Now I’m going to generalize this idea to an even bolder proposal: What if we tax political contributions?

Donations to political campaigns are very similar to advertising. A contest function framework also makes a lot of sense: Increased spending improves your odds of winning, but it doesn’t actually produce any real goods.

Suppose there’s some benefit B that I get if a given politician wins an election. That benefit could include direct benefits to me, as well as altruistic benefits to other citizens I care about, or even my concern for the world as a whole. But presumably, I do benefit in some fashion from my favored politician winning—otherwise, why are they my favored politician?

In this very simple model, let’s assume that there are only two parties and two donors (obviously in the real world there are more parties and vastly more donors; but it doesn’t fundamentally change the argument). Say I will donate x and the other side will donate y.

Assuming that donations are all that matter, the probability my party will win the election is x/(x+y).

Fortunately that isn’t the case. A lot of things matter, some that should (policy platforms, experience, qualifications, character) and some that shouldn’t (race, gender, age, heightpart of why Trump won may in fact be that he is tall; he’s about 6’1”.). So let’s put all the other factors that affect elections into a package and call that F.

The probability that my candidate wins is then x/(x+y) + F, where F can be positive or negative. If F is positive, it means that my candidate is more likely to win, while if it’s negative, it means my candidate is less likely to win. (If you want to be pedantic, the probability of winning has to be capped at 0 and 1, but this doesn’t fundamentally change the argument, and only matters for candidates that are obvious winners or obvious losers regardless of how much anyone donates.)

The donation costs me money, x. The cost in utility of that money depends on my utility function, so for now I’ll just call it a cost function C(x).
Then my net benefit is:
B*[x/(x+y)+F] – C(x)

I can maximize this by a first-order condition. Notice how the F just drops out. I like F to be large, but it doesn’t affect my choice of x.

B*y/(x+y)^2 = C'(x)

Turning that into an exact value requires knowing my cost function and my opponent’s cost function (which need not be the same, in general; unlike the advertising case, it’s not a matter of splitting fungible profits between us), but it’s actually possible to stop here. We can already tell that there is a well-defined solution: There’s a certain amount of donation x that maximizes my expected utility, given the amount y that the other side has donated. Moreover, with a little bit of calculus you can show that the optimal amount of x is strictly increasing in y, which makes intuitive sense: The more they give, the more you need to give in order to keep up. Since x is increasing in y and y is increasing in x, there is a Nash equilibrium: At some amount x and y we each are giving the optimal amount from our perspective.

We can get a precise answer if we assume that the amount of the donations is small compared to my overall wealth, so I will be approximately risk-neutral; then we can just say C(x) = x, and C'(x) = 1:

B*y/(x+y)^2 = 1
Then we get essentially the same result we did for the advertising:

x = y = B/4

According to this, I should be willing to donate up to one-fourth the benefit I’d get from my candidate winning in donations. This actually sounds quite high; I think once you take into account the fact that lots of other people are donating and political contributions aren’t that effective at winning elections, the optimal donation is actually quite a bit smaller—though perhaps still larger than most people give.

If we impose a tax rate r on political contributions, nothing changes. The cost to me of donating is still the same, and as long as the tax is proportional, the ratio x/(x+y) and the probability x/(x+y) + F will remain exactly the same as before. Therefore, I will continue to donate the same amount, as will my opponent, and each candidate will have the same probability of winning as before. The only difference is that some of the money (r of the money, to be precise) will go to the government instead of the politicians.

The total amount of donations will not change. The probability of each candidate winning will not change. All that will happen is money will be transferred from politicians to the government. If this tax revenue is earmarked for some socially beneficial function, this will obviously be an improvement in welfare.

The revenue gained is not nearly as large an amount of money as is spent on advertising (which tells you something about American society), but it’s still quite a bit: Since we currently spend about $5 billion per year on federal elections, a tax rate of 50% could raise about $2.5 billion.

But in fact this seriously under-estimates the benefits of such a tax. This simple model assumes that political contributions only change which candidate wins; but that’s actually not the main concern. (If F is large enough, it can offset any possible donations.)
The real concern is how political contributions affect the choices politicians make once they get into office. While outright quid-pro-quo bribery is illegal, it’s well-known that many corporations and wealthy individuals will give campaign donations with the reasonable expectation of influencing what sort of policies will be made.

You don’t think Goldman Sachs gives millions of dollars each election out of the goodness of their hearts, do you? And they give to both major parties, which really only makes sense if their goal is not to make a particular candidate win, but to make sure that whoever wins feels indebted to Goldman Sachs. (I guess it could also be to prevent third parties from winning—but they hardly ever win anyway, so that wouldn’t be a smart investment from the bank’s perspective.)

Lynda Powell at the University of Rochester has documented the many subtle but significant ways that these donations have influenced policy. Campaign donations aren’t as important as party platforms, but a lot of subtle changes across a wide variety of policies add up to large differences in outcomes.

A political contribution tax would reduce these influences. If politicians’ sole goal were to win, the tax would have no effect. But it seems quite likely that politicians enjoy various personal benefits from lobbying and campaign contributions: Fine dinners, luxurious vacations, and so on. And insofar as that is influencing politicians’ behavior, it is both obviously corrupt and clearly reduced by a political contribution tax. How large an effect this would be is difficult to say; but the direction of the effect is clearly the one we want.

Taxing donations would also allow us to protect the right to give to campaigns (which does seem to be a limited kind of civil liberty, even though the precise interpretation “money is speech” is Orwellian), while reducing corruption and allowing us to keep close track on donations that are made. Taxing a money stream, even a small amount, is often one of the best ways to incentivize close monitoring of that money stream.

With a subtle change, the tax could even be made to bias in favor of populism: All you need to do is exempt small donations from the tax. If say the first $1000 per person per year is exempt from taxation, then the imposition of the tax will reduce the effectiveness of million-dollar contributions from Goldman Sachs and the Koch brothers without having any effect on $50 donations from people like you and me. That would technically be “distorting” elections—but it seems like it might be a distortion worth making.

Of course, this is probably even less likely to happen than the advertising tax.

The potential of an advertising tax

Jan 7, JDN 2458126

Advertising is everywhere in our society. You may see some on this very page (though if I hit my next Patreon target I’m going to pay to get rid of those). Ad-blockers can help when you’re on the Web, and premium channels like HBO will save you from ads when watching TV, but what are you supposed to do about ads on billboards as you drive down the highway, ads on buses as you walk down the street, ads on the walls of the subway train?

And Banksy isn’t entirely wrong; this stuff can be quite damaging. Based on decades of research, the American Psychological Association has issued official statements condemning the use of advertising to children for its harmful psychological effects. Medical research has shown that advertisements for food can cause overeating—and thus, the correlated rise of advertising and obesity may be no coincidence.

Worst of all, political advertising distorts our view of the world. Though we may not be able to blame advertising per se for Trump; most of his publicity was gained for free by irresponsible media coverage.

And yet, advertising is almost pure rent-seeking. It costs resources, but it doesn’t produce anything. In most cases it doesn’t even raise awareness about something or find new customers. The primary goal of most advertising is to get you to choose that brand instead of a different brand. A secondary goal (especially for food ads) is to increase your overall consumption of that good, but since the means employed typically involve psychological manipulation, this increase in consumption is probably harmful to social welfare.

A general principle of economics that has almost universal consensus is the Pigou Principle: If you want less of something, you should put a tax on it. So, what would happen if we put a tax on advertising?

The amazing thing is that in this case, we would probably not actually reduce advertising spending, but we would reduce advertising, which is what we actually care about. Moreover, we would be able to raise an enormous amount of revenue with zero social cost. Like the other big Pigovian tax (the carbon tax), this a rare example of a tax that will give you a huge amount of revenue while actually yielding a benefit to society.

This is far from obvious, so I think it is worth explaining where it comes from.

The key point is that advertising doesn’t typically increase the overall size of the market (though in some cases it does; I’ll get back to that in a moment). Rather that a conventional production function like we would have for most types of expenditure, advertising is better modeled by what is called a contest function (something that our own Stergios Skaperdas at UCI is actually a world-class expert in). In a production function, inputs increase the total amount of output. But in a contest function, inputs only redistribute output from one place to another. Contest functions thus provide a good model of rent-seeking, which is what most advertising is.

Suppose there’s a total market M for some good, where M is the total profits that can be gained from capturing that entire market.
Then, to keep it simple, let’s suppose there are only two major firms in the market, a duopoly like Coke and Pepsi or Boeing and Airbus.

Let’s say Coke decides to spend an amount x on advertising, and Pepsi decides to spend an amount y.

For now, let’s assume that total beverage consumption won’t change; so the total profits to be had from the market are always M.

What advertising does is it changes the share of that market which each firm will get. Specifically, let’s use the simplest model, where the share of the market is equal to the share of advertising spending.

Then the net profit for Coke is the following:

The share they get, x/(x+y), times the size of the whole market, M, minus the advertising spending x.

max M*x/(x+y) – x

We can maximize this with the usual first-order condition:

y/(x+y)^2 M – 1 = 0

(x+y)^2 = My

Since the game is symmetric, in a Nash equilibrium, Pepsi will use the same reasoning:

(x+y)^2 = Mx

Thus we have:

x = y

(2x)^2 = Mx

x = M/4

In this very simple model, each firm will spend one-fourth of the market’s value, and the total advertising spending will be equal to half the size of the market. Then, each company’s net income will be equal to its advertising spending. This is a pretty good estimate for Coca-Cola in real life, which spends about $3.3 billion on advertising and receives about $2.8 billion in net income each year.

What would happen if we introduce a tax? Let’s say we introduce a proportional tax r on all advertising spending. That is, for every dollar you spend on advertising, you must pay the government $r in tax. The really remarkable thing is that companies who advertise shouldn’t care what we make the tax; the only ones who will care are the advertising companies themselves.

If Coke pays x, the actual amount of advertising they receive is x – r x = x(1-r).

Likewise, Pepsi’s actual advertising received is y(1-r).

But notice that the share of total advertising spending is completely unchanged!

(x(1-r))/(x(1-r) + y(1-r)) = x/(x+y)

Since the payoff for Coke only depends on how much Coke spends and what market share they get, it is also unchanged. Since the same is true for Pepsi, nothing will change in how the two companies behave. They will spend the same amount on advertising, and they will receive the same amount of net income when all is said and done.

The total quantity of advertising will be reduced, from x+y to (x+y)(1-r). That means fewer billboards, fewer posters in subway stations, fewer TV commercials. That will hurt advertising companies, but benefit everyone else.

How much revenue will we get for the government? r x + r y = r(x+y).

Since the goal is to substantially reduce advertising output, and it won’t distort other industries in any way, we should set this tax quite high. A reasonable value for r would be 50%. We might even want to consider something as high as 90%; but for now let’s look at what 50% would do.

Total advertising spending in the US is over $200 billion per year. Since an advertising tax would not change total advertising spending, we can expect that a tax rate of 50% would simply capture 50% of this spending as revenue, which is to say $100 billion per year. That would be enough to pay for the entire Federal education budget, or the foreign aid and environment budgets combined.
Another great aspect of how an advertising tax is actually better than a carbon tax is that countries will want to compete to have the highest advertising taxes. If say Canada imposes a carbon tax but the US doesn’t, industries will move production to the US where it is cheaper, which hurts Canada. Yet the total amount of pollution will remain about the same, and Canada will be just as affected by climate change as they would have been anyway. So we need to coordinate across countries so that the carbon taxes are all the same (or at least close), to prevent industries from moving around; and each country has an incentive to cheat by imposing a lower carbon tax.

But advertising taxes aren’t like that. If Canada imposes an advertising tax and the US doesn’t, companies won’t shift production to the US; they will shift advertising to the US. And having your country suddenly flooded with advertisements is bad. That provides a strong incentive for you to impose your own equal or even higher advertising tax to stem the tide. And pretty soon, everyone will have imposed an advertising tax at the same rate.

Of course, in all the above I’ve assumed a pure contest function, meaning that advertisements are completely unproductive. What if they are at least a little bit productive? Then we wouldn’t want to set the tax too high, but the basic conclusions would be unchanged.

Suppose, for instance, that the advertising spending adds half its value to the value of the market. This is a pretty high estimate of the benefits of advertising.

Under this assumption, in place of M we have M+(x+y)/2. Everything else is unchanged.

We can maximize as before:

max (M+(x+y)/2)*x/(x+y) – x

The math is a bit trickier, but we can still solve by a first-order condition, which simplifies to:

(x+y)^2 = 2My

By the same symmetry reasoning as before:

(2x)^2 = 2Mx

x = M/2

Now, total advertising spending would equal the size of the market without advertising, and net income for each firm after advertising would be:

2M(1/2) – M/2 = M/2

That is, advertising spending would equal net income, as before. (A surprisingly robust result!)

What if we imposed a tax? Now the algebra gets even nastier:

max (M+(x+y)(1-r)/2)*x/(x+y) – x

But the ultimate outcome is still quite similar:

(1+r)(x+y)^2 = 2My

(1+r)(2x)^2 = 2Mx

x = M/2*1/(1+r)

Advertising spending will be reduced by a factor of 1/(1+r). Even if r is 50%, that still means we’ll have 2/3 of the advertising spending we had before.

Total tax revenue will then be M*r/(1+r), which for r of 50% would be M/3.

Total advertising will be M(1-r)/(1+r), which would be M/3. So we managed to reduce advertising by 2/3, while reducing advertising spending by only 1/3. Then we would receive half of that spending as revenue. Thus, instead of getting $100 billion per year, we would get $67 billion, which is still just about enough to pay for food stamps.

What’s the downside of this tax? Unlike most taxes, there really isn’t one. Yes, it would hurt advertising companies, which I suppose counts as a downside. But that was mostly waste anyway; anyone employed in advertising would be better employed almost anywhere else. Millions of minds are being wasted coming up with better ways to sell Viagra instead of better treatments for cancer. Any unemployment introduced by an advertising tax would be temporary and easily rectified by monetary policy, and most of it would hit highly educated white-collar professionals who have high incomes to begin with and can more easily find jobs when displaced.

The real question is why we aren’t doing this already. And that, I suppose, has to come down to politics.