May 28, JDN 2457902
In most neoclassical models, workers are paid according to their marginal productivity—the additional (market) value of goods that a firm is able to produce by hiring that worker. This is often used as an excuse for inequality: If someone can produce more, why shouldn’t they be paid more?
The most extreme example of this is people like Maura Pennington writing for Forbes about how poor people just need to get off their butts and “do something”; but there is a whole literature in mainstream economics, particularly “optimal tax theory”, arguing based on marginal productivity that we should tax the very richest people the least and never tax capital income. The Chamley-Judd Theorem famously “shows” (by making heroic assumptions) that taxing capital just makes everyone worse off because it reduces everyone’s productivity.
The biggest reason this is wrong is that there are many, many reasons why someone would have a higher income without being any more productive. They could inherit wealth from their ancestors and get a return on that wealth; they could have a monopoly or some other form of market power; they could use bribery and corruption to tilt government policy in their favor. Indeed, most of the top 0.01% do literally all of these things.
But even if you assume that pay is related to productivity in competitive markets, the argument is not nearly as strong as it may at first appear. Here I have a simple little model to illustrate this.
Suppose there are 10 firms and 10 workers. Suppose that firm 1 has 1 unit of effective capital (capital adjusted for productivity), firm 2 has 2 units, and so on up to firm 10 which has 10 units. And suppose that worker 1 has 1 unit of so-called “human capital”, representing their overall level of skills and education, worker 2 has 2 units, and so on up to worker 10 with 10 units. Suppose each firm only needs one worker, so this is a matching problem.
Furthermore, suppose that productivity is equal to capital times human capital: That is, if firm 2 hired worker 7, they would make 2*7 = $14 of output.
What will happen in this market if it converges to equilibrium?
Well, first of all, the most productive firm is going to hire the most productive worker—so firm 10 will hire worker 10 and produce $100 of output. What wage will they pay? Well, they need a wage that is high enough to keep worker 10 from trying to go elsewhere. They should therefore pay a wage of $90—the next-highest firm productivity times the worker’s productivity. That’s the highest wage any other firm could credibly offer; so if they pay this wage, worker 10 will not have any reason to leave.
Now the problem has been reduced to matching 9 firms to 9 workers. Firm 9 will hire worker 9, making $81 of output, and paying $72 in wages.
And so on, until worker 1 at firm 1 produces $1 and receives… $0. Because there is no way for worker 1 to threaten to leave, in this model they actually get nothing. If I assume there’s some sort of social welfare system providing say $0.50, then at least worker 1 can get that $0.50 by threatening to leave and go on welfare. (This, by the way, is probably the real reason firms hate social welfare spending; it gives their workers more bargaining power and raises wages.) Or maybe they have to pay that $0.50 just to keep the worker from starving to death.
What does inequality look like in this society?
Well, the most-productive firm only has 10 times as much capital as the least-productive firm, and the most-educated worker only has 10 times as much skill as the least-educated worker, so we might think that incomes would vary only by a factor of 10.
But in fact they vary by a factor of over 100.
The richest worker makes $90, while the poorest worker makes $0.50. That’s a ratio of 180. (Still lower than the ratio of the average CEO to their average employee in the US, by the way.) The worker is 10 times as productive, but they receive 180 times as much income.
The firm profits vary along a more reasonable scale in this case; firm 1 makes a profit of $0.50 while firm 10 makes a profit of $10. Indeed, except for firm 1, firm n always makes a profit of $n. So that’s very nearly a linear scaling in productivity.
Where did this result come from? Why is it so different from the usual assumptions? All I did was change one thing: I allowed for increasing returns to scale.
If you make the usual assumption of constant returns to scale, this result can’t happen. Multiplying all the inputs by 10 should just multiply the output by 10, by assumption—since that is the definition of constant returns to scale.
But if you look at the structure of real-world incomes, it’s pretty obvious that we don’t have constant returns to scale.
If we had constant returns to scale, we should expect that wages for the same person should only vary slightly if that person were to work in different places. In particular, to have a 2-fold increase in wage for the same worker you’d need more than a 2-fold increase in capital.
This is a bit counter-intuitive, so let me explain a bit further. If a 2-fold increase in capital results in a 2-fold increase in wage for a given worker, that’s increasing returns to scale—indeed, it’s precisely the production function I assumed above.
If you had constant returns to scale, a 2-fold increase in wage would require something like an 8-fold increase in capital. This is because you should get a 2-fold increase in total production by doubling everything—capital, labor, human capital, whatever else. So doubling capital by itself should produce a much weaker effect. For technical reasons I’d rather not get into at the moment, usually it’s assumed that production is approximately proportional to capital to the one-third power—so to double production you need to multiply capital by 2^3 = 8.
I wasn’t able to quickly find really good data on wages for the same workers across different countries, but this should at least give a rough idea. In Mumbai, the minimum monthly wage for a full-time worker is about $80. In Shanghai, it is about $250. If you multiply out the US federal minimum wage of $7.25 per hour by 40 hours by 4 weeks, that comes to $1160 per month.
Of course, these are not the same workers. Even an “unskilled” worker in the US has a lot more education and training than a minimum-wage worker in India or China. But it’s not that much more. Maybe if we normalize India to 1, China is 3 and the US is 10.
Likewise, these are not the same jobs. Even a minimum wage job in the US is much more capital-intensive and uses much higher technology than most jobs in India or China. But it’s not that much more. Again let’s say India is 1, China is 3 and the US is 10.
If we had constant returns to scale, what should the wages be? Well, for India at productivity 1, the wage is $80. So for China at productivity 3, the wage should be $240—it’s actually $250, close enough for this rough approximation. But the US wage should be $800—and it is in fact $1160, 45% larger than we would expect by constant returns to scale.
Let’s try comparing within a particular industry, where the differences in skill and technology should be far smaller. The median salary for a software engineer in India is about 430,000 INR, which comes to about $6,700. If that sounds rather low for a software engineer, you’re probably more accustomed to the figure for US software engineers, which is $74,000. That is a factor of 11 to 1. For the same job. Maybe US software engineers are better than Indian software engineers—but are they that much better? Yes, you can adjust for purchasing power and shrink the gap: Prices in the US are about 4 times as high as those in India, so the real gap might be 3 to 1. But these huge price differences themselves need to be explained somehow, and even 3 to 1 for the same job in the same industry is still probably too large to explain by differences in either capital or education, unless you allow for increasing returns to scale.
In most industries, we probably don’t have quite as much increasing returns to scale as I assumed in my simple model. Workers in the US don’t make 100 times as much as workers in India, despite plausibly having both 10 times as much physical capital and 10 times as much human capital.
But in some industries, this model might not even be enough! The most successful authors and filmmakers, for example, make literally thousands of times as much money as the average author or filmmaker in their own country. J.K. Rowling has almost $1 billion from writing the Harry Potter series; this is despite having literally the same amount of physical capital and probably not much more human capital than the average author in the UK who makes only about 11,000 GBP—which is about $14,000. Harry Potter and the Philosopher’s Stone is now almost exactly 20 years old, which means that Rowling made an average of $50 million per year, some 3500 times as much as the average British author. Is she better than the average British author? Sure. Is she three thousand times better? I don’t think so. And we can’t even make the argument that she has more capital and technology to work with, because she doesn’t! They’re typing on the same laptops and using the same printing presses. Either the return on human capital for British authors is astronomical, or something other than marginal productivity is at work here—and either way, we don’t have anything close to constant returns to scale.
What can we take away from this? Well, if we don’t have constant returns to scale, then even if wage rates are proportional to marginal productivity, they aren’t proportional to the component of marginal productivity that you yourself bring. The same software developer makes more at Microsoft than at some Indian software company, the same doctor makes more at a US hospital than a hospital in China, the same college professor makes more at Harvard than at a community college, and J.K. Rowling makes three thousand times as much as the average British author—therefore we can’t speak of marginal productivity as inhering in you as an individual. It is an emergent property of a production process that includes you as a part. So even if you’re entirely being paid according to “your” productivity, it’s not really your productivity—it’s the productivity of the production process you’re involved in. A myriad of other factors had to snap into place to make your productivity what it is, most of which you had no control over. So in what sense, then, can we say you earned your higher pay?
Moreover, this problem becomes most acute precisely when incomes diverge the most. The differential in wages between two welders at the same auto plant may well be largely due to their relative skill at welding. But there’s absolutely no way that the top athletes, authors, filmmakers, CEOs, or hedge fund managers could possibly make the incomes they do by being individually that much more productive.