On labor theories of value

May 3 JDN 246164

I got into an argument a little while ago with an acquaintance of mine who is an avowed Marxist. He posted something that’s been going around Marxist social media about the “irony” that Marx’s labor theory of value is based on Smith and Ricardo’s labor theories of value (plural; they’re not the same), and thus when defenders of capitalism criticize the labor theory of value, they are in effect betraying their founding figures.

The first point I made in response to this was basically, “Yeah. So?” I think one thing that Marxists—at least this flavor of Marxist; I am prepared to exempt more serious Marxian economists—don’t really understand is that mainstream economists don’t have a founding figure that they worship and consider infallible. There is no inerrant text. I am fully prepared to acknowledge—and did, in fact, in that conversation, acknowledge—that Adam Smith made errors and his labor theory of value was one of them. And quite frankly, any defender of capitalism who worships Milton Friedman or Ayn Rand isn’t a mainstream economist, or is at best a very bad one.

My interlocutor then challenged me to describe these different labor theories of value, and I was foolish enough to take the bait, and then the whole conversation devolved into him playing this smug game of “That’s not what Marx really meant” and “clearly you haven’t read Das Kapital” (even though I have, but I admit it was several years ago; I did call up a PDF copy to refresh my memory during the conversation).

But it got me thinking about labor theories of value, and trying to understand why so many people find them seductive when it really doesn’t take much thought to show that they can’t possibly be right. (This post turned out to be a bit long, but I promise I won’t be as long-winded as Marx.)

So what’s wrong with labor theories of value?

If objects are valued based on the labor put into them, the following four propositions should hold:

  1. A project you spend 100 hours on which ultimately failed and produced nothing useful was extremely valuable.
  2. Everything in the Garden of Eden is worthless, because it doesn’t require labor to access.
  3. If you come up with a cure for cancer in a random stroke of insight, it’s worthless because you didn’t put any labor into it, even though both its utility (the lives it will save) and its price (the money you could make off of it) are surely astronomical.
  4. Increased productivity is worthless, because all it does is make our goods worthless as we get better at making them.

All four of these propositions are clearly preposterous, and yet they all seem to follow directly from the basic concept of valuing things by the labor that goes into them. Mainstream economists eventually realized this, and gave up on labor theories of value in favor of the now-consensus utility theory of value.

To be fair, Marx was no idiot, and he did try to address concerns like these in Das Kapital. (Well, the first three he does; I’ll talk about the fourth one in a moment.) But the way he does so is by continually re-defining his terms in contradictory ways, so that by the time you get through the book, you realize he doesn’t even have a labor theory of value. He has many labor theories of value, and he substitutes them ad hoc whenever they seem to yield the conclusions he’s looking for.

For example: Sometimes he says that it’s the actual labor that goes in which matters. Other times that it’s the “usual” or “socially necessary” amount of labor. Other times that it’s the average amount of labor that would be required for this production across the whole economy. These are not the same thing! They yield radically different results in many cases!

Marx tries to distinguish use-value (approximately utility) from exchange-value (approximately price), which is good; those two things are different. It’s very important to distinguish price from value.

But then he doesn’t even use these concepts consistently! At one point, he gives us this absolute howler:

The use-value of the money-commodity becomes two-fold. In addition to its special use-value as

a commodity (gold, for instance, serving to stop teeth, to form the raw material of articles of

luxury, &c.), it acquires a formal use-value, originating in its specific social function.

Das Kapital, Volume 1, Chapter 2, p. 63

No, dude. That is exchange-value. That is paradigmatic exchange-value. People mainly want gold because they can sell it at a high price to buy stuff that’s actually useful. If this is use-value, then the distinction between use-value and exchange-value collapses to, well, useless.

I think what Marx is doing here is that he wants use-value to always be higher than exchange-value, so that surplus-value can be the difference between them and always be positive. But gold is a very clear example of a good for which the price greatly exceeds the marginal utility, which I think you can convince yourself by imagining being stranded alone on a desert island with a crate full of gold. If that crate had contained non-perishable food, or water purification equipment, or tools and materials for building shelter, or best of all, a satellite phone and some solar panels, you’d be overjoyed to have it. Even a crate full of books, plushies, or underwear would have some use to you. (Plushies make better friends even than Wilson!) But gold? You have nothing to do but laugh—or cry—at the cruel irony. (And cash would be the same way, though maybe you could use the linen for something.)

But we actually do have a good explanation for how assets such as gold (and Bitcoin) can have prices far exceeding their marginal utility; expectations. If you expect that you’ll be able to sell an asset for more than you paid for it, you have reason to buy that asset, even if it’s useless to you. And for gold, that’s actually been a pretty smart gamble most of the time (Bitcoin, it very much depends on when you bought it). This could be a non-stationary equilibrium in rational expectations, or it could just be an ever-replenishing array of Greater Fools; but one way or another, the reason gold has a high price is that people expect it to have an even higher price in the future.

In fact, this seems like a deep flaw in capitalism! Marx could have spent a whole chapter on why gold is stupid and financial markets are basically a casino—he would have beaten out Keynes on that by decades. (If I were going to worship an economist, it would be Keynes. But again, I still don’t think his work is inerrant. Just very, very good.) But instead, Marx accepted that gold is priced the way it should be, and contorted his already-tortured theory of value into accommodating that.

I really don’t know why Marx was so insistent that all goods had to be valued based on labor. Marx actually had a lot of good insights about capitalism, and he wasn’t entirely wrong that capitalism as we know it breeds exploitation and ever-growing inequality. I believe that relatively simple reforms (like antitrust enforcement, co-ops, and progressive taxation) can solve, or at least mitigate, these problems, and allow us to enjoy the fruits of higher productivity that capitalism provides. But I recognize that I could be wrong about that; maybe some more radical change is genuinely needed. Yet this in no way vindicates Marx’s theory of value, which was simply wrongheaded from the start.

Indeed, why was he so insistent about it?

Why not simply give up on it, and adopt a new theory, or state it as an unsolved problem?

I have a hypothesis about that. Let me reprise proposition 4:

  1. Increased productivity is worthless, because all it does is make our goods worthless as we get better at making them.

This proposition is preposterous, as I’ve already said: A technology that allows you to make 100 cars with the same labor previously required to make 1 car does not make cars less useful. It simply makes them available to more people at lower prices, and this is generally a good thing.

But I think that Marx did not regard it as preposterous; in fact, I think he regarded it as true.

Consider this paragraph:

In proportion as capitalist production is developed in a country, in the same proportion do the

national intensity and productivity of labour there rise above the international level.2 The

different quantities of commodities of the same kind, produced in different countries in the same

working-time, have, therefore, unequal international values, which are expressed in different

prices, i.e., in sums of money varying according to international values. The relative value of

money will, therefore, be less in the nation with more developed capitalist mode of production

than in the nation with less developed. It follows, then, that the nominal wages, the equivalent of

labour-power expressed in money, will also be higher in the first nation than in the second; which

does not at all prove that this holds also for the real wages, i.e., for the means of subsistence

placed at the disposal of the labourer

– Das Kapital, Volume 1, chapter 22, p. 394

So he does get one qualitative fact right here: Nominal prices are higher in rich countries, for goods and services that are not traded across international borders. This is why we use purchasing power parity.

But he then goes on to say that real wages aren’t higher in rich countries. This… is just clearly false. By any reasonable measure, real wages are higher in the United States or France than they are in Congo or Haiti.

One can quibble with the particular measure used; I in fact happen to believe that we do overestimate real wages in the US by using the CPI instead of an index that better reflects the price of necessities. But there’s just no plausible way to say that a laborer in Malawi who makes $600 a year is at the same standard of living as a laborer in the US who makes $20,000. They might both be legitimately considered poor; but saying that real wages aren’t better here just isn’t plausible.

And Marx’s views on wages get weirder from there:

But hand-in-hand with the increasing productivity of labour, goes, as we have seen, the cheapening of the labourer, therefore a higher rate of surplus-value, even when the real wages are rising. The latter never rise proportionally to the productive power of labour. The same value in variable capital therefore sets in movement more labour-power, and, therefore, more labour.

Das Kapital, Volume 1, Chapter 24, p. 421

I’d in particular like to draw your attention to these two clauses: “the cheapening of the labourer, […] even when the real wages are rising.” What in the world does that mean? How can labor simultaneously get cheaper and more expensive? How can I be “cheapened” even as I am better off?

A bit later, he gets close to acknowledging that higher productivity increases value, but he characterizes it in a very strange way:

Labour transmits to its product the value of the means of production consumed by it. On the other

hand, the value and mass of the means of production set in motion by a given quantity of labour

increase as the labour becomes more productive. Though the same quantity of labour adds always

to its products only the same sum of new value, still the old capital value, transmitted by the

labour to the products, increases with the growing productivity of labour.

Das Kapital, Volume 1, Chapter 24, p. 422

So what he seems to be saying here is that the value added from capital is itself denominated in terms of the labor that was used to create that capital. Yet this is a very strange accounting indeed, as I think a simple model will help you see.

Consider a productivity-enhancing technology.

Suppose that, initially, one can make 1 widget per person-hour. So, Marx says, the value of 1 widget is precisely 1 person-hour.

And suppose there are enough laborers to do 20 person-hours of work. Then we make 20 widgets, and we get value equal to 20 person-hours. Okay, seems reasonable so far.

Then, an engineer comes along, spending 100 hours to invent a machine that costs 10 person-hours to build, and can produce 1000 widgets using 10 person-hours of labor.

So the value of that machine, according to Marx as I understand him, is 10+X person-hours, where X is some amortized fraction of the 100 person-hours involved in inventing it. It’s unclear how to do this amortization; what time frame should be using? Once invented, the machine can be built many times. But I guess we could maybe make sense of it as the patent duration—the price of the machine will surely be higher during the time the patent is still valid, and I guess we could say that is somehow reflected in its value. (Notice how this is already getting pretty weird.)

Now, let’s go ahead and make 1000 widgets with the machine.

We have spent 10 person-hours of labor running the machine, another 10 building it, and we’re supposed to count in X from inventing it in the first place. X ranges somewhere between 0 and 100.

So at the low end, when X=0, these 1000 widgets have only cost us 20 person-hours to make, increasing productivity 50-fold. This is sort of where we expect to end up after the machine goes out of patent and becomes commonplace.

But at the high end, when X=100, these 1000 widgets have cost us 120 person-hours to make, increasing productivity a lesser, but still substantial, 8-fold. This might be where we find ourselves when the very first machine comes online and it’s still an experimental prototype.

Under the utility theory of value (which, again, virtually all mainstream economists, including both neoclassical, behavioral, and even Marxian economists, accept), the value of widgets has increased from U(20) to U(1000); exactly what this value is depends on how many consumers there are and what their utility functions are, but two things we can say for sure:

  • This is definitely much higher than before. (Probably more than 10 but less than 50 times higher.)
  • The value is the same regardless of how we account for the person-hours that went into inventing the machine.
  • The cost gets lower over time, as the technology becomes established.
  • Thus the value added should increase over time. (Whether or not profit does depends upon additional factors we haven’t modeled.)

But as Marx seems to be saying here (again, he may say differently elsewhere, but that’s kind of my point; he doesn’t have a coherent theory), we are to value these 1000 widgets as follows:

When the technology is new, X=100, and so the value of the 1000 widgets is 120 person-hours, the labor that went into inventing, producing, and using the machine. So this productivity enhancement has increased value somewhat—a 6-fold increase—but not all that much. And the value of each widget has been radically reduced: It is now only 0.12 person-hours, or about 7 person-minutes.

Yet once the technology becomes established, X=0, and so the value of the 1000 widgets is 20 person-hours, the labor that went into producing and using the machine. So now this productivity enhancement has not increased value at all. The value of each widget has fallen even further: It is now a mere 0.02 person-hours, or just over 1 person-minute.

This weird dynamic, where technology increases value temporarily, then brings it back down to exactly what it was before, is clearly not how technology actually works. The value added from new technologies—in terms of utility, what really matters—is permanent and increasing over time.

Yet upon re-reading Marx and reflecting some more on his labor theories of value, I think Marx believed that this is actually what happens.

I think that Marx’s whole account of why the rate of profit must fall (even though it absolutely hasn’t, empirically, and even Marxian economists today recognize there’s no particular theoretical reason it should) is based on this misconception.

I think because he believed that labor is the correct measure of value, the fact that human beings can only do so much labor (which hasn’t really changed much over the millennia) means that standard of living can never really increase, because higher productivity simply translates into stuff becoming more and more worthless.

And I think part of where the confusion comes from is that price does sort of behave this way, at least qualitatively; no doubt in a world where widgets can be produced with only 1 minute of labor instead of 60 is one in which widgets are much cheaper to buy. But that doesn’t mean that their value has been correspondingly reduced; they are still just as useful (for whatever widgets do) as they were before, and any decline in marginal value merely comes from diminishing marginal utility as people get more and more of them.

Yet I think Marx didn’t want that result, because it seemed to imply that capitalism could actually make life better, even for workers. (As, empirically, it absolutely did.) He wanted to be able to prove that, despite all appearances, workers have gained absolutely nothing from capitalism and technology, and live just as poorly today as they did in the Middle Ages. And a labor theory of value was just the way to do that, for we only work slightly more hours today than most people did in the Middle Ages (and given the state of Medieval scholarship at the time, Marx may have even thought it was the same). Yet I for one am really a fan of vaccines and flush toilets; I don’t know about you.

He quickly realized many of the problems with this theory, and so he added more and more epicycles to try to correct these problems; but the result was a theory that wasn’t even coherent. Yet in part because of Marx’s incredibly dense and verbose writing style (please note; there are 547 pages in Volume I of Das Kapital, and it has three volumes.)it remained plausible enough to non-experts to catch on, and due to its very complexity, it becomes genuinely hard for anyone to understand. So then we can have the argument I had, where even as I clearly demonstrated the deep flaws in the theory, my interlocutor could always insist I hadn’t really understood what Marx was saying, and it was all my failing, not anything wrong with the theory, which is of course inerrant and handed down from On High.

For some people (not all, but some), Marxism really does seem more like a religion than a scientific theory: “I don’t know exactly what it means, but dammit, I know it’s true and you’ll never convince me otherwise.”

Is there a way to make a labor theory of value work?

I’m pretty well convinced that Marx’s labor theory of value is either wrong, or so incoherent as to be not even wrong. (Adam Smith’s and David Ricardo’s theories were coherent, so they were definitely just wrong.)

But could there, somewhere buried in all those hundreds of pages of mind-numbingly dense and self-contradictory text, be a theory worth salvaging?

Can I steelman the labor theory of value?

I’m going to give it a try.

Okay, so clearly it’s not the actual amount of labor used, as that runs afoul of proposition 1 immediately:

  1. A project you spend 100 hours on which ultimately failed and produced nothing useful was extremely valuable.

That’s nonsense, so we’ll rule that theory out.

Okay, maybe we can patch it up by saying it’s the socially necessary amount of labor required; the amount of labor that the most-efficient worker would require. Clearly, if you are spending 100 hours on something useless, you’re not being the most-efficient worker.

This seems to be closer to Marx’s account, but it still runs afoul of propositions 2, 3, and 4:

  1. Everything in the Garden of Eden is worthless, because it doesn’t require labor to access.
  2. If you come up with a cure for cancer in a random stroke of insight, it’s worthless because you didn’t put any labor into it, even though both its utility (the lives it will save) and its price (the money you could make off of it) are surely astronomical.
  3. Increased productivity is worthless, because all it does is make our goods worthless as we get better at making them.

Marx actually seemed to like proposition 4, but we can see that it’s wrong. So this is a problem.

Also, while propositions 2 and 3 may seem like extreme thought experiments, consider the following:

First, “The Garden of Eden” is very much what a Star Trek-style fully automated luxury communism would feel like. Many leftists say that they really would like to see such a world, and I agree with them on this. But on this theory of value, it’s all worthless, because nobody has to work to get anything.

Second, a sudden insight into a miracle cure that ends up becoming cheap and plentiful is pretty much what happened with penicillin and vaccines. Yes, there was some labor involved in making them (and still is), but it was clearly far less than the utility gained from all the improvements in health and lifespan that we have received from these inventions. Valuing these technologies in terms of their labor cost seems to completely miss the point of why they were such miracles.

So is there some other way to make a labor theory of value work?

The best I can come up with is this:

The value of a product is the amount of labor it would take to make that product by hand with pre-historic technology.

This is my attempt at steelmanning the labor theory of value. It does solve propositions 2, 3, and 4:

For 2, the fact that everything is handed to you (perhaps by robots) doesn’t change the fact that making it yourself would be really, really hard.

For 3, it’s much harder to make penicillin by hand than in a factory (though it can be done!), so improved penicillin technology is a gain in value. And every new vial of penicillin is worth the many hours that would have gone into making it by hand.

And for 4, any improvement in labor productivity works exactly how you’d expect: A machine that can do the work of 100 people produces 100 times as much value in goods. (In some ways, this is even more intuitive to most people than the utility theory of value, which predicts an increase, but not a one-to-one increase.)

So, okay, this theory is not preposterous, unlike everything we’ve considered so far.

But it really can’t be Marx’s theory, because he contradicts it very heavily in multiple places, and this theory, unlike his, does not predict that the rate of profit must fall. (Which, again, is good, because it doesn’t.)

Yet even this theory is ultimately unsatisfying, for the following reasons:

  1. Some products literally cannot be made by hand using pre-historic technology. Consider a graphics card or a strong-force microscope. In order to make these things, we had to make tools to make better tools to make even better tools to make still better tools to make yet even better tools to make staggeringly near-flawless tools to make them. Even if you had the complete schematics for all the necessary tools and machines, all the raw materials you needed, and an unlimited supply of labor, I’m not sure you could build a graphics card from scratch within a single lifetime.
  2. While it can account for the value of increased efficiency in producing a given good, it doesn’t seem to be able to account for the value of inventing whole new classes of goods. (Yes, penicillin can be made by hand using pre-historic tools, but nobody did as far as we know, and the value of that invention was absolutely enormous in a way that even this labor theory of value cannot account for.)

These two problems are related: The new products you can make now that you couldn’t before are made possible by a mix of new ideas and an accumulation of better and better tools.

As far as proposition 5, I think we might be able to shore up the theory by counting the value of capital accumulation in terms of the labor that would be needed at each level of technology: however many person-hours to make the optical microscope, and then however many person-hours to make lasers, and however many person-hours to make sulfuric acid, and so on and so forth, until you’ve finally added up all the labor that went into producing the things that produced the things that produced the things that produced the things that produced graphics cards.

But as for proposition 6? I think this is just fatal. I don’t think there’s any way for a labor theory of value to not systematically and catastrophically undervalue new discoveries and new inventions.

The whole point of new inventions is that they make new things possible or allow us to do things with far less effort or cost than before. The value they create is in the labor they save. But if they are things we theoretically could have done, just didn’t know how (like penicillin), then there is no value added by the discovery (though at least there can be a lot of value added by the actual production). And if they are things we couldn’t have done until we reached a certain level of technology and capital, the value added seems to all be captured by the production of each new tier of technology, with nothing left to go to the discovery itself.

Maybe there’s still a way to save this theory. But at some point, we have to stop and ask ourselves:

Why?

Why do we even want a labor theory of value, when we already have a utility theory of value?

Maybe it’s the fact that utility is hard to measure precisely, and so the idea of basing our value system on it is uncomfortable? Yet I think this is just a fact of life: The things that really matter are hard to measure precisely.

And it’s not as if we have absolutely no idea: We can tell the difference between happiness and suffering, and we can see how various products and technologies can contribute to happiness and alleviate suffering. (We can also see how some products and technologies can reduce happiness and contribute to suffering! Not all new technologies are good, and some products that are good for their users are bad for other people!)

Indeed, we even have a unit of measurement: The QALY. And for some particular technologies—such as penicillin and vaccines—we actually have a pretty good idea of the number of QALY they’ve added to the world, and it’s enormous.

I’m not even saying Marx was wrong about everything. He had some good ideas, actually. And Marxian economists today do sometimes come up with useful findings that can be integrated into a deeper understanding of political economy.

But he was wrong about some things, and the labor theory of value is one of them.

On foxes and hedgehogs, part I

Aug 3 JDN 2460891

Today I finally got around to reading Expert Political Judgment by Philip E. Tetlock, more or less in a single sitting because I’ve been sick the last week with some pretty tight limits on what activities I can do. (It’s mostly been reading, watching TV, or playing video games that don’t require intense focus.)

It’s really an excellent book, and I now both understand why it came so highly recommended to me, and now pass on that recommendation to you: Read it.

The central thesis of the book really boils down to three propositions:

  1. Human beings, even experts, are very bad at predicting political outcomes.
  2. Some people, who use an open-minded strategy (called “foxes”), perform substantially better than other people, who use a more dogmatic strategy (called “hedgehogs”).
  3. When rewarding predictors with money, power, fame, prestige, and status, human beings systematically favor (over)confident “hedgehogs” over (correctly) humble “foxes”.

I decided I didn’t want to make this post about current events, but I think you’ll probably agree with me when I say:

That explains a lot.

How did Tetlock determine this?

Well, he studies the issue several different ways, but the core experiment that drives his account is actually a rather simple one:

  1. He gathered a large group of subject-matter experts: Economists, political scientists, historians, and area-studies professors.
  2. He came up with a large set of questions about politics, economics, and similar topics, which could all be formulated as a set of probabilities: “How likely is this to get better/get worse/stay the same?” (For example, this was in the 1980s, so he asked about the fate of the Soviet Union: “By 1990, will they become democratic, remain as they are, or collapse and fragment?”)
  3. Each respondent answered a subset of the questions, some about their own particular field, some about another, more distant field; they assigned probabilities on an 11-point scale, from 0% to 100% in increments of 10%.
  4. A few years later, he compared the predictions to the actual results, scoring them using a Brier score, which penalizes you for assigning high probability to things that didn’t happen or low probability to things that did happen.
  5. He compared the resulting scores between people with different backgrounds, on different topics, with different thinking styles, and a variety of other variables. He also benchmarked them using some automated algorithms like “always say 33%” and “always give ‘stay the same’ 100%”.

I’ll show you the key results of that analysis momentarily, but to help it make more sense to you, let me elaborate a bit more on the “foxes” and “hedgehogs”. The notion is was first popularized by Isaiah Berlin in an essay called, simply, The Hedgehog and the Fox.

“The fox knows many things, but the hedgehog knows one very big thing.”

That is, someone who reasons as a “fox” combines ideas from many different sources and perspective, and tries to weigh them all together into some sort of synthesis that then yields a final answer. This process is messy and complicated, and rarely yields high confidence about anything.

Whereas, someone who reasons as a “hedgehog” has a comprehensive theory of the world, an ideology, that provides clear answers to almost any possible question, with the surely minor, insubstantial flaw that those answers are not particularly likely to be correct.

He also considered “hedge-foxes” (people who are mostly fox but also a little bit hedgehog) and “fox-hogs” (people who are mostly hedgehog but also a little bit fox).

Tetlock has decomposed the scores into two components: calibration and discrimination. (Both very overloaded words, but they are standard in the literature.)

Calibration is how well your stated probabilities matched up with the actual probabilities; that is, if you predicted 10% probability on 20 different events, you have very good calibration if precisely 2 of those events occurred, and very poor calibration if 18 of those events occurred.

Discrimination more or less describes how useful your predictions are, what information they contain above and beyond the simple base rate. If you just assign equal probability to all events, you probably will have reasonably good calibration, but you’ll have zero discrimination; whereas if you somehow managed to assign 100% to everything that happened and 0% to everything that didn’t, your discrimination would be perfect (and we would have to find out how you cheated, or else declare you clairvoyant).

For both measures, higher is better. The ideal for each is 100%, but it’s virtually impossible to get 100% discrimination and actually not that hard to get 100% calibration if you just use the base rates for everything.


There is a bit of a tradeoff between these two: It’s not too hard to get reasonably good calibration if you just never go out on a limb, but then your predictions aren’t as useful; we could have mostly just guessed them from the base rates.

On the graph, you’ll see downward-sloping lines that are meant to represent this tradeoff: Two prediction methods that would yield the same overall score but different levels of calibration and discrimination will be on the same line. In a sense, two points on the same line are equally good methods that prioritize usefulness over accuracy differently.

All right, let’s see the graph at last:

The pattern is quite clear: The more foxy you are, the better you do, and the more hedgehoggy you are, the worse you do.

I’d also like to point out the other two regions here: “Mindless competition” and “Formal models”.

The former includes really simple algorithms like “always return 33%” or “always give ‘stay the same’ 100%”. These perform shockingly well. The most sophisticated of these, “case-specific extrapolation” (35 and 36 on the graph, which basically assumes that each country will continue doing what it’s been doing) actually performs as well if not better than even the foxes.

And what’s that at the upper-right corner, absolutely dominating the graph? That’s “Formal models”. This describes basically taking all the variables you can find and shoving them into a gigantic logit model, and then outputting the result. It’s computationally intensive and requires a lot of data (hence why he didn’t feel like it deserved to be called “mindless”), but it’s really not very complicated, and it’s the best prediction method, in every way, by far.

This has made me feel quite vindicated about a weird nerd thing I do: When I have a big decision to make (especially a financial decision), I create a spreadsheet and assemble a linear utility model to determine which choice will maximize my utility, under different parameterizations based on my past experiences. Whichever result seems to win the most robustly, I choose. This is fundamentally similar to the “formal models” prediction method, where the thing I’m trying to predict is my own happiness. (It’s a bit less formal, actually, since I don’t have detailed happiness data to feed into the regression.) And it has worked for me, astonishingly well. It definitely beats going by my own gut. I highly recommend it.

What does this mean?

Well first of all, it means humans suck at predicting things. At least for this data set, even our experts don’t perform substantially better than mindless models like “always assume the base rate”.

Nor do experts perform much better in their own fields than in other fields; they do all perform better than undergrads or random people (who somehow perform worse than the “mindless” models)

But Tetlock also investigates further, trying to better understand this “fox/hedgehog” distinction and why it yields different performance. He really bends over backwards to try to redeem the hedgehogs, in the following ways:

  1. He allows them to make post-hoc corrections to their scores, based on “value adjustments” (assigning higher probability to events that would be really important) and “difficulty adjustments” (assigning higher scores to questions where the three outcomes were close to equally probable) and “fuzzy sets” (giving some leeway on things that almost happened or things that might still happen later).
  2. He demonstrates a different, related experiment, in which certain manipulations can cause foxes to perform a lot worse than they normally would, and even yield really crazy results like probabilities that add up to 200%.
  3. He has a whole chapter that is a Socratic dialogue (seriously!) between four voices: A “hardline neopositivist”, a “moderate neopositivist”, a “reasonable relativist”, and an “unrelenting relativist”; and all but the “hardline neopositivist” agree that there is some legitimate place for the sort of post hoc corrections that the hedgehogs make to keep themselves from looking so bad.

This post is already getting a bit long, so that will conclude part I. Stay tuned for part II, next week!

A new theoretical model of co-ops

Mar 30 JDN 2460765

A lot of economists seem puzzled by the fact that co-ops are just as efficient as corporate firms, since they have this idea that profit-sharing inevitably results in lower efficiency due to perverse incentives.

I think they’ve been modeling co-ops wrong. Here I present a new model, a very simple one, with linear supply and demand curves. Of course one could make a more sophisticated model, but this should be enough to make the point (and this is just a blog post, not a research paper, after all).

Demand curve is p = a – b q

Marginal cost is f q

There are n workers, who would hold equal shares of the co-op.

Competitive market

First, let’s start with the traditional corporate firm in a competitive market.

Since the market is competitive, price would equal marginal cost would equal wage:

a – b q = d q

q = a/(b+f)

w = d (a/(b+f)) = (a d)/(b+f)

Total profit will be

(p – w)q = 0.

Monopoly firm

In a monopoly, marginal revenue would equal marginal cost:
d[pq]/dq = a – 2 b q

If they are also a monopsonist in the labor market, this marginal cost would be marginal cost of labor, not wage:

d[d q2]/dq = 2 f q

a – 2 b q = 2 f q

q = a/(2b + 2f)

p = a – b q = a (1 – b/(2b + 2f)) = (a (b + 2f))/(2b + 2f)

w = d q = (a f)/(2b + 2f)

Total profit will be

(p – w) q = ((a (b + 2f))/(2b + 2f) – (a f)/(2b + 2f))a/(2b + 2f) = a2/(4b + 2f)

Now consider the co-op.

First, suppose that instead of working for a wage, I work for profit sharing.

If our product market is competitive, we’ll be price-takers, and we will produce until price equals marginal cost:

p = f q

a – b q = f q

q = a/(a+b)

But will we, really? I only get 1/n share of the profits. So let’s see here. My marginal cost of production is still f q, but the marginal benefit I get from more sales may only be p/n.

In that case I would work until:

p/n = f q

(a – b q)/n = fq

a – b q = n f q

q = (a/(b+nf))

Thus I would under-produce. This is the usual argument against co-ops and similar shared ownership.

Co-ops with wages

But that’s not actually how co-ops work. They pay wages. Why do they do that? Well, consider what happens if I am offered a wage as a worker-owner of the co-op.

Is there any reason for the co-op to vote on a wage that is less than the competitive market? No, because owners are workers, so any additional profit from a lower wage would simply be taken from their own wages.

If there any reason for the co-op to vote on a wage that is more than the competitive market? No, because workers are owners, and any surplus lost by paying higher wages would simply be taken from their own profits.

So if the product market is competitive, the co-op will produce the same amount and charge the same price as a firm in perfect competition, even if they have market power over their own wages.

Monopoly co-ops

The argument above doesn’t assume that the co-op has no market power in the labor market. Thus if they are a monopoly in the product market and a monopsony in the labor market, they still pay a competitive wage.

Thus they would set marginal revenue equal to marginal cost:

a – 2 b q = f q

q = a/(2b + f)

The co-op will produce more than the monopoly firm..

This is the new price:

p = a – b q = a(1 – b/(2b+f)) = a(b+f)/(2b + f)

It’s not obvious that this is lower than the price charged by the monopoly firm, but it is.

(a (b + 2f))/(2b + 2f) – a(b+f)/(2b + f) = (a (2b + f)(b + 2f) – 2 a(b+f)2)/(2(b+f)(2b+f))

This is proportional to:

(2b + f)(b + 2f) – 2(b+f)2

2b2 + 5bf + 2f2 – (2b2 + 4bf + 2f2) = bf

So it’s not a large difference, but it’s there. In the presence of market power in the labor market, the co-op is better for consumers, because they get more goods and pay a lower price.

Thus, there is actually no lost efficiency from being a co-op. There is simply much lower inequality, and potentially higher efficiency.

But that’s just in theory.

What do we see in practice?

Exactly that.

Co-ops have the same productivity and efficiency as corporate firms, but they pay higher wages, provide better benefits, and offer collateral benefits to their communities. In fact, they are sometimes more efficient than corporate firms.

Since they’re just as efficient—if not more so—and produce much lower inequality, switching more firms over to co-ops would clearly be a good thing.

Why, then, aren’t co-ops more common?

Because the people who have the money don’t like them.

The biggest barrier facing co-ops is their inability to get financing, because they don’t pay shareholders (so no IPOs) and banks don’t like to lend to them. They tend to make less profit than corporate firms, which offers investors a lower return—instead that money goes to the worker-owners. This lower return isn’t due to inefficiency; it’s just a different distribution of income, more to labor and less to capital.

We will need new financial institutions to support co-ops, such as the Cooperative Fund of New England. And general redistribution of wealth would also help, because if middle class people had more wealth they could afford to finance co-ops. (It would also be good for many other reasons, of course.)

How to detect discrimination, empirically

Aug 25 JDN 2460548

For concreteness, I’ll use men and women as my example, though the same principles would apply for race, sexual orientation, and so on. Suppose we find that there are more men than women in a given profession; does this mean that women are being discriminated against?

Not necessarily. Maybe women are less interested in that kind of work, or innately less qualified. Is there a way we can determine empirically that it really is discrimination?

It turns out that there is. All we need is a reliable measure of performance in that profession. Then, we compare performance between men and women, and that comparison can tell us whether discrimination is happening or not. The key insight is that workers in a job are not a random sample; they are a selected sample. The results of that selection can tell us whether discrimination is happening.

Here’s a simple model to show how this works.

Suppose there are five different skill levels in the job, from 1 to 5 where 5 is the most skilled. And suppose there are 5 women and 5 men in the population.

1. Baseline

The baseline case to consider is when innate talents are equal and there is no discrimination. In that case, we should expect men and women to be equally represented in the profession.

For the simplest case, let’s say that there is one person at each skill level:

MenWomen
11
22
33
44
55

Now suppose that everyone above a certain skill threshold gets hired. Since we’re assuming no discrimination, the threshold should be the same for men and women. Let’s say it’s 3; then these are the people who get hired:

Hired MenHired Women
33
44
55

The result is that not only are there the same number of men and women in the job, their skill levels are also the same. There are just as many highly-competent men as highly-competent women.

2. Innate Differences

Now, suppose there is some innate difference in talent between men and women for this job. For most jobs this seems suspicious, but consider pro sports: Men really are better at basketball, in general, than women, and this is pretty clearly genetic. So it’s not absurd to suppose that for at least some jobs, there might be some innate differences. What would that look like?


Again suppose a population of 5 men and 5 women, but now the women are a bit less qualified: There are two 1s and no 5s among the women.

MenWomen
11
21
32
43
54

Then, this is the group that will get hired:

Hired MenHired Women
33
44
5

The result will be fewer women who are on average less qualified. The most highly-qualified individuals at that job will be almost entirely men. (In this simple model, entirely men; but you can easily extend it so that there are a few top-qualified women.)

This is in fact what we see for a lot of pro sports; in a head-to-head match, even the best WNBA teams would generally lose against most NBA teams. That’s what it looks like when there are real innate differences.

But it’s hard to find clear examples outside of sports. The genuine, large differences in size and physical strength between the sexes just don’t seem to be associated with similar differences in mental capabilities or even personality. You can find some subtler effects, but nothing very large—and certainly nothing large enough to explain the huge gender gaps in various industries.

3. Discrimination

What does it look like when there is discrimination?

Now assume that men and women are equally qualified, but it’s harder for women to get hired, because of discrimination. The key insight here is that this amounts to women facing a higher threshold. Where men only need to have level 3 competence to get hired, women need level 4.

So if the population looks like this:

MenWomen
11
22
33
44
55

The hired employees will look like this:

Hired MenHired Women
3
44
55

Once again we’ll have fewer women in the profession, but they will be on average more qualified. The top-performing individuals will be as likely to be women as they are to be men, while the lowest-performing individuals will be almost entirely men.

This is the kind of pattern we observe when there is discrimination. Do we see it in real life?

Yes, we see it all the time.

Corporations with women CEOs are more profitable.

Women doctors have better patient outcomes.

Startups led by women are more likely to succeed.

This shows that there is some discrimination happening, somewhere in the process. Does it mean that individual firms are actively discriminating in their hiring process? No, it doesn’t. The discrimination could be happening somewhere else; maybe it happens during education, or once women get hired. Maybe it’s a product of sexism in society as a whole, that isn’t directly under the control of employers. But it must be in there somewhere. If women are both rarer and more competent, there must be some discrimination going on.

What if there is also innate difference? We can detect that too!

4. Both

Suppose now that men are on average more talented, but there is also discrimination against women. Then the population might look like this:

MenWomen
11
21
32
43
54

And the hired employees might look like this:

Hired MenHired Women
3
4
54

In such a scenario, you’ll see a large gender imbalance, but there may not be a clear difference in competence. The tiny fraction of women who get hired will perform about as well as the men, on average.

Of course, this assumes that the two effects are of equal strength. In reality, we might see a whole spectrum of possibilities, from very strong discrimination with no innate differences, all the way to very large innate differences with no discrimination. The outcomes will then be similarly along a spectrum: When discrimination is much larger than innate difference, women will be rare but more competent. When innate difference is much larger than discrimination, women will be rare and less competent. And when there is a mix of both, women will be rare but won’t show as much difference in competence.

Moreover, if you look closer at the distribution of performance, you can still detect the two effects independently. If the lowest-performing workers are almost all men, that’s evidence of discrimination against women; while if the highest-performing workers are almost all men, that’s evidence of innate difference. And if you look at the table above, that’s exactly what we see: Both the 3 and the 5 are men, indicating the presence of both effects.

What does affirmative action do?

Effectively, affirmative action lowers the threshold for hiring women (or minorities) in order to equalize representation in the workplace. In the presence of discrimination raising that threshold, this is exactly what we need! It can take us from case 3 (discrimination) to case 1 (equality), or from case 4 (both discrimination and innate difference) to case 2 (innate difference only).

Of course, it’s possible for us to overshoot, using more affirmative action than we should have. If we achieve better representation of women, but the lowest performers at the job are women, then we have overshot, effectively now discriminating against men. Fortunately, there is very little evidence of this in practice. In general, even with affirmative action programs in place, we tend to find that the lowest performers are still men—so there is still discrimination against women that we’ve failed to compensate for.

What if we can’t measure competence?

Of course, it’s possible that we don’t have good measures of competence in a given industry. (One must wonder how firms decide who to hire, but frankly I’m prepared to believe they’re just really bad at it.) Then we can’t observe discrimination statistically in this way. What do we do then?

Well, there is at least one avenue left for us to detect discrimination: We can do direct experiments comparing resumes with male names versus female names. These sorts of experiments typically don’t find very much, though—at least for women. For different races, they absolutely do find strong results. They also find evidence of discrimination against people with disabilities, older people, and people who are physically unattractive. There’s also evidence of intersectional effects, where women of particular ethnic groups get discriminated against even when women in general don’t.

But this will only pick up discrimination if it occurs during the hiring process. The advantage of having a competence measure is that it can detect discrimination that occurs anywhere—even outside employer control. Of course, if we don’t know where the discrimination is happening, that makes it very hard to fix; so the two approaches are complementary.

And there is room for new methods too; right now we don’t have a good way to detect discrimination in promotion decisions, for example. Many of us suspect that it occurs, but unless you have a good measure of competence, you can’t really distinguish promotion discrimination from innate differences in talent. We don’t have a good method for testing that in a direct experiment, either, because unlike hiring, we can’t just use fake resumes with masculine or feminine names on them.

What behavioral economics needs

Apr 16 JDN 2460049

The transition from neoclassical to behavioral economics has been a vital step forward in science. But lately we seem to have reached a plateau, with no major advances in the paradigm in quite some time.

It could be that there is work already being done which will, in hindsight, turn out to be significant enough to make that next step forward. But my fear is that we are getting bogged down by our own methodological limitations.

Neoclassical economics shared with us its obsession with mathematical sophistication. To some extent this was inevitable; in order to impress neoclassical economists enough to convert some of them, we had to use fancy math. We had to show that we could do it their way in order to convince them why we shouldn’t—otherwise, they’d just have dismissed us the way they had dismissed psychologists for decades, as too “fuzzy-headed” to do the “hard work” of putting everything into equations.

But the truth is, putting everything into equations was never the right approach. Because human beings clearly don’t think in equations. Once we write down a utility function and get ready to take its derivative and set it equal to zero, we have already distanced ourselves from how human thought actually works.

When dealing with a simple physical system, like an atom, equations make sense. Nobody thinks that the electron knows the equation and is following it intentionally. That equation simply describes how the forces of the universe operate, and the electron is subject to those forces.

But human beings do actually know things and do things intentionally. And while an equation could be useful for analyzing human behavior in the aggregate—I’m certainly not objecting to statistical analysis—it really never made sense to say that people make their decisions by optimizing the value of some function. Most people barely even know what a function is, much less remember calculus well enough to optimize one.

Yet right now, behavioral economics is still all based in that utility-maximization paradigm. We don’t use the same simplistic utility functions as neoclassical economists; we make them more sophisticated and realistic. Yet in that very sophistication we make things more complicated, more difficult—and thus in at least that respect, even further removed from how actual human thought must operate.

The worst offender here is surely Prospect Theory. I recognize that Prospect Theory predicts human behavior better than conventional expected utility theory; nevertheless, it makes absolutely no sense to suppose that human beings actually do some kind of probability-weighting calculation in their heads when they make judgments. Most of my students—who are well-trained in mathematics and economics—can’t even do that probability-weighting calculation on paper, with a calculator, on an exam. (There’s also absolutely no reason to do it! All it does it make your decisions worse!) This is a totally unrealistic model of human thought.

This is not to say that human beings are stupid. We are still smarter than any other entity in the known universe—computers are rapidly catching up, but they haven’t caught up yet. It is just that whatever makes us smart must not be easily expressible as an equation that maximizes a function. Our thoughts are bundles of heuristics, each of which may be individually quite simple, but all of which together make us capable of not only intelligence, but something computers still sorely, pathetically lack: wisdom. Computers optimize functions better than we ever will, but we still make better decisions than they do.

I think that what behavioral economics needs now is a new unifying theory of these heuristics, which accounts for not only how they work, but how we select which one to use in a given situation, and perhaps even where they come from in the first place. This new theory will of course be complex; there’s a lot of things to explain, and human behavior is a very complex phenomenon. But it shouldn’t be—mustn’t be—reliant on sophisticated advanced mathematics, because most people can’t do advanced mathematics (almost by construction—we would call it something different otherwise). If your model assumes that people are taking derivatives in their heads, your model is already broken. 90% of the world’s people can’t take a derivative.

I guess it could be that our cognitive processes in some sense operate as if they are optimizing some function. This is commonly posited for the human motor system, for instance; clearly baseball players aren’t actually solving differential equations when they throw and catch balls, but the trajectories that balls follow do in fact obey such equations, and the reliability with which baseball players can catch and throw suggests that they are in some sense acting as if they can solve them.

But I think that a careful analysis of even this classic example reveals some deeper insights that should call this whole notion into question. How do baseball players actually do what they do? They don’t seem to be calculating at all—in fact, if you asked them to try to calculate while they were playing, it would destroy their ability to play. They learn. They engage in practiced motions, acquire skills, and notice patterns. I don’t think there is anywhere in their brains that is actually doing anything like solving a differential equation. It’s all a process of throwing and catching, throwing and catching, over and over again, watching and remembering and subtly adjusting.

One thing that is particularly interesting to me about that process is that is astonishingly flexible. It doesn’t really seem to matter what physical process you are interacting with; as long as it is sufficiently orderly, such a method will allow you to predict and ultimately control that process. You don’t need to know anything about differential equations in order to learn in this way—and, indeed, I really can’t emphasize this enough, baseball players typically don’t.

In fact, learning is so flexible that it can even perform better than calculation. The usual differential equations most people would think to use to predict the throw of a ball would assume ballistic motion in a vacuum, which absolutely not what a curveball is. In order to throw a curveball, the ball must interact with the air, and it must be launched with spin; curving a baseball relies very heavily on the Magnus Effect. I think it’s probably possible to construct an equation that would fully predict the motion of a curveball, but it would be a tremendously complicated one, and might not even have an exact closed-form solution. In fact, I think it would require solving the Navier-Stokes equations, for which there is an outstanding Millennium Prize. Since the viscosity of air is very low, maybe you could get away with approximating using the Euler fluid equations.

To be fair, a learning process that is adapting to a system that obeys an equation will yield results that become an ever-closer approximation of that equation. And it is in that sense that a baseball player can be said to be acting as if solving a differential equation. But this relies heavily on the system in question being one that obeys an equation—and when it comes to economic systems, is that even true?

What if the reason we can’t find a simple set of equations that accurately describe the economy (as opposed to equations of ever-escalating complexity that still utterly fail to describe the economy) is that there isn’t one? What if the reason we can’t find the utility function people are maximizing is that they aren’t maximizing anything?

What behavioral economics needs now is a new approach, something less constrained by the norms of neoclassical economics and more aligned with psychology and cognitive science. We should be modeling human beings based on how they actually think, not some weird mathematical construct that bears no resemblance to human reasoning but is designed to impress people who are obsessed with math.

I’m of course not the first person to have suggested this. I probably won’t be the last, or even the one who most gets listened to. But I hope that I might get at least a few more people to listen to it, because I have gone through the mathematical gauntlet and earned my bona fides. It is too easy to dismiss this kind of reasoning from people who don’t actually understand advanced mathematics. But I do understand differential equations—and I’m telling you, that’s not how people think.

Implications of stochastic overload

Apr 2 JDN 2460037

A couple weeks ago I presented my stochastic overload model, which posits a neurological mechanism for the Yerkes-Dodson effect: Stress increases sympathetic activation, and this increases performance, up to the point where it starts to risk causing neural pathways to overload and shut down.

This week I thought I’d try to get into some of the implications of this model, how it might be applied to make predictions or guide policy.

One thing I often struggle with when it comes to applying theory is what actual benefits we get from a quantitative mathematical model as opposed to simply a basic qualitative idea. In many ways I think these benefits are overrated; people seem to think that putting something into an equation automatically makes it true and useful. I am sometimes tempted to try to take advantage of this, to put things into equations even though I know there is no good reason to put them into equations, simply because so many people seem to find equations so persuasive for some reason. (Studies have even shown that, particularly in disciplines that don’t use a lot of math, inserting a totally irrelevant equation into a paper makes it more likely to be accepted.)

The basic implications of the Yerkes-Dodson effect are already widely known, and utterly ignored in our society. We know that excessive stress is harmful to health and performance, and yet our entire economy seems to be based around maximizing the amount of stress that workers experience. I actually think neoclassical economics bears a lot of the blame for this, as neoclassical economists are constantly talking about “increasing work incentives”—which is to say, making work life more and more stressful. (And let me remind you that there has never been any shortage of people willing to work in my lifetime, except possibly briefly during the COVID pandemic. The shortage has always been employers willing to hire them.)

I don’t know if my model can do anything to change that. Maybe by putting it into an equation I can make people pay more attention to it, precisely because equations have this weird persuasive power over most people.

As far as scientific benefits, I think that the chief advantage of a mathematical model lies in its ability to make quantitative predictions. It’s one thing to say that performance increases with low levels of stress then decreases with high levels; but it would be a lot more useful if we could actually precisely quantify how much stress is optimal for a given person and how they are likely to perform at different levels of stress.

Unfortunately, the stochastic overload model can only make detailed predictions if you have fully specified the probability distribution of innate activation, which requires a lot of free parameters. This is especially problematic if you don’t even know what type of distribution to use, which we really don’t; I picked three classes of distribution because they were plausible and tractable, not because I had any particular evidence for them.

Also, we don’t even have standard units of measurement for stress; we have a vague notion of what more or less stressed looks like, but we don’t have the sort of quantitative measure that could be plugged into a mathematical model. Probably the best units to use would be something like blood cortisol levels, but then we’d need to go measure those all the time, which raises its own issues. And maybe people don’t even respond to cortisol in the same ways? But at least we could measure your baseline cortisol for awhile to get a prior distribution, and then see how different incentives increase your cortisol levels; and then the model should give relatively precise predictions about how this will affect your overall performance. (This is a very neuroeconomic approach.)

So, for now, I’m not really sure how useful the stochastic overload model is. This is honestly something I feel about a lot of the theoretical ideas I have come up with; they often seem too abstract to be usefully applicable to anything.

Maybe that’s how all theory begins, and applications only appear later? But that doesn’t seem to be how people expect me to talk about it whenever I have to present my work or submit it for publication. They seem to want to know what it’s good for, right now, and I never have a good answer to give them. Do other researchers have such answers? Do they simply pretend to?

Along similar lines, I recently had one of my students ask about a theory paper I wrote on international conflict for my dissertation, and after sending him a copy, I re-read the paper. There are so many pages of equations, and while I am confident that the mathematical logic is valid,I honestly don’t know if most of them are really useful for anything. (I don’t think I really believe that GDP is produced by a Cobb-Douglas production function, and we don’t even really know how to measure capital precisely enough to say.) The central insight of the paper, which I think is really important but other people don’t seem to care about, is a qualitative one: International treaties and norms provide an equilibrium selection mechanism in iterated games. The realists are right that this is cheap talk. The liberals are right that it works. Because when there are many equilibria, cheap talk works.

I know that in truth, science proceeds in tiny steps, building a wall brick by brick, never sure exactly how many bricks it will take to finish the edifice. It’s impossible to see whether your work will be an irrelevant footnote or the linchpin for a major discovery. But that isn’t how the institutions of science are set up. That isn’t how the incentives of academia work. You’re not supposed to say that this may or may not be correct and is probably some small incremental progress the ultimate impact of which no one can possibly foresee. You’re supposed to sell your work—justify how it’s definitely true and why it’s important and how it has impact. You’re supposed to convince other people why they should care about it and not all the dozens of other probably equally-valid projects being done by other researchers.

I don’t know how to do that, and it is agonizing to even try. It feels like lying. It feels like betraying my identity. Being good at selling isn’t just orthogonal to doing good science—I think it’s opposite. I think the better you are at selling your work, the worse you are at cultivating the intellectual humility necessary to do good science. If you think you know all the answers, you’re just bad at admitting when you don’t know things. It feels like in order to succeed in academia, I have to act like an unscientific charlatan.

Honestly, why do we even need to convince you that our work is more important than someone else’s? Are there only so many science points to go around? Maybe the whole problem is this scarcity mindset. Yes, grant funding is limited; but why does publishing my work prevent you from publishing someone else’s? Why do you have to reject 95% of the papers that get sent to you? Don’t tell me you’re limited by space; the journals are digital and searchable and nobody reads the whole thing anyway. Editorial time isn’t infinite, but most of the work has already been done by the time you get a paper back from peer review. Of course, I know the real reason: Excluding people is the main source of prestige.

The role of innate activation in stochastic overload

Mar 26 JDN 2460030

Two posts ago I introduced my stochastic overload model, which offers an explanation for the Yerkes-Dodson effect by positing that additional stress increases sympathetic activation, which is useful up until the point where it starts risking an overload that forces systems to shut down and rest.

The central equation of the model is actually quite simple, expressed either as an expectation or as an integral:

Y = E[x + s | x + s < 1] P[x + s < 1]

Y = \int_{0}^{1-s} (x+s) dF(x)

The amount of output produced is the expected value of innate activation plus stress activation, times the probability that there is no overload. Increased stress raises this expectation value (the incentive effect), but also increases the probability of overload (the overload effect).

The model relies upon assuming that the brain starts with some innate level of activation that is partially random. Exactly what sort of Yerkes-Dodson curve you get from this model depends very much on what distribution this innate activation takes.

I’ve so far solved it for three types of distribution.

The simplest is a uniform distribution, where within a certain range, any level of activation is equally probable. The probability density function looks like this:

Assume the distribution has support between a and b, where a < b.

When b+s < 1, then overload is impossible, and only the incentive effect occurs; productivity increases linearly with stress.

The expected output is simply the expected value of a uniform distribution from a+s to b+s, which is:

E[x + s] = (a+b)/2+s

Then, once b+s > 1, overload risk begins to increase.

In this range, the probability of avoiding overload is:

P[x + s < 1] = F(1-s) = (1-s-a)/(b-a)

(Note that at b+s=1, this is exactly 1.)

The expected value of x+s in this range is:

E[x + s | x + s < 1] = (1-s)(1+s)/(2(b-a))

Multiplying these two together:

Y = [(1-s)(1+s)(1-s-a)]/[2(b-a)^2]

Here is what that looks like for a=0, b=1/2:

It does have the right qualitative features: increasing, then decreasing. But its sure looks weird, doesn’t it? It has this strange kinked shape.

So let’s consider some other distributions.

The next one I was able to solve it for is an exponential distribution, where the most probable activation is zero, and then higher activation always has lower probability than lower activation in an exponential decay:

For this it was actually easiest to do the integral directly (I did it by integrating by parts, but I’m sure you don’t care about all the mathematical steps):

Y = \int_{0}^{1-s} (x+s) dF(x)

Y = (1/λ+s) – (1/ λ + 1)e^(-λ(1-s))

The parameter λdecides how steeply your activation probability decays. Someone with low λ is relatively highly activated all the time, while someone with high λ is usually not highly activated; this seems like it might be related to the personality trait neuroticism.

Here are graphs of what the resulting Yerkes-Dodson curve looks like for several different values of λ:

λ = 0.5:

λ = 1:

λ = 2:

λ = 4:

λ = 8:

The λ = 0.5 person has high activation a lot of the time. They are actually fairly productive even without stress, but stress quickly overwhelms them. The λ = 8 person has low activation most of the time. They are not very productive without stress, but can also bear relatively high amounts of stress without overloading.

(The low-λ people also have overall lower peak productivity in this model, but that might not be true in reality, if λ is inversely correlated with some other attributes that are related to productivity.)

Neither uniform nor exponential has the nice bell-curve shape for innate activation we might have hoped for. There is another class of distributions, beta distributions, which do have this shape, and they are sort of tractable—you need something called an incomplete beta function, which isn’t an elementary function but it’s useful enough that most statistical packages include it.

Beta distributions have two parameters, α and β. They look like this:

Beta distributions are quite useful in Bayesian statistics; if you’re trying to estimate the probability of a random event that either succeeds or fails with a fixed probability (a Bernoulli process), and so far you have observed a successes and b failures, your best guess of its probability at each trial is a beta distribution with α = a+1 and β = b+1.

For beta distributions with parameters α and β, the result comes out to (I is that incomplete beta function I mentioned earlier):

Y = I(1-s, α+1, β) + I(1-s, α, β)

For whole number values of α andβ, the incomplete beta function can be computed by hand (though it is more work the larger they are); here’s an example with α = β = 2.

The innate activation probability looks like this:

And the result comes out like this:

Y = 2(1-s)^3 – 3/2(1-s)^4 + 3s(1-s)^2 – 2s(1-s)^3

This person has pretty high innate activation most of the time, so stress very quickly overwhelms them. If I had chosen a much higher β, I could change that, making them less likely to be innately so activated.

These are the cases I’ve found to be relatively tractable so far. They all have the right qualitative pattern: Increasing stress increases productivity for awhile, then begins decreasing it once overload risk becomes too high. They also show a general pattern where people who are innately highly activated (neurotic?) are much more likely to overload and thus much more sensitive to stress.

The stochastic overload model

The stochastic overload model

Mar 12 JDN 2460016

The next few posts are going to be a bit different, a bit more advanced and technical than usual. This is because, for the first time in several months at least, I am actually working on what could be reasonably considered something like theoretical research.

I am writing it up in the form of blog posts, because actually writing a paper is still too stressful for me right now. This also forces me to articulate my ideas in a clearer and more readable way, rather than dive directly into a morass of equations. It also means that even if I do never actually get around to finishing a paper, the idea is out there, and maybe someone else could make use of it (and hopefully give me some of the credit).

I’ve written previously about the Yerkes-Dodson effect: On cognitively-demanding tasks, increased stress increases performance, but only to a point, after which it begins decreasing it again. The effect is well-documented, but the mechanism is poorly understood.

I am currently on the wrong side of the Yerkes-Dodson curve, which is why I’m too stressed to write this as a formal paper right now. But that also gave me some ideas about how it may work.

I have come up with a simple but powerful mathematical model that may provide a mechanism for the Yerkes-Dodson effect.

This model is clearly well within the realm of a behavioral economic model, but it is also closely tied to neuroscience and cognitive science.

I call it the stochastic overload model.

First, a metaphor: Consider an engine, which can run faster or slower. If you increase its RPMs, it will output more power, and provide more torque—but only up to a certain point. Eventually it hits a threshold where it will break down, or even break apart. In real engines, we often include safety systems that force the engine to shut down as it approaches such a threshold.

I believe that human brains function on a similar principle. Stress increases arousal, which activates a variety of processes via the sympathetic nervous system. This activation improves performance on both physical and cognitive tasks. But it has a downside; especially on cognitively demanding tasks which required sustained effort, I hypothesize that too much sympathetic activation can result in a kind of system overload, where your brain can no longer handle the stress and processes are forced to shut down.

This shutdown could be brief—a few seconds, or even a fraction of a second—or it could be prolonged—hours or days. That might depend on just how severe the stress is, or how much of your brain it requires, or how prolonged it is. For purposes of the model, this isn’t vital. It’s probably easiest to imagine it being a relatively brief, localized shutdown of a particular neural pathway. Then, your performance in a task is summed up over many such pathways over a longer period of time, and by the law of large numbers your overall performance is essentially the average performance of all your brain systems.

That’s the “overload” part of the model. Now for the “stochastic” part.

Let’s say that, in the absence of stress, your brain has a certain innate level of sympathetic activation, which varies over time in an essentially chaotic, unpredictable—stochastic—sort of way. It is never really completely deactivated, and may even have some chance of randomly overloading itself even without outside input. (Actually, a potential role in the model for the personality trait neuroticism is an innate tendency toward higher levels of sympathetic activation in the absence of outside stress.)

Let’s say that this innate activation is x, which follows some kind of known random distribution F(x).

For simplicity, let’s also say that added stress s adds linearly to your level of sympathetic activation, so your overall level of activation is x + s.

For simplicity, let’s say that activation ranges between 0 and 1, where 0 is no activation at all and 1 is the maximum possible activation and triggers overload.

I’m assuming that if a pathway shuts down from overload, it doesn’t contribute at all to performance on the task. (You can assume it’s only reduced performance, but this adds complexity without any qualitative change.)

Since sympathetic activation improves performance, but can result in overload, your overall expected performance in a given task can be computed as the product of two terms:

[expected value of x + s, provided overload does not occur] * [probability overload does not occur]

E[x + s | x + s < 1] P[x + s < 1]

The first term can be thought of as the incentive effect: Higher stress promotes more activation and thus better performance.

The second term can be thought of as the overload effect: Higher stress also increases the risk that activation will exceed the threshold and force shutdown.

This equation actually turns out to have a remarkably elegant form as an integral (and here’s where I get especially technical and mathematical):

\int_{0}^{1-s} (x+s) dF(x)

The integral subsumes both the incentive effect and the overload effect into one term; you can also think of the +s in the integrand as the incentive effect and the 1-s in the limit of integration as the overload effect.

For the uninitated, this is probably just Greek. So let me show you some pictures to help with your intuition. These are all freehand sketches, so let me apologize in advance for my limited drawing skills. Think of this as like Arthur Laffer’s famous cocktail napkin.

Suppose that, in the absence of outside stress, your innate activation follows a distribution like this (this could be a normal or logit PDF; as I’ll talk about next week, logit is far more tractable):

As I start adding stress, this shifts the distribution upward, toward increased activation:

Initially, this will improve average performance.

But at some point, increased stress actually becomes harmful, as it increases the probability of overload.

And eventually, the probability of overload becomes so high that performance becomes worse than it was with no stress at all:

The result is that overall performance, as a function of stress, looks like an inverted U-shaped curve—the Yerkes-Dodson curve:

The precise shape of this curve depends on the distribution that we use for the innate activation, which I will save for next week’s post.

The injustice of talent

Sep 4 JDN 2459827

Consider the following two principles of distributive justice.

A: People deserve to be rewarded in proportion to what they accomplish.

B: People deserve to be rewarded in proportion to the effort they put in.

Both principles sound pretty reasonable, don’t they? They both seem like sensible notions of fairness, and I think most people would broadly agree with both them.

This is a problem, because they are mutually contradictory. We cannot possibly follow them both.

For, as much as our society would like to pretend otherwise—and I think this contradiction is precisely why our society would like to pretend otherwise—what you accomplish is not simply a function of the effort you put in.

Don’t get me wrong; it is partly a function of the effort you put in. Hard work does contribute to success. But it is neither sufficient, nor strictly necessary.

Rather, success is a function of three factors: Effort, Environment, and Talent.

Effort is the work you yourself put in, and basically everyone agrees you deserve to be rewarded for that.

Environment includes all the outside factors that affect you—including both natural and social environment. Inheritance, illness, and just plain luck are all in here, and there is general, if not universal, agreement that society should make at least some efforts to minimize inequality created by such causes.

And then, there is talent. Talent includes whatever capacities you innately have. It could be strictly genetic, or it could be acquired in childhood or even in the womb. But by the time you are an adult and responsible for your own life, these factors are largely fixed and immutable. This includes things like intelligence, disability, even height. The trillion-dollar question is: How much should we reward talent?

For talent clearly does matter. I will never swim like Michael Phelps, run like Usain Bolt, or shoot hoops like Steph Curry. It doesn’t matter how much effort I put in, how many hours I spend training—I will never reach their level of capability. Never. It’s impossible. I could certainly improve from my current condition; perhaps it would even be good for me to do so. But there are certain hard fundamental constraints imposed by biology that give them more potential in these skills than I will ever have.

Conversely, there are likely things I can do that they will never be able to do, though this is less obvious. Could Michael Phelps never be as good a programmer or as skilled a mathematician as I am? He certainly isn’t now. Maybe, with enough time, enough training, he could be; I honestly don’t know. But I can tell you this: I’m sure it would be harder for him than it was for me. He couldn’t breeze through college-level courses in differential equations and quantum mechanics the way I did. There is something I have that he doesn’t, and I’m pretty sure I was born with it. Call it spatial working memory, or mathematical intuition, or just plain IQ. Whatever it is, math comes easy to me in not so different a way from how swimming comes easy to Michael Phelps. I have talent for math; he has talent for swimming.

Moreover, these are not small differences. It’s not like we all come with basically the same capabilities with a little bit of variation that can be easily washed out by effort. We’d like to believe that—we have all sorts of cultural tropes that try to inculcate that belief in us—but it’s obviously not true. The vast majority of quantum physicists are people born with high IQ. The vast majority of pro athletes are people born with physical prowess. The vast majority of movie stars are people born with pretty faces. For many types of jobs, the determining factor seems to be talent.

This isn’t too surprising, actually—even if effort matters a lot, we would still expect talent to show up as the determining factor much of the time.

Let’s go back to that contest function model I used to analyze the job market awhile back (the one that suggests we spend way too much time and money in the hiring process). This time let’s focus on the perspective of the employees themselves.

Each employee has a level of talent, h. Employee X has talent hx and exerts effort x, producing output of a quality that is the product of these: hx x. Similarly, employee Z has talent hz and exerts effort z, producing output hz z.

Then, there’s a certain amount of luck that factors in. The most successful output isn’t necessarily the best, or maybe what should have been the best wasn’t because some random circumstance prevailed. But we’ll say that the probability an individual succeeds is proportional to the quality of their output.

So the probability that employee X succeeds is: hx x / ( hx x + hz z)

I’ll skip the algebra this time (if you’re interested you can look back at that previous post), but to make a long story short, in Nash equilibrium the two employees will exert exactly the same amount of effort.

Then, which one succeeds will be entirely determined by talent; because x = z, the probability that X succeeds is hx / ( hx + hz).

It’s not that effort doesn’t matter—it absolutely does matter, and in fact in this model, with zero effort you get zero output (which isn’t necessarily the case in real life). It’s that in equilibrium, everyone is exerting the same amount of effort; so what determines who wins is innate talent. And I gotta say, that sounds an awful lot like how professional sports works. It’s less clear whether it applies to quantum physicists.

But maybe we don’t really exert the same amount of effort! This is true. Indeed, it seems like actually effort is easier for people with higher talent—that the same hour spent running on a track is easier for Usain Bolt than for me, and the same hour studying calculus is easier for me than it would be for Usain Bolt. So in the end our equilibrium effort isn’t the same—but rather than compensating, this effect only serves to exaggerate the difference in innate talent between us.

It’s simple enough to generalize the model to allow for such a thing. For instance, I could say that the cost of producing a unit of effort is inversely proportional to your talent; then instead of hx / ( hx + hz ), in equilibrium the probability of X succeeding would become hx2 / ( hx2 + hz2). The equilibrium effort would also be different, with x > z if hx > hz.

Once we acknowledge that talent is genuinely important, we face an ethical problem. Do we want to reward people for their accomplishment (A), or for their effort (B)? There are good cases to be made for each.

Rewarding for accomplishment, which we might call meritocracy,will tend to, well, maximize accomplishment. We’ll get the best basketball players playing basketball, the best surgeons doing surgery. Moreover, accomplishment is often quite easy to measure, even when effort isn’t.

Rewarding for effort, which we might call egalitarianism, will give people the most control over their lives, and might well feel the most fair. Those who succeed will be precisely those who work hard, even if they do things they are objectively bad at. Even people who are born with very little talent will still be able to make a living by working hard. And it will ensure that people do work hard, which meritocracy can actually fail at: If you are extremely talented, you don’t really need to work hard because you just automatically succeed.

Capitalism, as an economic system, is very good at rewarding accomplishment. I think part of what makes socialism appealing to so many people is that it tries to reward effort instead. (Is it very good at that? Not so clear.)

The more extreme differences are actually in terms of disability. There’s a certain baseline level of activities that most people are capable of, which we think of as “normal”: most people can talk; most people can run, if not necessarily very fast; most people can throw a ball, if not pitch a proper curveball. But some people can’t throw. Some people can’t run. Some people can’t even talk. It’s not that they are bad at it; it’s that they are literally not capable of it. No amount of effort could have made Stephen Hawking into a baseball player—not even a bad one.

It’s these cases when I think egalitarianism becomes most appealing: It just seems deeply unfair that people with severe disabilities should have to suffer in poverty. Even if they really can’t do much productive work on their own, it just seems wrong not to help them, at least enough that they can get by. But capitalism by itself absolutely would not do that—if you aren’t making a profit for the company, they’re not going to keep you employed. So we need some kind of social safety net to help such people. And it turns out that such people are quite numerous, and our current system is really not adequate to help them.

But meritocracy has its pull as well. Especially when the job is really important—like surgery, not so much basketball—we really want the highest quality work. It’s not so important whether the neurosurgeon who removes your tumor worked really hard at it or found it a breeze; what we care about is getting that tumor out.

Where does this leave us?

I think we have no choice but to compromise, on both principles. We will reward both effort and accomplishment, to greater or lesser degree—perhaps varying based on circumstances. We will never be able to entirely reward accomplishment or entirely reward effort.

This is more or less what we already do in practice, so why worry about it? Well, because we don’t like to admit that it’s what we do in practice, and a lot of problems seem to stem from that.

We have people acting like billionaires are such brilliant, hard-working people just because they’re rich—because our society rewards effort, right? So they couldn’t be so successful if they didn’t work so hard, right? Right?

Conversely, we have people who denigrate the poor as lazy and stupid just because they are poor. Because it couldn’t possibly be that their circumstances were worse than yours? Or hey, even if they are genuinely less talented than you—do less talented people deserve to be homeless and starving?

We tell kids from a young age, “You can be whatever you want to be”, and “Work hard and you’ll succeed”; and these things simply aren’t true. There are limitations on what you can achieve through effort—limitations imposed by your environment, and limitations imposed by your innate talents.

I’m not saying we should crush children’s dreams; I’m saying we should help them to build more realistic dreams, dreams that can actually be achieved in the real world. And then, when they grow up, they either will actually succeed, or when they don’t, at least they won’t hate themselves for failing to live up to what you told them they’d be able to do.

If you were wondering why Millennials are so depressed, that’s clearly a big part of it: We were told we could be and do whatever we wanted if we worked hard enough, and then that didn’t happen; and we had so internalized what we were told that we thought it had to be our fault that we failed. We didn’t try hard enough. We weren’t good enough. I have spent years feeling this way—on some level I do still feel this way—and it was not because adults tried to crush my dreams when I was a child, but on the contrary because they didn’t do anything to temper them. They never told me that life is hard, and people fail, and that I would probably fail at my most ambitious goals—and it wouldn’t be my fault, and it would still turn out okay.

That’s really it, I think: They never told me that it’s okay not to be wildly successful. They never told me that I’d still be good enough even if I never had any great world-class accomplishments. Instead, they kept feeding me the lie that I would have great world-class accomplishments; and then, when I didn’t, I felt like a failure and I hated myself. I think my own experience may be particularly extreme in this regard, but I know a lot of other people in my generation who had similar experiences, especially those who were also considered “gifted” as children. And we are all now suffering from depression, anxiety, and Impostor Syndrome.

All because nobody wanted to admit that talent, effort, and success are not the same thing.

Scalability and inequality

May 15 JDN 2459715

Why are some molecules (e.g. DNA) billions of times larger than others (e.g. H2O), but all atoms are within a much narrower range of sizes (only a few hundred)?

Why are some animals (e.g. elephants) millions of times as heavy as other (e.g. mice), but their cells are basically the same size?

Why does capital income vary so much more (factors of thousands or millions) than wages (factors of tens or hundreds)?

These three questions turn out to have much the same answer: Scalability.

Atoms are not very scalable: Adding another proton to a nucleus causes interactions with all the other protons, which makes the whole atom unstable after a hundred protons or so. But molecules, particularly organic polymers such as DNA, are tremendously scalable: You can add another piece to one end without affecting anything else in the molecule, and keep on doing that more or less forever.

Cells are not very scalable: Even with the aid of active transport mechanisms and complex cellular machinery, a cell’s functionality is still very much limited by its surface area. But animals are tremendously scalable: The same exponential growth that got you from a zygote to a mouse only needs to continue a couple years longer and it’ll get you all the way to an elephant. (A baby elephant, anyway; an adult will require a dozen or so years—remarkably comparable to humans, in fact.)

Labor income is not very scalable: There are only so many hours in a day, and the more hours you work the less productive you’ll be in each additional hour. But capital income is perfectly scalable: We can add another digit to that brokerage account with nothing more than a few milliseconds of electronic pulses, and keep doing that basically forever (due to the way integer storage works, above 2^63 it would require special coding, but it can be done; and seeing as that’s over 9 quintillion, it’s not likely to be a problem any time soon—though I am vaguely tempted to write a short story about an interplanetary corporation that gets thrown into turmoil by an integer overflow error).

This isn’t just an effect of our accounting either. Capital is scalable in a way that labor is not. When your contribution to production is owning a factory, there’s really nothing to stop you from owning another factory, and then another, and another. But when your contribution is working at a factory, you can only work so hard for so many hours.

When a phenomenon is highly scalable, it can take on a wide range of outcomes—as we see in molecules, animals, and capital income. When it’s not, it will only take on a narrow range of outcomes—as we see in atoms, cells, and labor income.

Exponential growth is also part of the story here: Animals certainly grow exponentially, and so can capital when invested; even some polymers function that way (e.g. under polymerase chain reaction). But I think the scalability is actually more important: Growing rapidly isn’t so useful if you’re going to immediately be blocked by a scalability constraint. (This actually relates to the difference between r- and K- evolutionary strategies, and offers further insight into the differences between mice and elephants.) Conversely, even if you grow slowly, given enough time, you’ll reach whatever constraint you’re up against.

Indeed, we can even say something about the probability distribution we are likely to get from random processes that are scalable or non-scalable.

A non-scalable random process will generally converge toward the familiar normal distribution, a “bell curve”:

[Image from Wikipedia: By Inductiveload – self-made, Mathematica, Inkscape, Public Domain, https://commons.wikimedia.org/w/index.php?curid=3817954]

The normal distribution has most of its weight near the middle; most of the population ends up near there. This is clearly the case for labor income: Most people are middle class, while some are poor and a few are rich.

But a scalable random process will typically converge toward quite a different distribution, a Pareto distribution:

[Image from Wikipedia: By Danvildanvil – Own work, CC BY-SA 3.0, https://commons.wikimedia.org/w/index.php?curid=31096324]

A Pareto distribution has most of its weight near zero, but covers an extremely wide range. Indeed it is what we call fat tailed, meaning that really extreme events occur often enough to have a meaningful effect on the average. A Pareto distribution has most of the people at the bottom, but the ones at the top are really on top.

And indeed, that’s exactly how capital income works: Most people have little or no capital income (indeed only about half of Americans and only a third(!) of Brits own any stocks at all), while a handful of hectobillionaires make utterly ludicrous amounts of money literally in their sleep.

Indeed, it turns out that income in general is pretty close to distributed normally (or maybe lognormally) for most of the income range, and then becomes very much Pareto at the top—where nearly all the income is capital income.

This fundamental difference in scalability between capital and labor underlies much of what makes income inequality so difficult to fight. Capital is scalable, and begets more capital. Labor is non-scalable, and we only have to much to give.

It would require a radically different system of capital ownership to really eliminate this gap—and, well, that’s been tried, and so far, it hasn’t worked out so well. Our best option is probably to let people continue to own whatever amounts of capital, and then tax the proceeds in order to redistribute the resulting income. That certainly has its own downsides, but they seem to be a lot more manageable than either unfettered anarcho-capitalism or totalitarian communism.