What behavioral economics needs

Apr 16 JDN 2460049

The transition from neoclassical to behavioral economics has been a vital step forward in science. But lately we seem to have reached a plateau, with no major advances in the paradigm in quite some time.

It could be that there is work already being done which will, in hindsight, turn out to be significant enough to make that next step forward. But my fear is that we are getting bogged down by our own methodological limitations.

Neoclassical economics shared with us its obsession with mathematical sophistication. To some extent this was inevitable; in order to impress neoclassical economists enough to convert some of them, we had to use fancy math. We had to show that we could do it their way in order to convince them why we shouldn’t—otherwise, they’d just have dismissed us the way they had dismissed psychologists for decades, as too “fuzzy-headed” to do the “hard work” of putting everything into equations.

But the truth is, putting everything into equations was never the right approach. Because human beings clearly don’t think in equations. Once we write down a utility function and get ready to take its derivative and set it equal to zero, we have already distanced ourselves from how human thought actually works.

When dealing with a simple physical system, like an atom, equations make sense. Nobody thinks that the electron knows the equation and is following it intentionally. That equation simply describes how the forces of the universe operate, and the electron is subject to those forces.

But human beings do actually know things and do things intentionally. And while an equation could be useful for analyzing human behavior in the aggregate—I’m certainly not objecting to statistical analysis—it really never made sense to say that people make their decisions by optimizing the value of some function. Most people barely even know what a function is, much less remember calculus well enough to optimize one.

Yet right now, behavioral economics is still all based in that utility-maximization paradigm. We don’t use the same simplistic utility functions as neoclassical economists; we make them more sophisticated and realistic. Yet in that very sophistication we make things more complicated, more difficult—and thus in at least that respect, even further removed from how actual human thought must operate.

The worst offender here is surely Prospect Theory. I recognize that Prospect Theory predicts human behavior better than conventional expected utility theory; nevertheless, it makes absolutely no sense to suppose that human beings actually do some kind of probability-weighting calculation in their heads when they make judgments. Most of my students—who are well-trained in mathematics and economics—can’t even do that probability-weighting calculation on paper, with a calculator, on an exam. (There’s also absolutely no reason to do it! All it does it make your decisions worse!) This is a totally unrealistic model of human thought.

This is not to say that human beings are stupid. We are still smarter than any other entity in the known universe—computers are rapidly catching up, but they haven’t caught up yet. It is just that whatever makes us smart must not be easily expressible as an equation that maximizes a function. Our thoughts are bundles of heuristics, each of which may be individually quite simple, but all of which together make us capable of not only intelligence, but something computers still sorely, pathetically lack: wisdom. Computers optimize functions better than we ever will, but we still make better decisions than they do.

I think that what behavioral economics needs now is a new unifying theory of these heuristics, which accounts for not only how they work, but how we select which one to use in a given situation, and perhaps even where they come from in the first place. This new theory will of course be complex; there’s a lot of things to explain, and human behavior is a very complex phenomenon. But it shouldn’t be—mustn’t be—reliant on sophisticated advanced mathematics, because most people can’t do advanced mathematics (almost by construction—we would call it something different otherwise). If your model assumes that people are taking derivatives in their heads, your model is already broken. 90% of the world’s people can’t take a derivative.

I guess it could be that our cognitive processes in some sense operate as if they are optimizing some function. This is commonly posited for the human motor system, for instance; clearly baseball players aren’t actually solving differential equations when they throw and catch balls, but the trajectories that balls follow do in fact obey such equations, and the reliability with which baseball players can catch and throw suggests that they are in some sense acting as if they can solve them.

But I think that a careful analysis of even this classic example reveals some deeper insights that should call this whole notion into question. How do baseball players actually do what they do? They don’t seem to be calculating at all—in fact, if you asked them to try to calculate while they were playing, it would destroy their ability to play. They learn. They engage in practiced motions, acquire skills, and notice patterns. I don’t think there is anywhere in their brains that is actually doing anything like solving a differential equation. It’s all a process of throwing and catching, throwing and catching, over and over again, watching and remembering and subtly adjusting.

One thing that is particularly interesting to me about that process is that is astonishingly flexible. It doesn’t really seem to matter what physical process you are interacting with; as long as it is sufficiently orderly, such a method will allow you to predict and ultimately control that process. You don’t need to know anything about differential equations in order to learn in this way—and, indeed, I really can’t emphasize this enough, baseball players typically don’t.

In fact, learning is so flexible that it can even perform better than calculation. The usual differential equations most people would think to use to predict the throw of a ball would assume ballistic motion in a vacuum, which absolutely not what a curveball is. In order to throw a curveball, the ball must interact with the air, and it must be launched with spin; curving a baseball relies very heavily on the Magnus Effect. I think it’s probably possible to construct an equation that would fully predict the motion of a curveball, but it would be a tremendously complicated one, and might not even have an exact closed-form solution. In fact, I think it would require solving the Navier-Stokes equations, for which there is an outstanding Millennium Prize. Since the viscosity of air is very low, maybe you could get away with approximating using the Euler fluid equations.

To be fair, a learning process that is adapting to a system that obeys an equation will yield results that become an ever-closer approximation of that equation. And it is in that sense that a baseball player can be said to be acting as if solving a differential equation. But this relies heavily on the system in question being one that obeys an equation—and when it comes to economic systems, is that even true?

What if the reason we can’t find a simple set of equations that accurately describe the economy (as opposed to equations of ever-escalating complexity that still utterly fail to describe the economy) is that there isn’t one? What if the reason we can’t find the utility function people are maximizing is that they aren’t maximizing anything?

What behavioral economics needs now is a new approach, something less constrained by the norms of neoclassical economics and more aligned with psychology and cognitive science. We should be modeling human beings based on how they actually think, not some weird mathematical construct that bears no resemblance to human reasoning but is designed to impress people who are obsessed with math.

I’m of course not the first person to have suggested this. I probably won’t be the last, or even the one who most gets listened to. But I hope that I might get at least a few more people to listen to it, because I have gone through the mathematical gauntlet and earned my bona fides. It is too easy to dismiss this kind of reasoning from people who don’t actually understand advanced mathematics. But I do understand differential equations—and I’m telling you, that’s not how people think.

The vector geometry of value change

Post 239: May 20 JDN 2458259

This post is one of those where I’m trying to sort out my own thoughts on an ongoing research project, so it’s going to be a bit more theoretical than most, but I’ll try to spare you the mathematical details.

People often change their minds about things; that should be obvious enough. (Maybe it’s not as obvious as it might be, as the brain tends to erase its prior beliefs as wastes of data storage space.)

Most of the ways we change our minds are fairly minor: We get corrected about Napoleon’s birthdate, or learn that George Washington never actually chopped down any cherry trees, or look up the actual weight of an average African elephant and are surprised.

Sometimes we change our minds in larger ways: We realize that global poverty and violence are actually declining, when we thought they were getting worse; or we learn that climate change is actually even more dangerous than we thought.

But occasionally, we change our minds in an even more fundamental way: We actually change what we care about. We convert to a new religion, or change political parties, or go to college, or just read some very compelling philosophy books, and come out of it with a whole new value system.

Often we don’t anticipate that our values are going to change. That is important and interesting in its own right, but I’m going to set it aside for now, and look at a different question: What about the cases where we know our values are going to change?
Can it ever be rational for someone to choose to adopt a new value system?

Yes, it can—and I can put quite tight constraints on precisely when.

Here’s the part where I hand-wave the math, but imagine for a moment there are only two goods in the world that anyone would care about. (This is obviously vastly oversimplified, but it’s easier to think in two dimensions to make the argument, and it generalizes to n dimensions easily from there.) Maybe you choose a job caring only about money and integrity, or design policy caring only about security and prosperity, or choose your diet caring only about health and deliciousness.

I can then represent your current state as a vector, a two dimensional object with a length and a direction. The length describes how happy you are with your current arrangement. The direction describes your values—the direction of the vector characterizes the trade-off in your mind of how much you care about each of the two goods. If your vector is pointed almost entirely parallel with health, you don’t much care about deliciousness. If it’s pointed mostly at integrity, money isn’t that important to you.

This diagram shows your current state as a green vector.

vector1

Now suppose you have the option of taking some action that will change your value system. If that’s all it would do and you know that, you wouldn’t accept it. You will be no better off, and your value system will be different, which is bad from your current perspective. So here, you would not choose to move to the red vector:

vector2

But suppose that the action would change your value system, and make you better off. Now the red vector is longer than the green vector. Should you choose the action?

vector3

It’s not obvious, right? From the perspective of your new self, you’ll definitely be better off, and that seems good. But your values will change, and maybe you’ll start caring about the wrong things.

I realized that the right question to ask is whether you’ll be better off from your current perspective. If you and your future self both agree that this is the best course of action, then you should take it.

The really cool part is that (hand-waving the math again) it’s possible to work this out as a projection of the new vector onto the old vector. A large change in values will be reflected as a large angle between the two vectors; to compensate for that you need a large change in length, reflecting a greater improvement in well-being.

If the projection of the new vector onto the old vector is longer than the old vector itself, you should accept the value change.

vector4
If the projection of the new vector onto the old vector is shorter than the old vector, you should not accept the value change.

vector5

This captures the trade-off between increased well-being and changing values in a single number. It fits the simple intuitions that being better off is good, and changing values more is bad—but more importantly, it gives us a way of directly comparing the two on the same scale.

This is a very simple model with some very profound implications. One is that certain value changes are impossible in a single step: If a value change would require you to take on values that are completely orthogonal or diametrically opposed to your own, no increase in well-being will be sufficient.

It doesn’t matter how long I make this red vector, the projection onto the green vector will always be zero. If all you care about is money, no amount of integrity will entice you to change.

vector6

But a value change that was impossible in a single step can be feasible, even easy, if conducted over a series of smaller steps. Here I’ve taken that same impossible transition, and broken it into five steps that now make it feasible. By offering a bit more money for more integrity, I’ve gradually weaned you into valuing integrity above all else:

vector7

This provides a formal justification for the intuitive sense many people have of a “moral slippery slope” (commonly regarded as a fallacy). If you make small concessions to an argument that end up changing your value system slightly, and continue to do so many times, you could end up with radically different beliefs at the end, even diametrically opposed to your original beliefs. Each step was rational at the time you took it, but because you changed yourself in the process, you ended up somewhere you would not have wanted to go.

This is not necessarily a bad thing, however. If the reason you made each of those changes was actually a good one—you were provided with compelling evidence and arguments to justify the new beliefs—then the whole transition does turn out to be a good thing, even though you wouldn’t have thought so at the time.

This also allows us to formalize the notion of “inferential distance”: the inferential distance is the number of steps of value change required to make someone understand your point of view. It’s a function of both the difference in values and the difference in well-being between their point of view and yours.

Another key insight is that if you want to persuade someone to change their mind, you need to do it slowly, with small changes repeated many times, and you need to benefit them at each step. You can only persuade someone to change their minds if they will end up better off than they were at each step.

Is this an endorsement of wishful thinking? Not if we define “well-being” in the proper way. It can make me better off in a deep sense to realize that my wishful thinking was incorrect, so that I realize what must be done to actually get the good things I thought I already had.  It’s not necessary to appeal to material benefits; it’s necessary to appeal to current values.

But it does support the notion that you can’t persuade someone by belittling them. You won’t convince people to join your side by telling them that they are defective and bad and should feel guilty for being who they are.

If that seems obvious, well, maybe you should talk to some of the people who are constantly pushing “White privilege”. If you focused on how reducing racism would make people—even White people—better off, you’d probably be more effective. In some cases there would be direct material benefits: Racism creates inefficiency in markets that reduces overall output. But in other cases, sure, maybe there’s no direct benefit for the person you’re talking to; but you can talk about other sorts of benefits, like what sort of world they want to live in, or how proud they would feel to be part of the fight for justice. You can say all you want that they shouldn’t need this kind of persuasion, they should already believe and do the right thing—and you might even be right about that, in some ultimate sense—but do you want to change their minds or not? If you actually want to change their minds, you need to meet them where they are, make small changes, and offer benefits at each step.

If you don’t, you’ll just keep on projecting a vector orthogonally, and you’ll keep ending up with zero.

What good are macroeconomic models? How could they be better?

Dec 11, JDN 2457734

One thing that I don’t think most people know, but which immediately obvious to any student of economics at the college level or above, is that there is a veritable cornucopia of different macroeconomic models. There are growth models (the Solow model, the Harrod-Domar model, the Ramsey model), monetary policy models (IS-LM, aggregate demand-aggregate supply), trade models (the Mundell-Fleming model, the Heckscher-Ohlin model), large-scale computational models (dynamic stochastic general equilibrium, agent-based computational economics), and I could go on.

This immediately raises the question: What are all these models for? What good are they?

A cynical view might be that they aren’t useful at all, that this is all false mathematical precision which makes economics persuasive without making it accurate or useful. And with such a proliferation of models and contradictory conclusions, I can see why such a view would be tempting.

But many of these models are useful, at least in certain circumstances. They aren’t completely arbitrary. Indeed, one of the litmus tests of the last decade has been how well the models held up against the events of the Great Recession and following Second Depression. The Keynesian and cognitive/behavioral models did rather well, albeit with significant gaps and flaws. The Monetarist, Real Business Cycle, and most other neoclassical models failed miserably, as did Austrian and Marxist notions so fluid and ill-defined that I’m not sure they deserve to even be called “models”. So there is at least some empirical basis for deciding what assumptions we should be willing to use in our models. Yet even if we restrict ourselves to Keynesian and cognitive/behavioral models, there are still a great many to choose from, which often yield inconsistent results.

So let’s compare with a science that is uncontroversially successful: Physics. How do mathematical models in physics compare with mathematical models in economics?

Well, there are still a lot of models, first of all. There’s the Bohr model, the Schrodinger equation, the Dirac equation, Newtonian mechanics, Lagrangian mechanics, Bohmian mechanics, Maxwell’s equations, Faraday’s law, Coulomb’s law, the Einstein field equations, the Minkowsky metric, the Schwarzschild metric, the Rindler metric, Feynman-Wheeler theory, the Navier-Stokes equations, and so on. So a cornucopia of models is not inherently a bad thing.

Yet, there is something about physics models that makes them more reliable than economics models.

Partly it is that the systems physicists study are literally two dozen orders of magnitude or more smaller and simpler than the systems economists study. Their task is inherently easier than ours.

But it’s not just that; their models aren’t just simpler—actually they often aren’t. The Navier-Stokes equations are a lot more complicated than the Solow model. They’re also clearly a lot more accurate.

The feature that models in physics seem to have that models in economics do not is something we might call nesting, or maybe consistency. Models in physics don’t come out of nowhere; you can’t just make up your own new model based on whatever assumptions you like and then start using it—which you very much can do in economics. Models in physics are required to fit consistently with one another, and usually inside one another, in the following sense:

The Dirac equation strictly generalizes the Schrodinger equation, which strictly generalizes the Bohr model. Bohmian mechanics is consistent with quantum mechanics, which strictly generalizes Lagrangian mechanics, which generalizes Newtonian mechanics. The Einstein field equations are consistent with Maxwell’s equations and strictly generalize the Minkowsky, Schwarzschild, and Rindler metrics. Maxwell’s equations strictly generalize Faraday’s law and Coulomb’s law.
In other words, there are a small number of canonical models—the Dirac equation, Maxwell’s equations and the Einstein field equation, essentially—inside which all other models are nested. The simpler models like Coulomb’s law and Newtonian mechanics are not contradictory with these canonical models; they are contained within them, subject to certain constraints (such as macroscopic systems far below the speed of light).

This is something I wish more people understood (I blame Kuhn for confusing everyone about what paradigm shifts really entail); Einstein did not overturn Newton’s laws, he extended them to domains where they previously had failed to apply.

This is why it is sensible to say that certain theories in physics are true; they are the canonical models that underlie all known phenomena. Other models can be useful, but not because we are relativists about truth or anything like that; Newtonian physics is a very good approximation of the Einstein field equations at the scale of many phenomena we care about, and is also much more mathematically tractable. If we ever find ourselves in situations where Newton’s equations no longer apply—near a black hole, traveling near the speed of light—then we know we can fall back on the more complex canonical model; but when the simpler model works, there’s no reason not to use it.

There are still very serious gaps in the knowledge of physics; in particular, there is a fundamental gulf between quantum mechanics and the Einstein field equations that has been unresolved for decades. A solution to this “quantum gravity problem” would be essentially a guaranteed Nobel Prize. So even a canonical model can be flawed, and can be extended or improved upon; the result is then a new canonical model which we now regard as our best approximation to truth.

Yet the contrast with economics is still quite clear. We don’t have one or two or even ten canonical models to refer back to. We can’t say that the Solow model is an approximation of some greater canonical model that works for these purposes—because we don’t have that greater canonical model. We can’t say that agent-based computational economics is approximately right, because we have nothing to approximate it to.

I went into economics thinking that neoclassical economics needed a new paradigm. I have now realized something much more alarming: Neoclassical economics doesn’t really have a paradigm. Or if it does, it’s a very informal paradigm, one that is expressed by the arbitrary judgments of journal editors, not one that can be written down as a series of equations. We assume perfect rationality, except when we don’t. We assume constant returns to scale, except when that doesn’t work. We assume perfect competition, except when that doesn’t get the results we wanted. The agents in our models are infinite identical psychopaths, and they are exactly as rational as needed for the conclusion I want.

This is quite likely why there is so much disagreement within economics. When you can permute the parameters however you like with no regard to a canonical model, you can more or less draw whatever conclusion you want, especially if you aren’t tightly bound to empirical evidence. I know a great many economists who are sure that raising minimum wage results in large disemployment effects, because the models they believe in say that it must, even though the empirical evidence has been quite clear that these effects are small if they are present at all. If we had a canonical model of employment that we could calibrate to the empirical evidence, that couldn’t happen anymore; there would be a coefficient I could point to that would refute their argument. But when every new paper comes with a new model, there’s no way to do that; one set of assumptions is as good as another.

Indeed, as I mentioned in an earlier post, a remarkable number of economists seem to embrace this relativism. “There is no true model.” they say; “We do what is useful.” Recently I encountered a book by the eminent economist Deirdre McCloskey which, though I confess I haven’t read it in its entirety, appears to be trying to argue that economics is just a meaningless language game that doesn’t have or need to have any connection with actual reality. (If any of you have read it and think I’m misunderstanding it, please explain. As it is I haven’t bought it for a reason any economist should respect: I am disinclined to incentivize such writing.)

Creating such a canonical model would no doubt be extremely difficult. Indeed, it is a task that would require the combined efforts of hundreds of researchers and could take generations to achieve. The true equations that underlie the economy could be totally intractable even for our best computers. But quantum mechanics wasn’t built in a day, either. The key challenge here lies in convincing economists that this is something worth doing—that if we really want to be taken seriously as scientists we need to start acting like them. Scientists believe in truth, and they are trying to find it out. While not immune to tribalism or ideology or other human limitations, they resist them as fiercely as possible, always turning back to the evidence above all else. And in their combined strivings, they attempt to build a grand edifice, a universal theory to stand the test of time—a canonical model.