Social science is broken. Can we fix it?

May 16 JDN 2459349

Social science is broken. I am of course not the first to say so. The Atlantic recently published an article outlining the sorry state of scientific publishing, and several years ago Slate Star Codex published a lengthy post (with somewhat harsher language than I generally use on this blog) showing how parapsychology, despite being obviously false, can still meet the standards that most social science is expected to meet. I myself discussed the replication crisis in social science on this very blog a few years back.

I was pessimistic then about the incentives of scientific publishing be fixed any time soon, and I am even more pessimistic now.

Back then I noted that journals are often run by for-profit corporations that care more about getting attention than getting the facts right, university administrations are incompetent and top-heavy, and publish-or-perish creates cutthroat competition without providing incentives for genuinely rigorous research. But these are widely known facts, even if so few in the scientific community seem willing to face up to them.

Now I am increasingly concerned that the reason we aren’t fixing this system is that the people with the most power to fix it don’t want to. (Indeed, as I have learned more about political economy I have come to believe this more and more about all the broken institutions in the world. American democracy has its deep flaws because politicians like it that way. China’s government is corrupt because that corruption is profitable for many of China’s leaders. Et cetera.)

I know economics best, so that is where I will focus; but most of what I’m saying here would also apply to other social sciences such as sociology and psychology as well. (Indeed it was psychology that published Daryl Bem.)

Rogoff and Reinhart’s 2010 article “Growth in a Time of Debt”, which was a weak correlation-based argument to begin with, was later revealed (by an intrepid grad student! His name is Thomas Herndon.) to be based upon deep, fundamental errors. Yet the article remains published, without any notice of retraction or correction, in the American Economic Review, probably the most prestigious journal in economics (and undeniably in the vaunted “Top Five”). And the paper itself was widely used by governments around the world to justify massive austerity policies—which backfired with catastrophic consequences.

Why wouldn’t the AER remove the article from their website? Or issue a retraction? Or at least add a note on the page explaining the errors? If their primary concern were scientific truth, they would have done something like this. Their failure to do so is a silence that speaks volumes, a hound that didn’t bark in the night.

It’s rational, if incredibly selfish, for Rogoff and Reinhart themselves to not want a retraction. It was one of their most widely-cited papers. But why wouldn’t AER’s editors want to retract a paper that had been so embarrassingly debunked?

And so I came to realize: These are all people who have succeeded in the current system. Their work is valued, respected, and supported by the system of scientific publishing as it stands. If we were to radically change that system, as we would necessarily have to do in order to re-align incentives toward scientific truth, they would stand to lose, because they would suddenly be competing against other people who are not as good at satisfying the magical 0.05, but are in fact at least as good—perhaps even better—actual scientists than they are.

I know how they would respond to this criticism: I’m someone who hasn’t succeeded in the current system, so I’m biased against it. This is true, to some extent. Indeed, I take it quite seriously, because while tenured professors stand to lose prestige, they can’t really lose their jobs even if there is a sudden flood of far superior research. So in directly economic terms, we would expect the bias against the current system among grad students, adjuncts, and assistant professors to be larger than the bias in favor of the current system among tenured professors and prestigious researchers.

Yet there are other motives aside from money: Norms and social status are among the most powerful motivations human beings have, and these biases are far stronger in favor of the current system—even among grad students and junior faculty. Grad school is many things, some good, some bad; but one of them is a ritual gauntlet that indoctrinates you into the belief that working in academia is the One True Path, without which your life is a failure. If your claim is that grad students are upset at the current system because we overestimate our own qualifications and are feeling sour grapes, you need to explain our prevalence of Impostor Syndrome. By and large, grad students don’t overestimate our abilities—we underestimate them. If we think we’re as good at this as you are, that probably means we’re better. Indeed I have little doubt that Thomas Herndon is a better economist than Kenneth Rogoff will ever be.

I have additional evidence that insider bias is important here: When Paul Romer—Nobel laureate—left academia he published an utterly scathing criticism of the state of academic macroeconomics. That is, once he had escaped the incentives toward insider bias, he turned against the entire field.

Romer pulls absolutely no punches: He literally compares the standard methods of DSGE models to “phlogiston” and “gremlins”. And the paper is worth reading, because it’s obviously entirely correct. He pulls no punches and every single one lands on target. It’s also a pretty fun read, at least if you have the background knowledge to appreciate the dry in-jokes. (Much like “Transgressing the Boundaries: Toward a Transformative Hermeneutics of Quantum Gravity.” I still laugh out loud every time I read the phrase “hegemonic Zermelo-Frankel axioms”, though I realize most people would be utterly nonplussed. For the unitiated, these are the Zermelo-Frankel axioms. Can’t you just see the colonialist imperialism in sentences like “\forall x \forall y (\forall z, z \in x \iff z \in y) \implies x = y”?)

In other words, the Upton Sinclair Principle seems to be applying here: “It is difficult to get a man to understand something when his salary depends upon not understanding it.” The people with the most power to change the system of scientific publishing are journal editors and prestigious researchers, and they are the people for whom the current system is running quite swimmingly.

It’s not that good science can’t succeed in the current system—it often does. In fact, I’m willing to grant that it almost always does, eventually. When the evidence has mounted for long enough and the most adamant of the ancien regime finally retire or die, then, at last, the paradigm will shift. But this process takes literally decades longer than it should. In principle, a wrong theory can be invalidated by a single rigorous experiment. In practice, it generally takes about 30 years of experiments, most of which don’t get published, until the powers that be finally give in.

This delay has serious consequences. It means that many of the researchers working on the forefront of a new paradigm—precisely the people that the scientific community ought to be supporting most—will suffer from being unable to publish their work, get grant funding, or even get hired in the first place. It means that not only will good science take too long to win, but that much good science will never get done at all, because the people who wanted to do it couldn’t find the support they needed to do so. This means that the delay is in fact much longer than it appears: Because it took 30 years for one good idea to take hold, all the other good ideas that would have sprung from it in that time will be lost, at least until someone in the future comes up with them.

I don’t think I’ll ever forget it: At the AEA conference a few years back, I went to a luncheon celebrating Richard Thaler, one of the founders of behavioral economics, whom I regard as one of the top 5 greatest economists of the 20th century (I’m thinking something like, “Keynes > Nash > Thaler > Ramsey > Schelling”). Yes, now he is being rightfully recognized for his seminal work; he won a Nobel, and he has an endowed chair at Chicago, and he got an AEA luncheon in his honor among many other accolades. But it was not always so. Someone speaking at the luncheon offhandedly remarked something like, “Did we think Richard would win a Nobel? Honestly most of us weren’t sure he’d get tenure.” Most of the room laughed; I had to resist the urge to scream. If Richard Thaler wasn’t certain to get tenure, then the entire system is broken. This would be like finding out that Erwin Schrodinger or Niels Bohr wasn’t sure he would get tenure in physics.

A. Gary Schilling, a renowned Wall Street economist (read: One Who Has Turned to the Dark Side), once remarked (the quote is often falsely attributed to Keynes): “markets can remain irrational a lot longer than you and I can remain solvent.” In the same spirit, I would say this: the scientific community can remain wrong a lot longer than you and I can extend our graduate fellowships and tenure clocks.

Did the World Bank modify its ratings to manipulate the outcome of an election in Chile?

Jan 21 JDN 2458140

(By the way, my birthday is January 19. I can’t believe I’m turning 30.)

This is a fairly obscure news item, so you may have missed it. It should be bigger news than it is.
I can’t fault the New York Times for having its front page focus mainly on the false missile alert that was issued to some people in Hawaii; a false alarm of nuclear attack definitely is the most important thing that could be going on in the world, short of course of actual nuclear war.

CNN, on the other hand, is focused entirely on Trump. When I first wrote this post, they were also focused on Trump, mainly interested in asking whether Trump’s comments about “immigrants from shithole countries” was racist. My answer: Yes, but not because he said the countries were “shitholes”. That was crude, yes, but not altogether inaccurate. Countries like Syria, Afghanistan, and Sudan are, by any objective measure, terrible places. His comments were racist because they attributed that awfulness to the people leaving these countries. But in fact we have a word for immigrants who flee terrible places seeking help and shelter elsewhere: Refugees. We call those people refugees. There are over 10 million refugees in the world today, most of them from Syria.

So anyway, here’s the news item you should have heard about but probably didn’t: The Chief Economist of the World Bank (Paul Romer, who coincidentally I mentioned in my post about DSGE models) has opened an investigation into the possibility that the World Bank’s ratings of economic freedom were intentionally manipulated in order to tilt a Presidential election in Chile.

The worst part is, it may have worked: Chile’s “Doing Business” rating consistently fell under President Michelle Bachelet and rose under President Sebastian Piñera, and Piñera won the most recent election. Was that the reason he won? Who knows? I’m still not entirely clear on how we ended up with President Trump. But it very likely contributed.

The World Bank is supposed to be an impartial institution representing the interests of global economic development. I’m not naive; I recognize that no human institution is perfect, and there will always be competing political and economic interests within any complex institution. Development economists are subject to cognitive biases just like anyone else. If this was the work of a handful of economic analysts (or if Romer turns out to be wrong and the changes in statistical methodology were totally reasonable), so be it; let’s make sure that the bias is corrected and the analysts involved are punished.

But I fear that the rot may run deeper than this. The World Bank is effectively a form of unelected international government. It has been accused of inherent pro-capitalist (or even racist) bias due to the fact that Western governments are overrepresented in its governance, but I actually consider that accusation unfair: There are very good reasons to make sure that your international institutions are managed by liberal democracies, and turns out that most of the world’s liberal democracies are Western. The fact that the US, France, Germany, and the UK make most of the decisions is entirely sensible: Those are in fact the countries we should want making global decisions.

China is not underrepresented, because China is not a democracy and doesn’t deserve to be represented. They are already more represented in the World Bank than they should be, because representing the PRC is not actually representing the interests of the people of China. Russia and Saudi Arabia are undeniably overrepresented. India is underrepresented; they should be complaining. Some African democracies, such as Namibia and Botswana, would also have a legitimate claim to underrepresentation. But I don’t lose any sleep over the fact that Zimbabwe and Iran aren’t getting votes in the World Bank. If and when those countries actually start representing their people, then we can talk about giving them representation in world government. I don’t see how refusing to give international authority to dictators and theocrats constitutes racism or pro-capitalist bias.

That said, there are other reasons to think that the World Bank might actually have some sort of pro-capitalist bias. The World Bank was instrumental in forming the Washington Consensus, which opened free trade and increase economic growth worldwide, but also exposed many poor countries to risk from deregulated financial markets and undermined social safety nets through fiscal austerity programs. They weren’t wrong to want more free trade, and many of their reforms did make sense; but they were at best wildly overconfident in their policy prescriptions, and at worst willing to sacrifice people in poor countries at the altar of bank profits. World poverty has in fact fallen by about half since 1990, and the World Bank has a lot to do with that. But things may have gone faster and smoother if they hadn’t insisted on removing so many financial regulations so quickly without clear forecasts of what would happen. I don’t share Jason Hickel’s pessimistic view that the World Bank’s failures were intentional acts toward an ulterior agenda, but I can see how it begins to look that way when they keep failing the same ways over and over again. (I instead invoke Hanlon’s razor: “Never attribute to malice that which is adequately explained by stupidity.”)

There are also reports of people facing retaliation for criticizing World Bank projects, including those within the World Bank who raise ethical concerns. If this was politically-motivated data manipulation, there may have been people who saw it happening, but were afraid to say anything for fear of being fired or worse.

And Chile in particular has reason to be suspicious. The World Bank suddenly started giving loans to Chile when Augusto Pinochet took power (the CIA denies supporting the coup, by the way—though, given the source, I can understand why one would take that with a grain of salt), and did so under the explicit reasoning that an authoritarian capitalist regime was somehow “more trustworthy” than a democratic socialist regime. Even in the narrow sense of financial creditworthiness that seems difficult to defend; the World Bank knew almost nothing about what kind of government Pinochet was going to create, and in fact despite the so-called “Miracle of Chile”, rapid economic growth in Chile didn’t really happen until the 1990s, after Chile became a democratic capitalist regime.

What I’m really getting at here is that the World Bank has a lot to answer for. I am prepared to believe that most of these actions were honest mistakes or ideological blinders, rather than corruption or cruelty; but even so, when millions of lives are at stake, even honest mistakes aren’t so forgivable. They should be looking for ways to improve their internal governance to make sure that mistakes are caught and corrected quickly. They should be constantly vigilant for biases—either intentional or otherwise—that might seep into their research. Error should be met with immediate correction and public apology; malfeasance should be met with severe punishment.

Perhaps Romer’s investigation actually signals a shift toward such a policy. If so, this is a very good thing. If only we had done this, say, thirty years ago.

“DSGE or GTFO”: Macroeconomics took a wrong turn somewhere

Dec 31, JDN 2458119

The state of macro is good,” wrote Oliver Blanchard—in August 2008. This is rather like the turkey who is so pleased with how the farmer has been feeding him lately, the day before Thanksgiving.

It’s not easy to say exactly where macroeconomics went wrong, but I think Paul Romer is right when he makes the analogy between DSGE (dynamic stochastic general equilbrium) models and string theory. They are mathematically complex and difficult to understand, and people can make their careers by being the only ones who grasp them; therefore they must be right! Nevermind if they have no empirical support whatsoever.

To be fair, DSGE models are at least a little better than string theory; they can at least be fit to real-world data, which is better than string theory can say. But being fit to data and actually predicting data are fundamentally different things, and DSGE models typically forecast no better than far simpler models without their bold assumptions. You don’t need to assume all this stuff about a “representative agent” maximizing a well-defined utility function, or an Euler equation (that doesn’t even fit the data), or this ever-proliferating list of “random shocks” that end up taking up all the degrees of freedom your model was supposed to explain. Just regressing the variables on a few years of previous values of each other (a “vector autoregression” or VAR) generally gives you an equally-good forecast. The fact that these models can be made to fit the data well if you add enough degrees of freedom doesn’t actually make them good models. As Von Neumann warned us, with enough free parameters, you can fit an elephant.

But really what bothers me is not the DSGE but the GTFO (“get the [expletive] out”); it’s not that DSGE models are used, but that it’s almost impossible to get published as a macroeconomic theorist using anything else. Defenders of DSGE typically don’t even argue anymore that it is good; they argue that there are no credible alternatives. They characterize their opponents as “dilettantes” who aren’t opposing DSGE because we disagree with it; no, it must be because we don’t understand it. (Also, regarding that post, I’d just like to note that I now officially satisfy the Athreya Axiom of Absolute Arrogance: I have passed my qualifying exams in a top-50 economics PhD program. Yet my enmity toward DSGE has, if anything, only intensified.)

Of course, that argument only makes sense if you haven’t been actively suppressing all attempts to formulate an alternative, which is precisely what DSGE macroeconomists have been doing for the last two or three decades. And yet despite this suppression, there are alternatives emerging, particularly from the empirical side. There are now empirical approaches to macroeconomics that don’t use DSGE models. Regression discontinuity methods and other “natural experiment” designs—not to mention actual experiments—are quickly rising in popularity as economists realize that these methods allow us to actually empirically test our models instead of just adding more and more mathematical complexity to them.

But there still seems to be a lingering attitude that there is no other way to do macro theory. This is very frustrating for me personally, because deep down I think what I would like to do as a career is macro theory: By temperament I have always viewed the world through a very abstract, theoretical lens, and the issues I care most about—particularly inequality, development, and unemployment—are all fundamentally “macro” issues. I left physics when I realized I would be expected to do string theory. I don’t want to leave economics now that I’m expected to do DSGE. But I also definitely don’t want to do DSGE.

Fortunately with economics I have a backup plan: I can always be an “applied micreconomist” (rather the opposite of a theoretical macroeconomist I suppose), directly attached to the data in the form of empirical analyses or even direct, randomized controlled experiments. And there certainly is plenty of work to be done along the lines of Akerlof and Roth and Shiller and Kahneman and Thaler in cognitive and behavioral economics, which is also generally considered applied micro. I was never going to be an experimental physicist, but I can be an experimental economist. And I do get to use at least some theory: In particular, there’s an awful lot of game theory in experimental economics these days. Some of the most exciting stuff is actually in showing how human beings don’t behave the way classical game theory predicts (particularly in the Ultimatum Game and the Prisoner’s Dilemma), and trying to extend game theory into something that would fit our actual behavior. Cognitive science suggests that the result is going to end up looking quite different from game theory as we know it, and with my cognitive science background I may be particularly well-positioned to lead that charge.

Still, I don’t think I’ll be entirely satisfied if I can’t somehow bring my career back around to macroeconomic issues, and particularly the great elephant in the room of all economics, which is inequality. Underlying everything from Marxism to Trumpism, from the surging rents in Silicon Valley and the crushing poverty of Burkina Faso, to the Great Recession itself, is inequality. It is, in my view, the central question of economics: Who gets what, and why?

That is a fundamentally macro question, but you can’t even talk about that issue in DSGE as we know it; a “representative agent” inherently smooths over all inequality in the economy as though total GDP were all that mattered. A fundamentally new approach to macroeconomics is needed. Hopefully I can be part of that, but from my current position I don’t feel much empowered to fight this status quo. Maybe I need to spend at least a few more years doing something else, making a name for myself, and then I’ll be able to come back to this fight with a stronger position.

In the meantime, I guess there’s plenty of work to be done on cognitive biases and deviations from game theory.