Argumentum ab scientia is not argumentum baculo: The difference between authority and expertise

May 7, JDN 2457881

Americans are, on the whole, suspicious of authority. This is a very good thing; it shields us against authoritarianism. But it comes with a major downside, which is a tendency to forget the distinction between authority and expertise.

Argument from authority is an informal fallacy, argumentum baculo. The fact that something was said by the Pope, or the President, or the General Secretary of the UN, doesn’t make it true. (Aside: You’re probably more familiar with the phrase argumentum ad baculum, which is terrible Latin. That would mean “argument toward a stick”, when clearly the intended meaning was “argument by means of a stick”, which is argumentum baculo.)

But argument from expertise, argumentum ab scientia, is something quite different. The world is much too complicated for any one person to know everything about everything, so we have no choice but to specialize our knowledge, each of us becoming an expert in only a few things. So if you are not an expert in a subject, when someone who is an expert in that subject tells you something about that subject, you should probably believe them.

You should especially be prepared to believe them when the entire community of experts is in consensus or near-consensus on a topic. The scientific consensus on climate change is absolutely overwhelming. Is this a reason to believe in climate change? You’re damn right it is. Unless you have years of education and experience in understanding climate models and atmospheric data, you have no basis for challenging the expert consensus on this issue.

This confusion has created a deep current of anti-intellectualism in our culture, as Isaac Asimov famously recognized:

There is a cult of ignorance in the United States, and there always has been. The strain of anti-intellectualism has been a constant thread winding its way through our political and cultural life, nurtured by the false notion that democracy means that “my ignorance is just as good as your knowledge.”

This is also important to understand if you have heterodox views on any scientific topic. The fact that the whole field disagrees with you does not prove that you are wrong—but it does make it quite likely that you are wrong. Cranks often want to compare themselves to Galileo or Einstein, but here’s the thing: Galileo and Einstein didn’t act like cranks. They didn’t expect the scientific community to respect their ideas before they had gathered compelling evidence in their favor.

When behavioral economists found that neoclassical models of human behavior didn’t stand up to scrutiny, did they shout from the rooftops that economics is all a lie? No, they published their research in peer-reviewed journals, and talked with economists about the implications of their results. There may have been times when they felt ignored or disrespected by the mainstream, but they pressed on, because the data was on their side. And ultimately, the mainstream gave in: Daniel Kahneman won the Nobel Prize in Economics.

Experts are not always right, that is true. But they are usually right, and if you think they are wrong you’d better have a good reason to think so. The best reasons are the sort that come about when you yourself have spent the time and effort to become an expert, able to challenge the consensus on its own terms.

Admittedly, that is a very difficult thing to do—and more difficult than it should be. I have seen firsthand how difficult and painful the slow grind toward a PhD can be, and how many obstacles will get thrown in your way, ranging from nepotism and interdepartmental politics, to discrimination against women and minorities, to mismatches of interest between students and faculty, all the way to illness, mental health problems, and the slings and arrows of outrageous fortune in general. If you have particularly heterodox ideas, you may face particularly harsh barriers, and sometimes it behooves you to hold your tongue and toe the lie awhile.

But this is no excuse not to gain expertise. Even if academia itself is not available to you, we live in an age of unprecedented availability of information—it’s not called the Information Age for nothing. A sufficiently talented and dedicated autodidact can challenge the mainstream, if their ideas are truly good enough. (Perhaps the best example of this is the mathematician savant Srinivasa Ramanujan. But he’s… something else. I think he is about as far from the average genius as the average genius is from the average person.) No, that won’t be easy either. But if you are really serious about advancing human understanding rather than just rooting for your political team (read: tribe), you should be prepared to either take up the academic route or attack it as an autodidact from the outside.

In fact, most scientific fields are actually quite good about admitting what they don’t know. A total consensus that turns out to be wrong is actually a very rare phenomenon; much more common is a clash of multiple competing paradigms where one ultimately wins out, or they end up replaced by a totally new paradigm or some sort of synthesis. In almost all cases, the new paradigm wins not because it becomes fashionable or the ancien regime dies out (as Planck cynically claimed) but because overwhelming evidence is observed in its favor, often in the form of explaining some phenomenon that was previously impossible to understand. If your heterodox theory doesn’t do that, then it probably won’t win, because it doesn’t deserve to.

(Right now you might think of challenging me: Does my heterodox theory do that? Does the tribal paradigm explain things that either total selfishness or total altruism cannot? I think it’s pretty obvious that it does. I mean, you are familiar with a little thing called “racism”, aren’t you? There is no explanation for racism in neoclassical economics; to understand it at all you have to just impose it as an arbitrary term on the utility function. But at that point, why not throw in whatever you please? Maybe some people enjoy bashing their heads against walls, and other people take great pleasure in the taste of arsenic. Why would this particular self- (not to mention other-) destroying behavior be universal to all human societies?)

In practice, I think most people who challenge the mainstream consensus aren’t genuinely interested in finding out the truth—certainly not enough to actually go through the work of doing it. It’s a pattern you can see in a wide range of fringe views: Anti-vaxxers, 9/11 truthers, climate denialists, they all think the same way. The mainstream disagrees with my preconceived ideology, therefore the mainstream is some kind of global conspiracy to deceive us. The overwhelming evidence that vaccination is safe and (wildly) cost-effective, 9/11 was indeed perpetrated by Al Qaeda and neither planned nor anticipated by anyone in the US government , and the global climate is being changed by human greenhouse gas emissions—these things simply don’t matter to them, because it was never really about the truth. They knew the answer before they asked the question. Because their identity is wrapped up in that political ideology, they know it couldn’t possibly be otherwise, and no amount of evidence will change their mind.

How do we reach such people? That, I don’t know. I wish I did. But I can say this much: We can stop taking them seriously when they say that the overwhelming scientific consensus against them is just another “appeal to authority”. It’s not. It never was. It’s an argument from expertise—there are people who know this a lot better than you, and they think you’re wrong, so you’re probably wrong.

What good are macroeconomic models? How could they be better?

Dec 11, JDN 2457734

One thing that I don’t think most people know, but which immediately obvious to any student of economics at the college level or above, is that there is a veritable cornucopia of different macroeconomic models. There are growth models (the Solow model, the Harrod-Domar model, the Ramsey model), monetary policy models (IS-LM, aggregate demand-aggregate supply), trade models (the Mundell-Fleming model, the Heckscher-Ohlin model), large-scale computational models (dynamic stochastic general equilibrium, agent-based computational economics), and I could go on.

This immediately raises the question: What are all these models for? What good are they?

A cynical view might be that they aren’t useful at all, that this is all false mathematical precision which makes economics persuasive without making it accurate or useful. And with such a proliferation of models and contradictory conclusions, I can see why such a view would be tempting.

But many of these models are useful, at least in certain circumstances. They aren’t completely arbitrary. Indeed, one of the litmus tests of the last decade has been how well the models held up against the events of the Great Recession and following Second Depression. The Keynesian and cognitive/behavioral models did rather well, albeit with significant gaps and flaws. The Monetarist, Real Business Cycle, and most other neoclassical models failed miserably, as did Austrian and Marxist notions so fluid and ill-defined that I’m not sure they deserve to even be called “models”. So there is at least some empirical basis for deciding what assumptions we should be willing to use in our models. Yet even if we restrict ourselves to Keynesian and cognitive/behavioral models, there are still a great many to choose from, which often yield inconsistent results.

So let’s compare with a science that is uncontroversially successful: Physics. How do mathematical models in physics compare with mathematical models in economics?

Well, there are still a lot of models, first of all. There’s the Bohr model, the Schrodinger equation, the Dirac equation, Newtonian mechanics, Lagrangian mechanics, Bohmian mechanics, Maxwell’s equations, Faraday’s law, Coulomb’s law, the Einstein field equations, the Minkowsky metric, the Schwarzschild metric, the Rindler metric, Feynman-Wheeler theory, the Navier-Stokes equations, and so on. So a cornucopia of models is not inherently a bad thing.

Yet, there is something about physics models that makes them more reliable than economics models.

Partly it is that the systems physicists study are literally two dozen orders of magnitude or more smaller and simpler than the systems economists study. Their task is inherently easier than ours.

But it’s not just that; their models aren’t just simpler—actually they often aren’t. The Navier-Stokes equations are a lot more complicated than the Solow model. They’re also clearly a lot more accurate.

The feature that models in physics seem to have that models in economics do not is something we might call nesting, or maybe consistency. Models in physics don’t come out of nowhere; you can’t just make up your own new model based on whatever assumptions you like and then start using it—which you very much can do in economics. Models in physics are required to fit consistently with one another, and usually inside one another, in the following sense:

The Dirac equation strictly generalizes the Schrodinger equation, which strictly generalizes the Bohr model. Bohmian mechanics is consistent with quantum mechanics, which strictly generalizes Lagrangian mechanics, which generalizes Newtonian mechanics. The Einstein field equations are consistent with Maxwell’s equations and strictly generalize the Minkowsky, Schwarzschild, and Rindler metrics. Maxwell’s equations strictly generalize Faraday’s law and Coulomb’s law.
In other words, there are a small number of canonical models—the Dirac equation, Maxwell’s equations and the Einstein field equation, essentially—inside which all other models are nested. The simpler models like Coulomb’s law and Newtonian mechanics are not contradictory with these canonical models; they are contained within them, subject to certain constraints (such as macroscopic systems far below the speed of light).

This is something I wish more people understood (I blame Kuhn for confusing everyone about what paradigm shifts really entail); Einstein did not overturn Newton’s laws, he extended them to domains where they previously had failed to apply.

This is why it is sensible to say that certain theories in physics are true; they are the canonical models that underlie all known phenomena. Other models can be useful, but not because we are relativists about truth or anything like that; Newtonian physics is a very good approximation of the Einstein field equations at the scale of many phenomena we care about, and is also much more mathematically tractable. If we ever find ourselves in situations where Newton’s equations no longer apply—near a black hole, traveling near the speed of light—then we know we can fall back on the more complex canonical model; but when the simpler model works, there’s no reason not to use it.

There are still very serious gaps in the knowledge of physics; in particular, there is a fundamental gulf between quantum mechanics and the Einstein field equations that has been unresolved for decades. A solution to this “quantum gravity problem” would be essentially a guaranteed Nobel Prize. So even a canonical model can be flawed, and can be extended or improved upon; the result is then a new canonical model which we now regard as our best approximation to truth.

Yet the contrast with economics is still quite clear. We don’t have one or two or even ten canonical models to refer back to. We can’t say that the Solow model is an approximation of some greater canonical model that works for these purposes—because we don’t have that greater canonical model. We can’t say that agent-based computational economics is approximately right, because we have nothing to approximate it to.

I went into economics thinking that neoclassical economics needed a new paradigm. I have now realized something much more alarming: Neoclassical economics doesn’t really have a paradigm. Or if it does, it’s a very informal paradigm, one that is expressed by the arbitrary judgments of journal editors, not one that can be written down as a series of equations. We assume perfect rationality, except when we don’t. We assume constant returns to scale, except when that doesn’t work. We assume perfect competition, except when that doesn’t get the results we wanted. The agents in our models are infinite identical psychopaths, and they are exactly as rational as needed for the conclusion I want.

This is quite likely why there is so much disagreement within economics. When you can permute the parameters however you like with no regard to a canonical model, you can more or less draw whatever conclusion you want, especially if you aren’t tightly bound to empirical evidence. I know a great many economists who are sure that raising minimum wage results in large disemployment effects, because the models they believe in say that it must, even though the empirical evidence has been quite clear that these effects are small if they are present at all. If we had a canonical model of employment that we could calibrate to the empirical evidence, that couldn’t happen anymore; there would be a coefficient I could point to that would refute their argument. But when every new paper comes with a new model, there’s no way to do that; one set of assumptions is as good as another.

Indeed, as I mentioned in an earlier post, a remarkable number of economists seem to embrace this relativism. “There is no true model.” they say; “We do what is useful.” Recently I encountered a book by the eminent economist Deirdre McCloskey which, though I confess I haven’t read it in its entirety, appears to be trying to argue that economics is just a meaningless language game that doesn’t have or need to have any connection with actual reality. (If any of you have read it and think I’m misunderstanding it, please explain. As it is I haven’t bought it for a reason any economist should respect: I am disinclined to incentivize such writing.)

Creating such a canonical model would no doubt be extremely difficult. Indeed, it is a task that would require the combined efforts of hundreds of researchers and could take generations to achieve. The true equations that underlie the economy could be totally intractable even for our best computers. But quantum mechanics wasn’t built in a day, either. The key challenge here lies in convincing economists that this is something worth doing—that if we really want to be taken seriously as scientists we need to start acting like them. Scientists believe in truth, and they are trying to find it out. While not immune to tribalism or ideology or other human limitations, they resist them as fiercely as possible, always turning back to the evidence above all else. And in their combined strivings, they attempt to build a grand edifice, a universal theory to stand the test of time—a canonical model.

The replication crisis, and the future of science

Aug 27, JDN 2457628 [Sat]

After settling in a little bit in Irvine, I’m now ready to resume blogging, but for now it will be on a reduced schedule. I’ll release a new post every Saturday, at least for the time being.

Today’s post was chosen by Patreon vote, though only one person voted (this whole Patreon voting thing has not been as successful as I’d hoped). It’s about something we scientists really don’t like to talk about, but definitely need to: We are in the middle of a major crisis of scientific replication.

Whenever large studies are conducted attempting to replicate published scientific results, their ability to do so is almost always dismal.

Psychology is the one everyone likes to pick on, because their record is particularly bad. Only 39% of studies were really replicated with the published effect size, though a further 36% were at least qualitatively but not quantitatively similar. Yet economics has its own replication problem, and even medical research is not immune to replication failure.

It’s important not to overstate the crisis; the majority of scientific studies do at least qualitatively replicate. We are doing better than flipping a coin, which is better than one can say of financial forecasters.
There are three kinds of replication, and only one of them should be expected to give near-100% results. That kind is reanalysiswhen you take the same data and use the same methods, you absolutely should get the exact same results. I favor making reanalysis a routine requirement of publication; if we can’t get your results by applying your statistical methods to your data, then your paper needs revision before we can entrust it to publication. A number of papers have failed on reanalysis, which is absurd and embarrassing; the worst offender was probably Rogart-Reinhoff, which was used in public policy decisions around the world despite having spreadsheet errors.

The second kind is direct replication—when you do the exact same experiment again and see if you get the same result within error bounds. This kind of replication should work something like 90% of the time, but in fact works more like 60% of the time.

The third kind is conceptual replication—when you do a similar experiment designed to test the same phenomenon from a different perspective. This kind of replication should work something like 60% of the time, but actually only works about 20% of the time.

Economists are well equipped to understand and solve this crisis, because it’s not actually about science. It’s about incentives. I facepalm every time I see another article by an aggrieved statistician about the “misunderstanding” of p-values; no, scientist aren’t misunderstanding anything. They know damn well how p-values are supposed to work. So why do they keep using them wrong? Because their jobs depend on doing so.

The first key point to understand here is “publish or perish”; academics in an increasingly competitive system are required to publish their research in order to get tenure, and frequently required to get tenure in order to keep their jobs at all. (Or they could become adjuncts, who are paid one-fifth as much.)

The second is the fundamentally defective way our research journals are run (as I have discussed in a previous post). As private for-profit corporations whose primary interest is in raising more revenue, our research journals aren’t trying to publish what will genuinely advance scientific knowledge. They are trying to publish what will draw attention to themselves. It’s a similar flaw to what has arisen in our news media; they aren’t trying to convey the truth, they are trying to get ratings to draw advertisers. This is how you get hours of meaningless fluff about a missing airliner and then a single chyron scroll about a war in Congo or a flood in Indonesia. Research journals haven’t fallen quite so far because they have reputations to uphold in order to attract scientists to read them and publish in them; but still, their fundamental goal is and has always been to raise attention in order to raise revenue.

The best way to do that is to publish things that are interesting. But if a scientific finding is interesting, that means it is surprising. It has to be unexpected or unusual in some way. And above all, it has to be positive; you have to have actually found an effect. Except in very rare circumstances, the null result is never considered interesting. This adds up to making journals publish what is improbable.

In particular, it creates a perfect storm for the abuse of p-values. A p-value, roughly speaking, is the probability you would get the observed result if there were no effect at all—for instance, the probability that you’d observe this wage gap between men and women in your sample if in the real world men and women were paid the exact same wages. The standard heuristic is a p-value of 0.05; indeed, it has become so enshrined that it is almost an explicit condition of publication now. Your result must be less than 5% likely to happen if there is no real difference. But if you will only publish results that show a p-value of 0.05, then the papers that get published and read will only be the ones that found such p-values—which renders the p-values meaningless.

It was never particularly meaningful anyway; as we Bayesians have been trying to explain since time immemorial, it matters how likely your hypothesis was in the first place. For something like wage gaps where we’re reasonably sure, but maybe could be wrong, the p-value is not too unreasonable. But if the theory is almost certainly true (“does gravity fall off as the inverse square of distance?”), even a high p-value like 0.35 is still supportive, while if the theory is almost certainly false (“are human beings capable of precognition?”—actual study), even a tiny p-value like 0.001 is still basically irrelevant. We really should be using much more sophisticated inference techniques, but those are harder to do, and don’t provide the nice simple threshold of “Is it below 0.05?”

But okay, p-values can be useful in many cases—if they are used correctly and you see all the results. If you have effect X with p-values 0.03, 0.07, 0.01, 0.06, and 0.09, effect X is probably a real thing. If you have effect Y with p-values 0.04, 0.02, 0.29, 0.35, and 0.74, effect Y is probably not a real thing. But I’ve just set it up so that these would be published exactly the same. They each have two published papers with “statistically significant” results. The other papers never get published and therefore never get seen, so we throw away vital information. This is called the file drawer problem.

Researchers often have a lot of flexibility in designing their experiments. If their only goal were to find truth, they would use this flexibility to test a variety of scenarios and publish all the results, so they can be compared holistically. But that isn’t their only goal; they also care about keeping their jobs so they can pay rent and feed their families. And under our current system, the only way to ensure that you can do that is by publishing things, which basically means only including the parts that showed up as statistically significant—otherwise, journals aren’t interested. And so we get huge numbers of papers published that tell us basically nothing, because we set up such strong incentives for researchers to give misleading results.

The saddest part is that this could be easily fixed.

First, reduce the incentives to publish by finding other ways to evaluate the skill of academics—like teaching for goodness’ sake. Working papers are another good approach. Journals already get far more submissions than they know what to do with, and most of these papers will never be read by more than a handful of people. We don’t need more published findings, we need better published findings—so stop incentivizing mere publication and start finding ways to incentivize research quality.

Second, eliminate private for-profit research journals. Science should be done by government agencies and nonprofits, not for-profit corporations. (And yes, I would apply this to pharmaceutical companies as well, which should really be pharmaceutical manufacturers who make cheap drugs based off of academic research and carry small profit margins.) Why? Again, it’s all about incentives. Corporations have no reason to want to find truth and every reason to want to tilt it in their favor.

Third, increase the number of tenured faculty positions. Instead of building so many new grand edifices to please your plutocratic donors, use your (skyrocketing) tuition money to hire more professors so that you can teach more students better. You can find even more funds if you cut the salaries of your administrators and football coaches. Come on, universities; you are the one industry in the world where labor demand and labor supply are the same people a few years later. You have no excuse for not having the smoothest market clearing in the world. You should never have gluts or shortages.

Fourth, require pre-registration of research studies (as some branches of medicine already do). If the study is sound, an optimal rational agent shouldn’t care in the slightest whether it had a positive or negative result, and if our ape brains won’t let us think that way, we need to establish institutions to force it to happen. They shouldn’t even see the effect size and p-value before they make the decision to publish it; all they should care about is that the experiment makes sense and the proper procedure was conducted.
If we did all that, the replication crisis could be almost completely resolved, as the incentives would be realigned to more closely match the genuine search for truth.

Alas, I don’t see universities or governments or research journals having the political will to actually make such changes, which is very sad indeed.

The Expanse gets the science right—including the economics

JDN 2457502

Despite constantly working on half a dozen projects at once (literally—preparing to start my PhD, writing this blog, working at my day job, editing a novel, preparing to submit a nonfiction book, writing another nonfiction book with three of my friends as co-authors, and creating a card game—that’s seven actually), I do occasionally find time to do things for fun. One I’ve been doing lately is catching up on The Expanse on DVR (I’m about halfway through the first season so far).

If you’re not familiar with The Expanse, it has been fairly aptly described as Battlestar Galactica meets Game of Thrones, though I think that particular comparison misrepresents the tone and attitudes of the series, because both BG and GoT are so dark and cynical (“It’s a nice day… for a… red wedding!”). I think “Star Trek meets Game of Thrones” might be better actually—the extreme idealism of Star Trek would cancel out the extreme cynicism of Game of Thrones, with the result being a complex mix of idealism and cynicism that more accurately reflects the real world (a world where Mahatma Gandhi and Adolf Hitler lived at the same time). That complex, nuanced world (or should I say worlds?) is where The Expanse takes place. ST is also more geopolitical than BG and The Expanse is nothing if not geopolitical.

But The Expanse is not just psychologically realistic—it is also scientifically and economically realistic. It may in fact be the hardest science fiction I have ever encountered, and is definitely the hardest science fiction I’ve seen in a television show. (There are a few books that might be slightly harder, as well as some movies based on them.)

The only major scientific inaccuracy I’ve been able to find so far is the use of sound effects in space, and actually even these can be interpreted as reflecting an omniscient narrator perspective that would hear any sounds that anyone would hear, regardless of what planet or ship they might be on. The sounds the audience hears all seem to be sounds that someone would hear—there’s simply no particular person who would hear all of them. When people are actually thrown into hard vacuum, we don’t hear them make any noise.

Like Firefly (and for once I think The Expanse might actually be good enough to deserve that comparison), there is no FTL, no aliens, no superhuman AI. Human beings are bound within our own solar system, and travel between planets takes weeks or months depending on your energy budget. They actually show holograms projecting the trajectory of various spacecraft and the trajectories actually make good sense in terms of orbital mechanics. Finally screenwriters had the courage to give us the terrifying suspense and inevitability of an incoming nuclear missile rounding a nearby asteroid and intercepting your trajectory, where you have minutes to think about it but not nearly enough delta-v to get out of its blast radius. That is what space combat will be like, if we ever have space combat (as awesome as it is to watch, I strongly hope that we will not ever actually do it). Unlike what Star Trek would have you believe, space is not a 19th century ocean.

They do have stealth in space—but it requires technology that even to them is highly advanced. Moreover it appears to only work for relatively short periods and seems most effective against civilian vessels that would likely lack state-of-the-art sensors, both of which make it a lot more plausible.

Computers are more advanced in the 2200s then they were in the 2000s, but not radically so, at most a million times faster, about what we gained since the 1980s. I’m guessing a smartphone in The Expanse runs at a few petaflops. Essentially they’re banking on Moore’s Law finally dying sometime in the mid 21st century, but then, so am I. Perhaps a bit harder to swallow is that no one has figured out good enough heuristics to match human cognition; but then, human cognition is very tightly optimized.

Spacecraft don’t have artificial gravity except for the thrust of their engines, and people float around as they should when ships are freefalling. They actually deal with the fact that Mars and Ceres have lower gravity than Earth, and the kinds of health problems that result from this. (One thing I do wish they’d done is had the Martian cruiser set a cruising acceleration of Mars-g—about 38% Earth-g—that would feel awkward and dizzying to their Earther captives. Instead they basically seem to assume that Martians still like to use Earth-g for space transit, but that does make some sense in terms of both human health and simply transit time.) It doesn’t seem like people move around quite awkwardly enough in the very low gravity of Ceres—which should be only about 3% Earth-g—but they do establish that electromagnetic boots are ubiquitous and that could account for most of this.

They fight primarily with nuclear missiles and kinetic weapons, and the damage done by nuclear missiles is appropriately reduced by the fact that vacuum doesn’t transmit shockwaves. (Nuclear missiles would still be quite damaging in space by releasing large amounts of wide-spectrum radiation; but they wouldn’t cause the total devastation they do within atmosphere.) Oddly they decided not to go with laser weapons as far as I can tell, which actually seems to me like they’ve underestimated advancement; laser weapons have a number of advantages that would be particularly useful in space, once we can actually make them affordable and reliable enough for widespread deployment. There could also be a three-tier system, where missiles are used at long range, railguns at medium range, and lasers at short range. (Yes, short range—the increased speed of lasers would be only slight compared to a good railgun, and would be more than offset by the effect of diffraction. At orbital distances, a laser is a shotgun.) Then again, it could well work out that railguns are just better—depending on how vessels are structured, puncturing their hulls with kinetic rounds could well be more useful than burning them up with infrared lasers.

But I think what really struck me about the realism of The Expanse is how it even makes the society realistic (in a way that, say, Firefly really doesn’t—we wanted a Western and we got a Western!).

The only major offworld colonies are Mars and Ceres, both of which seem to be fairly well-established, probably originally colonized as much as a century ago. Different societies have formed on each world; Earth has largely united under the United Nations (one of the lead characters is an undersecretary for the UN), but meanwhile Mars has split off into its own independent nation (“Martian” is now an ethnicity like “German” rather than meaning “extraterrestrial”), and the asteroid belt colonists, while formally still under Earth’s government, think of themselves as a different culture (“Belters”) and are seeking independence. There are some fairly obvious—but deftly managed rather than heavy-handed—parallels between the Belter independence movement and real-world independence movements, particularly Palestine (it’s hard not to think of the PLO when they talk about the OPA). Both Mars and the Belt have their own languages, while Earth’s languages have largely coalesced around English as the language of politics and commerce. (If the latter seems implausible, I remind you that the majority of the Internet and all international air traffic control are in English.) English is the world’s lingua franca (which is a really bizarre turn of phrase because it’s the Latin for French).

There is some of the conniving and murdering of Game of Thrones, but it is at a much more subdued level, and all of the major factions display both merits and flaws. There is no clear hero and no clear villain, just conflict and misunderstanding between a variety of human beings each with their own good and bad qualities. There does seem to be a sense that the most idealistic characters suffer for their idealism much as the Starks often do, but unlike the Starks they usually survive and learn from the experience. Indeed, some of the most cynical also seem to suffer for their cynicism—in the episode I just finished, the grizzled UN Colonel assumed the worst of his adversary and ended up branded “the butcher of Anderson Station”.

Cost of living on Ceres is extraordinarily high because of the limited living space (the apartments look a lot like the tiny studios of New York or San Francisco), and above all the need to constantly import air and water from Earth. A central plot point in the first episode is that a ship carrying comet ice—i.e., water—to Ceres is lost in a surprise attack by unknown adversaries with advanced technology, and the result is a deepening of an already dire water shortage, exacerbating the Belter’s craving for rebellion.

Air and water are recyclable, so it wouldn’t be that literally every drink and every breath needs to be supplied from outside—indeed that would clearly be cost-prohibitive. But recycling is never perfect, and Ceres also appears to have a growing population, both of which would require a constant input of new resources to sustain. It makes perfect sense that the most powerful people on Ceres are billionaire tycoons who own water and air transport corporations.

The police on Ceres (of which another lead character is a detective) are well-intentioned but understaffed, underfunded and moderately corrupt, similar to what we seem to find in large inner-city police departments like the NYPD and LAPD. It felt completely right when they responded to an attempt to kill a police officer with absolutely overwhelming force and little regard for due process and procedure—for this is what real-world police departments almost always do.

But why colonize the asteroid belt at all? Mars is a whole planet, there is plenty there—and in The Expanse they are undergoing terraforming at a very plausible rate (there’s a moving scene where a Martian says to an Earther, “We’re trying to finish building our garden before you finish paving over yours.”). Mars has as much land as Earth, and it has water, abundant metals, and CO2 you could use to make air.Even just the frontier ambition could be enough to bring us to Mars.

But why go to Ceres? The explanation The Expanse offers is a very sensible one: Mining, particularly so-called “rare earth metals”. Gold and platinum might have been profitable to mine at first, but once they became plentiful the market would probably collapse or at least drop off to a level where they aren’t particularly expensive or interesting—because they aren’t useful for very much. But neodymium, scandium, and prometheum are all going to be in extremely high demand in a high-tech future based on nuclear-powered spacecraft, and given that we’re already running out of easily accessible deposits on Earth, by the 2200s there will probably be basically none left. The asteroid belt, however, will have plenty for centuries to come.

As a result Ceres is organized like a mining town, or perhaps an extractive petrostate (metallostate?); but due to lightspeed interplanetary communication—very important in the series—and some modicum of free speech it doesn’t appear to have attained more than a moderate level of corruption. This also seems realistic; the “end-of-history” thesis is often overstated, but the basic idea that some form of democracy and welfare-state capitalism is fast becoming the only viable model of governance does seem to be true, and that is almost certainly the model of governance we would export to other planets. In such a system corruption can only get so bad before it is shown on the mass media and people won’t take it anymore.

The show doesn’t deal much with absolute dollar (or whatever currency) numbers, which is probably wise; but nominal incomes on Ceres are likely extremely high even though the standard of living is quite poor, because the tiny living space and need to import air and water would make prices (literally?) astronomical. Most people on Ceres seem to have grown up there, but the initial attraction could have been something like the California Gold Rush, where rumors of spectacularly high incomes clashed with similarly spectacular expenses incurred upon arrival. “Become a millionaire!” “Oh, by the way, your utility bill this month is $112,000.”

Indeed, even the poor on Ceres don’t seem that poor, which is a very nice turn toward realism that a lot of other science fiction shows seem unprepared to make. In Firefly, the poor are poor—they can barely afford food and clothing, and have no modern conveniences whatsoever. (“Jaynestown”, perhaps my favorite episode, depicts this vividly.) But even the poor in the US today are rarely that poor; our minimalistic and half-hearted welfare state has a number of cracks one can fall through, but as long as you get the benefits you’re supposed to get you should be able to avoid starvation and homelessness. Similarly I find it hard to believe that any society with high enough productivity to routinely build interstellar spacecraft the way we build container ships would not have at least the kind of welfare state that provides for the most basic needs. Chronic dehydration is probably still a problem for Belters, because water would be too expensive to subsidize in this way; but they all seem to have fairly nice clothes, home appliances, and smartphones, and that seems right to me. At one point a character loses his arm, and the “cheap” solution is a cybernetic prosthetic—the “expensive” one would be to grow him a new arm. As today but perhaps even more so, poverty in The Expanse is really about inequality—the enormous power granted to those who have millions of times as much as others. (Another show that does this quite well, though is considerably softer as far as the physics, is Continuum. If I recall correctly, Alec Sadler in 2079 is literally a trillionaire.)

Mars also appears to be a democracy, and actually quite a thriving one. In many ways Mars appears to be surpassing Earth economically and technologically. This suggests that Mars was colonized with our best and brightest, but not necessarily; Australians have done quite well for themselves despite being founded as a penal colony. Mars colonization would also have a way of justifying their frontier idealism that no previous frontiers have granted: No indigenous people to displace, no local ecology to despoil, and no gifts from the surrounding environment. You really are working entirely out of your own hard work and know-how (and technology and funding from Earth of course) to establish a truly new world on the open and unspoiled frontier. You’re not naive or a hypocrite, it’s the real truth. That kind of realistic idealism could make the Martian Dream a success in ways even the American Dream never quite was.

In all it is a very compelling series, and should appeal to people like me who crave geopolitical nuance in fiction. But it also has its moments of huge space battles with exploding star cruisers, so there’s that.

What does correlation have to do with causation?

JDN 2457345

I’ve been thinking of expanding the topics of this blog into some basic statistics and econometrics. It has been said that there are “Lies, damn lies, and statistics”; but in fact it’s almost the opposite—there are truths, whole truths, and statistics. Almost everything in the world that we know—not merely guess, or suppose, or intuit, or believe, but actually know, with a quantifiable level of certainty—is done by means of statistics. All sciences are based on them, from physics (when they say the Higgs discovery is a “5-sigma event”, that’s a statistic) to psychology, ecology to economics. Far from being something we cannot trust, they are in a sense the only thing we can trust.

The reason it sometimes feels like we cannot trust statistics is that most people do not understand statistics very well; this creates opportunities for both accidental confusion and willful distortion. My hope is therefore to provide you with some of the basic statistical knowledge you need to combat the worst distortions and correct the worst confusions.

I wasn’t quite sure where to start on this quest, but I suppose I have to start somewhere. I figured I may as well start with an adage about statistics that I hear commonly abused: “Correlation does not imply causation.”

Taken at its original meaning, this is definitely true. Unfortunately, it can be easily abused or misunderstood.

In its original meaning, the formal sense of the word “imply” meaning logical implication, to “imply” something is an extremely strong statement. It means that you logically entail that result, that if the antecedent is true, the consequent must be true, on pain of logical contradiction. Logical implication is for most practical purposes synonymous with mathematical proof. (Unfortunately, it’s not quite synonymous, because of things like Gödel’s incompleteness theorems and Löb’s theorem.)

And indeed, correlation does not logically entail causation; it’s quite possible to have correlations without any causal connection whatsoever, simply by chance. One of my former professors liked to brag that from 1990 to 2010 whether or not she ate breakfast had a statistically significant positive correlation with that day’s closing price for the Dow Jones Industrial Average.

How is this possible? Did my professor actually somehow influence the stock market by eating breakfast? Of course not; if she could do that, she’d be a billionaire by now. And obviously the Dow’s price at 17:00 couldn’t influence whether she ate breakfast at 09:00. Could there be some common cause driving both of them, like the weather? I guess it’s possible; maybe in good weather she gets up earlier and people are in better moods so they buy more stocks. But the most likely reason for this correlation is much simpler than that: She tried a whole bunch of different combinations until she found two things that correlated. At the usual significance level of 0.05, on average you need to try about 20 combinations of totally unrelated things before two of them will show up as correlated. (My guess is she used a number of different stock indexes and varied the starting and ending year. That’s a way to generate a surprisingly large number of degrees of freedom without it seeming like you’re doing anything particularly nefarious.)

But how do we know they aren’t actually causally related? Well, I suppose we don’t. Especially if the universe is ultimately deterministic and nonlocal (as I’ve become increasingly convinced by the results of recent quantum experiments), any two data sets could be causally related somehow. But the point is they don’t have to be; you can pick any randomly-generated datasets, pair them up in 20 different ways, and odds are, one of those ways will show a statistically significant correlation.

All of that is true, and important to understand. Finding a correlation between eating grapefruit and getting breast cancer, or between liking bitter foods and being a psychopath, does not necessarily mean that there is any real causal link between the two. If we can replicate these results in a bunch of other studies, that would suggest that the link is real; but typically, such findings cannot be replicated. There is something deeply wrong with the way science journalists operate; they like to publish the new and exciting findings, which 9 times out of 10 turn out to be completely wrong. They never want to talk about the really important and fascinating things that we know are true because we’ve been confirming them over hundreds of different experiments, because that’s “old news”. The journalistic desire to be new and first fundamentally contradicts the scientific requirement of being replicated and confirmed.

So, yes, it’s quite possible to have a correlation that tells you absolutely nothing about causation.

But this is exceptional. In most cases, correlation actually tells you quite a bit about causation.

And this is why I don’t like the adage; “imply” has a very different meaning in common speech, meaning merely to suggest or evoke. Almost everything you say implies all sorts of things in this broader sense, some more strongly than others, even though it may logically entail none of them.

Correlation does in fact suggest causation. Like any suggestion, it can be overridden. If we know that 20 different combinations were tried until one finally yielded a correlation, we have reason to distrust that correlation. If we find a correlation between A and B but there is no logical way they can be connected, we infer that it is simply an odd coincidence.

But when we encounter any given correlation, there are three other scenarios which are far more likely than mere coincidence: A causes B, B causes A, or some other factor C causes A and B. These are also not mutually exclusive; they can all be true to some extent, and in many cases are.

A great deal of work in science, and particularly in economics, is based upon using correlation to infer causation, and has to be—because there is simply no alternative means of approaching the problem.

Yes, sometimes you can do randomized controlled experiments, and some really important new findings in behavioral economics and development economics have been made this way. Indeed, much of the work that I hope to do over the course of my career is based on randomized controlled experiments, because they truly are the foundation of scientific knowledge. But sometimes, that’s just not an option.

Let’s consider an example: In my master’s thesis I found a strong correlation between the level of corruption in a country (as estimated by the World Bank) and the proportion of that country’s income which goes to the top 0.01% of the population. Countries that have higher levels of corruption also tend to have a larger proportion of income that accrues to the top 0.01%. That correlation is a fact; it’s there. There’s no denying it. But where does it come from? That’s the real question.

Could it be pure coincidence? Well, maybe; but when it keeps showing up in several different models with different variables included, that becomes unlikely. A single p < 0.05 will happen about 1 in 20 times by chance; but five in a row should happen less than 1 in 1 million times (assuming they’re independent, which, to be fair, they usually aren’t).

Could it be some artifact of the measurement methods? It’s possible. In particular, I was concerned about the possibility of Halo Effect, in which people tend to assume that something which is better (or worse) in one way is automatically better (or worse) in other ways as well. People might think of their country as more corrupt simply because it has higher inequality, even if there is no real connection. But it would have taken a very large halo bias to explain this effect.

So, does corruption cause income inequality? It’s not hard to see how that might happen: More corrupt individuals could bribe leaders or exploit loopholes to make themselves extremely rich, and thereby increase inequality.

Does inequality cause corruption? This also makes some sense, since it’s a lot easier to bribe leaders and manipulate regulations when you have a lot of money to work with in the first place.

Does something else cause both corruption and inequality? Also quite plausible. Maybe some general cultural factors are involved, or certain economic policies lead to both corruption and inequality. I did try to control for such things, but I obviously couldn’t include all possible variables.

So, which way does the causation run? Unfortunately, I don’t know. I tried some clever statistical techniques to try to figure this out; in particular, I looked at which tends to come first—the corruption or the inequality—and whether they could be used to predict each other, a method called Granger causality. Those results were inconclusive, however. I could neither verify nor exclude a causal connection in either direction. But is there a causal connection? I think so. It’s too robust to just be coincidence. I simply don’t know whether A causes B, B causes A, or C causes A and B.

Imagine trying to do this same study as a randomized controlled experiment. Are we supposed to create two societies and flip a coin to decide which one we make more corrupt? Or which one we give more income inequality? Perhaps you could do some sort of experiment with a proxy for corruption (cheating on a test or something like that), and then have unequal payoffs in the experiment—but that is very far removed from how corruption actually works in the real world, and worse, it’s prohibitively expensive to make really life-altering income inequality within an experimental context. Sure, we can give one participant $1 and the other $1,000; but we can’t give one participant $10,000 and the other $10 million, and it’s the latter that we’re really talking about when we deal with real-world income inequality. I’m not opposed to doing such an experiment, but it can only tell us so much. At some point you need to actually test the validity of your theory in the real world, and for that we need to use statistical correlations.

Or think about macroeconomics; how exactly are you supposed to test a theory of the business cycle experimentally? I guess theoretically you could subject an entire country to a new monetary policy selected at random, but the consequences of being put into the wrong experimental group would be disastrous. Moreover, nobody is going to accept a random monetary policy democratically, so you’d have to introduce it against the will of the population, by some sort of tyranny or at least technocracy. Even if this is theoretically possible, it’s mind-bogglingly unethical.

Now, you might be thinking: But we do change real-world policies, right? Couldn’t we use those changes as a sort of “experiment”? Yes, absolutely; that’s called a quasi-experiment or a natural experiment. They are tremendously useful. But since they are not truly randomized, they aren’t quite experiments. Ultimately, everything you get out of a quasi-experiment is based on statistical correlations.

Thus, abuse of the adage “Correlation does not imply causation” can lead to ignoring whole subfields of science, because there is no realistic way of running experiments in those subfields. Sometimes, statistics are all we have to work with.

This is why I like to say it a little differently:

Correlation does not prove causation. But correlation definitely can suggest causation.

Why being a scientist means confronting your own ignorance

I read an essay today arguing that scientists should be stupid. Or more precisely, ignorant. Or even more precisely, they should recognize their ignorance when all others ignore and turn away.

What does it feel like to be wrong?

It doesn’t feel like anything. Most people are wrong most of the time without realizing it. (Explained brilliantly in this TED talk.)

What does it feel like to be proven wrong, to find out you were confused or ignorant?

It hurts, a great deal. And most people flinch away from this. They would rather continue being wrong than experience the feeling of being proven wrong.

But being proven wrong is the only way to become less wrong. Being proven ignorant is the only way to truly attain knowledge.

I once heard someone characterize the scientific temperament as “being comfortable not knowing”. No, no, no! Just the opposite, in fact. The unscientific temperament is being comfortable not knowing, being fine with your infinite ignorance as long as you can go about your day. The scientific temperament is being so deeply  uncomfortable not knowing that it overrides the discomfort everyone feels when their beliefs are proven wrong. It is to have a drive to actually know—not to think you know, not to feel as if you know, not to assume you know and never think about it, but to actually know—that is so strong it pushes you through all the pain and doubt and confusion of actually trying to find out.

An analogy I like to use is The Armor of Truth. Suppose you were presented with a piece of armor, The Armor of Truth, which is claimed to be indestructible. You will have the chance to wear this armor into battle; if it is indeed indestructible, you will be invincible and will surely prevail. But what if it isn’t? What if it has some weakness you aren’t aware of? Then it could fail and you could die.

How would you go about determining whether The Armor of Truth is really what it is claimed to be? Would you test it with things you expect it to survive? Would you brush it with feathers, pour glasses of water on it, poke it with your finger? Would you seek to confirm your belief in its indestructibility? (As confirmation bias would have you do?) No, you would test it with things you expect to destroy it; you’d hit it with everything you have. You’re fire machine guns at it, drop bombs on it, pour acid on it, place it in a nuclear testing site. You’d do everything you possibly could to falsify your belief in the armor’s indestructibility. And only when you failed, only after you had tried everything you could think of to destroy the armor and it remained undented and unscratched, would you begin to believe that it is truly indestructible. (Popper was exaggerating when he said all science is based on falsification; but he was not exaggerating very much.)

Science is The Armor of Truth, and we wear it into battle—but now the analogy begins to break down, for our beliefs are within us, they are part of us. We’d like to be able to point the machineguns at armor far away from us, but instead it is as if we are forced to wear the armor as the guns are fired. When a break in the armor is found and a bullet passes through—a belief we dearly held is proven false—it hurts us, and we wish we could find another way to test it. But we can’t; and if we fail to test it now, it will only endanger us later—confront a false belief with reality enough and it will eventually fail. A scientist is someone who accepts this and wears the armor bravely as the test guns blaze.

Being a scientist means confronting your own ignorance: Not accepting it, but also not ignoring it; confronting it. Facing it down. Conquering it. Destroying it.

What’s wrong with academic publishing?

JDN 2457257 EDT 14:23.

I just finished expanding my master’s thesis into a research paper that is, I hope, suitable for publication in an economics journal. As part of this process I’ve been looking into the process of submitting articles for publication in academic journals… and I’ve found has been disgusting and horrifying. It is astonishingly bad, and my biggest question is why researchers put up with it.

Thus, the subject of this post is what’s wrong with the system—and what we might do instead.

Before I get into it, let me say that I don’t actually disagree with “publish or perish” in principle—as SMBC points out, it’s a lot like “do your job or get fired”. Researchers should publish in peer-reviewed journals; that’s a big part of what doing research means. The problem is how most peer-reviewed journals are currently operated.

First of all, in case you didn’t know, most scientific journals are owned by for-profit corporations. The largest corporation Elsevier, owns The Lancet and all of ScienceDirect, and has net income of over 1 billion Euros a year. Then there’s Springer and Wiley-Blackwell; between the three of them, these publishers account for over 40% of all scientific publications. These for-profit publishers retain the full copyright to most of the papers they publish, and tightly control access with paywalls; the cost to get through these paywalls is generally thousands of dollars a year for individuals and millions of dollars a year for universities. Their monopoly power is so great it “makes Rupert Murdoch look like a socialist.”

For-profit journals do often offer an “open-access” option in which you basically buy back your own copyright, but the price is high—the most common I’ve seen are $1800 or $3000 per paper—and very few researchers do this, for obvious financial reasons. In fact I think for a full-time tenured faculty researcher it’s probably worth it, given the alternatives. (Then again, full-time tenured faculty are becoming an endangered species lately; what might be worth it in the long run can still be very difficult for a cash-strapped adjunct to afford.) Open-access means people can actually read your paper and potentially cite your paper. Closed-access means it may languish in obscurity.

And of course it isn’t just about the benefits for the individual researcher. The scientific community as a whole depends upon the free flow of information; the reason we publish in the first place is that we want people to read papers, discuss them, replicate them, challenge them. Publication isn’t the finish line; it’s at best a checkpoint. Actually one thing that does seem to be wrong with “publish or perish” is that there is so much pressure for publication that we publish too many pointless papers and nobody has time to read the genuinely important ones.

These prices might be justifiable if the for-profit corporations actually did anything. But in fact they are basically just aggregators. They don’t do the peer-review, they farm it out to other academic researchers. They don’t even pay those other researchers; they just expect them to do it. (And they do! Like I said, why do they put up with this?) They don’t pay the authors who have their work published (on the contrary, they often charge submission fees—about $100 seems to be typical—simply to look at them). It’s been called “the world’s worst restaurant”, where you pay to get in, bring your own ingredients and recipes, cook your own food, serve other people’s food while they serve yours, and then have to pay again if you actually want to be allowed to eat.

They pay for the printing of paper copies of the journal, which basically no one reads; and they pay for the electronic servers that host the digital copies that everyone actually reads. They also provide some basic copyediting services (copyediting APA style is a job people advertise on Craigslist—so you can guess how much they must be paying).

And even supposing that they actually provided some valuable and expensive service, the fact would remain that we are making for-profit corporations the gatekeepers of the scientific community. Entities that exist only to make money for their owners are given direct control over the future of human knowledge. If you look at Cracked’s “reasons why we can’t trust science anymore”, all of them have to do with the for-profit publishing system. p-hacking might still happen in a better system, but publishers that really had the best interests of science in mind would be more motivated to fight it than publishers that are simply trying to raise revenue by getting people to buy access to their papers.

Then there’s the fact that most journals do not allow authors to submit to multiple journals at once, yet take 30 to 90 days to respond and only publish a fraction of what is submitted—it’s almost impossible to find good figures on acceptance rates (which is itself a major problem!), but the highest figures I’ve seen are 30% acceptance, a more typical figure seems to be 10%, and some top journals go as low as 3%. In the worst-case scenario you are locked into a journal for 90 days with only a 3% chance of it actually publishing your work. At that rate publishing an article could take years.

Is open-access the solution? Yes… well, part of it, anyway.

There are a large number of open-access journals, some of which do not charge submission fees, but very few of them are prestigious, and many are outright predatory. Predatory journals charge exorbitant fees, often after accepting papers for publication; many do little or no real peer review. There are almost seven hundred known predatory open-access journals; over one hundred have even been caught publishing hoax papers. These predatory journals are corrupting the process of science.

There are a few reputable open-access journals, such as BMC Biology and PLOSOne. Though not actually a journal, ArXiv serves a similar role. These will be part of the solution, most definitely. Yet even legitimate open-access journals often charge each author over $1000 to publish an article. There is a small but significant positive correlation between publication fees and journal impact factor.

We need to found more open-access journals which are funded by either governments or universities, so that neither author nor reader ever pays a cent. Science is a public good and should be funded as such. Even if copyright makes sense for other forms of content (I’m not so sure about that), it most certainly does not make sense for scientific knowledge, which by its very nature is only doing its job if it is shared with the world.

These journals should be specifically structured to be method-sensitive but results-blind. (It’s a very good thing that medical trials are usually registered before they are completed, so that publication is assured even if the results are negative—the same should be done with other sciences. Unfortunately, even in medicine there is significant publication bias.) If you could sum up the scientific method in one phrase, it might just be that: Method-sensitive but results-blind. If you think you know what you’re going to find beforehand, you may not be doing science. If you are certain what you’re going to find beforehand, you’re definitely not doing science.

The process should still be highly selective, but it should be possible—indeed, expected—to submit to multiple journals at once. If journals want to start paying their authors to entice them to publish in that journal rather than take another offer, that’s fine with me. Researchers are the ones who produce the content; if anyone is getting paid for it, it should be us.

This is not some wild and fanciful idea; it’s already the way that book publishing works. Very few literary agents or book publishers would ever have the audacity to say you can’t submit your work elsewhere; those that try are rapidly outcompeted as authors stop submitting to them. It’s fundamentally unreasonable to expect anyone to hang all their hopes on a particular buyer months in advance—and that is what you are, publishers, you are buyers. You are not sellers, you did not create this content.

But new journals face a fundamental problem: Good researchers will naturally want to publish in journals that are prestigious—that is, journals that are already prestigious. When all of the prestige is in journals that are closed-access and owned by for-profit companies, the best research goes there, and the prestige becomes self-reinforcing. Journals are prestigious because they are prestigious; welcome to tautology club.

Somehow we need to get good researchers to start boycotting for-profit journals and start investing in high-quality open-access journals. If Elsevier and Springer can’t get good researchers to submit to them, they’ll change their ways or wither and die. Research should be funded and published by governments and nonprofit institutions, not by for-profit corporations.

This may in fact highlight a much deeper problem in academia, the very concept of “prestige”. I have no doubt that Harvard is a good university, better university than most; but is it actually the best as it is in most people’s minds? Might Stanford or UC Berkeley be better, or University College London, or even the University of Michigan? How would we tell? Are the students better? Even if they are, might that just be because all the better students went to the schools that had better reputations? Controlling for the quality of the student, more prestigious universities are almost uncorrelated with better outcomes. Those who get accepted to Ivies but attend other schools do just as well in life as those who actually attend Ivies. (Good news for me, getting into Columbia but going to Michigan.) Yet once a university acquires such a high reputation, it can be very difficult for it to lose that reputation, and even more difficult for others to catch up.

Prestige is inherently zero-sum; for me to get more prestige you must lose some. For one university or research journal to rise in rankings, another must fall. Aside from simply feeding on other prestige, the prestige of a university is largely based upon the students it rejects—its “selectivity” score. What does it say about our society that we value educational institutions based upon the number of people they exclude?

Zero-sum ranking is always easier to do than nonzero-sum absolute scoring. Actually that’s a mathematical theorem, and one of the few good arguments against range voting (still not nearly good enough, in my opinion); if you have a list of scores you can always turn them into ranks (potentially with ties); but from a list of ranks there is no way to turn them back into scores.

Yet ultimately it is absolute scores that must drive humanity’s progress. If life were simply a matter of ranking, then progress would be by definition impossible. No matter what we do, there will always be top-ranked and bottom-ranked people.

There is simply no way mathematically for more than 1% of human beings to be in the top 1% of the income distribution. (If you’re curious where exactly that lies today, I highly recommend this interactive chart by the New York Times.) But we could raise the standard of living for the majority of people to a level that only the top 1% once had—and in fact, within the First World we have already done this. We could in fact raise the standard of living for everyone in the First World to a level that only the top 1%—or less—had as recently as the 16th century, by the simple change of implementing a basic income.

There is no way for more than 0.14% of people to have an IQ above 145, because IQ is defined to have a mean of 100 and a standard deviation of 15, regardless of how intelligent people are. People could get dramatically smarter over timeand in fact have—and yet it would still be the case that by definition, only 0.14% can be above 145.

Similarly, there is no way for much more than 1% of people to go to the top 1% of colleges. There is no way for more than 1% of people to be in the highest 1% of their class. But we could increase the number of college degrees (which we have); we could dramatically increase literacy rates (which we have).

We need to find a way to think of science in the same way. I wouldn’t suggest simply using number of papers published or even number of drugs invented; both of those are skyrocketing, but I can’t say that most of the increase is actually meaningful. I don’t have a good idea of what an absolute scale for scientific quality would look like, even at an aggregate level; and it is likely to be much harder still to make one that applies on an individual level.

But I think that ultimately this is the only way, the only escape from the darkness of cutthroat competition. We must stop thinking in terms of zero-sum rankings and start thinking in terms of nonzero-sum absolute scales.