Rogart-Reinhoff

Feb 19, JDN 2457804

One of the more legitimate criticisms out there of we “urban elites” is our credentialism—our tendency to decide a person’s value as an employee or even as a human being based solely upon their formal credentials. Randall Collins, an American sociologist, wrote a book called The Credential Society arguing that much of the class stratification in the United States is traceable to this credentialism—upper-middle-class White Anglo-Saxon Protestants go to the good high schools to get into the good colleges to get the good careers, and all along the way maintain subtle but significant barriers to keep everyone else out.

A related concern is that of credential inflation, where more and more people get a given credential (such as a high school diploma or a college degree), and it begins to lose value as a signal of status. It is often noted that a bachelor’s degree today “gets” you the same jobs that a high school diploma did two generations ago, and two generations hence you may need a master’s or even a PhD.

I consider this concern wildly overblown, however. First of all, they’re not actually the same jobs at all. Even our “menial” jobs of today require skills that most people didn’t have two generations ago—not simply those involving electronics and computers, but even quite basic literacy and numeracy. Yes, you could be a banker in the 1920s with a high school diploma, but plenty of bankers in the 1920s didn’t know algebra. What, you think they were arbitraging derivatives based on the Black-Scholes model?

The primary purpose of education should be to actually improve students’ abilities, not to signal their superior status. More people getting educated is good, not bad. If we really do need signals, we can devise better ones than making people pay tens of thousands of dollars in tuition and spending years taking classes. An expenditure of that magnitude should be accomplishing something, not just signaling. (And given the overwhelming positive correlation between a country’s educational attainment and its economic development, clearly education is actually accomplishing something.) Our higher educational standards have directly tied to higher technology and higher productivity. If indeed you need a PhD to be a janitor in 2050, it will be because in 2050 a “janitor” is actually the expert artificial intelligence engineer who commands an army of cleaning robots, not because credentials have “inflated”. Thinking that credentials “inflate” requires thinking that business managers must be very stupid, that they would exclude whole swaths of qualified candidates that they could pay less to do the same work. Only a complete moron would require a PhD to hire you for wielding a mop.

No, what concerns me is an over-emphasis on prestigious credentials over genuine competence. This is definitely a real issue in our society: Almost every US President went to an Ivy League university, yet several of them (George W. Bush, anyone?) clearly would not actually have been selected by such a university if their families had not been wealthy and well-connected. (Harvard’s application literally contains a question asking whether you are a “lineal or collateral descendant” of one of a handful of super-wealthy families.) Papers that contain errors so basic that I would probably get a failing grade as a grad student for them become internationally influential because they were written by famous economists with fancy degrees.

Ironically, it may be precisely because elite universities try not to give grades or special honors that so many of their students try so desperately to latch onto any bits of social status they can get their hands on. In this blog post, a former Yale law student comments on how, without grades or cum laude to define themselves, Yale students became fiercely competitive in the pettiest ways imaginable. Or it might just be a selection effect; to get into Yale you’ve probably got to be pretty competitive, so even if they don’t give out grades once you get there, you can take the student out of the honors track, but you can’t take the honors track out of the student.

But perhaps the biggest problem with credentialism is… I don’t see any viable alternatives!

We have to decide who is going to be hired for technical and professional positions somehow. It almost certainly can’t be everyone. And the most sensible way to do it would be to have a process people go through to get trained and evaluated on their skills in that profession—that is, a credential.

What else would we do? We could decide randomly, I suppose; well, good luck with that. Or we could try to pick people who don’t have qualifications (“anti-credentialism” I suppose), which would be systematically wrong. Or individual employers could hire individuals they know and trust on a personal level, which doesn’t seem quite so ridiculous—but we have a name for that too, and it’s nepotism.

Even anti-credentialism does exist, bafflingly enough. Many people voted for George W. Bush because they said he was “the kind of guy you can have a beer with”. That wasn’t true, of course; he was the spoiled child of a billionaire, a man who had never really worked a day in his life. But even if it had been true, so what? How is that a qualification to be the leader of the free world? And how many people voted for Trump precisely because he had no experience in government? This made sense to them somehow. (And, shockingly, he has no idea what he’s doing. Actually what is shocking is that he admits that.)

Nepotism of course happens all the time. In fact, nepotism is probably the default state for humans. The continual re-emergence of hereditary monarchy and feudalism around the world suggests that this is some sort of attractor state for human societies, that in the absence of strong institutional pressures toward some other system this is what people will generally settle into. And feudalism is nothing if not nepotistic; your position in life is almost entirely determined by your father’s position, and his father’s before that.

Formal credentials can put a stop to that. Of course, your ability to obtain the credential often depends upon your income and social status. But if you can get past those barriers and actually get the credential, you now have a way of pushing past at least some of the competitors who would have otherwise been hired on their family connections alone. The rise in college enrollments—and women actually now exceeding men in college enrollment rates—is one of the biggest reasons why the gender pay gap is rapidly closing among young workers. Nepotism and sexism that would otherwise have hired unqualified men is now overtaken by the superior credentials of qualified women.

Credentialism does still seem suboptimal… but from where I’m sitting, it seems like a second-best solution. We can’t actually observe people’s competence and ability directly, so we need credentials to provide an approximate measurement. We can certainly work to improve credentials—and for example, I am fiercely opposed to multiple-choice testing because it produces such meaningless credentials—but ultimately I don’t see any alternative to credentials.

Aug 27, JDN 2457628 [Sat]

After settling in a little bit in Irvine, I’m now ready to resume blogging, but for now it will be on a reduced schedule. I’ll release a new post every Saturday, at least for the time being.

Today’s post was chosen by Patreon vote, though only one person voted (this whole Patreon voting thing has not been as successful as I’d hoped). It’s about something we scientists really don’t like to talk about, but definitely need to: We are in the middle of a major crisis of scientific replication.

Whenever large studies are conducted attempting to replicate published scientific results, their ability to do so is almost always dismal.

Psychology is the one everyone likes to pick on, because their record is particularly bad. Only 39% of studies were really replicated with the published effect size, though a further 36% were at least qualitatively but not quantitatively similar. Yet economics has its own replication problem, and even medical research is not immune to replication failure.

It’s important not to overstate the crisis; the majority of scientific studies do at least qualitatively replicate. We are doing better than flipping a coin, which is better than one can say of financial forecasters.
There are three kinds of replication, and only one of them should be expected to give near-100% results. That kind is reanalysis—when you take the same data and use the same methods, you absolutely should get the exact same results. I favor making reanalysis a routine requirement of publication; if we can’t get your results by applying your statistical methods to your data, then your paper needs revision before we can entrust it to publication. A number of papers have failed on reanalysis, which is absurd and embarrassing; the worst offender was probably Rogart-Reinhoff, which was used in public policy decisions around the world despite having spreadsheet errors.

The second kind is direct replication—when you do the exact same experiment again and see if you get the same result within error bounds. This kind of replication should work something like 90% of the time, but in fact works more like 60% of the time.

The third kind is conceptual replication—when you do a similar experiment designed to test the same phenomenon from a different perspective. This kind of replication should work something like 60% of the time, but actually only works about 20% of the time.

Economists are well equipped to understand and solve this crisis, because it’s not actually about science. It’s about incentives. I facepalm every time I see another article by an aggrieved statistician about the “misunderstanding” of p-values; no, scientist aren’t misunderstanding anything. They know damn well how p-values are supposed to work. So why do they keep using them wrong? Because their jobs depend on doing so.

The first key point to understand here is “publish or perish”; academics in an increasingly competitive system are required to publish their research in order to get tenure, and frequently required to get tenure in order to keep their jobs at all. (Or they could become adjuncts, who are paid one-fifth as much.)

The second is the fundamentally defective way our research journals are run (as I have discussed in a previous post). As private for-profit corporations whose primary interest is in raising more revenue, our research journals aren’t trying to publish what will genuinely advance scientific knowledge. They are trying to publish what will draw attention to themselves. It’s a similar flaw to what has arisen in our news media; they aren’t trying to convey the truth, they are trying to get ratings to draw advertisers. This is how you get hours of meaningless fluff about a missing airliner and then a single chyron scroll about a war in Congo or a flood in Indonesia. Research journals haven’t fallen quite so far because they have reputations to uphold in order to attract scientists to read them and publish in them; but still, their fundamental goal is and has always been to raise attention in order to raise revenue.

The best way to do that is to publish things that are interesting. But if a scientific finding is interesting, that means it is surprising. It has to be unexpected or unusual in some way. And above all, it has to be positive; you have to have actually found an effect. Except in very rare circumstances, the null result is never considered interesting. This adds up to making journals publish what is improbable.

In particular, it creates a perfect storm for the abuse of p-values. A p-value, roughly speaking, is the probability you would get the observed result if there were no effect at all—for instance, the probability that you’d observe this wage gap between men and women in your sample if in the real world men and women were paid the exact same wages. The standard heuristic is a p-value of 0.05; indeed, it has become so enshrined that it is almost an explicit condition of publication now. Your result must be less than 5% likely to happen if there is no real difference. But if you will only publish results that show a p-value of 0.05, then the papers that get published and read will only be the ones that found such p-values—which renders the p-values meaningless.

It was never particularly meaningful anyway; as we Bayesians have been trying to explain since time immemorial, it matters how likely your hypothesis was in the first place. For something like wage gaps where we’re reasonably sure, but maybe could be wrong, the p-value is not too unreasonable. But if the theory is almost certainly true (“does gravity fall off as the inverse square of distance?”), even a high p-value like 0.35 is still supportive, while if the theory is almost certainly false (“are human beings capable of precognition?”—actual study), even a tiny p-value like 0.001 is still basically irrelevant. We really should be using much more sophisticated inference techniques, but those are harder to do, and don’t provide the nice simple threshold of “Is it below 0.05?”

But okay, p-values can be useful in many cases—if they are used correctly and you see all the results. If you have effect X with p-values 0.03, 0.07, 0.01, 0.06, and 0.09, effect X is probably a real thing. If you have effect Y with p-values 0.04, 0.02, 0.29, 0.35, and 0.74, effect Y is probably not a real thing. But I’ve just set it up so that these would be published exactly the same. They each have two published papers with “statistically significant” results. The other papers never get published and therefore never get seen, so we throw away vital information. This is called the file drawer problem.

Researchers often have a lot of flexibility in designing their experiments. If their only goal were to find truth, they would use this flexibility to test a variety of scenarios and publish all the results, so they can be compared holistically. But that isn’t their only goal; they also care about keeping their jobs so they can pay rent and feed their families. And under our current system, the only way to ensure that you can do that is by publishing things, which basically means only including the parts that showed up as statistically significant—otherwise, journals aren’t interested. And so we get huge numbers of papers published that tell us basically nothing, because we set up such strong incentives for researchers to give misleading results.

The saddest part is that this could be easily fixed.

First, reduce the incentives to publish by finding other ways to evaluate the skill of academics—like teaching for goodness’ sake. Working papers are another good approach. Journals already get far more submissions than they know what to do with, and most of these papers will never be read by more than a handful of people. We don’t need more published findings, we need better published findings—so stop incentivizing mere publication and start finding ways to incentivize research quality.

Second, eliminate private for-profit research journals. Science should be done by government agencies and nonprofits, not for-profit corporations. (And yes, I would apply this to pharmaceutical companies as well, which should really be pharmaceutical manufacturers who make cheap drugs based off of academic research and carry small profit margins.) Why? Again, it’s all about incentives. Corporations have no reason to want to find truth and every reason to want to tilt it in their favor.

Third, increase the number of tenured faculty positions. Instead of building so many new grand edifices to please your plutocratic donors, use your (skyrocketing) tuition money to hire more professors so that you can teach more students better. You can find even more funds if you cut the salaries of your administrators and football coaches. Come on, universities; you are the one industry in the world where labor demand and labor supply are the same people a few years later. You have no excuse for not having the smoothest market clearing in the world. You should never have gluts or shortages.

Fourth, require pre-registration of research studies (as some branches of medicine already do). If the study is sound, an optimal rational agent shouldn’t care in the slightest whether it had a positive or negative result, and if our ape brains won’t let us think that way, we need to establish institutions to force it to happen. They shouldn’t even see the effect size and p-value before they make the decision to publish it; all they should care about is that the experiment makes sense and the proper procedure was conducted.
If we did all that, the replication crisis could be almost completely resolved, as the incentives would be realigned to more closely match the genuine search for truth.

Alas, I don’t see universities or governments or research journals having the political will to actually make such changes, which is very sad indeed.

	Welcome to Cyberpunk… on What is the cost of all t…
	Mark James on Marriage and matching
	The most dangerous i… on Why would AI kill us?
	lorentjd on The stochastic superstar …
	lorentjd on The stochastic superstar …

Human Economics

Cognitive development economics: Understanding the mind in order to feed the world

Caught between nepotism and credentialism

The replication crisis, and the future of science