Darkest Before the Dawn: Bayesian Impostor Syndrome

Jan 12 JDN 2458860

At the time of writing, I have just returned from my second Allied Social Sciences Association Annual Meeting, the AEA’s annual conference (or AEA and friends, I suppose, since there several other, much smaller economics and finance associations are represented as well). This one was in San Diego, which made it considerably cheaper for me to attend than last year’s. Alas, next year’s conference will be in Chicago. At least flights to Chicago tend to be cheap because it’s a major hub.

My biggest accomplishment of the conference was getting some face-time and career advice from Colin Camerer, the Caltech economist who literally wrote the book on behavioral game theory. Otherwise I would call the conference successful, but not spectacular. Some of the talks were much better than others; I think I liked the one by Emmanuel Saez best, and I also really liked the one on procrastination by Matthew Gibson. I was mildly disappointed by Ben Bernanke’s keynote address; maybe I would have found it more compelling if I were more focused on macroeconomics.

But while sitting through one of the less-interesting seminars I had a clever little idea, which may help explain why Impostor Syndrome seems to occur so frequently even among highly competent, intelligent people. This post is going to be more technical than most, so be warned: Here There Be Bayes. If you fear yon algebra and wish to skip it, I have marked below a good place for you to jump back in.

Suppose there are two types of people, high talent H and low talent L. (In reality there is of course a wide range of talents, so I could assign a distribution over that range, but it would complicate the model without really changing the conclusions.) You don’t know which one you are; all you know is a prior probability h that you are high-talent. It doesn’t matter too much what h is, but for concreteness let’s say h = 0.50; you’ve got to be in the top 50% to be considered “high-talent”.

You are engaged in some sort of activity that comes with a high risk of failure. Many creative endeavors fit this pattern: Perhaps you are a musician looking for a producer, an actor looking for a gig, an author trying to secure an agent, or a scientist trying to publish in a journal. Or maybe you’re a high school student applying to college, or a unemployed worker submitting job applications.

If you are high-talent, you’re more likely to succeed—but still very likely to fail. And even low-talent people don’t always fail; sometimes you just get lucky. Let’s say the probability of success if you are high-talent is p, and if you are low-talent, the probability of success is q. The precise value depends on the domain; but perhaps p = 0.10 and q = 0.02.

Finally, let’s suppose you are highly rational, a good and proper Bayesian. You update all your probabilities based on your observations, precisely as you should.

How will you feel about your talent, after a series of failures?

More precisely, what posterior probability will you assign to being a high-talent individual, after a series of n+k attempts, of which k met with success and n met with failure?

Since failure is likely even if you are high-talent, you shouldn’t update your probability too much on a failurebut each failure should, in fact, lead to revising your probability downward.

Conversely, since success is rare, it should cause you to revise your probability upward—and, as will become important, your revisions upon success should be much larger than your revisions upon failure.

We begin as any good Bayesian does, with Bayes’ Law:

P[H|(~S)^n (S)^k] = P[(~S)^n (S)^k|H] P[H] / P[(~S)^n (S)^k]

In words, this reads: The posterior probability of being high-talent, given that you have observed k successes and n failures, is equal to the probability of observing such an outcome, given that you are high-talent, times the prior probability of being high-skill, divided by the prior probability of observing such an outcome.

We can compute the probabilities on the right-hand side using the binomial distribution:

P[H] = h

P[(~S)^n (S)^k|H] = (n+k C k) p^k (1-p)^n

P[(~S)^n (S)^k] = (n+k C k) p^k (1-p)^n h + (n+k C k) q^k (1-q)^n (1-h)

Plugging all this back in and canceling like terms yields:

P[H|(~S)^n (S)^k] = 1/(1 + [1-h/h] [q/p]^k [(1-q)/(1-p)]^n)

This turns out to be particularly convenient in log-odds form:

L[X] = ln [ P(X)/P(~X) ]

L[(~S)^n) (S)^k|H] = ln [h/(1-h)] + k ln [p/q] + n ln [(1-p)/(1-q)]

Since p > q, ln[p/q] is a positive number, while ln[(1-p)/(1-q)] is a negative number. This corresponds to the fact that you will increase your posterior when you observe a success (k increases by 1) and decrease your posterior when you observe a failure (n increases by 1).

But when p and q are small, it turns out that ln[p/q] is much larger in magnitude than ln[(1-p)/(1-q)]. For the numbers I gave above, p = 0.10 and q = 0.02, ln[p/q] = 1.609 while ln[(1-p)/(1-q)] = -0.085. You will therefore update substantially more upon a success than on a failure.

Yet successes are rare! This means that any given success will most likely be first preceded by a sequence of failures. This results in what I will call the darkest-before-dawn effect: Your opinion of your own talent will tend to be at its very worst in the moments just preceding a major success.

I’ve graphed the results of a few simulations illustrating this: On the X-axis is the number of overall attempts made thus far, and on the Y-axis is the posterior probability of being high-talent. The simulated individual undergoes randomized successes and failures with the probabilities I chose above.

Bayesian_Impostor_full

There are 10 simulations on that one graph, which may make it a bit confusing. So let’s focus in on two runs in particular, which turned out to be run 6 and run 10:

[If you skipped over the math, here’s a good place to come back. Welcome!]

Bayesian_Impostor_focus

Run 6 is a lucky little devil. They had an immediate success, followed by another success in their fourth attempt. As a result, they quickly update their posterior to conclude that they are almost certainly a high-talent individual, and even after a string of failures beyond that they never lose faith.

Run 10, on the other hand, probably has Impostor Syndrome. Failure after failure after failure slowly eroded their self-esteem, leading them to conclude that they are probably a low-talent individual. And then, suddenly, a miracle occurs: On their 20th attempt, at last they succeed, and their whole outlook changes; perhaps they are high-talent after all.

Note that all the simulations are of high-talent individuals. Run 6 and run 10 are equally competent. Ex ante, the probability of success for run 6 and run 10 was exactly the same. Moreover, both individuals are completely rational, in the sense that they are doing perfect Bayesian updating.

And yet, if you compare their self-evaluations after the 19th attempt, they could hardly look more different: Run 6 is 85% sure that they are high-talent, even though they’ve been in a slump for the last 13 attempts. Run 10, on the other hand, is 83% sure that they are low-talent, because they’ve never succeeded at all.

It is darkest just before the dawn: Run 10’s self-evaluation is at its very lowest right before they finally have a success, at which point their self-esteem surges upward, almost to baseline. With just one more success, their opinion of themselves would in fact converge to the same as Run 6’s.

This may explain, at least in part, why Impostor Syndrome is so common. When successes are few and far between—even for the very best and brightest—then a string of failures is the most likely outcome for almost everyone, and it can be difficult to tell whether you are so bright after all. Failure after failure will slowly erode your self-esteem (and should, in some sense; you’re being a good Bayesian!). You’ll observe a few lucky individuals who get their big break right away, and it will only reinforce your fear that you’re not cut out for this (whatever this is) after all.

Of course, this model is far too simple: People don’t just come in “talented” and “untalented” varieties, but have a wide range of skills that lie on a continuum. There are degrees of success and failure as well: You could get published in some obscure field journal hardly anybody reads, or in the top journal in your discipline. You could get into the University of Northwestern Ohio, or into Harvard. And people face different barriers to success that may have nothing to do with talent—perhaps why marginalized people such as women, racial minorities, LGBT people, and people with disabilities tend to have the highest rates of Impostor Syndrome. But I think the overall pattern is right: People feel like impostors when they’ve experienced a long string of failures, even when that is likely to occur for everyone.

What can be done with this information? Well, it leads me to three pieces of advice:

1. When success is rare, find other evidence. If truly “succeeding” (whatever that means in your case) is unlikely on any given attempt, don’t try to evaluate your own competence based on that extremely noisy signal. Instead, look for other sources of data: Do you seem to have the kinds of skills that people who succeed in your endeavors have—preferably based on the most objective measures you can find? Do others who know you or your work have a high opinion of your abilities and your potential? This, perhaps is the greatest mistake we make when falling prey to Impostor Syndrome: We imagine that we have somehow “fooled” people into thinking we are competent, rather than realizing that other people’s opinions of us are actually evidence that we are in fact competent. Use this evidence. Update your posterior on that.

2. Don’t over-update your posterior on failures—and don’t under-update on successes. Very few living humans (if any) are true and proper Bayesians. We use a variety of heuristics when judging probability, most notably the representative and availability heuristics. These will cause you to over-respond to failures, because this string of failures makes you “look like” the kind of person who would continue to fail (representative), and you can’t conjure to mind any clear examples of success (availability). Keeping this in mind, your update upon experiencing failure should be small, probably as small as you can make it. Conversely, when you do actually succeed, even in a small way, don’t dismiss it. Don’t look for reasons why it was just luck—it’s always luck, at least in part, for everyone. Try to update your self-evaluation more when you succeed, precisely because success is rare for everyone.

3. Don’t lose hope. The next one really could be your big break. While astronomically baffling (no, it’s darkest at midnight, in between dusk and dawn!), “it is always darkest before the dawn” really does apply here. You are likely to feel the worst about yourself at the very point where you are about to finally succeed. The lowest self-esteem you ever feel will be just before you finally achieve a major success. Of course, you can’t know if the next one will be it—or if it will take five, or ten, or twenty more tries. And yes, each new failure will hurt a little bit more, make you doubt yourself a little bit more. But if you are properly grounded by what others think of your talents, you can stand firm, until that one glorious day comes and you finally make it.

Now, if I could only manage to take my own advice….

Sexism in the economics profession

Jan 20 JDN 2458504

I mentioned in my previous post that the economics profession is currently coming to a reckoning with its own sexist biases. Today I’d like to get back to that in more detail.

I think I should include some kind of trigger warning here, because some of this sexism is pretty extreme. In particular, there are going to be references to anal sex, which certainly isn’t something I was expecting to find. I won’t quote anything highly explicit—but I assure you, it exists.

There is reason to believe that these biases are not as bad as they once were. If you compare the cohorts of new economics PhDs to those of the past, or to the professors who have been tenured for many years, the pattern is quite clear: The longer back you look, the fewer women (and racial minorities, and LGBT people) you see.

In part because of the #MeToo movement (which, I really would like to say, has done an excellent job of picking legitimate targets and not publicly shaming the wrong people, unlike almost every other attempt at public shaming via social media), the economics profession is also coming to terms with a related matter, which could be both cause and consequence of these gender disparities: Sexual harassment by economists of their students and junior faculty.

It wasn’t until last year that the AEA officially adopted a Code of Professional Conduct mandating equality of opportunity for women (and minorities, and LGBT people). Of course, sexual harassment has been illegal much longer than that—but it’s probably the most under-reported and under-prosecuted crime in existence. Last year’s AEA conference was the first to include panels specifically on gender and discrimination in economics, and this year’s conference had more.

Grad students have been a big part of this push; hundreds of econ grad students signed an open letter demanding that universities implement reporting and disciplinary systems to deal with sexual harassment in economics (one of the signatories is friend of mine from UCI, though strangely I don’t remember hearing about it, or I would have signed it too).

One of the most prominent economists accused of repeated sexual harassment unfortunately happens to be the youngest Black person ever to get tenured at Harvard. This would seem to create some tension between gender equality and racial equality. But of course this tension is illusory: There are plenty of other brilliant Black economists they could have hired who aren’t serial sexual harassers.

It’s still dicey for grad students and junior faculty to talk about these things, because of the very real power that senior faculty have over us as committees for dissertations, hiring, and tenure. Some economists who wrote papers about sexism in the profession have chosen to remain anonymous for fear of retaliation.

Part of how this issue has finally gotten so much attention is by concerned economists actually showing it using the methods of social science. One of the most striking studies was a data analysis of the word usage on econjobrumors.com, a job discussion board for PhD grads and junior faculty in economics. (More detail on that study here and also here.)

I’ve bolded the terms that are sexual or suggest bias. I’ve italicized the terms that suggest something involving romantic or family relationships. I’ve underlined the terms actually relevant for economics.
These were the terms most commonly associated with women:

hotter, lesbian, bb, sexism, tits, anal, marrying, feminazi, slut, hot, vagina, boobs, pregnant, pregnancy, cute, marry, levy, gorgeous, horny, crush, beautiful, secretary, dump, shopping, date, nonprofit, intentions, sexy, dated and prostitute.

These were the terms most commonly associated with men:

juicy, keys, adviser, bully, prepare, fought, wharton, austrian, fieckers, homo, genes, e7ee, mathematician, advisor, burning, pricing, fully, band, kfc, nobel, cat, amusing, greatest, textbook, goals, irritate, roof, pointing, episode, and tries.

I imagine the two lists more or less speak for themselves. I’m particularly shocked by the high prevalence of the word “anal”—the sixth-most common word used in threads involving women.

Who goes to an economics job forum and starts talking about anal sex?

I actually did a search on “anal” to see what sort of things were being discussed: This thread is apparently someone trying to decide where he should work based on “Girls of which country are easiest to get?”, so basically sex tourism as job market planning. Here’s another asking (perhaps legitimately) about the appropriate social norm for splitting vacation costs with a girlfriend, and someone down the thread recommends that in exchange for paying, he should expect her to provide him with anal sex. This one starts with a man lamenting that his girlfriend dumped him on his birthday (that’s a dick move by the way), but somehow veers off into a discussion of whether anal sex is overrated. And this one is just off the bat about frequency of sexual encounters.
So yeah, I’m really not surprised that there aren’t a lot of women on these so-called “job discussion boards”.

The only bias-related word associated with men was “homo”—so it’s actually a homophobic bias, itself indicative of sexism and a profession dominated by cisgender straight White men. I’m not entirely sure that “juicy” was intended to be sexualized (one could also speak of “juicy ideas”), but I’ll assume it was just to conservatively estimate the gender disparity.

Also of special note are “fieckers” and “e7ee”, which refer to specific users, who, despite being presumably economists, caused a great deal of damage to the discussion boards. “fieckers” was an idiosyncratic word that one user used in a variety of sexist and homophobic troll posts, while “e7ee” is the hexadecimal code for one of the former moderators, who apparently uilaterally deleted and moved threads in order to tilt the entire discussion board toward right-wing laissez-faire economics.

Of course, that one discussion board isn’t representative of the entire profession. As anyone who has ever visited 4chan knows, discussion boards can be some of the darkest places on the Internet.

Clearer evidence of discrimination where it counts can be found in citation studies, which have found that papers published by women in top economics journals are more highly cited than papers published by men in the same journals.

What does that mean? Well, it’s the same reason that female stock brokers outperform male brokers and firms with more female executives are more profitable. Women are held to a higher standard than men, so in order to simply get in, women have to be more competent and produce higher-quality output.

Admittedly, citation count is far from a perfect measure of research quality (and for that matter profit is far from a perfect measure of a well-run corporation). But this is very clear evidence of actual discrimination. Not innate differences in preferences, not differences in talent—actual discrimination. It’s less clear where and how the discrimination is happening. Are journals simply not accepting good papers if they see female authors? (This is possible, because most top journals in economics don’t use double-blind peer review anymore—for quite flimsy reasons, in my opinion). Are there not enough mentors for women in academia? Are women moving to more accepting fields before they even enter grad school? Are they being pushed out by harassment as grad students? Likely all of these are part of the story.

There’s reason to think that economic ideology has contributed to this problem. If you think of the world in neoclassical laissez-faire terms, where markets are perfect and always lead to the best outcome, then you are likely to be blind to bias and discrimination, because a perfect market would obviously eliminate such things. This is why the recognition of bias has largely come from empirical studies of labor markets, and to a lesser extent from experiments and more left-wing theorists. If you assert that markets are perfectly efficient, labor economists are likely to laugh in your face, while a surprising number of macro theorists will nod and ask you to continue.

Interestingly, recent field experiments on bias in hiring of new faculty did not find any bias against women in economics (and found biases toward women in several other fields). Of course, that doesn’t mean there never was such bias; but perhaps we’ve actually managed to remove it. So that’s one major avenue of discrimination we maybe finally have under control. Only several dozen left to go?