Everyone includes your mother and Los Angeles

Apr 28 JDN 2460430

What are the chances that artificial intelligence will destroy human civilization?

A bunch of experts were surveyed on that question and similar questions, and half of respondents gave a probability of 5% or more; some gave probabilities as high as 99%.

This is incredibly bizarre.

Most AI experts are people who work in AI. They are actively participating in developing this technology. And yet more than half of them think that the technology they are working on right now has a more than 5% chance of destroying human civilization!?

It feels to me like they honestly don’t understand what they’re saying. They can’t really grasp at an intuitive level just what a 5% or 10% chance of global annihilation means—let alone a 99% chance.

If something has a 5% chance of killing everyone, we should consider that at least as bad asthan something that is guaranteed to kill 5% of people.

Probably worse, in fact, because you can recover from losing 5% of the population (we have, several times throughout history). But you cannot recover from losing everyone. So really, it’s like losing 5% of all future people who will ever live—which could be a very large number indeed.

But let’s be a little conservative here, and just count people who already, currently exist, and use 5% of that number.

5% of 8 billion people is 400 million people.

So anyone who is working on AI and also says that AI has a 5% chance of causing human extinction is basically saying: “In expectation, I’m supporting 20 Holocausts.”

If you really think the odds are that high, why aren’t you demanding that any work on AI be tried as a crime against humanity? Why aren’t you out there throwing Molotov cocktails at data centers?

(To be fair, Eliezer Yudkowsky is actually calling for a global ban on AI that would be enforced by military action. That’s the kind of thing you should be doing if indeed you believe the odds are that high. But most AI doomsayers don’t call for such drastic measures, and many of them even continue working in AI as if nothing is wrong.)

I think this must be scope neglector something even worse.

If you thought a drug had a 99% chance of killing your mother, you would never let her take the drug, and you would probably sue the company for making it.

If you thought a technology had a 99% chance of destroying Los Angeles, you would never even consider working on that technology, and you would want that technology immediately and permanently banned.

So I would like to remind anyone who says they believe the danger is this great and yet continues working in the industry:

Everyone includes your mother and Los Angeles.

If AI destroys human civilization, that means AI destroys Los Angeles. However shocked and horrified you would be if a nuclear weapon were detonated in the middle of Hollywood, you should be at least that shocked and horrified by anyone working on advancing AI, if indeed you truly believe that there is at least a 5% chance of AI destroying human civilization.

But people just don’t seem to think this way. Their minds seem to take on a totally different attitude toward “everyone” than they would take toward any particular person or even any particular city. The notion of total human annihilation is just so remote, so abstract, they can’t even be afraid of it the way they are afraid of losing their loved ones.

This despite the fact that everyone includes all your loved ones.

If a drug had a 5% chance of killing your mother, you might let her take it—but only if that drug was the best way to treat some very serious disease. Chemotherapy can be about that risky—but you don’t go on chemo unless you have cancer.

If a technology had a 5% chance of destroying Los Angeles, I’m honestly having trouble thinking of scenarios in which we would be willing to take that risk. But the closest I can come to it is the Manhattan Project. If you’re currently fighting a global war against fascist imperialists, and they are also working on making an atomic bomb, then being the first to make an atomic bomb may in fact be the best option, even if you know that it carries a serious risk of utter catastrophe.

In any case, I think one thing is clear: You don’t take that kind of serious risk unless there is some very large benefit. You don’t take chemotherapy on a whim. You don’t invent atomic bombs just out of curiosity.

Where’s the huge benefit of AI that would justify taking such a huge risk?

Some forms of automation are clearly beneficial, but so far AI per se seems to have largely made our society worse. ChatGPT lies to us. Robocalls inundate us. Deepfakes endanger journalism. What’s the upside here? It makes a ton of money for tech companies, I guess?

Now, fortunately, I think 5% is too high an estimate.

(Scientific American agrees.)

My own estimate is that, over the next two centuries, there is about a 1% chance that AI destroys human civilization, and only a 0.1% chance that it results in human extinction.

This is still really high.

People seem to have trouble with that too.

“Oh, there’s a 99.9% chance we won’t all die; everything is fine, then?” No. There are plenty of other scenarios that would also be very bad, and a total extinction scenario is so terrible that even a 0.1% chance is not something we can simply ignore.

0.1% of people is still 8 million people.

I find myself in a very odd position: On the one hand, I think the probabilities that doomsayers are giving are far too high. On the other hand, I think the actions that are being taken—even by those same doomsayers—are far too small.

Most of them don’t seem to consider a 5% chance to be worthy of drastic action, while I consider a 0.1% chance to be well worthy of it. I would support a complete ban on all AI research immediately, just from that 0.1%.

The only research we should be doing that is in any way related to AI should involve how to make AI safer—absolutely no one should be trying to make it more powerful or apply it to make money. (Yet in reality, almost the opposite is the case.)

Because 8 million people is still a lot of people.

Is it fair to treat a 0.1% chance of killing everyone as equivalent to killing 0.1% of people?

Well, first of all, we have to consider the uncertainty. The difference between a 0.05% chance and a 0.015% chance is millions of people, but there’s probably no way we can actually measure it that precisely.

But it seems to me that something expected to kill between 4 million and 12 million people would still generally be considered very bad.

More importantly, there’s also a chance that AI will save people, or have similarly large benefits. We need to factor that in as well. Something that will kill 4-12 million people but also save 15-30 million people is probably still worth doing (but we should also be trying to find ways to minimize the harm and maximize the benefit).

The biggest problem is that we are deeply uncertain about both the upsides and the downsides. There are a vast number of possible outcomes from inventing AI. Many of those outcomes are relatively mundane; some are moderately good, others are moderately bad. But the moral question seems to be dominated by the big outcomes: With some small but non-negligible probability, AI could lead to either a utopian future or an utter disaster.

The way we are leaping directly into applying AI without even being anywhere close to understanding AI seems to me especially likely to lean toward disaster. No other technology has ever become so immediately widespread while also being so poorly understood.

So far, I’ve yet to see any convincing arguments that the benefits of AI are anywhere near large enough to justify this kind of existential risk. In the near term, AI really only promises economic disruption that will largely be harmful. Maybe one day AI could lead us into a glorious utopia of automated luxury communism, but we really have no way of knowing that will happen—and it seems pretty clear that Google is not going to do that.

Artificial intelligence technology is moving too fast. Even if it doesn’t become powerful enough to threaten our survival for another 50 years (which I suspect it won’t), if we continue on our current path of “make money now, ask questions never”, it’s still not clear that we would actually understand it well enough to protect ourselves by then—and in the meantime it is already causing us significant harm for little apparent benefit.

Why are we even doing this? Why does halting AI research feel like stopping a freight train?

I dare say it’s because we have handed over so much power to corporations.

The paperclippers are already here.

Bundling the stakes to recalibrate ourselves

Mar 31 JDN 2460402

In a previous post I reflected on how our minds evolved for an environment of immediate return: An immediate threat with high chance of success and life-or-death stakes. But the world we live in is one of delayed return: delayed consequences with low chance of success and minimal stakes.

We evolved for a world where you need to either jump that ravine right now or you’ll die; but we live in a world where you’ll submit a hundred job applications before finally getting a good offer.

Thus, our anxiety system is miscalibrated for our modern world, and this miscalibration causes us to have deep, chronic anxiety which is pathological, instead of brief, intense anxiety that would protect us from harm.

I had an idea for how we might try to jury-rig this system and recalibrate ourselves:

Bundle the stakes.

Consider job applications.

The obvious way to think about it is to consider each application, and decide whether it’s worth the effort.

Any particular job application in today’s market probably costs you 30 minutes, but you won’t hear back for 2 weeks, and you have maybe a 2% chance of success. But if you fail, all you lost was that 30 minutes. This is the exact opposite of what our brains evolved to handle.

So now suppose if you think of it in terms of sending 100 job applications.

That will cost you 30 times 100 minutes = 50 hours. You still won’t hear back for weeks, but you’ve spent weeks, so that won’t feel as strange. And your chances of success after 100 applications are something like 1-(0.98)^100 = 87%.

Even losing 50 hours over a few weeks is not the disaster that falling down a ravine is. But it still feels a lot more reasonable to be anxious about that than to be anxious about losing 30 minutes.

More importantly, we have radically changed the chances of success.

Each individual application will almost certainly fail, but all 100 together will probably succeed.

If we were optimally rational, these two methods would lead to the same outcomes, by a rather deep mathematical law, the linearity of expectation:
E[nX] = n E[X]

Thus, the expected utility of doing something n times is precisely n times the expected utility of doing it once (all other things equal); and so, it doesn’t matter which way you look at it.

But of course we aren’t perfectly rational. We don’t actually respond to the expected utility. It’s still not entirely clear how we do assess probability in our minds (prospect theory seems to be onto something, but it’s computationally harder than rational probability, which means it makes absolutely no sense to evolve it).

If instead we are trying to match up our decisions with a much simpler heuristic that evolved for things like jumping over ravines, our representation of probability may be very simple indeed, something like “definitely”, “probably”, “maybe”, “probably not”, “definitely not”. (This is essentially my categorical prospect theory, which, like the stochastic overload model, is a half-baked theory that I haven’t published and at this point probably never will.)

2% chance of success is solidly “probably not” (or maybe something even stronger, like “almost definitely not”). Then, outcomes that are in that category are presumably weighted pretty low, because they generally don’t happen. Unless they are really good or really bad, it’s probably safest to ignore them—and in this case, they are neither.

But 87% chance of success is a clear “probably”; and outcomes in that category deserve our attention, even if their stakes aren’t especially high. And in fact, by bundling them, we have even made the stakes a bit higher—likely making the outcome a bit more salient.

The goal is to change “this will never work” to “this is going to work”.

For an individual application, there’s really no way to do that (without self-delusion); maybe you can make the odds a little better than 2%, but you surely can’t make them so high they deserve to go all the way up to “probably”. (At best you might manage a “maybe”, if you’ve got the right contacts or something.)

But for the whole set of 100 applications, this is in fact the correct assessment. It will probably work. And if 100 doesn’t, 150 might; if 150 doesn’t, 200 might. At no point do you need to delude yourself into over-estimating the odds, because the actual odds are in your favor.

This isn’t perfect, though.

There’s a glaring problem with this technique that I still can’t resolve: It feels overwhelming.

Doing one job application is really not that big a deal. It accomplishes very little, but also costs very little.

Doing 100 job applications is an enormous undertaking that will take up most of your time for multiple weeks.

So if you are feeling demotivated, asking you to bundle the stakes is asking you to take on a huge, overwhelming task that surely feels utterly beyond you.

Also, when it comes to this particular example, I even managed to do 100 job applications and still get a pretty bad outcome: My only offer was Edinburgh, and I ended up being miserable there. I have reason to believe that these were exceptional circumstances (due to COVID), but it has still been hard to shake the feeling of helplessness I learned from that ordeal.

Maybe there’s some additional reframing that can help here. If so, I haven’t found it yet.

But maybe stakes bundling can help you, or someone out there, even if it can’t help me.

Why are political speeches so vacuous?

Aug 27 JDN 2460184

In last week’s post I talked about how posters for shows at the Fringe seem to be attention-grabbing but almost utterly devoid of useful information.

This brings to mind another sort of content that also fits that description: political speeches.

While there are some exceptions—including in fact some of the greatest political speeches ever made, such as Martin Luther King’s “I have a dream” or Dwight Eisenhower’s “Cross of Iron”—on the whole, most political speeches seem to be incredibly vacuous.

Each country probably has its own unique flavor of vacuousness, but in the US they talk about motherhood, and apple pie, and American exceptionalism. “I love my great country, we are an amazing country, I’m so proud to live here” is basically the extent of the information conveyed within what could well be a full hour-long oration.

This raises a question: Why? Why don’t political speeches typically contain useful information?

It’s not that there’s no useful information to be conveyed: There are all sorts of things that people would like to know about a political candidate, including how honest they are, how competent they are, and the whole range of policies they intend to support or oppose on a variety of issues.

But most of what you’d like to know about a candidate actually comes in one of two varieties: Cheap talk, or controversy.

Cheap talk is the part related to being honest and competent. Basically every voter wants candidates who are honest and competent, and we know all too well that not all candidates qualify. The problem is, how do they show that they are honest and competent? They could simply assert it, but that’s basically meaningless—anybody could assert it. In fact, Donald Trump is the candidate who leaps to mind as the most eager to frequently assert his own honesty and competence, and also the most successful candidate in at least my lifetime who seems to utterly and totally lack anything resembling these qualities.

So unless you are clever enough to find ways to demonstrate your honesty and competence, you’re really not accomplishing anything by asserting it. Most people simply won’t believe you, and they’re right not to. So it doesn’t make much sense to spend a lot of effort trying to make such assertions.

Alternatively, you could try to talk about policy, say what you would like to do regarding climate change, the budget, or the military, or the healthcare system, or any of dozens of other political questions. That would absolutely be useful information for voters, and it isn’t just cheap talk, because different candidates and voters do intend different things and voters would like to know which ones are which.

The problem, then, is that it’s controversial. Not everyone is going to agree with your particular take on any given political issue—even within your own party there is bound to be substantial disagreement.

If enough voters were sufficiently rational about this, and could coolly evaluate a candidate’s policies, accepting the pros and cons, then it would still make sense to deliver this information. I for one would rather vote for someone I know agrees with me 90% of the time than someone who won’t even tell me what they intend to do while in office.

But in fact most voters are not sufficiently rational about this. Voters react much more strongly to negative information than positive information: A candidate you agree with 9 times out of 10 can still make you utterly outraged by their stance on issue number 10. This is a specific form of the more general phenomenon of negativity bias: Psychologically, people just react a lot more strongly to bad things than to good things. Negativity bias has strong effects on how people vote, especially young people.

Rather than a cool-headed, rational assessment of pros and cons, most voters base their decision on deal-breakers: “I could never vote for a Republican” or “I could never vote for someone who wants to cut the military”. Only after they’ve excluded a large portion of candidates based on these heuristics do they even try to look closer at the detailed differences between candidates.

This means that, if you are a candidate, your best option is to avoid offering any deal-breakers. You want to say things that almost nobody will strongly disagree with—because any strong disagreement could be someone’s deal-breaker and thereby hurt your poll numbers.

And what’s the best way to not say anything that will offend or annoy anyone? Not say anything at all. Campaign managers basically need to Mirandize their candidates: You have the right to remain silent. Anything you say can and will be used against you in the court of public opinion.

But in fact you can’t literally remain silent—when running for office, you are expected to make a lot of speeches. So you do the next best thing: You say a lot of words, but convey very little meaning. You say things like “America is great” and “I love apple pie” and “Moms are heroes” that, while utterly vapid, are very unlikely to make anyone particularly angry at you or be any voter’s deal-breaker.

And then we get into a Nash equilibrium where everyone is talking like this, nobody is saying anything, and political speeches become entirely devoid of useful content.

What can we as voters do about this? Individually, perhaps nothing. Collectively, literally everything.

If we could somehow shift the equilibrium so that candidates who are brave enough to make substantive, controversial claims get rewarded for it—even when we don’t entirely agree with them—while those who continue to recite insipid nonsense are punished, then candidates will absolutely change how they speak.

But this would require a lot of people to change, more or less all at once. A sufficiently large critical mass of voters would need to be willing to support candidates specifically because they made detailed policy proposals, even if we didn’t particularly like those policy proposals.

Obviously, if their policy proposals were terrible, we’d have good reason to reject them; but for this to work, we need to be willing to support a lot of things that are just… kind of okay. Because it’s vanishingly unlikely that the first candidates who are brave enough to say what they intend will also be ones whose intentions we entirely agree with. We need to set some kind of threshold of minimum agreement, and reward anyone who exceeds it. We need to ask ourselves if our deal-breakers really need to be deal-breakers.

What behavioral economics needs

Apr 16 JDN 2460049

The transition from neoclassical to behavioral economics has been a vital step forward in science. But lately we seem to have reached a plateau, with no major advances in the paradigm in quite some time.

It could be that there is work already being done which will, in hindsight, turn out to be significant enough to make that next step forward. But my fear is that we are getting bogged down by our own methodological limitations.

Neoclassical economics shared with us its obsession with mathematical sophistication. To some extent this was inevitable; in order to impress neoclassical economists enough to convert some of them, we had to use fancy math. We had to show that we could do it their way in order to convince them why we shouldn’t—otherwise, they’d just have dismissed us the way they had dismissed psychologists for decades, as too “fuzzy-headed” to do the “hard work” of putting everything into equations.

But the truth is, putting everything into equations was never the right approach. Because human beings clearly don’t think in equations. Once we write down a utility function and get ready to take its derivative and set it equal to zero, we have already distanced ourselves from how human thought actually works.

When dealing with a simple physical system, like an atom, equations make sense. Nobody thinks that the electron knows the equation and is following it intentionally. That equation simply describes how the forces of the universe operate, and the electron is subject to those forces.

But human beings do actually know things and do things intentionally. And while an equation could be useful for analyzing human behavior in the aggregate—I’m certainly not objecting to statistical analysis—it really never made sense to say that people make their decisions by optimizing the value of some function. Most people barely even know what a function is, much less remember calculus well enough to optimize one.

Yet right now, behavioral economics is still all based in that utility-maximization paradigm. We don’t use the same simplistic utility functions as neoclassical economists; we make them more sophisticated and realistic. Yet in that very sophistication we make things more complicated, more difficult—and thus in at least that respect, even further removed from how actual human thought must operate.

The worst offender here is surely Prospect Theory. I recognize that Prospect Theory predicts human behavior better than conventional expected utility theory; nevertheless, it makes absolutely no sense to suppose that human beings actually do some kind of probability-weighting calculation in their heads when they make judgments. Most of my students—who are well-trained in mathematics and economics—can’t even do that probability-weighting calculation on paper, with a calculator, on an exam. (There’s also absolutely no reason to do it! All it does it make your decisions worse!) This is a totally unrealistic model of human thought.

This is not to say that human beings are stupid. We are still smarter than any other entity in the known universe—computers are rapidly catching up, but they haven’t caught up yet. It is just that whatever makes us smart must not be easily expressible as an equation that maximizes a function. Our thoughts are bundles of heuristics, each of which may be individually quite simple, but all of which together make us capable of not only intelligence, but something computers still sorely, pathetically lack: wisdom. Computers optimize functions better than we ever will, but we still make better decisions than they do.

I think that what behavioral economics needs now is a new unifying theory of these heuristics, which accounts for not only how they work, but how we select which one to use in a given situation, and perhaps even where they come from in the first place. This new theory will of course be complex; there’s a lot of things to explain, and human behavior is a very complex phenomenon. But it shouldn’t be—mustn’t be—reliant on sophisticated advanced mathematics, because most people can’t do advanced mathematics (almost by construction—we would call it something different otherwise). If your model assumes that people are taking derivatives in their heads, your model is already broken. 90% of the world’s people can’t take a derivative.

I guess it could be that our cognitive processes in some sense operate as if they are optimizing some function. This is commonly posited for the human motor system, for instance; clearly baseball players aren’t actually solving differential equations when they throw and catch balls, but the trajectories that balls follow do in fact obey such equations, and the reliability with which baseball players can catch and throw suggests that they are in some sense acting as if they can solve them.

But I think that a careful analysis of even this classic example reveals some deeper insights that should call this whole notion into question. How do baseball players actually do what they do? They don’t seem to be calculating at all—in fact, if you asked them to try to calculate while they were playing, it would destroy their ability to play. They learn. They engage in practiced motions, acquire skills, and notice patterns. I don’t think there is anywhere in their brains that is actually doing anything like solving a differential equation. It’s all a process of throwing and catching, throwing and catching, over and over again, watching and remembering and subtly adjusting.

One thing that is particularly interesting to me about that process is that is astonishingly flexible. It doesn’t really seem to matter what physical process you are interacting with; as long as it is sufficiently orderly, such a method will allow you to predict and ultimately control that process. You don’t need to know anything about differential equations in order to learn in this way—and, indeed, I really can’t emphasize this enough, baseball players typically don’t.

In fact, learning is so flexible that it can even perform better than calculation. The usual differential equations most people would think to use to predict the throw of a ball would assume ballistic motion in a vacuum, which absolutely not what a curveball is. In order to throw a curveball, the ball must interact with the air, and it must be launched with spin; curving a baseball relies very heavily on the Magnus Effect. I think it’s probably possible to construct an equation that would fully predict the motion of a curveball, but it would be a tremendously complicated one, and might not even have an exact closed-form solution. In fact, I think it would require solving the Navier-Stokes equations, for which there is an outstanding Millennium Prize. Since the viscosity of air is very low, maybe you could get away with approximating using the Euler fluid equations.

To be fair, a learning process that is adapting to a system that obeys an equation will yield results that become an ever-closer approximation of that equation. And it is in that sense that a baseball player can be said to be acting as if solving a differential equation. But this relies heavily on the system in question being one that obeys an equation—and when it comes to economic systems, is that even true?

What if the reason we can’t find a simple set of equations that accurately describe the economy (as opposed to equations of ever-escalating complexity that still utterly fail to describe the economy) is that there isn’t one? What if the reason we can’t find the utility function people are maximizing is that they aren’t maximizing anything?

What behavioral economics needs now is a new approach, something less constrained by the norms of neoclassical economics and more aligned with psychology and cognitive science. We should be modeling human beings based on how they actually think, not some weird mathematical construct that bears no resemblance to human reasoning but is designed to impress people who are obsessed with math.

I’m of course not the first person to have suggested this. I probably won’t be the last, or even the one who most gets listened to. But I hope that I might get at least a few more people to listen to it, because I have gone through the mathematical gauntlet and earned my bona fides. It is too easy to dismiss this kind of reasoning from people who don’t actually understand advanced mathematics. But I do understand differential equations—and I’m telling you, that’s not how people think.

Implications of stochastic overload

Apr 2 JDN 2460037

A couple weeks ago I presented my stochastic overload model, which posits a neurological mechanism for the Yerkes-Dodson effect: Stress increases sympathetic activation, and this increases performance, up to the point where it starts to risk causing neural pathways to overload and shut down.

This week I thought I’d try to get into some of the implications of this model, how it might be applied to make predictions or guide policy.

One thing I often struggle with when it comes to applying theory is what actual benefits we get from a quantitative mathematical model as opposed to simply a basic qualitative idea. In many ways I think these benefits are overrated; people seem to think that putting something into an equation automatically makes it true and useful. I am sometimes tempted to try to take advantage of this, to put things into equations even though I know there is no good reason to put them into equations, simply because so many people seem to find equations so persuasive for some reason. (Studies have even shown that, particularly in disciplines that don’t use a lot of math, inserting a totally irrelevant equation into a paper makes it more likely to be accepted.)

The basic implications of the Yerkes-Dodson effect are already widely known, and utterly ignored in our society. We know that excessive stress is harmful to health and performance, and yet our entire economy seems to be based around maximizing the amount of stress that workers experience. I actually think neoclassical economics bears a lot of the blame for this, as neoclassical economists are constantly talking about “increasing work incentives”—which is to say, making work life more and more stressful. (And let me remind you that there has never been any shortage of people willing to work in my lifetime, except possibly briefly during the COVID pandemic. The shortage has always been employers willing to hire them.)

I don’t know if my model can do anything to change that. Maybe by putting it into an equation I can make people pay more attention to it, precisely because equations have this weird persuasive power over most people.

As far as scientific benefits, I think that the chief advantage of a mathematical model lies in its ability to make quantitative predictions. It’s one thing to say that performance increases with low levels of stress then decreases with high levels; but it would be a lot more useful if we could actually precisely quantify how much stress is optimal for a given person and how they are likely to perform at different levels of stress.

Unfortunately, the stochastic overload model can only make detailed predictions if you have fully specified the probability distribution of innate activation, which requires a lot of free parameters. This is especially problematic if you don’t even know what type of distribution to use, which we really don’t; I picked three classes of distribution because they were plausible and tractable, not because I had any particular evidence for them.

Also, we don’t even have standard units of measurement for stress; we have a vague notion of what more or less stressed looks like, but we don’t have the sort of quantitative measure that could be plugged into a mathematical model. Probably the best units to use would be something like blood cortisol levels, but then we’d need to go measure those all the time, which raises its own issues. And maybe people don’t even respond to cortisol in the same ways? But at least we could measure your baseline cortisol for awhile to get a prior distribution, and then see how different incentives increase your cortisol levels; and then the model should give relatively precise predictions about how this will affect your overall performance. (This is a very neuroeconomic approach.)

So, for now, I’m not really sure how useful the stochastic overload model is. This is honestly something I feel about a lot of the theoretical ideas I have come up with; they often seem too abstract to be usefully applicable to anything.

Maybe that’s how all theory begins, and applications only appear later? But that doesn’t seem to be how people expect me to talk about it whenever I have to present my work or submit it for publication. They seem to want to know what it’s good for, right now, and I never have a good answer to give them. Do other researchers have such answers? Do they simply pretend to?

Along similar lines, I recently had one of my students ask about a theory paper I wrote on international conflict for my dissertation, and after sending him a copy, I re-read the paper. There are so many pages of equations, and while I am confident that the mathematical logic is valid,I honestly don’t know if most of them are really useful for anything. (I don’t think I really believe that GDP is produced by a Cobb-Douglas production function, and we don’t even really know how to measure capital precisely enough to say.) The central insight of the paper, which I think is really important but other people don’t seem to care about, is a qualitative one: International treaties and norms provide an equilibrium selection mechanism in iterated games. The realists are right that this is cheap talk. The liberals are right that it works. Because when there are many equilibria, cheap talk works.

I know that in truth, science proceeds in tiny steps, building a wall brick by brick, never sure exactly how many bricks it will take to finish the edifice. It’s impossible to see whether your work will be an irrelevant footnote or the linchpin for a major discovery. But that isn’t how the institutions of science are set up. That isn’t how the incentives of academia work. You’re not supposed to say that this may or may not be correct and is probably some small incremental progress the ultimate impact of which no one can possibly foresee. You’re supposed to sell your work—justify how it’s definitely true and why it’s important and how it has impact. You’re supposed to convince other people why they should care about it and not all the dozens of other probably equally-valid projects being done by other researchers.

I don’t know how to do that, and it is agonizing to even try. It feels like lying. It feels like betraying my identity. Being good at selling isn’t just orthogonal to doing good science—I think it’s opposite. I think the better you are at selling your work, the worse you are at cultivating the intellectual humility necessary to do good science. If you think you know all the answers, you’re just bad at admitting when you don’t know things. It feels like in order to succeed in academia, I have to act like an unscientific charlatan.

Honestly, why do we even need to convince you that our work is more important than someone else’s? Are there only so many science points to go around? Maybe the whole problem is this scarcity mindset. Yes, grant funding is limited; but why does publishing my work prevent you from publishing someone else’s? Why do you have to reject 95% of the papers that get sent to you? Don’t tell me you’re limited by space; the journals are digital and searchable and nobody reads the whole thing anyway. Editorial time isn’t infinite, but most of the work has already been done by the time you get a paper back from peer review. Of course, I know the real reason: Excluding people is the main source of prestige.

The role of innate activation in stochastic overload

Mar 26 JDN 2460030

Two posts ago I introduced my stochastic overload model, which offers an explanation for the Yerkes-Dodson effect by positing that additional stress increases sympathetic activation, which is useful up until the point where it starts risking an overload that forces systems to shut down and rest.

The central equation of the model is actually quite simple, expressed either as an expectation or as an integral:

Y = E[x + s | x + s < 1] P[x + s < 1]

Y = \int_{0}^{1-s} (x+s) dF(x)

The amount of output produced is the expected value of innate activation plus stress activation, times the probability that there is no overload. Increased stress raises this expectation value (the incentive effect), but also increases the probability of overload (the overload effect).

The model relies upon assuming that the brain starts with some innate level of activation that is partially random. Exactly what sort of Yerkes-Dodson curve you get from this model depends very much on what distribution this innate activation takes.

I’ve so far solved it for three types of distribution.

The simplest is a uniform distribution, where within a certain range, any level of activation is equally probable. The probability density function looks like this:

Assume the distribution has support between a and b, where a < b.

When b+s < 1, then overload is impossible, and only the incentive effect occurs; productivity increases linearly with stress.

The expected output is simply the expected value of a uniform distribution from a+s to b+s, which is:

E[x + s] = (a+b)/2+s

Then, once b+s > 1, overload risk begins to increase.

In this range, the probability of avoiding overload is:

P[x + s < 1] = F(1-s) = (1-s-a)/(b-a)

(Note that at b+s=1, this is exactly 1.)

The expected value of x+s in this range is:

E[x + s | x + s < 1] = (1-s)(1+s)/(2(b-a))

Multiplying these two together:

Y = [(1-s)(1+s)(1-s-a)]/[2(b-a)^2]

Here is what that looks like for a=0, b=1/2:

It does have the right qualitative features: increasing, then decreasing. But its sure looks weird, doesn’t it? It has this strange kinked shape.

So let’s consider some other distributions.

The next one I was able to solve it for is an exponential distribution, where the most probable activation is zero, and then higher activation always has lower probability than lower activation in an exponential decay:

For this it was actually easiest to do the integral directly (I did it by integrating by parts, but I’m sure you don’t care about all the mathematical steps):

Y = \int_{0}^{1-s} (x+s) dF(x)

Y = (1/λ+s) – (1/ λ + 1)e^(-λ(1-s))

The parameter λdecides how steeply your activation probability decays. Someone with low λ is relatively highly activated all the time, while someone with high λ is usually not highly activated; this seems like it might be related to the personality trait neuroticism.

Here are graphs of what the resulting Yerkes-Dodson curve looks like for several different values of λ:

λ = 0.5:

λ = 1:

λ = 2:

λ = 4:

λ = 8:

The λ = 0.5 person has high activation a lot of the time. They are actually fairly productive even without stress, but stress quickly overwhelms them. The λ = 8 person has low activation most of the time. They are not very productive without stress, but can also bear relatively high amounts of stress without overloading.

(The low-λ people also have overall lower peak productivity in this model, but that might not be true in reality, if λ is inversely correlated with some other attributes that are related to productivity.)

Neither uniform nor exponential has the nice bell-curve shape for innate activation we might have hoped for. There is another class of distributions, beta distributions, which do have this shape, and they are sort of tractable—you need something called an incomplete beta function, which isn’t an elementary function but it’s useful enough that most statistical packages include it.

Beta distributions have two parameters, α and β. They look like this:

Beta distributions are quite useful in Bayesian statistics; if you’re trying to estimate the probability of a random event that either succeeds or fails with a fixed probability (a Bernoulli process), and so far you have observed a successes and b failures, your best guess of its probability at each trial is a beta distribution with α = a+1 and β = b+1.

For beta distributions with parameters α and β, the result comes out to (I is that incomplete beta function I mentioned earlier):

Y = I(1-s, α+1, β) + I(1-s, α, β)

For whole number values of α andβ, the incomplete beta function can be computed by hand (though it is more work the larger they are); here’s an example with α = β = 2.

The innate activation probability looks like this:

And the result comes out like this:

Y = 2(1-s)^3 – 3/2(1-s)^4 + 3s(1-s)^2 – 2s(1-s)^3

This person has pretty high innate activation most of the time, so stress very quickly overwhelms them. If I had chosen a much higher β, I could change that, making them less likely to be innately so activated.

These are the cases I’ve found to be relatively tractable so far. They all have the right qualitative pattern: Increasing stress increases productivity for awhile, then begins decreasing it once overload risk becomes too high. They also show a general pattern where people who are innately highly activated (neurotic?) are much more likely to overload and thus much more sensitive to stress.

Mental accounting and “free shipping”

Mar 5 JDN 2460009

Suppose you are considering buying a small item, such as a hardcover book or a piece of cookware. If you buy it from one seller, the price is $50, but shipping costs $20; if you buy it from another, it costs $70 but you’ll get free shipping. Which one do you buy from?

If you are being rational, you won’t care in the slightest. But most people don’t seem to behave that way. The idea of paying $20 to ship a $50 item just feels wrong somehow, and so most people will tend to prefer the seller with free shipping—even though the total amount they spend is the same.

Sellers know this, and take advantage of it. Indeed, it is the only plausible reason they would ever offer free shipping in the first place.

Free shipping, after all, is not actually free. Someone still gets paid to perform that delivery. And while the seller is the one making the payment, they will no doubt raise the price they charge you as a customer in order to make up the difference—it would be very foolish of them not to. So ultimately, everything turns out the same as if you had paid for shipping.

But it still feels different, doesn’t it? This is because of a series of heuristics most people use for their financial decisions known as mental accounting.

There are a lot of different heuristics that go into mental accounting, but the one that is most relevant here is mental budgeting: We divide our spending into different budgetary categories, and try not to go over budget in any particular category.

While the item you’re buying may in fact be worth more than $70 to you, you probably didn’t budget in your mind $20 for shipping. So even if the total impact on your finances is the same, you register the higher shipping price as “over budget” in one of your mental categories. So it feels like you are spending more than if you had simply paid $70 for the item and gotten free shipping. Even though you are actually paying exactly the same amount.

Another reason this works so well may be that people don’t really have a clear idea what the price of items is at different sellers. So you see “$70, free shipping” and you assume that it previously had a price of $70 and they are generously offering you shipping for free.

But if you ever find yourself assuming that a corporation is being generous—you are making a cognitive error. Corporations are, by design, as selfish as possible. They are never generous. There is always something in it for them.

In the best-case scenario, what serves the company will also serve other people, as when they donate to good causes for tax deductions and better PR (or when they simply provide good products at low prices). But no corporation is going to intentionally sacrifice its own interests to benefit anyone else. They exist to maximize profits for their shareholders. That is what they do. That is what they always do. Keep that in mind, and you won’t be disappointed by them.

They might offer you a lower price, or other perks, in order to keep you as a customer; but they will do so very carefully, only enough to keep you from shopping elsewhere. And if they are able to come down on the price while still making a profit, that really just goes to show they had too much market power to begin with.

Free shipping, at least, is relatively harmless. It’s slightly manipulative, but a higher price plus free shipping really does ultimately amount to the same thing as a lower price plus paid shipping. The worst I can say about it is that it may cause people to buy things they otherwise wouldn’t have; but they must have still felt that the sticker price was worth it, so it can’t really be so bad.

Another, more sinister way that corporations use mental accounting to manipulate customers is through the use of credit cards.

It’s well-documented that people are willing to spend more on credit cards than they would be in cash. In most cases, this does not appear to be the result of people actually being constrained by their liquidity—even if people have the cash, they are more willing to spend a credit card to buy the same item.

This effect is called pain of paying. It hurts more, psychologically, to hand over a series of dollar bills than it does to swipe (or lately, just tap) a credit card. It’s not just about convenience; by making it less painful to pay, companies can pressure us to spend more.

And since credit cards add to an existing balance, there is what’s called transaction decoupling: The money we spent on any particular item gets mentally separated from the actual transaction in which we bought that item. We may not even remember how much we paid. We just see a credit card balance go up; and it may end up being quite a large balance, but any particular transaction usually won’t have raised it very much.

Human beings tend to perceive stimuli proportionally: We don’t really feel the effect of $5 per se, we feel the effect of a 20% increase. So that $5 feels like a lot more when it’s coming out of a wallet that held $20 than it does when it’s adding to a $200 credit card balance.

This is also why I say expensive cheap things, cheap expensive things; you should care more about the same proportional difference when it’s on a higher base price.

Optimization is unstable. Maybe that’s why we satisfice.

Feb 26 JDN 2460002

Imagine you have become stranded on a deserted island. You need to find shelter, food, and water, and then perhaps you can start working on a way to get help or escape the island.

Suppose you are programmed to be an optimizerto get the absolute best solution to any problem. At first this may seem to be a boon: You’ll build the best shelter, find the best food, get the best water, find the best way off the island.

But you’ll also expend an enormous amount of effort trying to make it the best. You could spend hours just trying to decide what the best possible shelter would be. You could pass up dozens of viable food sources because you aren’t sure that any of them are the best. And you’ll never get any rest because you’re constantly trying to improve everything.

In principle your optimization could include that: The cost of thinking too hard or searching too long could be one of the things you are optimizing over. But in practice, this sort of bounded optimization is often remarkably intractable.

And what if you forgot about something? You were so busy optimizing your shelter you forgot to treat your wounds. You were so busy seeking out the perfect food source that you didn’t realize you’d been bitten by a venomous snake.

This is not the way to survive. You don’t want to be an optimizer.

No, the person who survives is a satisficerthey make sure that what they have is good enough and then they move on to the next thing. Their shelter is lopsided and ugly. Their food is tasteless and bland. Their water is hard. But they have them.

Once they have shelter and food and water, they will have time and energy to do other things. They will notice the snakebite. They will treat the wound. Once all their needs are met, they will get enough rest.

Empirically, humans are satisficers. We seem to be happier because of it—in fact, the people who are the happiest satisfice the most. And really this shouldn’t be so surprising: Because our ancestral environment wasn’t so different from being stranded on a desert island.

Good enough is perfect. Perfect is bad.

Let’s consider another example. Suppose that you have created a powerful artificial intelligence, an AGI with the capacity to surpass human reasoning. (It hasn’t happened yet—but it probably will someday, and maybe sooner than most people think.)

What do you want that AI’s goals to be?

Okay, ideally maybe they would be something like “Maximize goodness”, where we actually somehow include all the panoply of different factors that go into goodness, like beneficence, harm, fairness, justice, kindness, honesty, and autonomy. Do you have any idea how to do that? Do you even know what your own full moral framework looks like at that level of detail?

Far more likely, the goals you program into the AGI will be much simpler than that. You’ll have something you want it to accomplish, and you’ll tell it to do that well.

Let’s make this concrete and say that you own a paperclip company. You want to make more profits by selling paperclips.

First of all, let me note that this is not an unreasonable thing for you to want. It is not an inherently evil goal for one to have. The world needs paperclips, and it’s perfectly reasonable for you to want to make a profit selling them.

But it’s also not a true ultimate goal: There are a lot of other things that matter in life besides profits and paperclips. Anyone who isn’t a complete psychopath will realize that.

But the AI won’t. Not unless you tell it to. And so if we tell it to optimize, we would need to actually include in its optimization all of the things we genuinely care about—not missing a single one—or else whatever choices it makes are probably not going to be the ones we want. Oops, we forgot to say we need clean air, and now we’re all suffocating. Oops, we forgot to say that puppies don’t like to be melted down into plastic.

The simplest cases to consider are obviously horrific: Tell it to maximize the number of paperclips produced, and it starts tearing the world apart to convert everything to paperclips. (This is the original “paperclipper” concept from Less Wrong.) Tell it to maximize the amount of money you make, and it seizes control of all the world’s central banks and starts printing $9 quintillion for itself. (Why that amount? I’m assuming it uses 64-bit signed integers, and 2^63 is over 9 quintillion. If it uses long ints, we’re even more doomed.) No, inflation-adjusting won’t fix that; even hyperinflation typically still results in more real seigniorage for the central banks doing the printing (which is, you know, why they do it). The AI won’t ever be able to own more than all the world’s real GDP—but it will be able to own that if it prints enough and we can’t stop it.

But even if we try to come up with some more sophisticated optimization for it to perform (what I’m really talking about here is specifying its utility function), it becomes vital for us to include everything we genuinely care about: Anything we forget to include will be treated as a resource to be consumed in the service of maximizing everything else.

Consider instead what would happen if we programmed the AI to satisfice. The goal would be something like, “Produce at least 400,000 paperclips at a price of at most $0.002 per paperclip.”

Given such an instruction, in all likelihood, it would in fact produce exactly 400,000 paperclips at a price of exactly $0.002 per paperclip. And maybe that’s not strictly the best outcome for your company. But if it’s better than what you were previously doing, it will still increase your profits.

Moreover, such an instruction is far less likely to result in the end of the world.

If the AI has a particular target to meet for its production quota and price limit, the first thing it would probably try is to use your existing machinery. If that’s not good enough, it might start trying to modify the machinery, or acquire new machines, or develop its own techniques for making paperclips. But there are quite strict limits on how creative it is likely to be—because there are quite strict limits on how creative it needs to be. If you were previously producing 200,000 paperclips at $0.004 per paperclip, all it needs to do is double production and halve the cost. That’s a very standard sort of industrial innovation— in computing hardware (admittedly an extreme case), we do this sort of thing every couple of years.

It certainly won’t tear the world apart making paperclips—at most it’ll tear apart enough of the world to make 400,000 paperclips, which is a pretty small chunk of the world, because paperclips aren’t that big. A paperclip weighs about a gram, so you’ve only destroyed about 400 kilos of stuff. (You might even survive the lawsuits!)

Are you leaving money on the table relative to the optimization scenario? Eh, maybe. One, it’s a small price to pay for not ending the world. But two, if 400,000 at $0.002 was too easy, next time try 600,000 at $0.001. Over time, you can gently increase its quotas and tighten its price requirements until your company becomes more and more successful—all without risking the AI going completely rogue and doing something insane and destructive.

Of course this is no guarantee of safety—and I absolutely want us to use every safeguard we possibly can when it comes to advanced AGI. But the simple change from optimizing to satisficing seems to solve the most severe problems immediately and reliably, at very little cost.

Good enough is perfect; perfect is bad.

I see broader implications here for behavioral economics. When all of our models are based on optimization, but human beings overwhelmingly seem to satisfice, maybe it’s time to stop assuming that the models are right and the humans are wrong.

Optimization is perfect if it works—and awful if it doesn’t. Satisficing is always pretty good. Optimization is unstable, while satisficing is robust.

In the real world, that probably means that satisficing is better.

Good enough is perfect; perfect is bad.

Mind reading is not optional

Nov 20 JDN 2459904

I have great respect for cognitive-behavioral therapy (CBT), and it has done a lot of good for me. (It is also astonishingly cost-effective; its QALY per dollar rate compares favorably to almost any other First World treatment, and loses only to treating high-impact Third World diseases like malaria and schistomoniasis.)

But there are certain aspects of it that have always been frustrating to me. Standard CBT techniques often present as ‘cognitive distortions‘ what are in fact clearly necessary heuristics without which it would be impossible to function.

Perhaps the worst of these is so-called ‘mind reading‘. The very phrasing of it makes it sound ridiculous: Are you suggesting that you have some kind of extrasensory perception? Are you claiming to be a telepath?

But in fact ‘mind reading’ is simply the use of internal cognitive models to forecast the thoughts, behaviors, and expectations of other human beings. And without it, it would be completely impossible to function in human society.

For instance, I have had therapists tell me that it is ‘mind reading’ for me to anticipate that people will have tacit expectations for my behavior that they will judge me for failing to meet, and I should simply wait for people to express their expectations rather than assuming them. I admit, life would be much easier if I could do that. But I know for a fact that I can’t. Indeed, I used to do that, as a child, and it got me in trouble all the time. People were continually upset at me for not doing things they had expected me to do but never bothered to actually mention. They thought these expectations were “obvious”; they were not, at least not to me.

It was often little things, and in hindsight some of these things seem silly: I didn’t know what a ‘made bed’ was supposed to look like, so I put it in a state that was functional for me, but that was not considered ‘making the bed’. (I have since learned that my way was actually better: It’s good to let sheets air out before re-using them.) I was asked to ‘clear the sink’, so I moved the dishes out of the sink and left them on the counter, not realizing that the implicit command was for me to wash those dishes, dry them, and put them away. I was asked to ‘bring the dinner plates to the table’, so I did that, and left them in a stack there, not realizing that I should be setting them out in front of each person’s chair and also bringing flatware. Of course I know better now. But how was I supposed to know then? It seems like I was expected to, though.

Most people just really don’t seem to realize how many subtle, tacit expectations are baked into every single task. I think neurodivergence is quite relevant here; I have a mild autism spectrum disorder, and so I think rather differently than most people. If you are neurotypical, then you probably can forecast other people’s expectations fairly well automatically, and so they may seem obvious to you. In fact, they may seem so obvious that you don’t even realize you’re doing it. Then when someone like me comes along and is consciously, actively trying to forecast other people’s expectations, and sometimes doing it poorly, you go and tell them to stop trying to forecast. But if they were to do that, they’d end up even worse off than they are. What you really need to be telling them is how to forecast better—but that would require insight into your own forecasting methods which you aren’t even consciously aware of.

Seriously, stop and think for a moment all of the things other people expect you to do every day that are rarely if ever explicitly stated. How you are supposed to dress, how you are supposed to speak, how close you are supposed to stand to other people, how long you are supposed to hold eye contact—all of these are standards you will be expected to meet, whether or not any of them have ever been explicitly explained to you. You may do this automatically; or you may learn to do it consciously after being criticized for failing to do it. But one way or another, you must forecast what other people will expect you to do.

To my knowledge, no one has ever explicitly told me not to wear a Starfleet uniform to work. I am not aware of any part of the university dress code that explicitly forbids such attire. But I’m fairly sure it would not be a good idea. To my knowledge, no one has ever explicitly told me not to burst out into song in the middle of a meeting. But I’m still pretty sure I shouldn’t do that. To my knowledge, no one has ever explicitly told me what the ‘right of way’ rules are for walking down a crowded sidewalk, who should be expected to move out of the way of whom. But people still get mad if you mess up and bump into them.

Even when norms are stated explicitly, it is often as a kind of last resort, and the mere fact that you needed to have a norm stated is often taken as a mark against your character. I have been explicitly told in various contexts not to talk to myself or engage in stimming leg movements; but the way I was told has generally suggested that I would have been judged better if I hadn’t had to be told, if I had simply known the way that other people seem to know. (Or is it that they never felt any particular desire to stim?)

In fact, I think a major part of developing social skills and becoming more functional, to the point where a lot of people actually now seem a bit surprised to learn I have an autism spectrum disorder, has been improving my ability to forecast other people’s expectations for my behavior. There are dozens if not hundreds of norms that people expect you to follow at any given moment; most people seem to intuit them so easily that they don’t even realize they are there. But they are there all the same, and this is painfully evident to those of us who aren’t always able to immediately intuit them all.

Now, the fact remains that my current mental models are surely imperfect. I am often wrong about what other people expect of me. I’m even prepared to believe that some of my anxiety comes from believing that people have expectations more demanding than what they actually have. But I can’t simply abandon the idea of forecasting other people’s expectations. Don’t tell me to stop doing it; tell me how to do it better.

Moreover, there is a clear asymmetry here: If you think people want more from you than they actually do, you’ll be anxious, but people will like you and be impressed by you. If you think people want less from you than they actually do, people will be upset at you and look down on you. So, in the presence of uncertainty, there’s a lot of pressure to assume that the expectations are high. It would be best to get it right, of course; but when you aren’t sure you can get it right, you’re often better off erring on the side of caution—which is to say, the side of anxiety.

In short, mind reading isn’t optional. If you think it is, that’s only because you do it automatically.

On the Overton Window

Jul 24 JDN 2459786

As you are no doubt aware, a lot of people on the Internet like to loudly proclaim support for really crazy, extreme ideas. Some of these people actually believe in those ideas, and if you challenge them, will do their best to defend them. Those people are wrong at the level of substantive policy, but there’s nothing wrong with their general approach: If you really think that anarchism or communism is a good thing, it only makes sense that you’d try to convince other people. You might have a hard time of it (in part because you are clearly wrong), but it makes sense that you’d try.

But there is another class of people who argue for crazy, extreme ideas. When pressed, they will admit they don’t really believe in abolishing the police or collectivizing all wealth, but they believe in something else that’s sort of vaguely in that direction, and they think that advocating for the extreme idea will make people more likely to accept what they actually want.

They often refer to this as “shifting the Overton Window”. As Matt Yglesias explained quite well a year ago, this is not actually what Overton was talking about.

But, in principle, it could still be a thing that works. There is a cognitive bias known as anchoring which is often used in marketing: If I only offered a $5 bottle of wine and a $20 bottle of wine, you might think the $20 bottle is too expensive. But if I also include a $50 bottle, that makes you adjust your perceptions of what constitutes a “reasonable” price for wine, and may make you more likely to buy the $20 bottle after all.

It could be, therefore, that an extreme policy demand makes people more willing to accept moderate views, as a sort of compromise. Maybe demanding the abolition of police is a way of making other kinds of police reform seem more reasonable. Maybe showing pictures of Marx and chanting “eat the rich” could make people more willing to accept higher capital gains taxes. Maybe declaring that we are on the verge of apocalyptic climate disaster will make people more willing to accept tighter regulations on carbon emissions and subsidies for solar energy.

Then again—does it actually seem to do that? I see very little evidence that it does. All those demands for police abolition haven’t changed the fact that defunding the police is unpopular. Raising taxes on the rich is popular, but it has been for awhile now (and never was with, well, the rich). And decades of constantly shouting about imminent climate catastrophe is really starting to look like crying wolf.

To see why this strategy seems to be failing, I think it’s helpful to consider how it feels from the other side. Take a look at some issues where someone else is trying to get you to accept a particular view, and consider whether someone advocating a more extreme view would make you more likely to compromise.

Your particular opinions may vary, but here are some examples that would apply to me, and, I suspect, many of you.

If someone says they want tighter border security, I’m skeptical—it’s pretty tight already. But in and of itself, this would not be such a crazy idea. Certainly I agree that it is possible to have too little border security, and so maybe that turns out to be the state we’re in.

But then, suppose that same person, or someone closely allied to them, starts demanding the immediate deportation of everyone who was not born in the United States, even those who immigrated legally and are naturalized or here on green cards. This is a crazy, extreme idea that’s further in the same direction, so on this anchoring theory, it should make me more willing to accept the idea of tighter border security. And yet, I can say with some confidence that it has no such effect.

Indeed, if anything I think it would make me less likely to accept tighter border security, in proportion to how closely aligned those two arguments are. If they are coming from the same person, or the same political party, it would cause me to suspect that the crazy, extreme policy is the true objective, and the milder, compromise policy is just a means toward that end. It also suggests certain beliefs and attitudes about immigration in general—xenophobia, racism, ultranationalism—that I oppose even more strongly. If you’re talking about deporting all immigrants, you make me suspect that your reasons for wanting tighter border security are not good ones.

Let’s try another example. Suppose someone wants to cut taxes on upper income brackets. In our current state, I think that would be a bad idea. But there was a time not so long ago when I would have agreed with it: Even I have to admit that a top bracket of 94% (as we had in 1943) sounds a little ridiculous, and is surely on the wrong side of the Laffer curve. So the basic idea of cutting top tax rates is not inherently crazy or ridiculous.

Now, suppose that same idea came from the same person, or the same party, or the same political movement, as one that was arguing for the total abolition of all taxation. This is a crazy, extreme idea; it would amount to either total anarcho-capitalism with no government at all, or some sort of bizarre system where the government is funded entirely through voluntary contributions. I think it’s pretty obvious that such a system would be terrible, if not outright impossible; and anyone whose understanding of political economy is sufficiently poor that they would fail to see this is someone whose overall judgment on questions of policy I must consider dubious. Once again, the presence of the extreme view does nothing to make me want to consider the moderate view, and may even make me less willing to do so.

Perhaps I am an unusually rational person, not so greatly affected by anchoring biases? Perhaps. But whereas I do feel briefly tempted by to buy the $20 wine bottle by the effect of the $50 wine bottle, and must correct myself with knowledge I have about anchoring bias, the presentation of an extreme political view never even makes me feel any temptation to accept some kind of compromise with it. Learning that someone supports something crazy or ridiculous—or is willing to say they do, even if deep down they don’t—makes me automatically lower my assessment of their overall credibility. If anything, I think I am tempted to overreact in that direction, and have to remind myself of the Stopped Clock Principle: reversed stupidity is not intelligence, and someone can have both bad ideas and good ones.

Moreover, the empirical data, while sketchy, doesn’t seem to support this either; where the Overton Window (in the originally intended sense) has shifted, as on LGBT rights, it was because people convincingly argued that the “extreme” position was in fact an entirely reasonable and correct view. There was a time not so long ago that same-sex marriage was deemed unthinkable, and the “moderate” view was merely decriminalizing sodomy; but we demanded, and got, same-sex marriage, not as a strategy to compromise on decriminalizing sodomy, but because we actually wanted same-sex marriage and had good arguments for it. I highly doubt we would have been any more successful if we had demanded something ridiculous and extreme, like banning opposite-sex marriage.

The resulting conclusion seems obvious and banal: Only argue for things you actually believe in.

Yet, somehow, that seems to be a controversial view these days.