Good enough is perfect, perfect is bad

Jan 8 JDN 2459953

Not too long ago, I read the book How to Keep House While Drowning by KC Davis, which I highly recommend. It offers a great deal of useful and practical advice, especially for someone neurodivergent and depressed living through an interminable pandemic (which I am, but honestly, odds are, you may be too). And to say it is a quick and easy read is actually an unfair understatement; it is explicitly designed to be readable in short bursts by people with ADHD, and it has a level of accessibility that most other books don’t even aspire to and I honestly hadn’t realized was possible. (The extreme contrast between this and academic papers is particularly apparent to me.)

One piece of advice that really stuck with me was this: Good enough is perfect.

At first, it sounded like nonsense; no, perfect is perfect, good enough is just good enough. But in fact there is a deep sense in which it is absolutely true.

Indeed, let me make it a bit stronger: Good enough is perfect; perfect is bad.

I doubt Davis thought of it in these terms, but this is a concise, elegant statement of the principles of bounded rationality. Sometimes it can be optimal not to optimize.

Suppose that you are trying to optimize something, but you have limited computational resources in which to do so. This is actually not a lot for you to suppose—it’s literally true of basically everyone basically every moment of every day.

But let’s make it a bit more concrete, and say that you need to find the solution to the following math problem: “What is the product of 2419 times 1137?” (Pretend you don’t have a calculator, as it would trivialize the exercise. I thought about using a problem you couldn’t do with a standard calculator, but I realized that would also make it much weirder and more obscure for my readers.)

Now, suppose that there are some quick, simple ways to get reasonably close to the correct answer, and some slow, difficult ways to actually get the answer precisely.

In this particular problem, the former is to approximate: What’s 2500 times 1000? 2,500,000. So it’s probably about 2,500,000.

Or we could approximate a bit more closely: Say 2400 times 1100, that’s about 100 times 100 times 24 times 11, which is 2 times 12 times 11 (times 10,000), which is 2 times (110 plus 22), which is 2 times 132 (times 10,000), which is 2,640,000.

Or, we could actually go through all the steps to do the full multiplication (remember I’m assuming you have no calculator), multiply, carry the 1s, add all four sums, re-check everything and probably fix it because you messed up somewhere; and then eventually you will get: 2,750,403.

So, our really fast method was only off by about 10%. Our moderately-fast method was only off by 4%. And both of them were a lot faster than getting the exact answer by hand.

Which of these methods you’d actually want to use depends on the context and the tools at hand. If you had a calculator, sure, get the exact answer. Even if you didn’t, but you were balancing the budget for a corporation, I’m pretty sure they’d care about that extra $110,403. (Then again, they might not care about the $403 or at least the $3.) But just as an intellectual exercise, you really didn’t need to do anything; the optimal choice may have been to take my word for it. Or, if you were at all curious, you might be better off choosing the quick approximation rather than the precise answer. Since nothing of any real significance hinged on getting that answer, it may be simply a waste of your time to bother finding it.

This is of course a contrived example. But it’s not so far from many choices we make in real life.

Yes, if you are making a big choice—which job to take, what city to move to, whether to get married, which car or house to buy—you should get a precise answer. In fact, I make spreadsheets with formal utility calculations whenever I make a big choice, and I haven’t regretted it yet. (Did I really make a spreadsheet for getting married? You’re damn right I did; there were a lot of big financial decisions to make there—taxes, insurance, the wedding itself! I didn’t decide whom to marry that way, of course; but we always had the option of staying unmarried.)

But most of the choices we make from day to day are small choices: What should I have for lunch today? Should I vacuum the carpet now? What time should I go to bed? In the aggregate they may all add up to important things—but each one of them really won’t matter that much. If you were to construct a formal model to optimize your decision of everything to do each day, you’d spend your whole day doing nothing but constructing formal models. Perfect is bad.

In fact, even for big decisions, you can’t really get a perfect answer. There are just too many unknowns. Sometimes you can spend more effort gathering additional information—but that’s costly too, and sometimes the information you would most want simply isn’t available. (You can look up the weather in a city, visit it, ask people about it—but you can’t really know what it’s like to live there until you do.) Even those spreadsheet models I use to make big decisions contain error bars and robustness checks, and if, even after investing a lot of effort trying to get precise results, I still find two or more choices just can’t be clearly distinguished to within a good margin of error, I go with my gut. And that seems to have been the best choice for me to make. Good enough is perfect.

I think that being gifted as a child trained me to be dangerously perfectionist as an adult. (Many of you may find this familiar.) When it came to solving math problems, or answering quizzes, perfection really was an attainable goal a lot of the time.

As I got older and progressed further in my education, maybe getting every answer right was no longer feasible; but I still could get the best possible grade, and did, in most of my undergraduate classes and all of my graduate classes. To be clear, I’m not trying to brag here; if anything, I’m a little embarrassed. What it mainly shows is that I had learned the wrong priorities. In fact, one of the main reasons why I didn’t get a 4.0 average in undergrad is that I spent a lot more time back then writing novels and nonfiction books, which to this day I still consider my most important accomplishments and grieve that I’ve not (yet?) been able to get them commercially published. I did my best work when I wasn’t trying to be perfect. Good enough is perfect; perfect is bad.

Now here I am on the other side of the academic system, trying to carve out a career, and suddenly, there is no perfection. When my exam is being graded by someone else, there is a way to get the most points. When I’m the one grading the exams, there is no “correct answer” anymore. There is no one scoring me to see if I did the grading the “right way”—and so, no way to be sure I did it right.

Actually, here at Edinburgh, there are other instructors who moderate grades and often require me to revise them, which feels a bit like “getting it wrong”; but it’s really more like we had different ideas of what the grade curve should look like (not to mention US versus UK grading norms). There is no longer an objectively correct answer the way there is for, say, the derivative of x^3, the capital of France, or the definition of comparative advantage. (Or, one question I got wrong on an undergrad exam because I had zoned out of that lecture to write a book on my laptop: Whether cocaine is a dopamine reuptake inhibitor. It is. And the fact that I still remember that because I got it wrong over a decade ago tells you a lot about me.)

And then when it comes to research, it’s even worse: What even constitutes “good” research, let alone “perfect” research? What would be most scientifically rigorous isn’t what journals would be most likely to publish—and without much bigger grants, I can afford neither. I find myself longing for the research paper that will be so spectacular that top journals have to publish it, removing all risk of rejection and failure—in other words, perfect.

Yet such a paper plainly does not exist. Even if I were to do something that would win me a Nobel or a Fields Medal (this is, shall we say, unlikely), it probably wouldn’t be recognized as such immediately—a typical Nobel isn’t awarded until 20 or 30 years after the work that spawned it, and while Fields Medals are faster, they’re by no means instant or guaranteed. In fact, a lot of ground-breaking, paradigm-shifting research was originally relegated to minor journals because the top journals considered it too radical to publish.

Or I could try to do something trendy—feed into DSGE or GTFO—and try to get published that way. But I know my heart wouldn’t be in it, and so I’d be miserable the whole time. In fact, because it is neither my passion nor my expertise, I probably wouldn’t even do as good a job as someone who really buys into the core assumptions. I already have trouble speaking frequentist sometimes: Are we allowed to say “almost significant” for p = 0.06? Maximizing the likelihood is still kosher, right? Just so long as I don’t impose a prior? But speaking DSGE fluently and sincerely? I’d have an easier time speaking in Latin.

What I know—on some level at least—I ought to be doing is finding the research that I think is most worthwhile, given the resources I have available, and then getting it published wherever I can. Or, in fact, I should probably constrain a little by what I know about journals: I should do the most worthwhile research that is feasible for me and has a serious chance of getting published in a peer-reviewed journal. It’s sad that those two things aren’t the same, but they clearly aren’t. This constraint binds, and its Lagrange multiplier is measured in humanity’s future.

But one thing is very clear: By trying to find the perfect paper, I have floundered and, for the last year and a half, not written any papers at all. The right choice would surely have been to write something.

Because good enough is perfect, and perfect is bad.

Experimentally testing categorical prospect theory

Dec 4, JDN 2457727

In last week’s post I presented a new theory of probability judgments, which doesn’t rely upon people performing complicated math even subconsciously. Instead, I hypothesize that people try to assign categories to their subjective probabilities, and throw away all the information that wasn’t used to assign that category.

The way to most clearly distinguish this from cumulative prospect theory is to show discontinuity. Kahneman’s smooth, continuous function places fairly strong bounds on just how much a shift from 0% to 0.000001% can really affect your behavior. In particular, if you want to explain the fact that people do seem to behave differently around 10% compared to 1% probabilities, you can’t allow the slope of the smooth function to get much higher than 10 at any point, even near 0 and 1. (It does depend on the precise form of the function, but the more complicated you make it, the more free parameters you add to the model. In the most parsimonious form, which is a cubic polynomial, the maximum slope is actually much smaller than this—only 2.)

If that’s the case, then switching from 0.% to 0.0001% should have no more effect in reality than a switch from 0% to 0.00001% would to a rational expected utility optimizer. But in fact I think I can set up scenarios where it would have a larger effect than a switch from 0.001% to 0.01%.

Indeed, these games are already quite profitable for the majority of US states, and they are called lotteries.

Rationally, it should make very little difference to you whether your odds of winning the Powerball are 0 (you bought no ticket) or 0.000000001% (you bought a ticket), even when the prize is $100 million. This is because your utility of $100 million is nowhere near 100 million times as large as your marginal utility of $1. A good guess would be that your lifetime income is about $2 million, your utility is logarithmic, the units of utility are hectoQALY, and the baseline level is about 100,000.

I apologize for the extremely large number of decimals, but I had to do that in order to show any difference at all. I have bolded where the decimals first deviate from the baseline.

Your utility if you don’t have a ticket is ln(20) = 2.9957322736 hQALY.

Your utility if you have a ticket is (1-10^-9) ln(20) + 10^-9 ln(1020) = 2.9957322775 hQALY.

You gain a whopping 40 microQALY over your whole lifetime. I highly doubt you could even perceive such a difference.

And yet, people are willing to pay nontrivial sums for the chance to play such lotteries. Powerball tickets sell for about $2 each, and some people buy tickets every week. If you do that and live to be 80, you will spend some $8,000 on lottery tickets during your lifetime, which results in this expected utility: (1-4*10^-6) ln(20-0.08) + 4*10^-6 ln(1020) = 2.9917399955 hQALY.
You have now sacrificed 0.004 hectoQALY, which is to say 0.4 QALY—that’s months of happiness you’ve given up to play this stupid pointless game.

Which shouldn’t be surprising, as (with 99.9996% probability) you have given up four months of your lifetime income with nothing to show for it. Lifetime income of $2 million / lifespan of 80 years = $25,000 per year; $8,000 / $25,000 = 0.32. You’ve actually sacrificed slightly more than this, which comes from your risk aversion.

Why would anyone do such a thing? Because while the difference between 0 and 10^-9 may be trivial, the difference between “impossible” and “almost impossible” feels enormous. “You can’t win if you don’t play!” they say, but they might as well say “You can’t win if you do play either.” Indeed, the probability of winning without playing isn’t zero; you could find a winning ticket lying on the ground, or win due to an error that is then upheld in court, or be given the winnings bequeathed by a dying family member or gifted by an anonymous donor. These are of course vanishingly unlikely—but so was winning in the first place. You’re talking about the difference between 10^-9 and 10^-12, which in proportional terms sounds like a lot—but in absolute terms is nothing. If you drive to a drug store every week to buy a ticket, you are more likely to die in a car accident on the way to the drug store than you are to win the lottery.

Of course, these are not experimental conditions. So I need to devise a similar game, with smaller stakes but still large enough for people’s brains to care about the “almost impossible” category; maybe thousands? It’s not uncommon for an economics experiment to cost thousands, it’s just usually paid out to many people instead of randomly to one person or nobody. Conducting the experiment in an underdeveloped country like India would also effectively amplify the amounts paid, but at the fixed cost of transporting the research team to India.

But I think in general terms the experiment could look something like this. You are given $20 for participating in the experiment (we treat it as already given to you, to maximize your loss aversion and endowment effect and thereby give us more bang for our buck). You then have a chance to play a game, where you pay $X to get a P probability of $Y*X, and we vary these numbers.

The actual participants wouldn’t see the variables, just the numbers and possibly the rules: “You can pay $2 for a 1% chance of winning $200. You can also play multiple times if you wish.” “You can pay $10 for a 5% chance of winning $250. You can only play once or not at all.”

So I think the first step is to find some dilemmas, cases where people feel ambivalent, and different people differ in their choices. That’s a good role for a pilot study.

Then we take these dilemmas and start varying their probabilities slightly.

In particular, we try to vary them at the edge of where people have mental categories. If subjective probability is continuous, a slight change in actual probability should never result in a large change in behavior, and furthermore the effect of a change shouldn’t vary too much depending on where the change starts.

But if subjective probability is categorical, these categories should have edges. Then, when I present you with two dilemmas that are on opposite sides of one of the edges, your behavior should radically shift; while if I change it in a different way, I can make a large change without changing the result.

Based solely on my own intuition, I guessed that the categories roughly follow this pattern:

Impossible: 0%

Almost impossible: 0.1%

Very unlikely: 1%

Unlikely: 10%

Fairly unlikely: 20%

Roughly even odds: 50%

Fairly likely: 80%

Likely: 90%

Very likely: 99%

Almost certain: 99.9%

Certain: 100%

So for example, if I switch from 0%% to 0.01%, it should have a very large effect, because I’ve moved you out of your “impossible” category (indeed, I think the “impossible” category is almost completely sharp; literally anything above zero seems to be enough for most people, even 10^-9 or 10^-10). But if I move from 1% to 2%, it should have a small effect, because I’m still well within the “very unlikely” category. Yet the latter change is literally one hundred times larger than the former. It is possible to define continuous functions that would behave this way to an arbitrary level of approximation—but they get a lot less parsimonious very fast.

Now, immediately I run into a problem, because I’m not even sure those are my categories, much less that they are everyone else’s. If I knew precisely which categories to look for, I could tell whether or not I had found it. But the process of both finding the categories and determining if their edges are truly sharp is much more complicated, and requires a lot more statistical degrees of freedom to get beyond the noise.

One thing I’m considering is assigning these values as a prior, and then conducting a series of experiments which would adjust that prior. In effect I would be using optimal Bayesian probability reasoning to show that human beings do not use optimal Bayesian probability reasoning. Still, I think that actually pinning down the categories would require a large number of participants or a long series of experiments (in frequentist statistics this distinction is vital; in Bayesian statistics it is basically irrelevant—one of the simplest reasons to be Bayesian is that it no longer bothers you whether someone did 2 experiments of 100 people or 1 experiment of 200 people, provided they were the same experiment of course). And of course there’s always the possibility that my theory is totally off-base, and I find nothing; a dissertation replicating cumulative prospect theory is a lot less exciting (and, sadly, less publishable) than one refuting it.

Still, I think something like this is worth exploring. I highly doubt that people are doing very much math when they make most probabilistic judgments, and using categories would provide a very good way for people to make judgments usefully with no math at all.