Pinker Propositions

May 19 2458623

What do the following statements have in common?

1. “Capitalist countries have less poverty than Communist countries.

2. “Black men in the US commit homicide at a higher rate than White men.

3. “On average, in the US, Asian people score highest on IQ tests, White and Hispanic people score near the middle, and Black people score the lowest.

4. “Men on average perform better at visual tasks, and women on average perform better on verbal tasks.

5. “In the United States, White men are no more likely to be mass shooters than other men.

6. “The genetic heritability of intelligence is about 60%.

7. “The plurality of recent terrorist attacks in the US have been committed by Muslims.

8. “The period of US military hegemony since 1945 has been the most peaceful period in human history.

These statements have two things in common:

1. All of these statements are objectively true facts that can be verified by rich and reliable empirical data which is publicly available and uncontroversially accepted by social scientists.

2. If spoken publicly among left-wing social justice activists, all of these statements will draw resistance, defensiveness, and often outright hostility. Anyone making these statements is likely to be accused of racism, sexism, imperialism, and so on.

I call such propositions Pinker Propositions, after an excellent talk by Steven Pinker illustrating several of the above statements (which was then taken wildly out of context by social justice activists on social media).

The usual reaction to these statements suggests that people think they imply harmful far-right policy conclusions. This inference is utterly wrong: A nuanced understanding of each of these propositions does not in any way lead to far-right policy conclusions—in fact, some rather strongly support left-wing policy conclusions.

1. Capitalist countries have less poverty than Communist countries, because Communist countries are nearly always corrupt and authoritarian. Social democratic countries have the lowest poverty and the highest overall happiness (#ScandinaviaIsBetter).

2. Black men commit more homicide than White men because of poverty, discrimination, mass incarceration, and gang violence. Black men are also greatly overrepresented among victims of homicide, as most homicide is intra-racial. Homicide rates often vary across ethnic and socioeconomic groups, and these rates vary over time as a result of cultural and political changes.

3. IQ tests are a highly imperfect measure of intelligence, and the genetics of intelligence cut across our socially-constructed concept of race. There is far more within-group variation in IQ than between-group variation. Intelligence is not fixed at birth but is affected by nutrition, upbringing, exposure to toxins, and education—all of which statistically put Black people at a disadvantage. Nor does intelligence remain constant within populations: The Flynn Effect is the well-documented increase in intelligence which has occurred in almost every country over the past century. Far from justifying discrimination, these provide very strong reasons to improve opportunities for Black children. The lead and mercury in Flint’s water suppressed the brain development of thousands of Black children—that’s going to lower average IQ scores. But that says nothing about supposed “inherent racial differences” and everything about the catastrophic damage of environmental racism.

4. To be quite honest, I never even understood why this one shocks—or even surprises—people. It’s not even saying that men are “smarter” than women—overall IQ is almost identical. It’s just saying that men are more visual and women are more verbal. And this, I think, is actually quite obvious. I think the clearest evidence of this—the “interocular trauma” that will convince you the effect is real and worth talking about—is pornography. Visual porn is overwhelmingly consumed by men, even when it was designed for women (e.g. Playgirla majority of its readers are gay men, even though there are ten times as many straight women in the world as there are gay men). Conversely, erotic novels are overwhelmingly consumed by women. I think a lot of anti-porn feminism can actually be explained by this effect: Feminists (who are usually women, for obvious reasons) can say they are against “porn” when what they are really against is visual porn, because visual porn is consumed by men; then the kind of porn that they like (erotic literature) doesn’t count as “real porn”. And honestly they’re mostly against the current structure of the live-action visual porn industry, which is totally reasonable—but it’s a far cry from being against porn in general. I have some serious issues with how our farming system is currently set up, but I’m not against farming.

5. This one is interesting, because it’s a lack of a race difference, which normally is what the left wing always wants to hear. The difference of course is that this alleged difference would make White men look bad, and that’s apparently seen as a desirable goal for social justice. But the data just doesn’t bear it out: While indeed most mass shooters are White men, that’s because most Americans are White, which is a totally uninteresting reason. There’s no clear evidence of any racial disparity in mass shootings—though the gender disparity is absolutely overwhelming: It’s almost always men.

6. Heritability is a subtle concept; it doesn’t mean what most people seem to think it means. It doesn’t mean that 60% of your intelligence is due to your your genes. Indeed, I’m not even sure what that sentence would actually mean; it’s like saying that 60% of the flavor of a cake is due to the eggs. What this heritability figure actually means that when you compare across individuals in a population, and carefully control for environmental influences, you find that about 60% of the variance in IQ scores is explained by genetic factors. But this is within a particular population—here, US adults—and is absolutely dependent on all sorts of other variables. The more flexible one’s environment becomes, the more people self-select into their preferred environment, and the more heritable traits become. As a result, IQ actually becomes more heritable as children become adults, called the Wilson Effect.

7. This one might actually have some contradiction with left-wing policy. The disproportionate participation of Muslims in terrorism—controlling for just about anything you like, income, education, age etc.—really does suggest that, at least at this point in history, there is some real ideological link between Islam and terrorism. But the fact remains that the vast majority of Muslims are not terrorists and do not support terrorism, and antagonizing all the people of an entire religion is fundamentally unjust as well as likely to backfire in various ways. We should instead be trying to encourage the spread of more tolerant forms of Islam, and maintaining the strict boundaries of secularism to prevent the encroach of any religion on our system of government.

8. The fact that US military hegemony does seem to be a cause of global peace doesn’t imply that every single military intervention by the US is justified. In fact, it doesn’t even necessarily imply that any such interventions are justified—though I think one would be hard-pressed to say that the NATO intervention in the Kosovo War or the defense of Kuwait in the Gulf War was unjustified. It merely points out that having a hegemon is clearly preferable to having a multipolar world where many countries jockey for military supremacy. The Pax Romana was a time of peace but also authoritarianism; the Pax Americana is better, but that doesn’t prevent us from criticizing the real harms—including major war crimes—committed by the United States.

So it is entirely possible to know and understand these facts without adopting far-right political views.

Yet Pinker’s point—and mine—is that by suppressing these true facts, by responding with hostility or even ostracism to anyone who states them, we are actually adding fuel to the far-right fire. Instead of presenting the nuanced truth and explaining why it doesn’t imply such radical policies, we attack the messenger; and this leads people to conclude three things:

1. The left wing is willing to lie and suppress the truth in order to achieve political goals (they’re doing it right now).

2. These statements actually do imply right-wing conclusions (else why suppress them?).

3. Since these statements are true, that must mean the right-wing conclusions are actually correct.

Now (especially if you are someone who identifies unironically as “woke”), you might be thinking something like this: “Anyone who can be turned away from social justice so easily was never a real ally in the first place!”

This is a fundamentally and dangerously wrongheaded view. No one—not me, not you, not anyone—was born believing in social justice. You did not emerge from your mother’s womb ranting against colonalist imperialism. You had to learn what you now know. You came to believe what you now believe, after once believing something else that you now think is wrong. This is true of absolutely everyone everywhere. Indeed, the better you are, the more true it is; good people learn from their mistakes and grow in their knowledge.

This means that anyone who is now an ally of social justice once was not. And that, in turn, suggests that many people who are currently not allies could become so, under the right circumstances. They would probably not shift all at once—as I didn’t, and I doubt you did either—but if we are welcoming and open and honest with them, we can gradually tilt them toward greater and greater levels of support.

But if we reject them immediately for being impure, they never get the chance to learn, and we never get the chance to sway them. People who are currently uncertain of their political beliefs will become our enemies because we made them our enemies. We declared that if they would not immediately commit to everything we believe, then they may as well oppose us. They, quite reasonably unwilling to commit to a detailed political agenda they didn’t understand, decided that it would be easiest to simply oppose us.

And we don’t have to win over every person on every single issue. We merely need to win over a large enough critical mass on each issue to shift policies and cultural norms. Building a wider tent is not compromising on your principles; on the contrary, it’s how you actually win and make those principles a reality.

There will always be those we cannot convince, of course. And I admit, there is something deeply irrational about going from “those leftists attacked Charles Murray” to “I think I’ll start waving a swastika”. But humans aren’t always rational; we know this. You can lament this, complain about it, yell at people for being so irrational all you like—it won’t actually make people any more rational. Humans are tribal; we think in terms of teams. We need to make our team as large and welcoming as possible, and suppressing Pinker Propositions is not the way to do that.

The sausage of statistics being made

 

Nov 11 JDN 2458434

“Laws, like sausages, cease to inspire respect in proportion as we know how they are made.”

~ John Godfrey Saxe, not Otto von Bismark

Statistics are a bit like laws and sausages. There are a lot of things in statistical practice that don’t align with statistical theory. The most obvious examples are the fact that many results in statistics are asymptotic: they only strictly apply for infinitely large samples, and in any finite sample they will be some sort of approximation (we often don’t even know how good an approximation).

But the problem runs deeper than this: The whole idea of a p-value was originally supposed to be used to assess one single hypothesis that is the only one you test in your entire study.

That’s frankly a ludicrous expectation: Why would you write a whole paper just to test one parameter?

This is why I don’t actually think this so-called multiple comparisons problem is a problem with researchers doing too many hypothesis tests; I think it’s a problem with statisticians being fundamentally unreasonable about what statistics is useful for. We have to do multiple comparisons, so you should be telling us how to do it correctly.

Statisticians have this beautiful pure mathematics that generates all these lovely asymptotic results… and then they stop, as if they were done. But we aren’t dealing with infinite or even “sufficiently large” samples; we need to know what happens when your sample is 100, not when your sample is 10^29. We can’t assume that our variables are independently identically distributed; we don’t know their distribution, and we’re pretty sure they’re going to be somewhat dependent.

Even in an experimental context where we can randomly and independently assign some treatments, we can’t do that with lots of variables that are likely to matter, like age, gender, nationality, or field of study. And applied econometricians are in an even tighter bind; they often can’t randomize anything. They have to rely upon “instrumental variables” that they hope are “close enough to randomized” relative to whatever they want to study.

In practice what we tend to do is… fudge it. We use the formal statistical methods, and then we step back and apply a series of informal norms to see if the result actually makes sense to us. This is why almost no psychologists were actually convinced by Daryl Bem’s precognition experiments, despite his standard experimental methodology and perfect p < 0.05 results; he couldn’t pass any of the informal tests, particularly the most basic one of not violating any known fundamental laws of physics. We knew he had somehow cherry-picked the data, even before looking at it; nothing else was possible.

This is actually part of where the “hierarchy of sciences” notion is useful: One of the norms is that you’re not allowed to break the rules of the sciences above you, but you can break the rules of the sciences below you. So psychology has to obey physics, but physics doesn’t have to obey psychology. I think this is also part of why there’s so much enmity between economists and anthropologists; really we should be on the same level, cognizant of each other’s rules, but economists want to be above anthropologists so we can ignore culture, and anthropologists want to be above economists so they can ignore incentives.

Another informal norm is the “robustness check”, in which the researcher runs a dozen different regressions approaching the same basic question from different angles. “What if we control for this? What if we interact those two variables? What if we use a different instrument?” In terms of statistical theory, this doesn’t actually make a lot of sense; the probability distributions f(y|x) of y conditional on x and f(y|x, z) of y conditional on x and z are not the same thing, and wouldn’t in general be closely tied, depending on the distribution f(x|z) of x conditional on z. But in practice, most real-world phenomena are going to continue to show up even as you run a bunch of different regressions, and so we can be more confident that something is a real phenomenon insofar as that happens. If an effect drops out when you switch out a couple of control variables, it may have been a statistical artifact. But if it keeps appearing no matter what you do to try to make it go away, then it’s probably a real thing.

Because of the powerful career incentives toward publication and the strange obsession among journals with a p-value less than 0.05, another norm has emerged: Don’t actually trust p-values that are close to 0.05. The vast majority of the time, a p-value of 0.047 was the result of publication bias. Now if you see a p-value of 0.001, maybe then you can trust it—but you’re still relying on a lot of assumptions even then. I’ve seen some researchers argue that because of this, we should tighten our standards for publication to something like p < 0.01, but that’s missing the point; what we need to do is stop publishing based on p-values. If you tighten the threshold, you’re just going to get more rejected papers and then the few papers that do get published will now have even smaller p-values that are still utterly meaningless.

These informal norms protect us from the worst outcomes of bad research. But they are almost certainly not optimal. It’s all very vague and informal, and different researchers will often disagree vehemently over whether a given interpretation is valid. What we need are formal methods for solving these problems, so that we can have the objectivity and replicability that formal methods provide. Right now, our existing formal tools simply are not up to that task.

There are some things we may never be able to formalize: If we had a formal algorithm for coming up with good ideas, the AIs would already rule the world, and this would be either Terminator or The Culture depending on whether we designed the AIs correctly. But I think we should at least be able to formalize the basic question of “Is this statement likely to be true?” that is the fundamental motivation behind statistical hypothesis testing.

I think the answer is likely to be in a broad sense Bayesian, but Bayesians still have a lot of work left to do in order to give us really flexible, reliable statistical methods we can actually apply to the messy world of real data. In particular, tell us how to choose priors please! Prior selection is a fundamental make-or-break problem in Bayesian inference that has nonetheless been greatly neglected by most Bayesian statisticians. So, what do we do? We fall back on informal norms: Try maximum likelihood, which is like using a very flat prior. Try a normally-distributed prior. See if you can construct a prior from past data. If all those give the same thing, that’s a “robustness check” (see previous informal norm).

Informal norms are also inherently harder to teach and learn. I’ve seen a lot of other grad students flail wildly at statistics, not because they don’t know what a p-value means (though maybe that’s also sometimes true), but because they don’t really quite grok the informal underpinnings of good statistical inference. This can be very hard to explain to someone: They feel like they followed all the rules correctly, but you are saying their results are wrong, and now you can’t explain why.

In fact, some of the informal norms that are in wide use are clearly detrimental. In economics, norms have emerged that certain types of models are better simply because they are “more standard”, such as the dynamic stochastic general equilibrium models that can basically be fit to everything and have never actually usefully predicted anything. In fact, the best ones just predict what we already knew from Keynesian models. But without a formal norm for testing the validity of models, it’s been “DSGE or GTFO”. At present, it is considered “nonstandard” (read: “bad”) not to assume that your agents are either a single unitary “representative agent” or a continuum of infinitely-many agents—modeling the actual fact of finitely-many agents is just not done. Yet it’s hard for me to imagine any formal criterion that wouldn’t at least give you some points for correctly including the fact that there is more than one but less than infinity people in the world (obviously your model could still be bad in other ways).

I don’t know what these new statistical methods would look like. Maybe it’s as simple as formally justifying some of the norms we already use; maybe it’s as complicated as taking a fundamentally new approach to statistical inference. But we have to start somewhere.

Demystifying dummy variables

Nov 5, JDN 2458062

Continuing my series of blog posts on basic statistical concepts, today I’m going to talk about dummy variables. Dummy variables are quite simple, but for some reason a lot of people—even people with extensive statistical training—often have trouble understanding them. Perhaps people are simply overthinking matters, or making subtle errors that end up having large consequences.

A dummy variable (more formally a binary variable) is a variable that has only two states: “No”, usually represented 0, and “Yes”, usually represented 1. A dummy variable answers a single “Yes or no” question. They are most commonly used for categorical variables, answering questions like “Is the person’s race White?” and “Is the state California?”; but in fact almost any kind of data can be represented this way: We could represent income using a series of dummy variables like “Is your income greater than $50,000?” “Is your income greater than $51,000?” and so on. As long as the number of possible outcomes is finite—which, in practice, it always is—the data can be represented by some (possibly large) set of dummy variables. In fact, if your data set is large enough, representing numerical data with dummy variables can be a very good thing to do, as it allows you to account for nonlinear effects without assuming some specific functional form.
Most of the misunderstanding regarding dummy variables involves applying them in regressions and interpreting the results.
Probably the most common confusion is about what dummy variables to include. When you have a set of categories represented in your data (e.g. one for each US state), you want to include dummy variables for all but one of them. The most common mistake here is to try to include all of them, and end up with a regression that doesn’t make sense, or if you have a catchall category like “Other” (e.g. race is coded as “White/Black/Other”), leaving out that one and getting results with a nonsensical baseline.

You don’t have to leave one out if you only have one set of categories and you don’t include a constant in your regression; then the baseline will emerge automatically from the regression. But this is dangerous, as the interpretation of the coefficients is no longer quite so simple.

The thing to keep in mind is that a coefficient on a dummy variable is an effect of a change—so the coefficient on “White” is the effect of being White. In order to be an effect of a change, that change must be measured against some baseline. The dummy variable you exclude from the regression is the baseline—because the effect of changing to the baseline from the baseline is by definition zero.
Here’s a very simple example where all the regressions can be done by hand. Suppose you have a household with 1 human and 1 cat, and you want to know the effect of species on number of legs. (I mean, hopefully this is something you already know; but that makes it a good illustration.) In what follows, you can safely skip the matrix algebra; but I included it for any readers who want to see how these concepts play out mechanically in the math.
Your outcome variable Y is legs: The human has 2 and the cat has 4. We can write this as a matrix:

\[ Y = \begin{bmatrix} 2 \\ 4 \end{bmatrix} \]

reg_1

What dummy variables should we choose? There are actually several options.

 

The simplest option is to include both a human variable and a cat variable, and no constant. Let’s put the human variable first. Then our human subject has a value of X1 = [1 0] (“Yes” to human and “No” to cat) and our cat subject has a value of X2 = [0 1].

This is very nice in this case, as it makes our matrix of independent variables simply an identity matrix:

\[ X = \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix} \]

reg_2

This makes the calculations extremely nice, because transposing, multiplying, and inverting an identity matrix all just give us back an identity matrix. The standard OLS regression coefficient is B = (X’X)-1 X’Y, which in this case just becomes Y itself.

\[ B = (X’X)^{-1} X’Y = Y = \begin{bmatrix} 2 \\ 4 \end{bmatrix} \]

reg_3

Our coefficients are 2 and 4. How would we interpret this? Pretty much what you’d think: The effect of being human is having 2 legs, while the effect of being a cat is having 4 legs. This amounts to choosing a baseline of nothing—the effect is compared to a hypothetical entity with no legs at all. And indeed this is what will happen more generally if you do a regression with a dummy for each category and no constant: The baseline will be a hypothetical entity with an outcome of zero on whatever your outcome variable is.
So far, so good.

But what if we had additional variables to include? Say we have both cats and humans with black hair and brown hair (and no other colors). If we now include the variables human, cat, black hair, brown hair, we won’t get the results we expect—in fact, we’ll get no result at all. The regression is mathematically impossible, regardless of how large a sample we have.

This is why it’s much safer to choose one of the categories as a baseline, and include that as a constant. We could pick either one; we just need to be clear about which one we chose.

Say we take human as the baseline. Then our variables are constant and cat. The variable constant is just 1 for every single individual. The variable cat is 0 for humans and 1 for cats.

Now our independent variable matrix looks like this:

\[ X = \begin{bmatrix} 1 & 0 \\ 1 & 1 \end{bmatrix} \]

reg_4
The matrix algebra isn’t quite so nice this time:

\[ X’X = \begin{bmatrix} 1 & 1 \\ 0 & 1 \end{bmatrix} \begin{bmatrix} 1 & 0 \\ 1 & 1 \end{bmatrix} = \begin{bmatrix} 2 & 1 \\ 1 & 1 \end{bmatrix} \]

\[ (X’X)^{-1} = \begin{bmatrix} 1 & -1 \\ -1 & 2 \end{bmatrix} \]

\[ X’Y = \begin{bmatrix} 1 & 1 \\ 0 & 1 \end{bmatrix} \begin{bmatrix} 2 \\ 4 \end{bmatrix} = \begin{bmatrix} 6 \\ 4 \end{bmatrix} \]

\[ B = (X’X)^{-1} X’Y = \begin{bmatrix} 1 & -1 \\ -1 & 2 \end{bmatrix} \begin{bmatrix} 6 \\ 4 \end{bmatrix} = \begin{bmatrix} 2 \\ 2 \end{bmatrix} \]

reg_5

Our coefficients are now 2 and 2. Now, how do we interpret that result? We took human as the baseline, so what we are saying here is that the default is to have 2 legs, and then the effect of being a cat is to get 2 extra legs.
That sounds a bit anthropocentric—most animals are quadripeds, after all—so let’s try taking cat as the baseline instead. Now our variables are constant and human, and our independent variable matrix looks like this:

\[ X = \begin{bmatrix} 1 & 1 \\ 1 & 0 \end{bmatrix} \]

\[ X’X = \begin{bmatrix} 1 & 1 \\ 1 & 0 \end{bmatrix} \begin{bmatrix} 1 & 1 \\ 1 & 0 \end{bmatrix} = \begin{bmatrix} 2 & 1 \\ 1 & 1 \end{bmatrix} \]

\[ (X’X)^{-1} = \begin{bmatrix} 1 & -1 \\ -1 & 2 \end{bmatrix} \]

\[ X’Y = \begin{bmatrix} 1 & 1 \\ 1 & 0 \end{bmatrix} \begin{bmatrix} 2 \\ 4 \end{bmatrix} = \begin{bmatrix} 6 \\ 2 \end{bmatrix} \]

\[ B = \begin{bmatrix} 1 & -1 \\ -1 & 2 \end{bmatrix} \begin{bmatrix} 6 \\ 2 \end{bmatrix} = \begin{bmatrix} 4 \\ -2 \end{bmatrix} \]

reg_6

Our coefficients are 4 and -2. This seems much more phylogenetically correct: The default number of legs is 4, and the effect of being human is to lose 2 legs.
All these regressions are really saying the same thing: Humans have 2 legs, cats have 4. And in this particular case, it’s simple and obvious. But once things start getting more complicated, people tend to make mistakes even on these very simple questions.

A common mistake would be to try to include a constant and both dummy variables: constant human cat. What happens if we try that? The matrix algebra gets particularly nasty, first of all:

\[ X = \begin{bmatrix} 1 & 1 & 0 \\ 1 & 0 & 1 \end{bmatrix} \]

\[ X’X = \begin{bmatrix} 1 & 1 \\ 1 & 0 \\ 0 & 1 \end{bmatrix} \begin{bmatrix} 1 & 1 & 0 \\ 1 & 0 & 1 \end{bmatrix} = \begin{bmatrix} 2 & 1 & 1 \\ 1 & 1 & 0 \\ 1 & 0 & 1 \end{bmatrix} \]

reg_7

Our covariance matrix X’X is now 3×3, first of all. That means we have more coefficients than we have data points. But we could throw in another human and another cat to fix that problem.

 

More importantly, the covariance matrix is not invertible. Rows 2 and 3 add up together to equal row 1, so we have a singular matrix.

If you tried to run this regression, you’d get an error message about “perfect multicollinearity”. What this really means is you haven’t chosen a valid baseline. Your baseline isn’t human and it isn’t cat; and since you included a constant, it isn’t a baseline of nothing either. It’s… unspecified.

You actually can choose whatever baseline you want for this regression, by setting the constant term to whatever number you want. Set a constant of 0 and your baseline is nothing: you’ll get back the coefficients 0, 2 and 4. Set a constant of 2 and your baseline is human: you’ll get 2, 0 and 2. Set a constant of 4 and your baseline is cat: you’ll get 4, -2, 0. You can even choose something weird like 3 (you’ll get 3, -1, 1) or 7 (you’ll get 7, -5, -3) or -4 (you’ll get -4, 6, 8). You don’t even have to choose integers; you could pick -0.9 or 3.14159. As long as the constant plus the coefficient on human add to 2 and the constant plus the coefficient on cat add to 4, you’ll get a valid regression.
Again, this example seems pretty simple. But it’s an easy trap to fall into if you don’t think carefully about what variables you are including. If you are looking at effects on income and you have dummy variables on race, gender, schooling (e.g. no high school, high school diploma, some college, Bachelor’s, master’s, PhD), and what state a person lives in, it would be very tempting to just throw all those variables into a regression and see what comes out. But nothing is going to come out, because you haven’t specified a baseline. Your baseline isn’t even some hypothetical person with $0 income (which already doesn’t sound like a great choice); it’s just not a coherent baseline at all.

Generally the best thing to do (for the most precise estimates) is to choose the most common category in each set as the baseline. So for the US a good choice would be to set the baseline as White, female, high school diploma, California. Another common strategy when looking at discrimination specifically is to make the most privileged category the baseline, so we’d instead have White, male, PhD, and… Maryland, it turns out. Then we expect all our coefficients to be negative: Your income is generally lower if you are not White, not male, have less than a PhD, or live outside Maryland.

This is also important if you are interested in interactions: For example, the effect on your income of being Black in California is probably not the same as the effect of being Black in Mississippi. Then you’ll want to include terms like Black and Mississippi, which for dummy variables is the same thing as taking the Black variable and multiplying by the Mississippi variable.

But now you need to be especially clear about what your baseline is: If being White in California is your baseline, then the coefficient on Black is the effect of being Black in California, while the coefficient on Mississippi is the effect of being in Mississippi if you are White. The coefficient on Black and Mississippi is the effect of being Black in Mississippi, over and above the sum of the effects of being Black and the effect of being in Mississippi. If we saw a positive coefficient there, it wouldn’t mean that it’s good to be Black in Mississippi; it would simply mean that it’s not as bad as we might expect if we just summed the downsides of being Black with the downsides of being in Mississippi. And if we saw a negative coefficient there, it would mean that being Black in Mississippi is even worse than you would expect just from summing up the effects of being Black with the effects of being in Mississippi.

As long as you choose your baseline carefully and stick to it, interpreting regressions with dummy variables isn’t very hard. But so many people forget this step that they get very confused by the end, looking at a term like Black female Mississippi and seeing a positive coefficient, and thinking that must mean that life is good for Black women in Mississippi, when really all it means is the small mercy that being a Black woman in Mississippi isn’t quite as bad as you might think if you just added up the effect of being Black, plus the effect of being a woman, plus the effect of being Black and a woman, plus the effect of living in Mississippi, plus the effect of being Black in Mississippi, plus the effect of being a woman in Mississippi.

 

A tale of two storms

Sep 24, JDN 2458021

There were two severe storm events this past week; one you probably heard a great deal about, the other, probably not. The first was Hurricane Irma, which hit the United States and did most of its damage in Florida; the second was Typhoon Doksuri, which hit Southeast Asia and did most of its damage in Vietnam.

You might expect that this post is going to give you more bad news. Well, I have a surprise for you: The news is actually mostly good.

The death tolls from both storms were astonishingly small. The hurricane is estimated to have killed at least 84 people, while the typhoon has killed at least 26. This result is nothing less than heroism. The valiant efforts of thousands of meteorologists and emergency responders around the world has saved thousands of lives, and did so both in the wealthy United States and in impoverished Vietnam.

When I started this post, I had expected to see that the emergency response in Vietnam would be much worse, and fatalities would be far higher; I am delighted to report that nothing of the sort was the case, and Vietnam, despite their per-capita GDP PPP of under $6,000, has made emergency response a sufficiently high priority that they saved their people just about as well as Florida did.

To get a sense of what might have happened without them, consider that 1.5 million homes in Florida were leveled by the hurricane, and over 100,000 homes were damaged by the typhoon. Vietnam is a country of 94 million people. Florida has a population of 20 million. (The reason Florida determines so many elections is that it is by far the most populous swing state.) Without weather forecasting and emergency response, these death figures would have been in the tens of thousands, not the dozens.

Indeed, if you know statistics and demographics well, these figures become even more astonishing: These death rates were almost indistinguishable from statistical noise.

Vietnam’s baseline death rate is about 5.9 per 1,000, meaning that they experience about 560,000 deaths in any given year. This means that over 1500 people die in Vietnam on a normal day.

Florida’s baseline death rate is about 6.6 per 1,000, actually a bit higher than Vietnam’s, because Florida’s population skews so much toward the elderly. Therefore Florida experiences about 130,000 deaths per year, or 360 deaths on a normal day.

In both Vietnam and Florida, this makes the daily death probability for any given person about 0.0017%. A random process with a fixed probability of 0.0017% over a population of n people will result in an average of 0.0017n events, but with some variation around that number. The standard deviation is actually sqrt(p(1-p)n) = 0.004 sqrt(n). When n = 20,000,000 (Florida), this results in a standard deviation of 18. When n = 94,000,000 (Vietnam), this results in a standard deviation of 40.

This means that the 26 additional deaths in Vietnam were within one standard deviation of average! They basically are indistinguishable from statistical noise. There have been over a hundred days in Vietnam where an extra 26 people happened to die, just in the past year. Weather forecasting took what could have been a historic disaster and turned it into just another bad day.

The 84 additional deaths in Florida are over four standard deviations away from average, so they are definitely distinguishable from statistical noise—but this still means that Florida’s total death rate for the year will only tick up by 0.6%.

It is common in such tragedies to point out in grave tones that “one death is too many”, but I maintain that this is not actually moral wisdom but empty platitude. No conceivable policy is ever going to reduce death rates to zero, and the people who died of heart attacks or brain aneurysms are every bit as dead as the people who died from hurricanes or terrorist attacks. Instead of focusing on the handful of people who died because they didn’t heed warnings or simply got extraordinarily unlucky, I think we should be focusing on the thousands of people who survived because our weather forecasters and emergency responders did their jobs so exceptionally well. Of course if we can reduce the numbers even further, we should; but from where I’m sitting, our emergency response system has a lot to be proud of.

Of course, the economic damage of the storms was substantially greater. The losses in destroyed housing and infrastructure in Florida are projected at over $80 billion. Vietnam is much poorer, so there simply isn’t as much infrastructure to destroy; total damage is unlikely to exceed $10 billion. Florida’s GDP is $926 billion, so they are losing 8.6%; while Vietnam’s GDP is $220 billion, so they are still losing 4.5%. And of course the damage isn’t evenly spread across everyone; those hardest hit will lose perhaps their entire net wealth, while others will feel absolutely nothing.

But economic damage is fleeting. Indeed, if we spend the government money we should be, and take the opportunity to rebuild this infrastructure better than it was before, the long-run economic impact could be positive. Even 8.6% of GDP is less than five years of normal economic growth—and there were years in the 1950s where we did it in a single year. The 4.6% that Vietnam lost, they should make back within a year of their current economic growth.

Thank goodness.

Why do so many Americans think that crime is increasing?

Jan 29, JDN 2457783

Since the 1990s, crime in United States has been decreasing, and yet in every poll since then most Americans report that they believe that crime is increasing.

It’s not a small decrease either. The US murder rate is down to the lowest it has been in a century. There are now a smaller absolute number (by 34 log points) of violent crimes per year in the US than there were 20 years ago, despite a significant increase in total population (19 log points—and the magic of log points is that, yes, the rate has decreased by precisely 53 log points).

It isn’t geographically uniform, of course; some states have improved much more than others, and a few states (such as New Mexico) have actually gotten worse.

The 1990s were a peak of violent crime, so one might say that we are just regressing to the mean. (Even that would be enough to make it baffling that people think crime is increasing.) But in fact overall crime in the US is now the lowest it has been since the 1970s, and still decreasing.

Indeed, this decrease has been underestimated, because we are now much better about reporting and investigating crimes than we used to be (which may also be part of why they are decreasing, come to think of it). If you compare against surveys of people who say they have been personally victimized, we’re looking at a decline in violent crime rates of two thirds—109 log points.

Just since 2008 violent crime has decreased by 26% (30 log points)—but of course we all know that Obama is “soft on crime” because he thinks cops shouldn’t be allowed to just shoot Black kids for no reason.

And yet, over 60% of Americans believe that overall crime in the US has increased in the last 10 years (though only 38% think it has increased in their own community!). These figures are actually down from 2010, when 66% thought crime was increasing nationally and 49% thought it was increasing in their local area.

The proportion of people who think crime is increasing does seem to decrease as crime rates decrease—but it still remains alarmingly high. If people were half as rational as most economists seem to believe, the proportion of people who think crime is increasing should drop to basically zero whenever crime rates decrease, since that’s a really basic fact about the world that you can just go look up on the Web in a couple of minutes. There’s no deep ambiguity, not even much “rational ignorance” given the low cost of getting correct answers. People just don’t bother to check, or don’t feel they need to.
What’s going on? How can crime fall to half what it was 20 years ago and yet almost two-thirds of people think it’s actually increasing?

Well, one hint is that news coverage of crime doesn’t follow the same pattern as actual crime.

News coverage in general is a terrible source of information, not simply because news organizations can be biased, make glaring mistakes, and sometimes outright lie—but actually for a much more fundamental reason: Even a perfect news channel, qua news channel, would report what is surprising—and what is surprising is, by definition, improbable. (Indeed, there is a formal mathematical concept in probability theory called surprisal that is simply the logarithm of 1 over the probability.) Even assuming that news coverage reports only the truth, the probability of seeing something on the news isn’t proportional to the probability of the event occurring—it’s more likely proportional to the entropy, which is probability times surprisal.

Now, if humans were optimal information processing engines, that would be just fine, actually; reporting events proportional to their entropy is actually a very efficient mechanism for delivering information (optimal, under certain types of constraints), provided that you can then process the information back into probabilities afterward.

But of course, humans aren’t optimal information processing engines. We don’t recompute the probabilities from the given entropy; instead we use the availability heuristic, by which we simply use the number of times we can think of something happening as our estimate of the probability of that event occurring. If you see more murders on TV news than you used you, you assume that murders must be more common than they used to be. (And when I put it like that, it really doesn’t sound so unreasonable, does it? Intuitively the availability heuristic seems to make sense—which is part of why it’s so insidious.)

Another likely reason for the discrepancy between perception and reality is nostalgia. People almost always have a more positive view of the past than it deserves, particularly when referring to their own childhoods. Indeed, I’m quite certain that a major reason why people think the world was much better when they were kids was that their parents didn’t tell them what was going on. And of course I’m fine with that; you don’t need to burden 4-year-olds with stories of war and poverty and terrorism. I just wish people would realize that they were being protected from the harsh reality of the world, instead of thinking that their little bubble of childhood innocence was a genuinely much safer world than the one we live in today.

Then take that nostalgia and combine it with the availability heuristic and the wall-to-wall TV news coverage of anything bad that happens—and almost nothing good that happens, certainly not if it’s actually important. I’ve seen bizarre fluff pieces about puppies, but never anything about how world hunger is plummeting or air quality is dramatically improved or cars are much safer. That’s the one thing I will say about financial news; at least they report it when unemployment is down and the stock market is up. (Though most Americans, especially most Republicans, still seem really confused on those points as well….) They will attribute it to anything from sunspots to the will of Neptune, but at least they do report good news when it happens. It’s no wonder that people are always convinced that the world is getting more dangerous even as it gets safer and safer.

The real question is what we do about it—how do we get people to understand even these basic facts about the world? I still believe in democracy, but when I see just how painfully ignorant so many people are of such basic facts, I understand why some people don’t. The point of democracy is to represent everyone’s interests—but we also end up representing everyone’s beliefs, and sometimes people’s beliefs just don’t line up with reality. The only way forward I can see is to find a way to make people’s beliefs better align with reality… but even that isn’t so much a strategy as an objective. What do I say to someone who thinks that crime is increasing, beyond showing them the FBI data that clearly indicates otherwise? When someone is willing to override all evidence with what they feel in their heart to be true, what are the rest of us supposed to do?

Experimentally testing categorical prospect theory

Dec 4, JDN 2457727

In last week’s post I presented a new theory of probability judgments, which doesn’t rely upon people performing complicated math even subconsciously. Instead, I hypothesize that people try to assign categories to their subjective probabilities, and throw away all the information that wasn’t used to assign that category.

The way to most clearly distinguish this from cumulative prospect theory is to show discontinuity. Kahneman’s smooth, continuous function places fairly strong bounds on just how much a shift from 0% to 0.000001% can really affect your behavior. In particular, if you want to explain the fact that people do seem to behave differently around 10% compared to 1% probabilities, you can’t allow the slope of the smooth function to get much higher than 10 at any point, even near 0 and 1. (It does depend on the precise form of the function, but the more complicated you make it, the more free parameters you add to the model. In the most parsimonious form, which is a cubic polynomial, the maximum slope is actually much smaller than this—only 2.)

If that’s the case, then switching from 0.% to 0.0001% should have no more effect in reality than a switch from 0% to 0.00001% would to a rational expected utility optimizer. But in fact I think I can set up scenarios where it would have a larger effect than a switch from 0.001% to 0.01%.

Indeed, these games are already quite profitable for the majority of US states, and they are called lotteries.

Rationally, it should make very little difference to you whether your odds of winning the Powerball are 0 (you bought no ticket) or 0.000000001% (you bought a ticket), even when the prize is $100 million. This is because your utility of $100 million is nowhere near 100 million times as large as your marginal utility of $1. A good guess would be that your lifetime income is about $2 million, your utility is logarithmic, the units of utility are hectoQALY, and the baseline level is about 100,000.

I apologize for the extremely large number of decimals, but I had to do that in order to show any difference at all. I have bolded where the decimals first deviate from the baseline.

Your utility if you don’t have a ticket is ln(20) = 2.9957322736 hQALY.

Your utility if you have a ticket is (1-10^-9) ln(20) + 10^-9 ln(1020) = 2.9957322775 hQALY.

You gain a whopping 40 microQALY over your whole lifetime. I highly doubt you could even perceive such a difference.

And yet, people are willing to pay nontrivial sums for the chance to play such lotteries. Powerball tickets sell for about $2 each, and some people buy tickets every week. If you do that and live to be 80, you will spend some $8,000 on lottery tickets during your lifetime, which results in this expected utility: (1-4*10^-6) ln(20-0.08) + 4*10^-6 ln(1020) = 2.9917399955 hQALY.
You have now sacrificed 0.004 hectoQALY, which is to say 0.4 QALY—that’s months of happiness you’ve given up to play this stupid pointless game.

Which shouldn’t be surprising, as (with 99.9996% probability) you have given up four months of your lifetime income with nothing to show for it. Lifetime income of $2 million / lifespan of 80 years = $25,000 per year; $8,000 / $25,000 = 0.32. You’ve actually sacrificed slightly more than this, which comes from your risk aversion.

Why would anyone do such a thing? Because while the difference between 0 and 10^-9 may be trivial, the difference between “impossible” and “almost impossible” feels enormous. “You can’t win if you don’t play!” they say, but they might as well say “You can’t win if you do play either.” Indeed, the probability of winning without playing isn’t zero; you could find a winning ticket lying on the ground, or win due to an error that is then upheld in court, or be given the winnings bequeathed by a dying family member or gifted by an anonymous donor. These are of course vanishingly unlikely—but so was winning in the first place. You’re talking about the difference between 10^-9 and 10^-12, which in proportional terms sounds like a lot—but in absolute terms is nothing. If you drive to a drug store every week to buy a ticket, you are more likely to die in a car accident on the way to the drug store than you are to win the lottery.

Of course, these are not experimental conditions. So I need to devise a similar game, with smaller stakes but still large enough for people’s brains to care about the “almost impossible” category; maybe thousands? It’s not uncommon for an economics experiment to cost thousands, it’s just usually paid out to many people instead of randomly to one person or nobody. Conducting the experiment in an underdeveloped country like India would also effectively amplify the amounts paid, but at the fixed cost of transporting the research team to India.

But I think in general terms the experiment could look something like this. You are given $20 for participating in the experiment (we treat it as already given to you, to maximize your loss aversion and endowment effect and thereby give us more bang for our buck). You then have a chance to play a game, where you pay $X to get a P probability of $Y*X, and we vary these numbers.

The actual participants wouldn’t see the variables, just the numbers and possibly the rules: “You can pay $2 for a 1% chance of winning $200. You can also play multiple times if you wish.” “You can pay $10 for a 5% chance of winning $250. You can only play once or not at all.”

So I think the first step is to find some dilemmas, cases where people feel ambivalent, and different people differ in their choices. That’s a good role for a pilot study.

Then we take these dilemmas and start varying their probabilities slightly.

In particular, we try to vary them at the edge of where people have mental categories. If subjective probability is continuous, a slight change in actual probability should never result in a large change in behavior, and furthermore the effect of a change shouldn’t vary too much depending on where the change starts.

But if subjective probability is categorical, these categories should have edges. Then, when I present you with two dilemmas that are on opposite sides of one of the edges, your behavior should radically shift; while if I change it in a different way, I can make a large change without changing the result.

Based solely on my own intuition, I guessed that the categories roughly follow this pattern:

Impossible: 0%

Almost impossible: 0.1%

Very unlikely: 1%

Unlikely: 10%

Fairly unlikely: 20%

Roughly even odds: 50%

Fairly likely: 80%

Likely: 90%

Very likely: 99%

Almost certain: 99.9%

Certain: 100%

So for example, if I switch from 0%% to 0.01%, it should have a very large effect, because I’ve moved you out of your “impossible” category (indeed, I think the “impossible” category is almost completely sharp; literally anything above zero seems to be enough for most people, even 10^-9 or 10^-10). But if I move from 1% to 2%, it should have a small effect, because I’m still well within the “very unlikely” category. Yet the latter change is literally one hundred times larger than the former. It is possible to define continuous functions that would behave this way to an arbitrary level of approximation—but they get a lot less parsimonious very fast.

Now, immediately I run into a problem, because I’m not even sure those are my categories, much less that they are everyone else’s. If I knew precisely which categories to look for, I could tell whether or not I had found it. But the process of both finding the categories and determining if their edges are truly sharp is much more complicated, and requires a lot more statistical degrees of freedom to get beyond the noise.

One thing I’m considering is assigning these values as a prior, and then conducting a series of experiments which would adjust that prior. In effect I would be using optimal Bayesian probability reasoning to show that human beings do not use optimal Bayesian probability reasoning. Still, I think that actually pinning down the categories would require a large number of participants or a long series of experiments (in frequentist statistics this distinction is vital; in Bayesian statistics it is basically irrelevant—one of the simplest reasons to be Bayesian is that it no longer bothers you whether someone did 2 experiments of 100 people or 1 experiment of 200 people, provided they were the same experiment of course). And of course there’s always the possibility that my theory is totally off-base, and I find nothing; a dissertation replicating cumulative prospect theory is a lot less exciting (and, sadly, less publishable) than one refuting it.

Still, I think something like this is worth exploring. I highly doubt that people are doing very much math when they make most probabilistic judgments, and using categories would provide a very good way for people to make judgments usefully with no math at all.

How I wish we measured percentage change

JDN 2457415

For today’s post I’m taking a break from issues of global policy to discuss a bit of a mathematical pet peeve. It is an opinion I share with many economists—for instance Miles Kimball has a very nice post about it, complete with some clever analogies to music.

I hate when we talk about percentages in asymmetric terms.

What do I mean by this? Well, here are a few examples.

If my stock portfolio loses 10% one year and then gains 11% the following year, have I gained or lost money? I’ve lost money. Only a little bit—I’m down 0.1%—but still, a loss.

In 2003, Venezuela suffered a depression of -26.7% growth one year, and then an economic boom of 36.1% growth the following year. What was their new GDP, relative to what it was before the depression? Very slightly less than before. (99.8% of its pre-recession value, to be precise.) You would think that falling 27% and rising 36% would leave you about 9% ahead; in fact it leaves you behind.

Would you rather live in a country with 11% inflation and have constant nominal pay, or live in a country with no inflation and take a 10% pay cut? You should prefer the inflation; in that case your real income only falls by 9.9%, instead of 10%.

We often say that the real interest rate is simply the nominal interest rate minus the rate of inflation, but that’s actually only an approximation. If you have 7% inflation and a nominal interest rate of 11%, your real interest rate is not actually 4%; it is 3.74%. If you have 2% inflation and a nominal interest rate of 0%, your real interest rate is not actually -2%; it is -1.96%.

This is what I mean by asymmetric:

Rising 10% and falling 10% do not cancel each other out. To cancel out a fall of 10%, you must actually rise 11.1%.

Gaining 20% and losing 20% do not cancel each other out. To cancel out a loss of 20%, you need a gain of 25%.

Is it starting to bother you yet? It sure bothers me.

Worst of all is the fact that the way we usually measure percentages, losses are bounded at 100% while gains are unbounded. To cancel a loss of 100%, you’d need a gain of infinity.

There are two basic ways of solving this problem: The simple way, and the good way.

The simple way is to just start measuring percentages symmetrically, by including both the starting and ending values in the calculation and averaging them.
That is, instead of using this formula:

% change = 100% * (new – old)/(old)

You use this one:

% change = 100% * (new – old)/((new + old)/2)

In this new system, percentage changes are symmetric.

Suppose a country’s GDP rises from $5 trillion to $6 trillion.

In the old system we’d say it has risen 20%:

100% * ($6 T – $5 T)/($5 T) = 20%

In the symmetric system, we’d say it has risen 18.2%:

100% * ($6 T – $5 T)/($5.5 T) = 18.2%

Suppose it falls back to $5 trillion the next year.

In the old system we’d say it has only fallen 16.7%:

100% * ($5 T – $6 T)/($6 T) = -16.7%

But in the symmetric system, we’d say it has fallen 18.2%.

100% * ($5 T – $6 T)/($5.5 T) = -18.2%

In the old system, the gain of 20% was somehow canceled by a loss of 16.7%. In the symmetric system, the gain of 18.2% was canceled by a loss of 18.2%, just as you’d expect.

This also removes the problem of losses being bounded but gains being unbounded. Now both losses and gains are bounded, at the rather surprising value of 200%.

Formally, that’s because of these limits:
lim_{x rightarrow infty} {(x-1) over {(x+1)/2}} = 2

lim_{x rightarrow infty} {(0-x) over {(x+0)/2}} = -2

It might be easier to intuit these limits with an example. Suppose something explodes from a value of 1 to a value of 10,000,000. In the old system, this means it rose 1,000,000,000%. In the symmetric system, it rose 199.9999%. Like the speed of light, you can approach 200%, but never quite get there.

100% * (10^7 – 1)/(5*10^6 + 0.5) = 199.9999%

Gaining 200% in the symmetric system is gaining an infinite amount. That’s… weird, to say the least. Also, losing everything is now losing… 200%?

This is simple to explain and compute, but it’s ultimately not the best way.

The best way is to use logarithms.

As you may vaguely recall from math classes past, logarithms are the inverse of exponents.

Since 2^4 = 16, log_2 (16) = 4.

The natural logarithm ln() is the most fundamental for deep mathematical reasons I don’t have room to explain right now. It uses the base e, a transcendental number that starts 2.718281828459045…

To the uninitiated, this probably seems like an odd choice—no rational number has a natural logarithm that is itself a rational number (well, other than 1, since ln(1) = 0).

But perhaps it will seem a bit more comfortable once I show you that natural logarithms are remarkably close to percentages, particularly for the small changes in which percentages make sense.

We define something called log points such that the change in log points is 100 times the natural logarithm of the ratio of the two:

log points = 100 * ln(new / old)

This is symmetric because of the following property of logarithms:

ln(a/b) = – ln(b/a)

Let’s return to the country that saw its GDP rise from $5 trillion to $6 trillion.

The logarithmic change is 18.2 log points:

100 * ln($6 T / $5 T) = 100 * ln(1.2) = 18.2

If it falls back to $5 T, the change is -18.2 log points:

100 * ln($5 T / $6 T) = 100 * ln(0.833) = -18.2

Notice how in the symmetric percentage system, it rose and fell 18.2%; and in the logarithmic system, it rose and fell 18.2 log points. They are almost interchangeable, for small percentages.

In this graph, the old value is assumed to be 1. The horizontal axis is the new value, and the vertical axis is the percentage change we would report by each method.

percentage_change_small

The green line is the usual way we measure percentages.

The red curve is the symmetric percentage method.

The blue curve is the logarithmic method.

For percentages within +/- 10%, all three methods are about the same. Then both new methods give about the same answer all the way up to changes of +/- 40%. Since most real changes in economics are within that range, the symmetric method and the logarithmic method are basically interchangeable.

However, for very large changes, even these two methods diverge, and in my opinion the logarithm is to be preferred.

percentage_change_large

The symmetric percentage never gets above 200% or below -200%, while the logarithm is unbounded in both directions.

If you lose everything, the old system would say you have lost 100%. The symmetric system would say you have lost 200%. The logarithmic system would say you have lost infinity log points. If infinity seems a bit too extreme, think of it this way: You have in fact lost everything. No finite proportional gain can ever bring it back. A loss that requires a gain of infinity percent seems like it should be called a loss of infinity percent, doesn’t it? Under the logarithmic system it is.

If you gain an infinite amount, the old system would say you have gained infinity percent. The logarithmic system would also say that you have gained infinity log points. But the symmetric percentage system would say that you have gained 200%. 200%? Counter-intuitive, to say the least.

Log points also have another very nice property that neither the usual system nor the symmetric percentage system have: You can add them.

If you gain 25 log points, lose 15 log points, then gain 10 log points, you have gained 20 log points.

25 – 15 + 10 = 20

Just as you’d expect!

But if you gain 25%, then lose 15%, and then gain 10%, you have gained… 16.9%.

(1 + 0.25)*(1 – 0.15)*(1 + 0.10) = 1.169

If you gain 25% symmetric, lose 15% symmetric, then gain 10% symmetric, that calculation is really a pain. To find the value y that is p symmetric percentage points from the starting value x, you end up needing to solve this equation:

p = 100 * (y – x)/((x+y)/2)

This can be done; it comes out like this:

y = (200 + p)/(200 – p) * x

(This also gives a bit of insight into why it is that the bounds are +/- 200%.)

So by chaining those, we can in fact find out what happens after gaining 25%, losing 15%, then gaining 10% in the symmetric system:

(200 + 25)/(200 – 25)*(200 – 15)/(200 + 15)*(200 + 10)/(200 – 10) = 1.223

Then we can put that back into the symmetric system:

100% * (1.223 – 1)/((1+1.223)/2) = 20.1%

So after all that work, we find out that you have gained 20.1% symmetric. We could almost just add them—because they are so similar to log points—but we can’t quite.

Log points actually turn out to be really convenient, once you get the hang of them. The problem is that there’s a conceptual leap for most people to grasp what a logarithm is in the first place.

In particular, the hardest part to grasp is probably that a doubling is not 100 log points.

It is in fact 69 log points, because ln(2) = 0.69.

(Doubling in the symmetric percentage system is gaining 67%—much closer to the log points than to the usual percentage system.)

Calculation of the new value is a bit more difficult than in the usual system, but not as difficult as in the symmetric percentage system.

If you have a change of p log points from a starting point of x, the ending point y is:

y = e^{p/100} * x

The fact that you can add log points ultimately comes from the way exponents add:

e^{p1/100} * e^{p2/100} = e^{(p1+p2)/100}

Suppose US GDP grew 2% in 2007, then 0% in 2008, then fell 8% in 2009 and rose 4% in 2010 (this is approximately true). Where was it in 2010 relative to 2006? Who knows, right? It turns out to be a net loss of 2.4%; so if it was $15 T before it’s now $14.63 T. If you had just added, you’d think it was only down 2%; you’d have underestimated the loss by $70 billion.

But if it had grown 2 log points, then 0 log points, then fell 8 log points, then rose 4 log points, the answer is easy: It’s down 2 log points. If it was $15 T before, it’s now $14.70 T. Adding gives the correct answer this time.

Thus, instead of saying that the stock market fell 4.3%, we should say it fell 4.4 log points. Instead of saying that GDP is up 1.9%, we should say it is up 1.8 log points. For small changes it won’t even matter; if inflation is 1.4%, it is in fact also 1.4 log points. Log points are a bit harder to conceptualize; but they are symmetric and additive, which other methods are not.

Is this a matter of life and death on a global scale? No.

But I can’t write about those every day, now can I?