What would a new macroeconomics look like?

Dec 9 JDN 2458462

In previous posts I have extensively criticized the current paradigm of macroeconomics. But it’s always easier to tear the old edifice down than to build a better one in its place. So in this post I thought I’d try to be more constructive: What sort of new directions could macroeconomics take?

The most important change we need to make is to abandon the assumption of dynamic optimization. This will be a very hard sell, as most macroeconomists have become convinced that the Lucas Critique means we need to always base everything on the dynamic optimization of a single representative agent. I don’t think this was actually what Lucas meant (though maybe we should ask him; he’s still at Chicago), and I certainly don’t think it is what he should have meant. He had a legitimate point about the way macroeconomics was operating at that time: It was ignoring the feedback loops that occur when we start trying to change policies.

Goodhart’s Law is probably a better formulation: Once you make an indicator into a target, you make it less effective as an indicator. So while inflation does seem to be negatively correlated with unemployment, that doesn’t mean we should try to increase inflation to extreme levels in order to get rid of unemployment; sooner or later the economy is going to adapt and we’ll just have both inflation and unemployment at the same time. (Campbell’s Law provides a specific example that I wish more people in the US understood: Test scores would be a good measure of education if we didn’t use them to target educational resources.)

The reason we must get rid of dynamic optimization is quite simple: No one behaves that way.

It’s often computationally intractable even in our wildly oversimplified models that experts spend years working onnow you’re imagining that everyone does this constantly?

The most fundamental part of almost every DSGE model is the Euler equation; this equation comes directly from the dynamic optimization. It’s supposed to predict how people will choose to spend and save based upon their plans for an infinite sequence of future income and spending—and if this sounds utterly impossible, that’s because it is. Euler equations don’t fit the data at all, and even extreme attempts to save them by adding a proliferation of additional terms have failed. (It reminds me very much of the epicycles that astronomers used to add to the geocentric model of the universe to try to squeeze in weird results like Mars, before they had the heliocentric model.)

We should instead start over: How do people actually choose their spending? Well, first of all, it’s not completely rational. But it’s also not totally random. People spend on necessities before luxuries; they try to live within their means; they shop for bargains. There is a great deal of data from behavioral economics that could be brought to bear on understanding the actual heuristics people use in deciding how to spend and save. There have already been successful policy interventions using this knowledge, like Save More Tomorrow.

The best thing about this is that it should make our models simpler. We’re no longer asking each agent in the model to solve an impossible problem. However people actually make these decisions, we know it can be done, because it is being done. Most people don’t really think that hard, even when they probably should; so the heuristics really can’t be that complicated. My guess is that you can get a good fit—certainly better than an Euler equation—just by assuming that people set a target for how much they’re going to save (which is also probably pretty small for most people), and then spend the rest.

The second most important thing we need to add is inequality. Some people are much richer than others; this is a very important fact about economics that we need to understand. Yet it has taken the economics profession decades to figure this out, and even now I’m only aware of one class of macroeconomic models that seriously involves inequality, the Heterogeneous Agent New Keynesian (HANK) models which didn’t emerge until the last few years (the earliest publication I can find is 2016!). And these models are monsters; they are almost always computationally intractable and have a huge number of parameters to estimate.

Understanding inequality will require more parameters, that much is true. But if we abandon dynamic optimization, we won’t need as many as the HANK models have, and most of the new parameters are actually things we can observe, like the distribution of wages and years of schooling.

Observability of parameters is a big deal. Another problem with the way the Lucas Critique has been used is that we’ve been told we need to be using “deep structural parameters” like the temporal elasticity of substitution and the coefficient of relative risk aversion—but we have no idea what those actually are. We can’t observe them, and all of our attempts to measure them indirectly have yielded inconclusive or even inconsistent results. This is probably because these parameters are based on assumptions about human rationality that are simply not realistic. Most people probably don’t have a well-defined temporal elasticity of substitution, because their day-to-day decisions simply aren’t consistent enough over time for that to make sense. Sometimes they eat salad and exercise; sometimes they loaf on the couch and drink milkshakes. Likewise with risk aversion: many moons ago I wrote about how people will buy both insurance and lottery tickets, which no one with a consistent coefficient of relative risk aversion would ever do.

So if we are interested in deep structural parameters, we need to base those parameters on behavioral experiments so that we can understand actual human behavior. And frankly I don’t think we need deep structural parameters; I think this is a form of greedy reductionism, where we assume that the way to understand something is always to look at smaller pieces. Sometimes the whole is more than the sum of its parts. Economists obviously feel a lot of envy for physics; but they don’t seem to understand that aerodynamics would never have (ahem) gotten off the ground if we had first waited for an exact quantum mechanical solution of the oxygen atom (which we still don’t have, by the way). Macroeconomics may not actually need “microfoundations” in the strong sense that most economists intend; it needs to be consistent with small-scale behavior, but it doesn’t need to be derived from small-scale behavior.

This means that the new paradigm in macroeconomics does not need to be computationally intractable. Using heuristics instead of dynamic optimization and worrying less about microfoundations will make the models simpler; adding inequality need not make them so much more complicated.

Fighting the zero-sum paradigm

Dec 2 JDN 2458455

It should be obvious at this point that there are deep, perhaps even fundamental, divides between the attitudes and beliefs of different political factions. It can be very difficult to even understand, much less sympathize, with the concerns of people who are racist, misogynistic, homophobic, xenophobic, and authoritarian.
But at the end of the day we still have to live in the same country as these people, so we’d better try to understand how they think. And maybe, just maybe, that understanding will help us to change them.

There is one fundamental belief system that I believe underlies almost all forms of extremism. Right now right-wing extremism is the major threat to global democracy, but left-wing extremism subscribes to the same core paradigm (consistent with Horseshoe Theory).

I think the best term for this is the zero-sum paradigm. The idea is quite simple: There is a certain amount of valuable “stuff” (money, goods, land, status, happiness) in the world, and the only political question is who gets how much.

Thus, any improvement in anyone’s life must, necessarily, come at someone else’s expense. If I become richer, you become poorer. If I become stronger, you become weaker. Any improvement in my standard of living is a threat to your status.

If this belief were true, it would justify, or at least rationalize, all sorts of destructive behavior: Any harm I can inflict upon someone else will yield a benefit for me, by some fundamental conservation law of the universe.

Viewed in this light, beliefs like patriarchy and White supremacy suddenly become much more comprehensible: Why would you want to spend so much effort hurting women and Black people? Because, by the fundamental law of zero-sum, any harm to women is a benefit to men, and any harm to Black people is a benefit to White people. The world is made of “teams”, and you are fighting for your own against all the others.

And I can even see why such an attitude is seductive: It’s simple and easy to understand. And there are many circumstances where it can be approximately true.
When you are bargaining with your boss over a wage, one dollar more for you is one dollar less for your boss.
When your factory outsources production to China, one more job for China is one less job for you.

When we vote for President, one more vote for the Democrats is one less vote for the Republicans.

But of course the world is not actually zero-sum. Both you and your boss would be worse off if your job were to disappear; they need your work and you need their money. For every job that is outsourced to China, another job is created in the United States. And democracy itself is such a profound public good that it basically overwhelms all others.

In fact, it is precisely when a system is running well that the zero-sum paradigm becomes closest to true. In the space of all possible allocations, it is the efficient ones that behave in something like a zero-sum way, because when the system is efficient, we are already producing as much as we can.

This may be part of why populist extremism always seems to assert itself during periods of global prosperity, as in the 1920s and today: It is precisely when the world is running at its full capacity that it feels most like someone else’s gain must come at your loss.

Yet if we live according to the zero-sum paradigm, we will rapidly destroy the prosperity that made that paradigm seem plausible. A trade war between the US and China would put millions out of work in both countries. A real war with conventional weapons would kill millions. A nuclear war would kill billions.

This is what we must convey: We must show people just how good things are right now.

This is not an easy task; when people want to believe the world is falling apart, they can very easily find excuses to do so. You can point to the statistics showing a global decline in homicide, but one dramatic shooting on the TV news will wipe that all away. You can show the worldwide rise in real incomes across the board, but that won’t console someone who just lost their job and blames outsourcing or immigrants.

Indeed, many people will be offended by the attempt—the mere suggestion that the world is actually in very good shape and overall getting better will be perceived as an attempt to deny or dismiss the problems and injustices that still exist.

I encounter this especially from the left: Simply pointing out the objective fact that the wealth gap between White and Black households is slowly closing is often taken as a claim that racism no longer exists or doesn’t matter. Congratulating the meteoric rise in women’s empowerment around the world is often paradoxically viewed as dismissing feminism instead of lauding it.

I think the best case against progress can be made with regard to global climate change: Carbon emissions are not falling nearly fast enough, and the world is getting closer to the brink of truly catastrophic ecological damage. Yet even here the zero-sum paradigm is clearly holding us back; workers in fossil-fuel industries think that the only way to reduce carbon emissions is to make their families suffer, but that’s simply not true. We can make them better off too.

Talking about injustice feels righteous. Talking about progress doesn’t. Yet I think what the world needs most right now—the one thing that might actually pull us back from the brink of fascism or even war—is people talking about progress.

If people think that the world is full of failure and suffering and injustice, they will want to tear down the whole system and start over with something else. In a world that is largely democratic, that very likely means switching to authoritarianism. If people think that this is as bad as it gets, they will be willing to accept or even instigate violence in order to change to almost anything else.

But if people realize that in fact the world is full of success and prosperity and progress, that things are right now quite literally better in almost every way for almost every person in almost every country than they were a hundred—or even fifty—years ago, they will not be so eager to tear the system down and start anew. Centrism is often mocked (partly because it is confused with false equivalence), but in a world where life is improving this quickly for this many people, “stay the course” sounds awfully attractive to me.
That doesn’t mean we should ignore the real problems and injustices that still exist, of course. There is still a great deal of progress left to be made.  But I believe we are more likely to make progress if we acknowledge and seek to continue the progress we have already made, than if we allow ourselves to fall into despair as if that progress did not exist.

What do we mean by “obesity”?

Nov 25 JDN 2458448

I thought this topic would be particularly appropriate for the week of Thanksgiving, since as a matter of public ritual, this time every year, we eat too much and don’t get enough exercise.

No doubt you have heard the term “obesity epidemic”: It’s not just used by WebMD or mainstream news; it’s also used by the American Heart Association, the Center for Disease Control, the World Health Organization, and sometimes even published in peer-reviewed journal articles.

This is kind of weird, because the formal meaning of the term “epidemic” clearly does not apply here. I feel uncomfortable going against public health officials in what is clearly their area of expertise rather than my own, but everything I’ve ever read about the official definition of the word “epidemic” requires it to be an infectious disease. You can’t “catch” obesity. Hanging out with people who are obese may slightly raise your risk of obesity, but not in the way that hanging out with people with influenza gives you influenza. It’s not caused by bacteria or viruses. Eating food touched by a fat person won’t cause you to catch the fat. Therefore, whatever else it is, this is not an epidemic. (I guess sometimes we use the term more metaphorically, “an epidemic of bankruptcies” or an “epidemic of video game consumption”; but I feel like the WHO and CDC of all people should be more careful.)

Indeed, before we decide what exactly this is, I think we should first ask ourselves a deeper question: What do we mean by “obesity”?

The standard definition of “obesity” relies upon the body mass index (BMI), a very crude measure that simply takes your body mass and divides by the square of your height. It’s easy to measure, but that’s basically its only redeeming quality.

Anyone who has studied dimensional analysis should immediately see a problem here: That isn’t a unit of density. It’s a unit of… density-length? If you take the exact same individual and scale them up by 10%, their BMI will increase by 10%. Do we really intend to say that simply being larger makes you obese, for the exact same ratios of muscle, fat, and bone?

Because of this, the taller you are, the more likely your BMI is going to register as “obese”, holding constant your actual level of health and fitness. And worldwide, average height has been increasing. This isn’t enough to account for the entire trend in rising BMI, but it reduces it substantially; average height has increased by about 10% since the 1950s, which is enough to raise our average BMI by about 2 points of the 5-point observed increase.

And of course BMI doesn’t say anything about your actual ratios of fat and muscle; all it says is how many total kilograms are in your body. As a result, there is a systematic bias against athletes in the calculation of BMI—and any health measure that is biased against athletes is clearly doing something wrong. All those doctors telling us to exercise more may not realize it, but if we actually took their advice, our BMIs would very likely get higher, not lower—especially for men, especially for strength-building exercise.

It’s also quite clear that our standards for “healthy weight” are distorted by social norms. Feminists have been talking about this for years; most women will never look like supermodels no matter how much weight they lose—and eating disorders are much more dangerous than being even 50 pounds overweight. We’re starting to figure out that similar principles hold for men: A six-pack of abs doesn’t actually mean you’re healthy; it means you are dangerously depleted of fatty acids.

To compensate for this, it seems like the most sensible methodology would be to figure out empirically what sort of weight is most strongly correlated with good health and long lifespan—what BMI maximizes your expected QALY.

You might think that this is what public health officials did when defining what is currently categorized as “normal weight”—but you would be wrong. They used social norms and general intuition, and as a result, our standards for “normal weight” are systematically miscalibrated.

In fact, the empirical evidence is quite clear: The people with the highest expected QALY are those who are classified as “overweight”, with BMI between 25 and 30. Those of “normal weight” (20 to 25) fare slightly worse, followed by those classified as “obese class I” (30 to 35)—but we don’t actually see large effects until either “underweight” (18.5-20) or “obese class II” (35 to 40). And the really severe drops in life and health expectancy don’t happen until “obese class III” (>40); and we see the same severe drops at “very underweight” (<18.5).
With that in mind, consider that the global average BMI increased from 21.7 in men and 21.4 in women in 1975 to 24.2 in men and 24.4 in women in 2014. That is, the world average increased from the low end of “normal weight” which is actually too light, to the high end of “normal weight” which is probably optimal. The global prevalence of “morbid obesity”, the kind that actually has severely detrimental effects on health, is only 0.64% in men and 1.6% in men. Even including “severe obesity”, the kind that has a noticeable but not dramatic effect on health, is only 2.3% in men and 5.0% in women. That’s your epidemic? Reporting often says things like “2/3 of American adults are overweight or obese”; but all that “overweight” proportion should be utterly disregarded, since it is beneficial to health. The actual prevalence of obesity in the US—even including class I obesity which is not very harmful—is less than 40%.

If obesity were the health crisis it were made out to be, we should expect that global life expectancy is decreasing, or at the very least not increasing. On the contrary, it is rapidly increasing: In 1955, global life expectancy was only 55 years, while it is now over 70.

Worldwide, the countries with the highest obesity rates are those with the longest life expectancy, because both of these things are strongly correlated with high levels of economic development. But it may not just be that: Smoking reduces obesity while also reducing lifespan, and a lot of those countries with very high obesity (including the US) have very low rates of smoking.

There’s some evidence that within the set of rich, highly-developed countries, obesity rates are positively correlated with lower life expectancy, but these effects are much smaller than the effects of high development itself. Going from the highest obesity in the world (the US, of course) to the lowest among all highly-developed countries (Japan) requires reducing the obesity rate by 34 percentage points but only increases life expectancy by about 5 years. You’d get the same increase by raising overall economic development from the level of Turkey to the level of Greece, about 10 points on the 100-point HDI scale.

 

Now, am I saying that we should all be 400 pounds? No, there does come a point where excess weight is clearly detrimental to health. But this threshold is considerably higher than you have probably been led to believe. If you are 15 or 20 pounds “overweight” by what our society (or even your doctor!) tells you, you are probably actually at the optimal weight for your body type. If you are 30 or 40 pounds “overweight”, you may want to try to lose some weight, but don’t make yourself suffer to achieve it. Only if you are 50 pounds or more “overweight” should you really be considering drastic action. If you do try to lose weight, be realistic about your goal: Losing 5% to 10% of your initial weight is a roaring success.

There are also reasons to be particularly concerned about obesity and lack of exercise in children, which is why Michelle Obama’s “Let’s Move!” campaign was a good thing.

And yes, exercise more! Don’t do it to try to lose weight (exercise does not actually cause much weight loss). Just do it. Exercise has so many health benefits it’s honestly kind of ridiculous.

But why am I complaining about this, anyway? Even if we cause some people to worry more about eating less than is strictly necessary, what’s the harm in that? At least we’re getting people to exercise, and Thanksgiving was already ruined by politics anyway.

Well, here’s the thing: I don’t think this obesity panic is actually making us any less obese.

The United States is the most obese country in the world—and you can’t so much as call up Facebook or step into a subway car in the US without someone telling you that you’re too fat and you need to lose weight. The people who really are obese and may need medical help losing weight are the ones most likely to be publicly shamed and harassed for their weight—and there’s no evidence that this actually does anything to reduce their weight. People who experience shaming and harassment for their weight are actually less likely to achieve sustained weight loss.

Teenagers—both boys and girls—who are perceived to be “overweight” are at substantially elevated risk of depression and suicide. People who more fully internalize feelings of shame about their weight have higher blood pressure and higher triglicerides, though once you control for other factors the effect is not huge. There’s even evidence that fat shaming by medical professionals leads to worse treatment outcomes among obese patients.

If we want to actually reduce obesity—and this makes sense, at least for the upper-tail obesity of BMI above 35—then we should be looking at what sort of interventions are actually effective at doing that. Medicine has an important role to play of course, but I actually think economics might be stronger here (though I suppose I would, wouldn’t I?).

Number 1: Stop subsidizing meat and feed grains. There is now quite clear evidence that direct and indirect government subsidies for meat production are a contributing factor in our high fat consumption and thus high obesity rate, though obviously other factors matter too. If you’re worried about farmers, subsidize vegetables instead, or pay for active labor market programs that will train those farmers to work in new industries. This thing we do where we try to save the job instead of the worker is fundamentally idiotic and destructive. Jobs are supposed to be destroyed; that’s what technological improvement is. If you stop destroying jobs, you will stop economic growth.

Number 2: Restrict advertising of high-sugar, high-fat foods, especially to children. Food advertising is particularly effective, because it draws on such primal impulses, and children are particularly vulnerable (as the APA has publicly reported on, including specifically for food advertising). Corporations like McDonald’s and Kellogg’s know quite well what they’re doing when they advertise high-fat, high-sugar foods to kids and get them into the habit of eating them early.

Number 3: Find policies to promote exercise. Despite its small effects on weight loss, exercise has enormous effects on health. Indeed, the fact that people who successfully lose weight show long-term benefits even if they put the weight back on suggests to me that really what they gained was a habit of exercise. We need to find ways to integrate exercise into our daily lives more. The one big thing that our ancestors did do better than we do is constantly exercise—be it hunting, gathering, or farming. Standing desks and treadmill desks may seem weird, but there is evidence that they actually improve health. Right now they are quite expensive, so most people don’t buy them. If we subsidized them, they would be cheaper; if they were cheaper, more people would buy them; if more people bought them, they would seem less weird. Eventually, it could become normative to walk on a treadmill while you work and sitting might seem weird. Even a quite large subsidy could be worthwhile: say we had to spend $500 per person per year to buy every single adult a treadmill desk each year. That comes to about $80 billion per year, which is less than one fourth what we’re currently spending on diabetes or heart disease, so we’d break even if we simply managed to reduce those two conditions by 13%. Add in all the other benefits for depression, chronic pain, sleep, sexual function, and so on, and the quality of life improvement could be quite substantial.

How do we get rid of gerrymandering?

Nov 18 JDN 2458441

I don’t mean in a technical sense; there is a large literature in political science on better voting mechanisms, and this is basically a solved problem. Proportional representation, algorithmic redistricting, or (my personal favorite) reweighted range voting would eradicate gerrymandering forever.

No, I mean strategically and politically—how do we actually make this happen?

Let’s set aside the Senate. (No, really. Set it aside. Get rid of it. “Take my wife… please.”) The Senate should not exist. It is fundamentally anathema to the most basic principle of democracy, “one person, one vote”; and even its most ardent supporters at the time admitted it had absolutely no principled justification for existing. Smaller states are wildly overrepresented (Wyoming, 580,000 people, gets the same number of Senators as California, 39 million), and non-states are not represented (DC has more people than Wyoming, and Puerto Rico has more people than Iowa). The “Senate popular vote” thus doesn’t really make sense as a concept. But this is not “gerrymandering”, as there is no redistricting process that can be used strategically to tilt voting results in favor of one party or another.

It is in the House of Representatives that gerrymandering is a problem.
North Carolina is a particularly extreme example. Republicans won 50.3% of the popular vote in this year’s House election; North Carolina has 13 seats; so, any reasonable person would think that the Republicans should get 7 of the 13 seats. Under algorithmic redistricting, they would have received 8 of 13 seats. Under proportional representation, they would have received, you guessed it, exactly 7. And under reweighted range voting? Well, that depends on how much people like each party. Assuming that Democrats and Republicans are about equally strong in their preferences, we would also expect the Republicans to win about 7. They in fact received 10 of 13 seats.

Indeed, as FiveThirtyEight found, this is almost the best the Republicans could possibly have done, if they had applied the optimal gerrymandering configuration. There are a couple of districts on the real map that occasionally swing which wouldn’t under the truly optimal gerrymandering; but none of these would flip Democrat more than 20% of the time.

Most states are not as gerrymandered as North Carolina. But there is a pattern you’ll notice among the highly-gerrymandered states.

Alabama is close to optimally gerrymandered for Republicans.

Arkansas is close to optimally gerrymandered for Republicans.

Idaho is close to optimally gerrymandered for Republicans.

Mississippi is close to optimally gerrymandered for Republicans.

As discussed, North Carolina is close to optimally gerrymandered for Republicans.
South Carolina is close to optimally gerrymandered for Republicans.

Texas is close to optimally gerrymandered for Republicans.

Wisconsin is close to optimally gerrymandered for Republicans.

Tennessee is close to optimally gerrymandered for Democrats.

Arizona is close to algorithmic redistricting.

California is close to algorithmic redistricting.

Connecticut is close to algorithmic redistricting.

Michigan is close to algorithmic redistricting.

Missouri is close to algorithmic redistricting.

Ohio is close to algorithmic redistricting.

Oregon is close to algorithmic redistricting.

Illinois is close to algorithmic redistricting, with some bias toward Democrats.

Kentucky is close to algorithmic redistricting, with some bias toward Democrats.

Louisiana is close to algorithmic redistricting, with some bias toward Democrats.

Maryland is close to algorithmic redistricting, with some bias toward Democrats.

Minnesota is close to algorithmic redistricting, with some bias toward Republicans.

New Jersey is close to algorithmic redistricting, with some bias toward Republicans.

Pennsylvania is close to algorithmic redistricting, with some bias toward Republicans.

Colorado is close to proportional representation.

Florida is close to proportional representation.

Iowa is close to proportional representation.

Maine is close to proportional representation.

Nebraska is close to proportional representation.

Nevada is close to proportional representation.

New Hampshire is close to proportional representation.

New Mexico is close to proportional representation.

Washington is close to proportional representation.

Georgia is somewhere between proportional representation and algorithmic redistricting.

Indiana is somewhere between proportional representation and algorithmic redistricting.

New York is somewhere between proportional representation and algorithmic redistricting.

Virginia is somewhere between proportional representation and algorithmic redistricting.

Hawaii is so overwhelmingly Democrat it’s impossible to gerrymander.

Rhode Island is so overwhelmingly Democrat it’s impossible to gerrymander.

Kansas is so overwhelmingly Republican it’s impossible to gerrymander.

Oklahoma is so overwhelmingly Republican it’s impossible to gerrymander.

Utah is so overwhelmingly Republican it’s impossible to gerrymander.

West Virginia is so overwhelmingly Republican it’s impossible to gerrymander.

You may have noticed the pattern. Most states are either close to algorithmic redistricting (14), close to proportional representation (9), or somewhere in between those (4). Of these, 4 are slightly biased toward Democrats and 3 are slightly biased toward Republicans.

6 states are so partisan that gerrymandering isn’t really possible there.

6 states are missing from the FiveThirtyEight analysis; I think they couldn’t get good data on them.

Of the remaining 9 states, 1 is strongly gerrymandered toward Democrats (gaining a whopping 1 seat, by the way), and 8 are strongly gerrymandered toward Republicans.

If we look at the nation as a whole, switching from the current system to proportional representation would increase the number of Democrat seats from 168 to 174 (+6), decrease the number of Republican seats from 195 to 179 (-16), and increase the number of competitive seats from 72 to 82 (+10).

Going to algorithmic redistricting instead would reduce the number of Democrat seats from 168 to 151 (-17), decrease the number of Republican seats from 195 to 180 (-15), and increase the number of competitive seats from 72 to a whopping 104 (+32).

Proportional representation minimizes wasted votes and best represents public opinion (with the possible exception of reweighted range voting, which we can’t really forecast because it uses more expressive information than what polls currently provide). It is thus to be preferred. Relative to the current system, proportional representation would decrease the representation of Republicans relative to Democrats by 24 seats—over 5% of the entire House.

Thus, let us not speak of gerrymandering as a “both sides” sort of problem. There is a very clear pattern here: Gerrymandering systematically favors Republicans.

Yet this does not answer the question I posed: How do we actually fix this?

The answer is going to sound a bit paradoxical: We must motivate voters to vote more so that voters will be better represented.

I have an acquaintance who has complained about this apparently paradoxical assertion: How can we vote to make our votes matter? (He advocates using violence instead.)

But the key thing to understand here is that it isn’t that our votes don’t matter at all—it is merely that they don’t matter enough.

If we were living in an authoritarian regime with sham elections (as some far-left people I’ve spoken to actually seem to believe), then indeed voting would be pointless. You couldn’t vote out Saddam Hussein or Benito Mussolini, even though they both did hold “elections” to make you think you had some voice. At that point, yes, obviously the only remaining choices are revolution or foreign invasion. (It does seem worth noting that both regimes fell by the latter, not the former.)

The US has not fallen that far just yet.

Votes in the US do not count evenly—but they do still count.

We have to work harder than our opponents for the same level of success, but we can still succeed.

Our legs may be shackled to weights, but they are not yet chained to posts in the ground.

Indeed, several states in this very election passed referenda to create independent redistricting commissions, and Democrats have gained at least 32 seats in the House—“at least” because some states are still counting mail-in ballots or undergoing recounts.

The one that has me on the edge of my seat is right here in Orange County, which several outlets (including the New York Times) have made preliminary projections in favor of Mimi Walters (R) but Nate Silver is forecasting higher probability for Katie Porter (D). It says “100% of precincts reporting”, but there are still as many ballots uncounted as there are counted, because California now has almost twice as many voters who vote by mail than in person.

Unfortunately, some of the states that are most highly gerrymandered don’t allow citizen-sponsored ballot initiatives (North Carolina, for instance). This is likely no coincidence. But this still doesn’t make us powerless. If your state is highly gerrymandered, make noise about it. Join or even organize protests. Write letters to legislators. Post on social media. Create memes.
Even most Republican voters don’t believe in gerrymandering. They want to win fair and square. Even if you can’t get them to vote for the candidates you want, reach out to them to get them to complain to their legislators about the injustice of the gerrymandering itself. Appeal to their patriotic values; election manipulation is clearly not what America stands for.

If your state is not highly gerrymandered, think bigger. We should be pushing for a Constitutional amendment implementing either proportional representation or algorithmic redistricting. The majority of states already have reasonably fair districts; if we can get 2/3 of the House and 2/3 of the Senate to agree on such an amendment, we don’t need to win North Carolina or Mississippi.

The sausage of statistics being made

 

Nov 11 JDN 2458434

“Laws, like sausages, cease to inspire respect in proportion as we know how they are made.”

~ John Godfrey Saxe, not Otto von Bismark

Statistics are a bit like laws and sausages. There are a lot of things in statistical practice that don’t align with statistical theory. The most obvious examples are the fact that many results in statistics are asymptotic: they only strictly apply for infinitely large samples, and in any finite sample they will be some sort of approximation (we often don’t even know how good an approximation).

But the problem runs deeper than this: The whole idea of a p-value was originally supposed to be used to assess one single hypothesis that is the only one you test in your entire study.

That’s frankly a ludicrous expectation: Why would you write a whole paper just to test one parameter?

This is why I don’t actually think this so-called multiple comparisons problem is a problem with researchers doing too many hypothesis tests; I think it’s a problem with statisticians being fundamentally unreasonable about what statistics is useful for. We have to do multiple comparisons, so you should be telling us how to do it correctly.

Statisticians have this beautiful pure mathematics that generates all these lovely asymptotic results… and then they stop, as if they were done. But we aren’t dealing with infinite or even “sufficiently large” samples; we need to know what happens when your sample is 100, not when your sample is 10^29. We can’t assume that our variables are independently identically distributed; we don’t know their distribution, and we’re pretty sure they’re going to be somewhat dependent.

Even in an experimental context where we can randomly and independently assign some treatments, we can’t do that with lots of variables that are likely to matter, like age, gender, nationality, or field of study. And applied econometricians are in an even tighter bind; they often can’t randomize anything. They have to rely upon “instrumental variables” that they hope are “close enough to randomized” relative to whatever they want to study.

In practice what we tend to do is… fudge it. We use the formal statistical methods, and then we step back and apply a series of informal norms to see if the result actually makes sense to us. This is why almost no psychologists were actually convinced by Daryl Bem’s precognition experiments, despite his standard experimental methodology and perfect p < 0.05 results; he couldn’t pass any of the informal tests, particularly the most basic one of not violating any known fundamental laws of physics. We knew he had somehow cherry-picked the data, even before looking at it; nothing else was possible.

This is actually part of where the “hierarchy of sciences” notion is useful: One of the norms is that you’re not allowed to break the rules of the sciences above you, but you can break the rules of the sciences below you. So psychology has to obey physics, but physics doesn’t have to obey psychology. I think this is also part of why there’s so much enmity between economists and anthropologists; really we should be on the same level, cognizant of each other’s rules, but economists want to be above anthropologists so we can ignore culture, and anthropologists want to be above economists so they can ignore incentives.

Another informal norm is the “robustness check”, in which the researcher runs a dozen different regressions approaching the same basic question from different angles. “What if we control for this? What if we interact those two variables? What if we use a different instrument?” In terms of statistical theory, this doesn’t actually make a lot of sense; the probability distributions f(y|x) of y conditional on x and f(y|x, z) of y conditional on x and z are not the same thing, and wouldn’t in general be closely tied, depending on the distribution f(x|z) of x conditional on z. But in practice, most real-world phenomena are going to continue to show up even as you run a bunch of different regressions, and so we can be more confident that something is a real phenomenon insofar as that happens. If an effect drops out when you switch out a couple of control variables, it may have been a statistical artifact. But if it keeps appearing no matter what you do to try to make it go away, then it’s probably a real thing.

Because of the powerful career incentives toward publication and the strange obsession among journals with a p-value less than 0.05, another norm has emerged: Don’t actually trust p-values that are close to 0.05. The vast majority of the time, a p-value of 0.047 was the result of publication bias. Now if you see a p-value of 0.001, maybe then you can trust it—but you’re still relying on a lot of assumptions even then. I’ve seen some researchers argue that because of this, we should tighten our standards for publication to something like p < 0.01, but that’s missing the point; what we need to do is stop publishing based on p-values. If you tighten the threshold, you’re just going to get more rejected papers and then the few papers that do get published will now have even smaller p-values that are still utterly meaningless.

These informal norms protect us from the worst outcomes of bad research. But they are almost certainly not optimal. It’s all very vague and informal, and different researchers will often disagree vehemently over whether a given interpretation is valid. What we need are formal methods for solving these problems, so that we can have the objectivity and replicability that formal methods provide. Right now, our existing formal tools simply are not up to that task.

There are some things we may never be able to formalize: If we had a formal algorithm for coming up with good ideas, the AIs would already rule the world, and this would be either Terminator or The Culture depending on whether we designed the AIs correctly. But I think we should at least be able to formalize the basic question of “Is this statement likely to be true?” that is the fundamental motivation behind statistical hypothesis testing.

I think the answer is likely to be in a broad sense Bayesian, but Bayesians still have a lot of work left to do in order to give us really flexible, reliable statistical methods we can actually apply to the messy world of real data. In particular, tell us how to choose priors please! Prior selection is a fundamental make-or-break problem in Bayesian inference that has nonetheless been greatly neglected by most Bayesian statisticians. So, what do we do? We fall back on informal norms: Try maximum likelihood, which is like using a very flat prior. Try a normally-distributed prior. See if you can construct a prior from past data. If all those give the same thing, that’s a “robustness check” (see previous informal norm).

Informal norms are also inherently harder to teach and learn. I’ve seen a lot of other grad students flail wildly at statistics, not because they don’t know what a p-value means (though maybe that’s also sometimes true), but because they don’t really quite grok the informal underpinnings of good statistical inference. This can be very hard to explain to someone: They feel like they followed all the rules correctly, but you are saying their results are wrong, and now you can’t explain why.

In fact, some of the informal norms that are in wide use are clearly detrimental. In economics, norms have emerged that certain types of models are better simply because they are “more standard”, such as the dynamic stochastic general equilibrium models that can basically be fit to everything and have never actually usefully predicted anything. In fact, the best ones just predict what we already knew from Keynesian models. But without a formal norm for testing the validity of models, it’s been “DSGE or GTFO”. At present, it is considered “nonstandard” (read: “bad”) not to assume that your agents are either a single unitary “representative agent” or a continuum of infinitely-many agents—modeling the actual fact of finitely-many agents is just not done. Yet it’s hard for me to imagine any formal criterion that wouldn’t at least give you some points for correctly including the fact that there is more than one but less than infinity people in the world (obviously your model could still be bad in other ways).

I don’t know what these new statistical methods would look like. Maybe it’s as simple as formally justifying some of the norms we already use; maybe it’s as complicated as taking a fundamentally new approach to statistical inference. But we have to start somewhere.

How much should we give?

Nov 4 JDN 2458427

How much should we give of ourselves to others?

I’ve previously struggled with this basic question when it comes to donating money; I have written multiple posts on it now, some philosophical, some empirical, and some purely mathematical.

But the question is broader than this: We don’t simply give money. We also give effort. We also give emotion. Above all, we also give time. How much should we be volunteering? How many protest marches should we join? How many Senators should we call?

It’s easy to convince yourself that you aren’t doing enough. You can always point to some hour when you weren’t doing anything particularly important, and think about all the millions of lives that hang in the balance on issues like poverty and climate change, and then feel a wave of guilt for spending that hour watching Netflix or playing video games instead of doing one more march. This, however, is clearly unhealthy: You won’t actually make yourself into a more effective activist, you’ll just destroy yourself psychologically and become no use to anybody.

I previously argued for a sort of Kantian notion that we should commit to giving our fair share, defined as the amount we would have to give if everyone gave that amount. This is quite appealing, and if I can indeed get anyone to donate 1% of their income as a result, I will be quite glad. (If I can get 100 people to do so, that’s better than I could ever have done myself—a good example of highly cost-effective slacktivism.)

Lately I have come to believe that this is probably inadequate. We know that not everyone will take this advice, which means that by construction it won’t be good enough to actually solve global problems.

This means I must make a slightly greater demand: Define your fair share as the amount you would have to give if everyone among people who are likely to give gave that amount.

Unfortunately, this question is considerably harder. It may not even have a unique answer. The number of people willing to give an amount n is obviously dependent upon the amount x itself, and we are nowhere close to knowing what that function n(x) looks like.

So let me instead put some mathematical constraints on it, by choosing an elasticity. Instead of an elasticity of demand or elasticity of supply, we could call this an elasticity of contribution.

Presumably the elasticity is negative: The more you ask of people, the fewer people you’ll get to contribute.

Suppose that the elasticity is something like -0.5, where contribution is relatively inelastic. This means that if you increase the amount you ask for by 2%, you’ll only decrease the number of contributors by 1%. In that case, you should be like Peter Singer and ask for everything. At that point, you’re basically counting on Bill Gates to save us, because nobody else is giving anything. The total amount contributed n(x) * x is increasing in x.

On the other hand, suppose that elasticity is something like 2, where contribution is relatively elastic. This means that if you increase the amount you ask for by 2%, you will decrease the number of contributors by 4%. In that case, you should ask for very little. You’re asking everyone in the world to give 1% of their income, as I did earlier. The total amount contributed n(x) * x is now decreasing in x.

But there is also a third option: What if the elasticity is exactly -1, unit elastic? Then if you increase the amount you ask for by 2%, you’ll decrease the number of contributors by 2%. Then it doesn’t matter how much you ask for: The total amount contributed n(x) * x is constant.

Of course, there’s no guarantee that the elasticity is constant over all possible choices of x—indeed, it would be quite surprising if it were. A quite likely scenario is that contribution is inelastic for small amounts, then passes through a regime where it is nearly unit elastic, and finally it becomes elastic as you start asking for really large amounts of money.

The simplest way to model that is to just assume that n(x) is linear in x, something like n = N – k x.

There is a parameter N that sets the maximum number of people who will ever donate, and a parameter k that sets how rapidly the number of contributors drops off as the amount asked for increases.

The first-order condition for maximizing n(x) * x is then quite simple: x = N/(2k)

This actually turns out to be the precisely the point at which the elasticity of contribution is -1.

The total amount you can get under that condition is N2/(4k)

Of course, I have no idea what N and k are in real life, so this isn’t terribly helpful. But what I really want to know is whether we should be asking for more money from each person, or asking for less money and trying to get more people on board.

In real life we can sometimes do both: Ask each person to give more than they are presently giving, whatever they are presently giving. (Just be sure to run your slogans by a diverse committee, so you don’t end up with “I’ve upped my standards. Now, up yours!”) But since we’re trying to find a benchmark level to demand of ourselves, let’s ignore that for now.

About 25% of American adults volunteer some of their time, averaging 140 hours of volunteer work per year. This is about 1.6% of all the hours in a year, or 2.4% of all waking hours. Total monetary contributions in the US reached $400 billion for the first time this year; this is about 2.0% of GDP. So the balance between volunteer hours and donations is actually pretty even. It would probably be better to tilt it a bit more toward donations, but it’s really not bad. About 60% of US households made some sort of charitable contribution, though only half of these received the charitable tax deduction.

This suggests to me that the quantity of people who give is probably about as high as it’s going to get—and therefore we need to start talking more about the amount of money. We may be in the inelastic regime, where the way to increase total contributions is to demand more from each individual.

Our goal is to increase the total contribution to poverty eradication by about 1% of GDP in both the US and Europe. So if 60% of people give, and currently total contributions are about 2.0% of GDP, this means that the average contribution is about 3.3% of the contributor’s gross income. Therefore I should tell them to donate 4.3%, right? Not quite; some of them might drop out entirely, and the rest will have to give more to compensate.
Without knowing the exact form of the function n(x), I can’t say precisely what the optimal value is. But it is most likely somewhat larger than 4.3%; 5% would be a nice round number in the right general range. This would raise contributions in the US to 2.6% of GDP, or about $500 billion. That’s a 20% increase over the current level, which is large, but feasible.

Accomplishing a similar increase in Europe would then give us a total of $200 billion per year in additional funds to fight global poverty; this might not quite be enough to end world hunger (depending on which estimate you use), but it would definitely have a large impact.

I asked you before to give 1%. I am afraid I must now ask for more. Set a target of 5%. You don’t have to reach it this year; you can gradually increase your donations each year for several years (I call this “Save More Lives Tomorrow”, after Thaler’s highly successful program “Save More Tomorrow”). This is in some sense more than your fair share; I’m relying on the assumption that half the population won’t actually give anything. But ultimately this isn’t about what’s fair to us. It’s about solving global problems.

Halloween is kind of a weird holiday.

Oct 28 JDN 2458420

I suppose most holidays are weird if you look at them from an outside perspective; but I think Halloween especially so, because we don’t even seem to be clear about what we’re celebrating at this point.

Christmas is ostensibly about the anniversary of the birth of Jesus; New Year’s is about the completion of the year; Thanksgiving is about the founding of the United States and being thankful for what we have; Independence Day is about declaring independence from Great Britain.

But what’s Halloween about, again? Why do we have our children dress up in costumes and go beg candy from our neighbors?

The name comes originally from “All Hallow’s Eve”, the beginning of the three-day Christian holiday Allhallowtide of rememberance for the dead, which has merged in most Latin American countries with the traditional holiday Dia de los Muertos. But most Americans don’t actually celebrate the rest of Allhallowtide; we just do the candy and costume thing on Halloween.

The parts involving costumes and pumpkins actually seem to be drawn from Celtic folk traditions celebrating the ending of harvest season and the coming of the winter months. It’s celebrated so early because, well, in Ireland and Scotland it gets dark and cold pretty early in the year.

One tradition I sort of wish we’d kept from the Celtic festival is that of pouring molten lead into water to watch it rapidly solidify. Those guys really knew how to have a good time. It may have originated as a form of molybdomancy, which I officially declare the word of the day. Fortunately by the power of YouTube, we too can enjoy the excitement of molten lead without the usual fear of third-degree burns. The only divination ritual that we kept as a Halloween activity is the far tamer apple-bobbing.

The trick-or-treating part and especially the costume part originated in the Medieval performance art of mumming, which is also related to the modern concept of mime. Basically, these were traveling performance troupes who went around dressed up as mythological figures, did battle silently, and then bowed and passed their hats around for money. It’s like busking, basically.

The costumes were originally religious or mythological figures, then became supernatural creatures more generally, and nowadays the most popular costumes tend to be superheroes. And since apparently we didn’t want people giving out money to our children, we went for candy instead. Yet I’m sure you could right a really convincing economics paper about why candy is way less efficient, making both the parents giving, the child receiving, and the parents of the child receiving less happy than the same amount of money would (and unlike the similar argument against Christmas presents, I’m actually sort of inclined to agree; it’s not a personal gesture, and what in the world do you need with all that candy?).

So apparently we’re celebrating the end of the harvest, and also mourning the dead, and also being mimes, and also emulating pagan divination rituals, but mainly we’re dressed up like superheroes and begging for candy? Like I said, it’s kind of a weird holiday.

But maybe none of that ultimately matters. The joy of holidays isn’t really in following some ancient ritual whose religious significance is now lost on us; it’s in the togetherness we feel when we manage to all coordinate our activities and do something joyful and out of the ordinary that we don’t have to do by ourselves. I think deep down we all sort of wish we could dress up as superheroes more of the time, but society frowns upon that sort of behavior most of the year; this is our one chance to do it, so we’ll take the chance when we get it.