Tax plan possibilities

Mar 26, JDN 2457839

Recently President Trump (that phrase may never quite feel right) began presenting his new tax plan. To be honest, it’s not as ridiculous as I had imagined it might be. I mean, it’s still not very good, but it’s probably better than Reagan’s tax plan his last year in office, and it’s not nearly as absurd as the half-baked plan Trump originally proposed during the campaign.

But it got me thinking about the incredible untapped potential of our tax system—the things we could achieve as a nation, if we were willing to really commit to them and raise taxes accordingly.

A few years back I proposed a progressive tax system based upon logarithmic utility. I now have a catchy name for that tax proposal; I call it the logtax. It depends on two parameters—a poverty level, at which the tax rate goes to zero; and what I like to call a metarate—the fundamental rate that sets all the actual tax rates by the formula.

For the poverty level, I suggest we use the highest 2-household poverty level set by the Department of Health and Human Services: Because of Alaska’s high prices, that’s the Alaska poverty level, and the resulting figure is $20,290—let’s round to $20,000.

I would actually prefer to calculate taxes on an individual basis—I see no reason to incentivize particular household arrangements—but as current taxes are calculated on a household basis, I’m going to use that for now.

The metarate can be varied, and in the plans below I will compare different options for the metarate.

I will compare six different tax plans:

  1. Our existing tax plan, set under the Obama administration
  2. Trump’s proposed tax plan
  3. A flat rate of 30% with a basic income of $12,000, replacing welfare programs and Medicaid
  4. A flat rate of 40% with a basic income of $15,000, replacing welfare programs and Medicaid
  5. A logtax with a metarate of 20%, all spending intact
  6. A logtax with a metarate of 25% and a basic income of $12,000, replacing welfare programs and Medicaid
  7. A logtax with a metarate of 35% and a basic income of $15,000, cutting military spending by 50% and expanding Medicare to the entire population while eliminating Medicare payroll taxes

To do a proper comparison, I need estimates of the income distribution in the United States, in order to properly estimate the revenue from each type of tax. For that I used US Census data for most of the income data, supplementing with the World Top Incomes database for the very highest income brackets. The household data is broken up into brackets of $5,000 and only goes up to $250,000, so it’s a rough approximation to use the average household income for each bracket, but it’s all I’ve got.

The current brackets are 10%, 15%, 25%, 28%, 33%, 35%, and 39.6%. These are actually marginal rates, not average rates, which makes the calculation a lot more complicated. I did it properly though; for example, when you start paying the marginal rate of 28%, your average rate is really only 20.4%.

Worst of all, I used static scoring—that is, I ignored the Laffer Effect by which increasing taxes changes incentives and can change pre-tax incomes. To really do this analysis properly, one should use dynamic scoring, taking these effects into account—but proper dynamic scoring is an enormous undertaking, and this is a blog post, not my dissertation.

Still, I was able to get pretty close to the true figures. The actual federal budget shows total revenue net of payroll taxes to be $2.397 trillion, whereas I estimated $2.326 trillion; the true deficit is $608 billion and I estimated $682 billion.

Under Trump’s tax plan, almost all rates are cut. He also plans to remove some deductions, but all reports I could find on the plan were vague as to which ones, and with data this coarse it’s very hard to get any good figures on deduction amounts anyway. I also want to give him credit where it’s due: It was a lot easier to calculate the tax rates under Trump’s plan (but still harder than under mine…). But in general what I found was the following:

Almost everyone pays less income tax under Trump’s plan, by generally about 4-5% of their income. The poor benefit less or are slightly harmed; the rich benefit a bit more.

For example, a household in poverty making $12,300 would pay $1,384 currently, but $1,478 under Trump’s plan, losing $94 or 0.8% of their income. An average household making $52,000 would pay $8,768 currently but only $6,238 under Trump’s plan, saving $2,530 or about 4.8% of their income. A household making $152,000 would pay $35,580 currently but only $28,235 under Trump’s plan, saving $7,345 or again about 4.8%. A top 1% household making $781,000 would pay $265,625 currently, but only $230,158 under Trump’s plan, saving $35,467 or about 4.5%. A top 0.1% household making $2,037,000 would pay $762,656 currently, but only $644,350 under Trump’s plan, saving $118,306 or 5.8% of their income. A top 0.01% household making $9,936,000 would pay $3,890,736 currently, but only $3,251,083 under Trump’s plan, saving $639,653 or 6.4% of their income.

Because taxes are cut across the board, Trump’s plan would raise less revenue. My static scoring will exaggerate this effect, but only moderately; my estimate says we would lose over $470 billion in annual revenue, while the true figure might be $300 billion. In any case, Trump will definitely increase the deficit substantially unless he finds a way to cut an awful lot of spending elsewhere—and his pet $54 billion increase to the military isn’t helping in that regard. My estimate of the new deficit under Trump’s plan is $1.155 trillion—definitely not the sort of deficit you should be running during a peacetime economic expansion.

Let’s see what we might have done instead.

If we value simplicity and ease of calculation, it’s hard to beat a flat tax plus basic income. With a flat tax of 30% and a basic income of $12,000 per household, the poor do much better off because of the basic income, while the rich do a little better because of the flat tax, and the middle class feels about the same because the two effects largely cancel. Calculating your tax liability now couldn’t be easier; multiply your income by 3, remove a zero—that’s what you owe in taxes. And how much do you get in basic income? The same as everyone else, $12,000.

Using the same comparison households: The poor household making $12,300 would now receive $8,305—increasing their income by $9,689 or 78.8% relative to the current system. The middle-class household making $52,000 would pay $3,596, saving $5,172 or 10% of their income. The upper-middle-class household making $152,000 would now pay $33,582, saving only $1998 or 1.3% of their income. The top 1% household making $782,000 would pay $234,461, saving $31,164 or 4.0%. The top 0.1% household making $2,037,000 would pay $611,000, saving $151,656 or 7.4%. Finally, the top 0.01% household making $9,936,000 would pay $2,980,757, saving $910,000 or 9.1%.

Thus, like Trump’s plan, the tax cut almost across the board results in less revenue. However, because of the basic income, we can now justify cutting a lot of spending on social welfare programs. I estimated we could reasonably save about $630 billion by cutting Medicaid and other social welfare programs, while still not making poor people worse off because of the basic income. The resulting estimated deficit comes in at $1.085 trillion, which is still too large—but less than what Trump is proposing.

If I raise the flat rate to 40%—just as easy to calculate—I can bring that deficit down, even if I raise the basic income to $15,000 to compensate. The poverty household now receives $10,073, and the other representative households pay $5,974; $45,776; $297,615; $799,666; and $3,959,343 respectively. This means that the poor are again much better off, the middle class are about the same, and the rich are now substantially worse off. But what’s our deficit now? $180 billion—that’s about 1% of GDP, the sort of thing you can maintain indefinitely with a strong currency.

Can we do better than this? I think we can, with my logtax.

I confess that the logtax is not quite as easy to calculate as the flat tax. It does require taking exponents, and you can’t do it in your head. But it’s actually still easier than the current system, because there are no brackets to keep track of, no discontinuous shifts in the marginal rate. It is continuously progressive for all incomes, and the same formula can be used for all incomes from zero to infinity.
The simplest plan just replaces the income tax with a logtax of 20%. The poor household now receives $1,254, just from the automatic calculation of the tax—no basic income was added. The middle-class household pays $9,041, slightly more than what they are currently paying. Above that, people start paying more for sure: $50,655; $406,076; $1,228,795; and $7,065,274 respectively.

This system is obviously more progressive, but does it raise sufficient revenue? Why, as a matter of fact it removes the deficit entirely. The model estimates that the budget would now be at surplus of $110 billion. This is probably too optimistic; under dynamic scoring the distortions are probably going to cut the revenue a little. But it would almost certainly reduce the deficit, and very likely eliminate it altogether—without any changes in spending.

The next logtax plan adds a basic income of $12,000. To cover this, I raised the metarate to 25%. Now the poor household is receiving $11,413, the middle-class household is paying a mere $1,115, and the other households are paying $50,144; $458,140; $1,384,475; and $7,819,932 respectively. That top 0.01% household isn’t going to be happy, as they are now paying 78% of their income where in our current system they would pay only 39%. But their after-tax income is still over $2 million.

How does the budget look now? As with the flat tax plan, we can save about $630 billion by cutting redundant social welfare programs. So we are once again looking at a surplus, this time of about $63 billion. Again, the dynamic scoring might show some deficit, but definitely not a large one.

Finally, what if I raise the basic income to $15,000 and raise the metarate to 35%? The poor household now receives $14,186, while the median household pays $2,383. The richer households of course foot the bill, paying $64,180; $551,031; $1,618,703; and $8,790,124 respectively. Oh no, the top 0.01% household will have to make do with only $1.2 million; how will they survive!?

This raises enough revenue that it allows me to do some even more exciting things. With a $15,000 basic income, I can eliminate social welfare programs for sure. But then I can also cut military spending, say in half—still leaving us the largest military in the world. I can move funds around to give Medicare to every single American, an additional cost of about twice what we currently pay for Medicare. Then Medicaid doesn’t just get cut; it can be eliminated entirely, folded into Medicare. Assuming that the net effect on total spending is zero, the resulting deficit is estimated at only $168 billion, well within the range of what can be sustained indefinitely.

And really, that’s only the start. Once you consider all the savings on healthcare spending—an average of $4000 per person per year, if switching to single-payer brings us down to the average of other highly-developed countries. This is more than what the majority of the population would be paying in taxes under this plan—meaning that once you include the healthcare benefits, the majority of Americans would net receive money from the government. Compared to our current system, everyone making under about $80,000 would be better off. That is what we could be doing right now—free healthcare for everyone, a balanced budget (or close enough), and the majority of Americans receiving more from the government than they pay in taxes.

These results are summarized in the table below. (I also added several more rows of representative households—though still not all the brackets I used!) I’ve color-coded who would be paying less in tax in green and who would be more in tax in red under each plan, compared to our current system. This color-coding is overly generous to Trump’s plan and the 30% flat tax plan, because it doesn’t account for the increased government deficit (though I did color-code those as well, again relative to the current system). And yet, over 50% of households make less than $51,986, putting the poorest half of Americans in the green zone for every plan except Trump’s. For the last plan, I also color-coded those between $52,000 and $82,000 who would pay additional taxes, but less than they save on healthcare, thus net saving money in blue. Including those folks, we’re benefiting over 69% of Americans.


pre-tax income

Current tax system Trump’s tax plan Flat 30% tax with $12k basic income Flat 40% tax with $15k basic income Logtax 20% Logtax 25% with $12k basic income Logtax 35% with $15k basic income, single-payer healthcare
$1,080 $108 $130 -$11,676 -$14,568 -$856 -$12,121 -$15,173
$12,317 $1,384 $1,478 -$8,305 -$10,073 -$1,254 -$11,413 -$14,186
$22,162 $2,861 $2,659 -$5,351 -$6,135 $450 -$9,224 -$11,213
$32,058 $4,345 $3,847 -$2,383 -$2,177 $2,887 -$6,256 -$7,258
$51,986 $8,768 $6,238 $3,596 $5,794 $9,041 $1,115 $2,383
$77,023 $15,027 $9,506 $11,107 $15,809 $18,206 $11,995 $16,350
$81,966 $16,263 $10,742 $12,590 $17,786 $20,148 $14,292 $17,786
$97,161 $20,242 $14,540 $17,148 $23,864 $26,334 $21,594 $28,516
$101,921 $21,575 $15,730 $18,576 $27,875 $30,571 $23,947 $31,482
$151,940 $35,580 $28,235 $33,582 $45,776 $50,655 $50,144 $64,180
$781,538 $265,625 $230,158 $222,461 $297,615 $406,076 $458,140 $551,031
$2,036,666 $762,656 $644,350 $599,000 $799,666 $1,228,795 $1,384,475 $1,618,703
$9,935,858 $3,890,736 $3,251,083 $2,968,757 $3,959,343 $7,065,274 $7,819,932 $8,790,124
Change in federal spending $0 $0 -$630 billion -$630 billion $0 -$630 billion $0
Estimated federal surplus -$682 billion -$1,155 billion -$822 billion -$180 billion $110 billion $63 billion -$168 billion

Information theory proves that multiple-choice is stupid

Mar 19, JDN 2457832

This post is a bit of a departure from my usual topics, but it’s something that has bothered me for a long time, and I think it fits broadly into the scope of uniting economics with the broader realm of human knowledge.

Multiple-choice questions are inherently and objectively poor methods of assessing learning.

Consider the following question, which is adapted from actual tests I have been required to administer and grade as a teaching assistant (that is, the style of question is the same; I’ve changed the details so that it wouldn’t be possible to just memorize the response—though in a moment I’ll get to why all this paranoia about students seeing test questions beforehand would also be defused if we stopped using multiple-choice):

The demand for apples follows the equation Q = 100 – 5 P.
The supply of apples follows the equation Q = 10 P.
If a tax of $2 per apple is imposed, what is the equilibrium price, quantity, tax revenue, consumer surplus, and producer surplus?

A. Price = $5, Quantity = 10, Tax revenue = $50, Consumer Surplus = $360, Producer Surplus = $100

B. Price = $6, Quantity = 20, Tax revenue = $40, Consumer Surplus = $200, Producer Surplus = $300

C. Price = $6, Quantity = 60, Tax revenue = $120, Consumer Surplus = $360, Producer Surplus = $300

D. Price = $5, Quantity = 60, Tax revenue = $120, Consumer Surplus = $280, Producer Surplus = $500

You could try solving this properly, setting supply equal to demand, adjusting for the tax, finding the equilibrium, and calculating the surplus, but don’t bother. If I were tutoring a student in preparing for this test, I’d tell them not to bother. You can get the right answer in only two steps, because of the multiple-choice format.

Step 1: Does tax revenue equal $2 times quantity? We said the tax was $2 per apple.
So that rules out everything except C and D. Welp, quantity must be 60 then.

Step 2: Is quantity 10 times price as the supply curve says? For C they are, for D they aren’t; guess it must be C then.

Now, to do that, you need to have at least a basic understanding of the economics underlying the question (How is tax revenue calculated? What does the supply curve equation mean?). But there’s an even easier technique you can use that doesn’t even require that; it’s called Answer Splicing.

Here’s how it works: You look for repeated values in the answer choices, and you choose the one that has the most repeated values. Prices $5 and $6 are repeated equally, so that’s not helpful (maybe the test designer planned at least that far). Quantity 60 is repeated, other quantities aren’t, so it’s probably that. Likewise with tax revenue $120. Consumer surplus $360 and Producer Surplus $300 are both repeated, so those are probably it. Oh, look, we’ve selected a unique answer choice C, the correct answer!

You could have done answer splicing even if the question were about 18th century German philosophy, or even if the question were written in Arabic or Japanese. In fact you even do it if it were written in a cipher, as long as the cipher was a consistent substitution cipher.

Could the question have been designed to better avoid answer splicing? Probably. But this is actually quite difficult to do, because there is a fundamental tradeoff between two types of “distractors” (as they are known in the test design industry). You want the answer choices to contain correct pieces and resemble the true answer, so that students who basically understand the question but make a mistake in the process still get it wrong. But you also want the answer choices to be distinct enough in a random enough pattern that answer splicing is unreliable. These two goals are inherently contradictory, and the result will always be a compromise between them. Professional test-designers usually lean pretty heavily against answer-splicing, which I think is probably optimal so far as it goes; but I’ve seen many a professor err too far on the side of similar choices and end up making answer splicing quite effective.

But of course, all of this could be completely avoided if I had just presented the question as an open-ended free-response. Then you’d actually have to write down the equations, show me some algebra solving them, and then interpret your results in a coherent way to answer the question I asked. What’s more, if you made a minor mistake somewhere (carried a minus sign over wrong, forgot to divide by 2 when calculating the area of the consumer surplus triangle), I can take off a few points for that error, rather than all the points just because you didn’t get the right answer. At the other extreme, if you just randomly guess, your odds of getting the right answer are miniscule, but even if you did—or copied from someone else—if you don’t show me the algebra you won’t get credit.

So the free-response question is telling me a lot more about what the student actually knows, in a much more reliable way, that is much harder to cheat or strategize against.

Moreover, this isn’t a matter of opinion. This is a theorem of information theory.

The information that is carried over a message channel can be quantitatively measured as its Shannon entropy. It is usually measured in bits, which you may already be familiar with as a unit of data storage and transmission rate in computers—and yes, those are all fundamentally the same thing. A proper formal treatment of information theory would be way too complicated for this blog, but the basic concepts are fairly straightforward: think in terms of how long a sequence of 1s and 0s it would take to convey the message. That is, roughly speaking, the Shannon entropy of that message.

How many bits are conveyed by a multiple-choice response with four choices? 2. Always. At maximum. No exceptions. It is fundamentally, provably, mathematically impossible to convey more than 2 bits of information via a channel that only has 4 possible states. Any multiple-choice response—any multiple-choice response—of four choices can be reduced to the sequence 00, 01, 10, 11.

True-false questions are a bit worse—literally, they convey 1 bit instead of 2. It’s possible to fully encode the entire response to a true-false question as simply 0 or 1.

For comparison, how many bits can I get from the free-response question? Well, in principle the answer to any mathematical question has the cardinality of the real numbers, which is infinite (in some sense beyond infinite, in fact—more infinite than mere “ordinary” infinity); but in reality you can only write down a small number of possible symbols on a page. I can’t actually write down the infinite diversity of numbers between 3.14159 and the true value of pi; in 10 digits or less, I can only (“only”) write down a few billion of them. So let’s suppose that handwritten text has about the same information density as typing, which in ASCII or Unicode has 8 bits—one byte—per character. If the response to this free-response question is 300 characters (note that this paragraph itself is over 800 characters), then the total number of bits conveyed is about 2400.

That is to say, one free-response question conveys six hundred times as much information as a multiple-choice question. Of course, a lot of that information is redundant; there are many possible correct ways to write the answer to a problem (if the answer is 1.5 you could say 3/2 or 6/4 or 1.500, etc.), and many problems have multiple valid approaches to them, and it’s often safe to skip certain steps of algebra when they are very basic, and so on. But it’s really not at all unrealistic to say that I am getting between 10 and 100 times as much useful information about a student from reading one free response than I would from one multiple-choice question.

Indeed, it’s actually a bigger difference than it appears, because when evaluating a student’s performance I’m not actually interested in the information density of the message itself; I’m interested in the product of that information density and its correlation with the true latent variable I’m trying to measure, namely the student’s actual understanding of the content. (A sequence of 500 random symbols would have a very high information density, but would be quite useless in evaluating a student!) Free-response questions aren’t just more information, they are also better information, because they are closer to the real-world problems we are training for, harder to cheat, harder to strategize, nearly impossible to guess, and provided detailed feedback about exactly what the student is struggling with (for instance, maybe they could solve the equilibrium just fine, but got hung up on calculating the consumer surplus).

As I alluded to earlier, free-response questions would also remove most of the danger of students seeing your tests beforehand. If they saw it beforehand, learned how to solve it, memorized the steps, and then were able to carry them out on the test… well, that’s actually pretty close to what you were trying to teach them. It would be better for them to learn a whole class of related problems and then be able to solve any problem from that broader class—but the first step in learning to solve a whole class of problems is in fact learning to solve one problem from that class. Just change a few details each year so that the questions aren’t identical, and you will find that any student who tried to “cheat” by seeing last year’s exam would inadvertently be studying properly for this year’s exam. And then perhaps we could stop making students literally sign nondisclosure agreements when they take college entrance exams. Listen to this Orwellian line from the SAT nondisclosure agreement:

Misconduct includes,but is not limited to:

Taking any test questions or essay topics from the testing room, including through memorization, giving them to anyone else, or discussing them with anyone else through anymeans, including, but not limited to, email, text messages or the Internet

Including through memorization. You are not allowed to memorize SAT questions, because God forbid you actually learn something when we are here to make money off evaluating you.

Multiple-choice tests fail in another way as well; by definition they cannot possibly test generation or recall of knowledge, they can only test recognition. You don’t need to come up with an answer; you know for a fact that the correct answer must be in front of you, and all you need to do is recognize it. Recall and recognition are fundamentally different memory processes, and recall is both more difficult and more important.

Indeed, the real mystery here is why we use multiple-choice exams at all.
There are a few types of very basic questions where multiple-choice is forgivable, because there are just aren’t that many possible valid answers. If I ask whether demand for apples has increased, you can pretty much say “it increased”, “it decreased”, “it stayed the same”, or “it’s impossible to determine”. So a multiple-choice format isn’t losing too much in such a case. But most really interesting and meaningful questions aren’t going to work in this format.

I don’t think it’s even particularly controversial among educators that multiple-choice questions are awful. (Though I do recall an “educational training” seminar a few weeks back that was basically an apologia for multiple choice, claiming that it is totally possible to test “higher-order cognitive skills” using multiple-choice, for reals, believe me.) So why do we still keep using them?

Well, the obvious reason is grading time. The one thing multiple-choice does have over a true free response is that it can be graded efficiently and reliably by machines, which really does make a big difference when you have 300 students in a class. But there are a couple reasons why even this isn’t a sufficient argument.

First of all, why do we have classes that big? It’s absurd. At that point you should just email the students video lectures. You’ve already foreclosed any possibility of genuine student-teacher interaction, so why are you bothering with having an actual teacher? It seems to be that universities have tried to work out what is the absolute maximum rent they can extract by structuring a class so that it is just good enough that students won’t revolt against the tuition, but they can still spend as little as possible by hiring only one adjunct or lecturer when they should have been paying 10 professors.

And don’t tell me they can’t afford to spend more on faculty—first of all, supporting faculty is why you exist. If you can’t afford to spend enough providing the primary service that you exist as an institution to provide, then you don’t deserve to exist as an institution. Moreover, they clearly can afford it—they simply prefer to spend on hiring more and more administrators and raising the pay of athletic coaches. PhD comics visualized it quite well; the average pay for administrators is three times that of even tenured faculty, and athletic coaches make ten times as much as faculty. (And here I think the mean is the relevant figure, as the mean income is what can be redistributed. Firing one administrator making $300,000 does actually free up enough to hire three faculty making $100,000 or ten grad students making $30,000.)

But even supposing that the institutional incentives here are just too strong, and we will continue to have ludicrously-huge lecture classes into the foreseeable future, there are still alternatives to multiple-choice testing.

Ironically, the College Board appears to have stumbled upon one themselves! About half the SAT math exam is organized into a format where instead of bubbling in one circle to give your 2 bits of answer, you bubble in numbers and symbols corresponding to a more complicated mathematical answer, such as entering “3/4” as “0”, “3”, “/”, “4” or “1.28” as “1”, “.”, “2”, “8”. This could easily be generalized to things like “e^2” as “e”, “^”, “2” and “sin(3pi/2)” as “sin”, “3” “pi”, “/”, “2”. There are 12 possible symbols currently allowed by the SAT, and each response is up to 4 characters, so we have already increased our possible responses from 4 to over 20,000—which is to say from 2 bits to 14. If we generalize it to include symbols like “pi” and “e” and “sin”, and allow a few more characters per response, we could easily get it over 20 bits—10 times as much information as a multiple-choice question.

But we can do better still! Even if we insist upon automation, high-end text-recognition software (of the sort any university could surely afford) is now getting to the point where it could realistically recognize a properly-formatted algebraic formula, so you’d at least know if the student remembered the formula correctly. Sentences could be transcribed into typed text, checked for grammar, and sorted for keywords—which is not nearly as good as a proper reading by an expert professor, but is still orders of magnitude better than filling circle “C”. Eventually AI will make even more detailed grading possible, though at that point we may have AIs just taking over the whole process of teaching. (Leaving professors entirely for research, presumably. Not sure if this would be good or bad.)

Automation isn’t the only answer either. You could hire more graders and teaching assistants—say one for every 30 or 40 students instead of one for every 100 students. (And then the TAs might actually be able to get to know their students! What a concept!) You could give fewer tests, or shorter ones—because a small, reliable sample is actually better than a large, unreliable one. A bonus there would be reducing students’ feelings of test anxiety. You could give project-based assignments, which would still take a long time to grade, but would also be a lot more interesting and fulfilling for both the students and the graders.

Or, and perhaps this is the most radical answer of all: You could stop worrying so much about evaluating student performance.

I get it, you want to know whether students are doing well, both so that you can improve your teaching and so that you can rank the students and decide who deserves various awards and merits. But do you really need to be constantly evaluating everything that students do? Did it ever occur to you that perhaps that is why so many students suffer from anxiety—because they are literally being formally evaluated with long-term consequences every single day they go to school?

If we eased up on all this evaluation, I think the fear is that students would just detach entirely; all teachers know students who only seem to show up in class because they’re being graded on attendance. But there are a couple of reasons to think that maybe this fear isn’t so well-founded after all.

If you give up on constant evaluation, you can open up opportunities to make your classes a lot more creative and interesting—and even fun. You can make students want to come to class, because they get to engage in creative exploration and collaboration instead of memorizing what you drone on at them for hours on end. Most of the reason we don’t do creative, exploratory activities is simply that we don’t know how to evaluate them reliably—so what if we just stopped worrying about that?

Moreover, are those students who only show up for the grade really getting anything out of it anyway? Maybe it would be better if they didn’t show up—indeed, if they just dropped out of college entirely and did something else with their lives until they get their heads on straight. Maybe all this effort that we are currently expending trying to force students to learn who clearly don’t appreciate the value of learning could instead be spent enriching the students who do appreciate learning and came here to do as much of it as possible. Because, ultimately, you can lead a student to algebra, but you can’t make them think. (Let me be clear, I do not mean students with less innate ability or prior preparation; I mean students who aren’t interested in learning and are only showing up because they feel compelled to. I admire students with less innate ability who nonetheless succeed because they work their butts off, and wish I were quite so motivated myself.)
There’s a downside to that, of course. Compulsory education does actually seem to have significant benefits in making people into better citizens. Maybe if we let those students just leave college, they’d never come back, and they would squander their potential. Maybe we need to force them to show up until something clicks in their brains and they finally realize why we’re doing it. In fact, we’re really not forcing them; they could drop out in most cases and simply don’t, probably because their parents are forcing them. Maybe the signaling problem is too fundamental, and the only way we can get unmotivated students to accept not getting prestigious degrees is by going through this whole process of forcing them to show up for years and evaluating everything they do until we can formally justify ultimately failing them. (Of course, almost by construction, a student who does the absolute bare minimum to pass will pass.) But college admission is competitive, and I can’t shake this feeling there are thousands of students out there who got rejected from the school they most wanted to go to, the school they were really passionate about and willing to commit their lives to, because some other student got in ahead of them—and that other student is now sitting in the back of the room playing with an iPhone, grumbling about having to show up for class every day. What about that squandered potential? Perhaps competitive admission and compulsory attendance just don’t mix, and we should stop compelling students once they get their high school diploma.

Intellectual Property, revisited

Mar 12, JDN 2457825

A few weeks ago I wrote a post laying out the burden of proof for intellectual property, but didn’t have time to get into the empirical question of whether our existing intellectual property system can meet this burden of proof.

First of all, I want to make a very sharp distinction between three types of regulations that are all called “intellectual property”.

First there are trademarks, which I have absolutely no quarrel with. Avoiding fraud and ensuring transparency are fundamental functions without which markets would unravel, and without trademarks these things would be much harder to accomplish. Trademarks allow a company to establish a brand identity that others cannot usurp; they ensure that when you buy Coca-Cola (R) it is really in fact the beverage you expect and not some counterfeit knockoff. (And if counterfeit Coke sounds silly, note that counterfeit honey and maple syrup are actually a major problem.) Yes, there should be limits on how much you can trademark—no one wants to live in a world where you feel Love ™ and open Screen Doors ™—but in fact our courts are already fairly good about only allowing corporations to trademark newly-coined words and proper names for their products.

Next there are copyrights, which I believe are currently too strong and often abused, but I do think should exist in some form (or perhaps copylefts instead). Authors should have at least certain basic rights over how their work can be used and published. If nothing else, proper attribution should always be required, as without that plagiarism becomes intolerably easy. And steps should be taken to ensure that if any people profit from its sale, the author is among them. I publish this blog under a by-sa copyleft, which essentially means that you can share it with whomever you like and even adapt its content into your own work, so long as you properly attribute it to me and you do not attempt to claim ownership over it. For scientific content, I think only a copyleft of this sort makes sense—the era of for-profit journals with paywalls must end, as it is holding back our civilization. But for artistic content (and I mean art in the broadest sense, including books, music, movies, plays, and video games), stronger regulations might well make sense. The question is whether our current system is actually too strong, or is protecting the wrong people—often it seems to protect the corporations that sell the content rather than the artists who created it.

Finally there are patents. Unlike copyright which applies to a specific work of art, patent is meant to apply to the underlying concept of a technology. Copyright (or rather the by-sa copyleft) protects the text of this article; you can’t post it on your own blog and claim you wrote it. But if I were to patent it somehow (generally, verbal arguments cannot be patented, fortunately), you wouldn’t even be able to paraphrase it. The trademark on a Samsung ™ TV just means that if I make a TV I can’t say I am Samsung, because I’m not. You wouldn’t copyright a TV, but the analogous process would be if I were to copy every single detail of the television and try to sell that precise duplicate. But the patents on that TV mean that if I take it apart, study each component, find a way to build them all from my own raw materials, even make them better, and build a new TV out of them that looks different and performs better—I would still be infringing on intellectual property. Patents grant an extremely strong notion of property rights, one which actually undermines a lot of other, more basic concepts of property. It’s my TV, why can’t I take it apart and copy the components? Well, as long as the patent holds, it’s not entirely my TV. Property rights this strong—that allow a corporation to have its cake of selling the TV but eat it too by owning the rights to all its components—require a much stronger justification.

Trademark protects a name, which is unproblematic. Copyright protects a work, which carries risks but is still probably necessary in many cases. But patent protects an idea—and we should ask ourselves whether that is really something it makes sense to do.

In previous posts I’ve laid out some of the basic philosophical arguments for why patents do not seem to support innovation and may actually undermine it. But in this post I want to do something more direct and quantitative: Empirically, what is the actual effect of copyrights and patents on innovation? Can we find a way to quantify the costs and benefits to our society of different modes of intellectual property?

Economists quantify things all the time, so I briefly combed the literature to see what sort of empirical studies had been done on the economic impact of copyrights and patents.

Patents definitely create barriers to scientific collaboration: Scientific articles with ideas that don’t get patented are about 10-20% more likely to be cited than scientific articles with ideas that are patented. (I would have expected a larger effect, but that’s still not trivial.)

A 1995 study found that creased patent protections do seem to be positively associated with more trade.

A 2009 study of Great Britain published in AER found it “puzzling” that stronger patents actually seem to reduce the rate of innovation domestically, while having no effect on foreign innovation—yet this is exactly what I would have predicted. Foreign innovations should be largely unaffected by UK patents, but stricter patent laws in the UK make it harder for most actual innovators, only benefiting a handful of corporations that aren’t even particularly innovative.

This 1996 study did find a positive effect of stronger patent laws on economic growth, but it was quite small and only statistically significant when using instrumental variables that they couldn’t be bothered to define except in an appendix. When your result hinges on the use of instrumental variables that you haven’t even clearly defined in the paper, something is very fishy. My guess is that they p-hacked the instruments until they got the result they wanted.

This other 1996 study is a great example of why economists need to listen to psychologists. It found a negative correlation between foreign direct investment and—wait for it—the number of companies that answered “yes” to a survey question, “Does country X have intellectual property protection too weak to allow you to transfer your newest or most effective technology to a wholly-owned subsidiarythere?” Oh, wow, you found a correlation between foreign direct investment and a question directly asking about foreign direct investment.

his 2004 study found a nonlinear relationship whereby increased economic development affects intellectual property rights, rather than the other way around. But I find their theoretical model quite odd, and the scatter plot that lies at the core of their empirical argument reminds me of Rexthor, the Dog-Bearer. “This relationship appears to be non-linear,” they say when pointing at a scatter plot that looks mostly like nothing and maybe like a monotonic increase.

This 1997 study found a positive correlation between intellectual property strength, R&D spending, and economic growth. The effect is weak, but the study looks basically sound. (Though I must say I’d never heard anyone use the words “significant at the 24% level” before. Normally one would say “nonsignificant” for that variable methinks. It’s okay for it not to be significant in some of your regressions, you know.)

This 1992 paper found that intellectual property harms poor countries and may or may not benefit rich countries, but it uses a really weird idiosyncratic theoretical model to get there. Frankly if I see the word “theorem” anywhere in your empirical paper, I get suspicious. No, it is not a theorem that “For economies in steady state the South loses from tighter intellectual property rights.” It may be true, but it does not follow from the fundamental axioms of mathematics.

This law paper is excellent; it focuses on the fact that intellectual property is a unique arrangement and a significant deviation from conventional property rights. It tracks the rise of legal arguments that erroneously equate intellectual property with real property, and makes the vital point that fully internalizing the positive externalities of technology was never the goal, and would in fact be horrible should it come to pass. We would all have to pay most of our income in royalties to the Newton and Faraday estates. So, I highly recommend reading it. But it doesn’t contain any empirical results on the economic effects of intellectual property.

This is the best paper I was able to find showing empirical effects of different intellectual property regimes; I really have no complaints about its econometrics. But it was limited to post-Soviet economies shortly after the fall of the USSR, which were rather unique circumstances. (Indeed, by studying only those countries, you’d probably conclude that free markets are harmful, because the shock of transition was so great.)

This 1999 paper is also quite good; using a natural experiment from a sudden shift in Japanese patent policy, they found almost no difference in actual R&D. The natural experiment design makes this particularly credible, but it’s difficult to generalize since it only covered Japan specifically.

This study focused in particular on copyrights and the film industry, and found a nonlinear effect: While having no copyright protection at all was harmful to the film industry, making the copyright protections too strong had a strangling effect on new filmmakers entering the industry. This would suggest that the optimal amount of copyright is moderate, which sounds reasonable to me.

This 2009 study did a much more detailed comparison of different copyright regimes, and was unable to find a meaningful pattern amidst the noise. Indeed, they found that the only variable that consistently predicted the number of new works of art was population—more people means more art, and nothing else seemed to matter. If this is correct, it’s quite damning to copyright; it would suggest that people make art for reasons fundamentally orthogonal to copyright, and copyright does almost nothing useful. (And I must say, if you talk to most artists, that tends to be their opinion on the matter!)

This 1996 paper found that stronger patents had no benefits for poor countries, but benefited rich countries quite a large amount: Increased patent protection was estimated to add as much as 0.7% annual GDP growth over the whole period. That’s a lot; if this is really true, stronger patents are almost certainly worth it. But then it becomes difficult to explain why more precise studies haven’t found effects anywhere near that large.

This paper was pretty interesting; they found a fat-tailed distribution of patents, where most firms have none, many have one or a few, and a handful of firms have a huge number of patents. This is also consistent with the distribution of firm revenue and profit—and I’d be surprised if I didn’t find a strong correlation between all three. But this really doesn’t tell us whether patents are contributing to innovation.
This paper found that the harmonization of global patents in the Uruguay Round did lead to gains from trade for most countries, but also transferred about $4.5 billion to the US from the rest of the world. Of course, that’s really not that large an amount when we’re talking about global policy over several years.

What does all that mean? I don’t know. It’s a mess. There just don’t seem to be any really compelling empirical studies on the economic impact of copyrights and patents. The preponderance of the evidence, such as it is, would seem to suggest that copyrights provide a benefit as long as they aren’t too strong, while patents provide a benefit but it is quite small and likely offset by the rent-seeking of the corporations that own them. The few studies that found really large effects (like 0.7% annual GDP growth) don’t seem very credible to me; if the effect were really that large, it shouldn’t be so ambiguous. 0.7% per year over 25 years is a GDP 20% larger. Over 50 years, GDP would be 42% larger. We would be able to see that.

Does this ambiguity mean we should do nothing, and wait until the data is better? I don’t think so. Remember, the burden of proof for intellectual property should be high. It’s a fundamentally bizarre notion of property, one which runs against most of our standard concepts of real property; it restricts our rights in very basic ways, making literally the majority of our population into criminals. Such a draconian policy requires a very strong justification, but such a justification does not appear to be forthcoming. If it could be supported, that 0.7% GDP growth might be enough; but it doesn’t seem to be replicable. A free society does not criminalize activities just in case it might be beneficial to do so—it only criminalizes activities that have demonstrable harm. And the harm of copyright and patent infringement simply isn’t demonstrable enough to justify its criminalization.

We don’t have to remove them outright, but we should substantially weaken copyright and patent laws. They should be short-term, they should provide very basic protection, and they should never be owned by corporations, always by individuals (corporations should be able to license them—but not own them). If we then observe a substantial reduction in innovation and economic output, then we can put them back. But I think that what defenders of intellectual property fear most is that if we tried this, it wouldn’t be so bad—and then the “doom and gloom” justification they’ve been relying on all this time would fall apart.

Games as economic simulations—and education tools

Mar 5, JDN 2457818 [Sun]

Moore’s Law is a truly astonishing phenomenon. Now as we are well into the 21st century (I’ve lived more of my life in the 21st century than the 20th now!) it may finally be slowing down a little bit, but it has had quite a run, and even this could be a temporary slowdown due to economic conditions or the lull before a new paradigm (quantum computing?) matures. Since at least 1975, the computing power of an individual processor has doubled approximately every year and a half; that means it has doubled over 25 times—or in other words that it has increased by a factor of over 30 million. I now have in my pocket a smartphone with several thousand times the processing speed of the guidance computer of the Saturn V that landed on the Moon.

This meteoric increase in computing power has had an enormous impact on the way science is done, including economics. Simple theoretical models that could be solved by hand are now being replaced by enormous simulation models that have to be processed by computers. It is now commonplace to devise models with systems of dozens of nonlinear equations that are literally impossible to solve analytically, and just solve them iteratively with computer software.

But one application of this technology that I believe is currently underutilized is video games.

As a culture, we still have the impression that video games are for children; even games like Dragon Age and Grand Theft Auto that are explicitly for adults (and really quite inappropriate for children!) are viewed as in some sense “childish”—that no serious adult would be involved with such frivolities. The same cultural critics who treat Shakespeare’s vagina jokes as the highest form of art are liable to dismiss the poignant critique of war in Call of Duty: Black Ops or the reflections on cultural diversity in Skyrim as mere puerility.

But video games are an art form with a fundamentally greater potential than any other. Now that graphics are almost photorealistic, there is really nothing you can do in a play or a film that you can’t do in a video game—and there is so, so much more that you can only do in a game.
In what other medium can we witness the spontaneous emergence and costly aftermath of a war? Yet EVE Online has this sort of event every year or so—just today there was a surprise attack involving hundreds of players that destroyed thousands of hours’—and dollars’—worth of starships, something that has more or less become an annual tradition. A few years ago there was a massive three-faction war that destroyed over $300,000 in ships and has now been commemorated as “the Bloodbath of B-R5RB”.
Indeed, the immersion and interactivity of games present an opportunity to do nothing less than experimental macroeconomics. For generations it has been impossible, or at least absurdly unethical, to ever experimentally manipulate an entire macroeconomy. But in a video game like EVE Online or Second Life, we can now do so easily, cheaply, and with little or no long-term harm to the participants—and we can literally control everything in the experiment. Forget the natural resource constraints and currency exchange rates—we can change the laws of physics if we want. (Indeed, EVE‘s whole trade network is built around FTL jump points, and in Second Life it’s a basic part of the interface that everyone can fly like Superman.)

This provides untold potential for economic research. With sufficient funding, we could build a game that would allow us to directly test hypotheses about the most fundamental questions of economics: How do governments emerge and maintain security? How is the rule of law sustained, and when can it be broken? What controls the value of money and the rate of inflation? What is the fundamental cause of unemployment, and how can it be corrected? What influences the rate of technological development? How can we maximize the rate of economic growth? What effect does redistribution of wealth have on employment and output? I envision a future where we can directly simulate these questions with thousands of eager participants, varying the subtlest of parameters and carrying out events over any timescale we like from seconds to centuries.

Nor is the potential of games in economics limited to research; it also has enormous untapped potential in education. I’ve already seen in my classes how tabletop-style games with poker chips can teach a concept better in a few minutes than hours of writing algebra derivations on the board; but custom-built video games could be made that would teach economics far better still, and to a much wider audience. In a well-designed game, people could really feel the effects of free trade or protectionism, not just on themselves as individuals but on entire nations that they control—watch their GDP numbers go down as they scramble to produce in autarky what they could have bought for half the price if not for the tariffs. They could see, in real time, how in the absence of environmental regulations and Pigovian taxes the actions of millions of individuals could despoil our planet for everyone.

Of course, games are fundamentally works of fiction, subject to the Fictional Evidence Fallacy and only as reliable as their authors make them. But so it is with all forms of art. I have no illusions about the fact that we will never get the majority of the population to regularly read peer-reviewed empirical papers. But perhaps if we are clever enough in the games we offer them to play, we can still convey some of the knowledge that those papers contain. We could also update and expand the games as new information comes in. Instead of complaining that our students are spending time playing games on their phones and tablets, we could actually make education into games that are as interesting and entertaining as the ones they would have been playing. We could work with the technology instead of against it. And in a world where more people have access to a smartphone than to a toilet, we could finally bring high-quality education to the underdeveloped world quickly and cheaply.

Rapid growth in computing power has given us a gift of great potential. But soon our capacity will widen even further. Even if Moore’s Law slows down, computing power will continue to increase for awhile yet. Soon enough, virtual reality will finally take off and we’ll have even greater depth of immersion available. The future is bright—if we can avoid this corporatist cyberpunk dystopia we seem to be hurtling toward, of course.