The Benevolent Ruler’s Handbook (Part 1): The Policy Problem

FCCC

(Note that while I’ll frame everything in terms of government policy, all of this theory really applies to any form of decision-making, including that of an artificial intelligence. So if government is not your jam, please don’t get turned off by all the policy talk. The theory I’ve laid out has been oddly helpful to a lot of my life, and I hope it provides you with similar benefits.)

All hail the emperor

Let’s suppose you’re the benevolent emperor of the world. All those who opposed you have died (under mysterious circumstances) and all who remain will follow your policies without question.

Palpatine shooting lightning out of his fingers
Very mysterious circumstances…

How do you decide between two different policies in, let’s say, transport? Well, since you’re benevolent, it follows that the policies must be measured against some kind of moral criteria…

Bentham called. He’s bringing over the utili-tea.

Maybe we select the transport policy that results in the “better world”, that makes sense, right?

Wait, what if the two transport policies are identical, except one version extends one train with an extra railcar, within which cancer researchers produce a cure. On the one hand, the cancer-research version produces a better outcome than the plain-train version. But on the other hand, it is clearly stepping outside its purview as a transport system.

So choosing a policy solely because it results in the “better” outcome categorically does not make sense: Each policy must have a bounded domain.

Just use science, you fools.

What about if we approach the policy-choice problem scientifically?

Scientism
New studies suggest that I’m right, and you’re wrong, so shut up

Aha! Let’s run a randomized experiment, we’ll test one transport system in half the world’s countries, and another transport system in the other half, and see which countries do better against some metric! Again, we reach the “bounded domains” problem, but if our metric is itself bounded to the domain of interest, we might avoid that problem.

The main issue here is the cost considerations: Even if the experiments had no financial cost, we still have the opportunity cost of running the wrong policy. And because of this, we must figure out how to choose which policies to test in the first place, which cannot be solved through empiricism: It’s obviously nonsensical to think, “Before I perform any experiment, I should perform an experiment”.

I’m not saying that empiricism has no place in public policy. It does. You need science to estimate parameters and to test assumptions. But there is no way around it: We need a theory first so that we know which parameters to estimate, and which assumptions to test. Even empiricism is based on a solid theoretical foundation. (In the case of deterministic models with infallible experiments, falsification only works because of modus tollens. In the more general case, its foundation is Bayesian.) Empiricism could not be relied upon without theory.

I’ve encountered many who claim that “theory cannot count as evidence”. That statement is effectively a claim that either logic does not work or that there is little value in containing our uncertainty to a small set of testable (or otherwise reasonable) assumptions. Both claims are, I hope, obviously ridiculous to you, dear reader. (See: all of mathematics and much of physics.)

Policymakers might know some things…

Maybe we can gain some wisdom by looking at how policy is currently done.

Dumpster fire
Pictured: How policy is currently done

In my experience, the approach of most policymakers is to see a problem produced under the current system, and then attempt some alteration to fix that problem.

For example, when I worked in the Farm Household Allowance team, the allowance had a limited number of days that recipients were allowed to be paid, as well as a “gradient payment” (if you earn a little more private income, you get paid a little less welfare).

This meant that farmers who were eligible for, say, half the maximum rate of pay faced a dilemma: Get paid half the max rate now, or avoid claiming an allowance that you are eligible for, because you might be under even more financial distress in the future (and thus receive a greater payment at that time). Burdening those people with that kind of choice is a big problem. So, we changed to an “all-or-nothing” payment: You’re either at the max rate or you get nothing. Problem solved!

And this illustrates the failure of focusing on “the current problem”—you create new ones. In our case, if someone is $1 below the income threshold, they get the maximum payment, which is several hundred dollars; if they earn $2 more, they lose all of it. This is a strong disincentive to earn more money when you’re right below the threshold. That’s bad. Customers should never be worse off from earning more private income.

Fortunately, we can solve both issues: I proposed we have a gradient payment and if someone is paid at, say, 17 percent of the max rate, they lose 0.17 days of entitlement (rather than a full day). This arguably met the legislative requirements (the legislators neglected to define the word “day”, the fools). So I encoded the desired properties mathematically and showed, under a reasonable set of assumptions, that my proposal guaranteed both of the desired criteria. We did not proceed with my idea (see above image).

It ain’t much, but it’s honest work
My team after coming up with one idea and calling it a day

Okay, fine. This problem is harder than I thought. What do we do?

So, you can’t use simple utility maximization… And you can’t really do experimental policy design… And you can’t focus on “fixing the current problems”.

Then what can you do?

I suspect that mechanism designers have it right: define an ideal set of goals (don’t worry, we’ll define exactly what that means later), and then prove, under a reasonable set of assumptions, that the proposed policy satisfies those goals.

To prove they’re right, what we need is a theory that shows mechanism design works according to some reasonable (and, ideally, unobjectionable) justification, i.e. in the same way that modus tollens is a reasonable justification for experimental falsification. And that’s exactly what I aim to do here. So if that interests you, please keep on reading. And if it doesn’t—but you make government policy—read it anyway.

What I hope you’ll come away with is a set of non-obvious and necessary constraints for thinking about making choices (policy or otherwise), just as modus tollens constrains you from believing in a theory that has been falsified. I warn you that this is a lot more complicated than justifications behind falsification.

This is Part 1 of a series. Here's Part 2.

I'll try to write a new post each week. If you want to get notified of each new post, you can subscribe to my account.

Effective Altruism Forum
EA Forum