Should you still use the ITN framework? [Red Teaming Contest]

frib

TL;DR: Fermi estimates are often a better tool than the importance-tractability-neglectedness framework to estimate the priority of cause areas. Fermi estimates can be done either using the breakdown of the ITN framework, or using a different breakdown entirely. I believe that using more Fermi estimates increases the quality of the arguments around cause area prioritization, while making it easier for outsiders to understand the core claims of effective altruism, easier to criticize the factual basis of the arguments, and easier to speak about potential new cause areas.

The state of the ITN framework

Most presentations of effective altruism argue that to compare the priority of cause areas, one should compare their importance, neglectedness and tractability. Even in-depth analyses, like those of 80,000 Hours, the discussion often revolves around evaluating a given cause area along these three metrics. It frames so much of the discussions in EA that it matters if this tool is overused.

However, this tool has often been criticized for 3 main reasons:

The criteria are fuzzy, which makes it hard to use properly. See The Important/Neglected/Tractable framework needs to be applied with care for more details.
The tool is easy to misuse by applying it directly to interventions instead of cause areas. This is problematic because when comparing interventions, the ITN framework might lead you to underestimate the importance of other relevant criteria. See Evaluation Frameworks for more details.
It leads you to make assumptions that are often wrong (such as diminishing marginal returns, irrationality of other actors, …). See Summary Review of ITN Critiques for more details.

The main reason why the ITN is still used seems to be that there is no good alternative to it when it comes to comparing cause area (for example, this page on 80,000 Hours only presents precise cost-effectiveness analysis as a credible alternative).

I disagree: I believe that there is at least one better way of comparing cause areas: Fermi estimates

An alternative: Fermi estimates

What Fermi estimates are

Doing a Fermi estimate of a quantity is estimating it quantitatively by breaking it down into other quantities easier to find/guess. Fermi estimates can be used to quickly estimate empirical quantities that could be found with more research (Such as How many fatalities from passenger-jet crashes have there been in the past 20 years?), but also quantities much closer to what you want to estimate when you do cause area prioritization (such as what’s the chance a smart London resident dies of a Russian nuke).

Analyses estimating the for specific interventions also use some kind of Fermi estimates extensively. For example, Givewell does deep empirical investigations, but still relies on guesses to estimate how much to trust different pieces of research, or to estimate the tradeoffs between different kinds of good.

In the rest of the piece, when I speak about Fermi estimates, I think not only of the kind of Fermi estimates you can do in ten minutes on the back of an envelope, but also of the kind of Fermi estimates Give Well is doing: a lot of careful empirical investigations mixed with some unavoidable wild guesses.

How Fermi estimates are relevant to cause area prioritization

The reason why people still use the ITN is mainly that they don’t feel like they can do a precise cost-effectiveness analysis of the $\frac{good done}{additional resources}$ for an entire cause area. Which is true. But Fermi estimates are not about being precise, it’s about having some estimate of the target quantity. What property of a cause area should be estimated? One possibility is to choose a unity of “good done” (for example, lives saved), a unity of “resources” (for example, dollars), and then estimate what would be the effect of giving 1 unit of additional resources to an effective unknown intervention that helps solve the problem.

In fact, the ITN framework is an instance of such a Fermi estimate! The ITN estimates $Q = \frac{good done}{additional resources}$ by breaking it down the following way: $Q = \frac{good done}{% of the problem solved} \times \frac{% of the problem solved}{% increase in resources} \times \frac{% increase in resources}{additional resources}$ .

How to do a Fermi estimate of the priority of a cause area?

There are two ways you can do a Fermi estimate of $\frac{good done}{additional resources}$ :

Use the breakdown of the ITN framework, and actually multiply the importance, the neglectedness and the tractability. I would still call that a Fermi estimate, since most ITN analyses don’t do this extra step.
Use a different breakdown that uses knowledge and guesses specific to the cause area.

In any case, don’t hesitate to make wild guesses if you feel stuck. It’s not a precise cost-effectiveness analysis! I think that making a quantitative guess of the tractability of a cause area is difficult. But guessing how likely it is that Elon Musk will be the first human to step on Mars also is! And EAs usually don’t shy away from making probabilistic estimates: it’s not intrinsically impossible, you will need to make some wild guesses along the way, and it will make what you say clearer if you use numbers instead of words.

I don’t know of any good example of strategy 1 (using the breakdown of the ITN framework), which is surprising given that this is supposed to be the underpinning of cause area prioritization!

A good example of strategy 2 (using a different breakdown) would be this estimation of the cost-effectivness of climate change interventions.

Why the ITN framework is worse than Fermi estimates at cause area prioritization

I believe that framing cause area prioritization using the ITN, rather than directly using Fermi estimates to guess what $\frac{good done}{additional resources}$ is, makes the resulting argument worse, while making it much harder to criticize. First, they don’t display the 3 main weaknesses of the ITN framework

There is no fuzziness in the criteria of Fermi estimates, because there are no criteria! The criteria you come up with when doing a Fermi estimate are often tailored to your use case, and you won’t be lost if one of the pieces of your estimate doesn’t fall neatly in one box.
Fermi estimates can’t be misused outside of its original domain, because it’s always a very good tool when there is something to estimate.
Fermi estimates don’t have simplifying assumptions baked in. You can make simplifying assumptions. But they will be clearly visible.

But Fermi estimates also have strong additional benefits.

ITN analyses often use incoherent units

Because the ITN's justification is that mutliplying the importance, the neglectedness and the tractability gives you $\frac{good done}{additional resources}$ , whenever someone gives you a quantitative estimate of importance, neglectedness and tractability, you should be able to multiply them together to get the quantity you truly care about. Therefore, any good ITN analysis should express the three quantities in “compatible units”. But then, why did 80,000 Hours in its problem profile of the risk of AI use “effort” in the unit of tractability (which encompasses both money and skilled persons), and dollars in the unit of neglectedness? I would bet that 80,000 Hours didn’t use dollars in the unit of tractability because AI risk isn’t very money constrained (see here for instance). Therefore it doesn’t matter how much money was already poured into AI safety if you want to know how much good (measured by “reducing probability of humanity’s extinction” for instance) your “effort” (measured by time invested in AI safety for instance) might do. This error would have been avoided if 80,000 Hours consistently put “our best guess at $\frac{good done}{additional resources}$ ” (using a Fermi estimate) next to the usual importance, neglectedness and tractability.

And even when you can multiply the three quantities together, I feel like speaking in terms of importance, neglectedness and tractability might make you feel that there is no total ordering of intervention (“some have higher importance, some have higher tractability, whether you prefer one or the other is a matter a personal taste”), which is completely wrong if you care about maximizing the $\frac{good done}{additional resources}$ , which is at the core of effective altruism. Again, if the emphasis was put on the result of the Fermi estimate rather than its pieces, nobody would make this mistake.

ITN analyses hide differences in moral values and available resources

The real reason why different rational actors should put their resources in different cause areas is not because of their taste of importance vs neglectedness vs tractability. It’s because they have different resources, and different moral values.

This can easily get washed down in summaries of cause area prioritization: for example “AI risk has high importance, medium neglectedness, medium tractability, and wild animal welfare has medium importance, high neglectedness, medium tractability” doesn’t express these differences, whereas “working your whole life in AI safety might reduce the probability of humanity going extinct by ~ $10^{- 6}$ , and by donating 1000$ to effective wild animal welfare interventions you might prevent ~ $10^{5}$ years of animal suffering” does. The numbers given here are not well grounded estimations. It’s a shame that after spending hours listening to wild animal welfare advocates and tens of hours reading EA posts about AI safety, I still don’t know what’s the rough order of magnitude of expected impact by relevant unit of effort (even if I roughly know what the relevant importance, neglectedness and tractability are).

Expressing your conclusions as the result of a Fermi estimate rather than using the ITN, makes it clear

What you count as “good”
What resources are relevant for solving the problem

Of course, a single cause area might be able to put different resources to good use, and might have multiple “good” effects. But that’s a feature of Fermi estimate for cause prioritization, not a bug. It doesn’t make sense to evaluate the priority of a cause area without specifying what counts as good, and what resources are needed.

ITN analyses are hard to criticize

Fermi estimates are weird. When you read one of them, you realize that the assumptions it makes are often pretty bad, and if you’re knowledgeable or just curious, you might find ways to improve parts of the estimate, by refining some aspect of it, or add some factors that the original author forgot to take into account. I feel like reading an estimate of importance, neglectedness and tractability feels much less like that. You might be curious about the specifics of the number of funders, or the value of the potential future, but it isn’t the most natural thing to investigate other relevant criteria, or the links between importance, neglectedness and tractability.

I don’t think that’s a problem for people who already have tens of hours of experience with EA: they often don’t use the ITN framework at all when it comes to debating the cost-effectiveness of one cause area. For instance, you know that Givewell never computes the tractability of a given intervention, and therefore, if you want to know what’s the good you can do by donating to global health, you might not bother thinking about tractability. The criticism of global health I found the most insightful (for example, critics speaking of the impact of global health donation on the long term future) didn’t speak from within the framework, but rather argued almost from first principles.

However, that is often not the case for new EAs. When I joined EA, it didn’t feel natural to debate about a Fermi estimate I had done before joining the movement (about the fight against aging), because it didn’t fall neatly into the ITN framework. I would love it if in a few years, one would be encouraged to come up with Fermi estimates for completely new cause areas.

In general, I feel like one shouldn’t need to justify that a cause area is important, neglected and tractable to make it worthy of attention. For example, the violence against women post has an good point if judged based on the following rough Fermi estimate: multiple studies have found ~100$/DALY saved, even adjusting down by a factor of 10 for the fact that the studies might be flawed, the usual effective intervention might save at around 1 DALY every ~1000$ donated, which is about as good as Givewell top recommendations. The article does use the relevant elements, but it spends a lot of time arguing that the problem is important (which seems to not be the strongest point of the piece: 68 500 deaths/year is ten times less than deaths due to Malaria), neglected, and somewhat tractable. I think the community would be in a better shape looking at the details of the Fermi estimate (How good are the studies? What could be the magnitude of positive/negative side effects of the interventions? …) rather than focusing on general heuristics which don’t have a lot of value when more evidence is available.

When using the ITN framework is still the right call

Even if I believe the ITN framework shouldn’t be used as extensively as it is today when it comes to in-depth research posted on 80,000 Hours or on the EA forum, I think it is a tool that has some good use cases:

Speaking to people afraid of numbers: Fermi estimates might not be suitable for people who can’t handle very large or very small numbers. It might be better to debate about importance, neglectedness and tractability to avoid losing precious minds which might not (yet) be fluent in this kind of math.
Quick judgment calls: if you hear in the news about some new intervention, the ITN is a tool which enables you to make a one minute judgment call about whether it is worth investigating further or not. Fermi estimates often take more than one minute, even for the most basic ones.
Summarizing Fermi estimates: if your Fermi estimate for a cause area uses the same breakdown as the ITN, saying explicitly what your importance, tractability and neglectedness are makes it much more obvious to the reader why the cause area has high priority (or low priority).
Investigating a completely new cause area: when you have no kind of intervention in mind, the error bars of the Fermi estimate might be way too large. Notice that in this case, the same argument also applies to the ITN framework (because it is a kind of Fermi estimate). But by dropping the T of the ITN, there is still something useful to say. You might say “cause area X has very high importance and neglectedness, so it might be worth trying really hard to find interventions” or “cause area Y has high neglectedness but very low importance, therefore I don’t plan to investigate further”. In this kind of situation, speaking of cost-effectiveness (or tractability) is confusing. Focusing on estimating the magnitude of the good to be done can be better.

Final words

There is one bad reason to use the ITN without doing the full Fermi estimate: the ITN is more socially acceptable. I believe that when the person you’re speaking to has the high-school level math required to understand Fermi estimates well, you should not shy away from it. I agree it feels more weird to speak about “animal lives saved per dollar” or “decrease in humanity extinction probability by work week” than about importance, neglectedness and tractability. But not explicitly saying that you are trying to guess $\frac{good done}{additional resources}$ is just sweeping under the carpet the natural conclusion of your argument, while making it harder to understand. This might help recruit more people, but drive away those with good epistemics (as some other recruitment tactics might).

Please, next time you’re writing a piece about a particular cause area, or explaining EA ideas to someone with a STEM education, use Fermi estimates instead of the ITN framework.

finmSep 10 20226

Thanks for writing this! What I took from it (with some of my own thoughts added):

The ITN framework is a way of breaking down into three components — $\frac{good done}{% of the problem solved} \times \frac{% of the problem solved}{% increase in resources} \times \frac{% increase in resources}{additional resources}$

As such ITN is one way of estimating $\frac{good done}{additional resources}$ . But you might sometimes prefer other ways to break it down, because:

Sometimes the units for I,T, or N are ambigious, and that can lead to unit inconsistensies in the same argument, i.e. by equivocating between "effort" and "money". These inconsistencies can mislead.
The neat factorisation might blind us to the fact that the meaning of 'good done' is underspecified, so it could lead us into thinking it is easier or more straightforward than it actually is to compare across disparate causes. Having more specific $X$ s for $\frac{X}{additional resources}$ can make it clearer when you are comparing apples and oranges.
ITN invites marginal thinking (you're being asked to estimate derivatives), but sometimes marginal thinking can mislead, when 'good done' is concave with resources.
Maybe most important of all: sometimes there are just much clearer/neater ways to factor the problem, which better carves it at its joints. Let's not constrain ourselves to one factorisation at the cost of more natural ones!

I should add that I find the "Fermi estimates vs ITN" framing potentially misleading. Maybe "ITN isn't the only way to do Fermi estimates of impact" is a clearer framing?

Anyway, curious if this all lines up with what you had in mind.

fribSep 23 20221

This roughly lines up with what I had in mind!

In a world in which people used the ITN as a way to do Fermi estimates of impact, I would have written "ITN isn't the only way to do Fermi estimates of impact", but my experience is that people don't use it this way. I have almost never seen an ITN analysis with a conclusion which looks like "therefore, is roughly X lives per dollars" (which is what I care about). But I agree that "Fermi estimates vs ITN" isn't a good title either: what I argue for is closer to "Fermi estimates (including ITN_as_a_way_to_Fermi_estimate, which sometimes is pretty useful) vs ITN_people_do_in_practice".

finmSep 24 20222

Ok, got it. I'm curious — how do you see people using ITN in practice? (If not for making and comparing estimates of ?)

Also this post may be relevant!

Michael_WiebeJul 14 20224

It sounds like you're arguing that we should estimate 'good done/additional resources' directly (via Fermi estimates), instead of indirectly using the ITN framework. But shouldn't these give the same answer?

Karthik TadepalliJul 14 20221

Use the breakdown of the ITN framework, and actually multiply the importance, the neglectedness and the tractability. I would still call that a Fermi estimate, since most ITN analyses don’t do this extra step.

I don't think OP is opposed to multiplying them together.

Michael_WiebeJul 14 20222

I didn't suggest otherwise.

And even when you can multiply the three quantities together, I feel like speaking in terms of importance, neglectedness and tractability might make you feel that there is no total ordering of intervention (“some have higher importance, some have higher tractability, whether you prefer one or the other is a matter a personal taste”)

I don't follow this. If you multiply I*T*N and get 'good done/additional resources', how is that not an ordering?

fribJul 16 20223

That's an ordering!

It's mostly analyses like the ones of 80k Hours, which do not multiply the three together, which might let you think there is no ordering.

Is there a way I can make that more precise?

Karthik TadepalliJul 14 20223

This seems to boil down to cluster thinking vs sequence thinking. Fermi estimates are sequence thinking because they involve a sequence of assumptions and output a number. ITN is cluster thinking because it takes a few different angles to look at the problem and estimate how appealing it is from each of them. The strength of sequence thinking is in its transparency and ability to generate numbers comparable across domains. The strength of cluster thinking is in its robustness and not letting any one factor dominate the analysis. I do not think there's a dominance relation between them.

fribJul 14 20221

I don't understand how the robustness argument works, I couldn't steelman it.

If you want to assess the priority of an intervention by breaking down it's priority Q into I, T & N:

if you multiply them together, you didn't make your estimation more robust than using any other breakdown.
if you don't, then you can't say anything about the overall priority of the intervention.

What's your strategy to have high robustness estimation of numerical quantities? How do you ground it? (And how is it that it works only when using the ITN breakdown of Q, and not any other breakdown?)

Multiplying them together would be the same it's true. I was talking about keeping it disaggregated. In this view rather than a single priority Q we can have an "importance Q", "tractability Q", "neglectedness Q" and we compare interventions that way.

The desire to have a total ordering over interventions is understandable but I don't know if it's always good when changing one subjective probability estimate from 10^-5 to 10^-6 can jump your intervention from "fantastic deal" to "garbage deal". By limiting the effect of any one criterion, the ITN framework is more stable to changing subjective estimates. Holden's cluster thinking vs sequence thinking essay goes into that in more detail.

Other breakdowns would be fine as well.

fribJul 16 20221

How would you compare these two interventions:

1: I=10 T=1 N=1

2: I=1 T=2 N = 2

I feel like the best way to do that is to multiply things together.

And if you have error bars around I, T & N, then you can probably do something more precise, but still close in spirit to "multiply the three things together"

Effective Altruism Forum
EA Forum