One of the big criteria used for cause area selection is scale and importance of the issue. This has been used by 80,000 Hours and OpenPhil amongst others. This is often defined as the size and intensity of the problem. For example, if an issue affects 100,000 people deeply, that would be considered higher scale than an issue that minorly affects 1,000 people. Although this version of scale is pretty common in EA, I think there are some major problems with it, the biggest of which being bottlenecking.

 

The broad idea of measuring scale this way has an implication baked into it that the total scale of the problem is the factor to most consider. However, with almost all large problems this seems very unlikely to be true as they are almost always going to be capped or bottlenecked by something much faster than they will by the total capacity of the problem. Take for example bednets. If AMF only gets the funding to give out 10 million bednets a year it doesn't really matter if the total scale of the malaria burden would require 20 million or 500 million. Effectively AMF is capped by money before it hits other scaling considerations. If you were a billionaire perhaps you could give enough money to make money no longer the capping feature, but even in that situation it’s likely another factor would cap before reaching all those in need of nets. In AMF’s case it would likely be number of partners who can effectively be worked with or the political stability of remaining countries.

This concern of bottlenecking is even more dramatic in fields that have tighter caps or very large scale. To take an example in animal rights, if your organization can only raise $100,000 it doesn't really matter how big the population is from a scale perspective as long as it’s much larger than you are likely to effectively help with $100,000. When comparing a cause like animal rights vs bednets, clearly the total size of the animal rights issues hits a lot more individuals, but its “true scale,” in many cases, will be more strictly capped than a more popular, better funded, and more well understood poverty intervention that affects fewer individuals.

 

Money is one of the most common capping features to scale but it’s not the only one. Sometimes it can be logistical factors like partners or total production in the market of a certain good. Sometimes it can be people capped (it seems likely a charity focused on surgeries would run into a skilled people shortage before running into the problem of not having enough people to do surgeries on). A capping feature could also tied to crowdedness of a space. It could also be our understanding of the problem. For example, wild animals may suffer enormously, but if we don’t know how to help all of them. In general, when looking at a cause area or charity, it seems one should consider what factor is most likely to cap scale first rather than just looking at the total size of the problem and assuming no other capping features happen.

 

A counter argument to this might be that scale is just used as a proxy to narrow down cause selection. This use of scale I have far less concerns with, but many people, including major organizations, explicitly use scale in the way I described to make end line calls about what causes to support.

 

Another counter argument is that if you think your intervention has a small chance of helping all the population. For example, if you think your action produces a 0.000001% increase in the chance of ending all factory farming, then a more normal understanding of scale makes sense. However given the huge scale of most problems EAs work on, few of our solutions are aimed at solving the whole problem (e.g. we cannot even fill fill AMF’s room for funding which is only one of many charities working on malaria). We want to be careful not being far overconfident about our ability to affect change and let that change our cause selection.

Comments15


Sorted by Click to highlight new comments since:

"Scaling Studies" are a thing, now part of "Implementation Science". 

The focus tends to be what makes pilot projects scale-able, and what interferes with. Politicians and funders get (understandably) irritated when pilots keep failing to scale - this was happening a lot in the 1990s, which gave rise to the first studies on scaling. 

Implementation science looks more generally at what really works IRL / in the field - a lot is going on in this in Chicago and Global Health.

[Writing personally] This post seems to argue (at least implicitly) that scale is bad if it is the only metric that is used to assess a cause area or intervention. I think this is clearly correct.

But I don't think anyone has used scale as the only metric: 80,000 Hours very explicitly modify it with other factors (neglectedness, solvability), to account for things like the "bottlenecking" you discuss.

There's a separate thing you could mean, which is "Scale, neglectedness, solvability is only one model for prioritisation. It's useful to have multiple different models for prioritising. One alternate model is to assess what the biggest bottleneck is for solving a problem." (Note that this does not really support the claim that scale is misused: it's just that other lenses are also useful.)

I respect the inclination to use multiple models, and I think that thinking in terms of bottlenecks is useful for e.g. organizational prioritization. I think it's harder to apply to cause prioritization because we face so many problems and potential solutions that it's hard to see what the bottlenecks are. It may be useful for prioritizing how to use resources to pursue an intervention, which seems to be how you are mostly using it in this case.

Overall, I worry that your title doesn't really reflect what you show in the text.

I didn't read the post as meaning either "scale is bad if it is the only metric that is used" _or_ "Scale, neglectedness, solvability is only one model for prioritisation. It's useful to have multiple different models...."

When looking at scale in a scale, neglectedness, tractability, framework, it's true that the other factors can offset the influence of scale. e.g. if something is large in scale but intractable, the intractability counts against the cause being considered and at least somewhat offsets the consideration that the cause is large in scale. But this doesn't touch on the point this post makes, which is that looking at scale itself as a consideration, the 'total scale' may be of little or no relevance to the evaluation of the cause, and rather 'scale' is only of value up to a given bottleneck and of no value beyond that. I almost never see people talking of scale in this way in the context of a scale, neglectedness, tractability, framework: dividing up the total scale into tractable bits, less tractable bits and totally intractable bits. Rather, I more typically see people assigning some points for scale, evaluating tractability independently and assigning some points for that and evaluating neglectedness independently and assigning some points for that.

Thanks, David. Your interpretation is indeed what I was trying to get across.

I read this the same way as Max. The issue of cost to solve (eg) all cases of malaria is really tractability, not scale. Scale is how many people would be helped (and to what degree) by doing so. Divide the latter by the former, and you have a sensible-looking cost-benefit analysis, (that is sensitive to the 'size and intensity of the problem', ie the former).

I do think there are scale-related issues with drawing lines between 'problems', though - if a marginal contribution to malaria nets now achieves twice as much good as the same marginal contribution would in 5 years, are combatting malaria now and combatting malaria in five years 'different problems', or do you just try to average out the cost-benefit ratio between somewhat arbitrary points (eg now and when the last case of malaria is prevented/cured). But I also think the models Max and Owen have written about on the CEA blog do a decent job of dealing with this kind of question.

[anonymous]0
0
0

Your argument does not suggest that there is a problem with the commonly used conception of scale, but rather with how it is combined with tractability and neglectedness. Thus, it does not support the claims made in the main piece.

I disagree on both counts. I think my comment is recapitulating the core claims of the main piece (and am pretty confident the author would agree).

In my comment I mention the total S/T/N framework only because MaxDalton suggested that when properly viewed within that framework, the concerns with 'scale' Joey raised, don't apply. I argued that that Joey's concerns apply even if you are applying the full S/T/N framework, but I don't think they apply only if you are applying the full framework.

[anonymous]0
0
0

OK, but then the issue is problem individuation, not the conception of scale used.

Agree. We might ask: why do we care about scale in the first place? Presumably because, in many cases, it means our efforts can help more. But in cases where larger scale does not mean that our efforts can help more (because we cannot help beyond a certain scale), we should not care about the larger scale of the problem.

Do people really think of scale as a bottleneck? I take this article to mean "maybe scale isn't really important to think about if you're unlikely to ever reach that scale".

Perhaps scale could be thought as the inverse of the diminishing returns rate (e.g., more scale = less diminishing returns = more ability to take funding). This seems useful to think about to me.

Maybe the argument should be that when thinking about scale, neglectedness, and tractability, we should put more emphasis on tractability and also think about the tractability of attracting funding / resources needed to meet the scale?

Perhaps scale could be thought as the inverse of the diminishing returns rate (e.g., more scale = less diminishing returns = more ability to take funding). This seems useful to think about to me.

Yes, this is why you need to consider the ratio of scale and neglectedness (for a fixed definition of the problem).

Quick comment: note that you can apply INT to any fraction of the problem (1% / 10% / 100%). The key is just that you use the same fraction for N and T as well. That's why we define the framework using "% of problem solved" rather than "solve the whole problem". https://80000hours.org/articles/problem-framework/

If you run into heavily diminishing returns at the 10% mark, then applying INT to 10% of the problem should yield better results.

This can mean that very narrowly defined problems will often be more effective than broad ones, so it's important to compare problems of roughly the same scale. Also note that narrowly defined problem areas are less useful - the whole point of having relatively broad areas is to build career capital that's relevant to more than just one project.

Finally, our overall process is (i) problems (ii) methods (iii) personal fit. Within methods you should think about the key bottlenecks within the problem area, so it partly gets captured there. Expected impact is roughly the multiple of the three. So, I agree people shouldn't use problem selection as an absolute filter, since it could be better to work on a medium-ranked problem with a great method and personal fit.

You've scooped me! I've got a post on the SNT framework in the works. On the scale bit:

The relevant consideration here seems to be systemic vs atomic changes. Former affects all of the cause, or has a chance of doing so. Latter just affects a small part with no further impacts, hence 'atom'. Example of former would be curing cancer, example of latter would be treating one case of it.

Assessing the total scale of a cause is only relevant if you're calculating the expected value of systemic interventions. I generally agree it's a mistake to force people to size up the entire cause - as 80k do, for instance - because it's not necessary if you're just look at atomic interventions.

I generally agree it's a mistake to force people to size up the entire cause - as 80k do

We don't - see my comment above.

For an atomic intervention, the relevant scale is the amount of good that can be done by a given amount of money, the relevant tractability is whether there is good evidence that the intervention works, and the relevant neglectedness is room for more funding. (This is the GiveWell framework.)

For a systemic intervention, the relevant scale is the amount of good that can be done by solving the problem, the relevant tractability is how much of the problem would be solved (in expectation) by increasing the resources going towards it by X%, and the relevant neglectedness is the amount of resources it would take to increase by X% the amount of resources devoted to the problem. (If there are increasing marginal returns or diminishing marginal returns, then X should be chosen based on the amount that the prospective donor/prospective employee would increase resources. If there are constant marginal returns, X can be set based on what would make it easier to predict how much of a problem would be solved (e.g. choosing a 50% increase even though your donation would only amount to a 0.1% increase because it's easier to get a sense of how much of a problem would be solved by a 50% increase). (This is the 80,000 Hours framework.)

When EAs discuss scale, they generally mean scale in the sense that the term is used for systemic interventions (i.e. the scale of the problem, not the scale of the good the intervention would do). When EAs discuss tractability, they generally mean tractability in the sense that the term is used for atomic interventions (i.e. whether the intervention would be successful, not how much of the problem it would solve in expectation). EAs should avoid mixing and matching scale and tractability in this way.

See my previous comment here for a lengthier discussion of this issue.

Curated and popular this week
Paul Present
 ·  · 28m read
 · 
Note: I am not a malaria expert. This is my best-faith attempt at answering a question that was bothering me, but this field is a large and complex field, and I’ve almost certainly misunderstood something somewhere along the way. Summary While the world made incredible progress in reducing malaria cases from 2000 to 2015, the past 10 years have seen malaria cases stop declining and start rising. I investigated potential reasons behind this increase through reading the existing literature and looking at publicly available data, and I identified three key factors explaining the rise: 1. Population Growth: Africa's population has increased by approximately 75% since 2000. This alone explains most of the increase in absolute case numbers, while cases per capita have remained relatively flat since 2015. 2. Stagnant Funding: After rapid growth starting in 2000, funding for malaria prevention plateaued around 2010. 3. Insecticide Resistance: Mosquitoes have become increasingly resistant to the insecticides used in bednets over the past 20 years. This has made older models of bednets less effective, although they still have some effect. Newer models of bednets developed in response to insecticide resistance are more effective but still not widely deployed.  I very crudely estimate that without any of these factors, there would be 55% fewer malaria cases in the world than what we see today. I think all three of these factors are roughly equally important in explaining the difference.  Alternative explanations like removal of PFAS, climate change, or invasive mosquito species don't appear to be major contributors.  Overall this investigation made me more convinced that bednets are an effective global health intervention.  Introduction In 2015, malaria rates were down, and EAs were celebrating. Giving What We Can posted this incredible gif showing the decrease in malaria cases across Africa since 2000: Giving What We Can said that > The reduction in malaria has be
LintzA
 ·  · 15m read
 · 
Cross-posted to Lesswrong Introduction Several developments over the past few months should cause you to re-evaluate what you are doing. These include: 1. Updates toward short timelines 2. The Trump presidency 3. The o1 (inference-time compute scaling) paradigm 4. Deepseek 5. Stargate/AI datacenter spending 6. Increased internal deployment 7. Absence of AI x-risk/safety considerations in mainstream AI discourse Taken together, these are enough to render many existing AI governance strategies obsolete (and probably some technical safety strategies too). There's a good chance we're entering crunch time and that should absolutely affect your theory of change and what you plan to work on. In this piece I try to give a quick summary of these developments and think through the broader implications these have for AI safety. At the end of the piece I give some quick initial thoughts on how these developments affect what safety-concerned folks should be prioritizing. These are early days and I expect many of my takes will shift, look forward to discussing in the comments!  Implications of recent developments Updates toward short timelines There’s general agreement that timelines are likely to be far shorter than most expected. Both Sam Altman and Dario Amodei have recently said they expect AGI within the next 3 years. Anecdotally, nearly everyone I know or have heard of who was expecting longer timelines has updated significantly toward short timelines (<5 years). E.g. Ajeya’s median estimate is that 99% of fully-remote jobs will be automatable in roughly 6-8 years, 5+ years earlier than her 2023 estimate. On a quick look, prediction markets seem to have shifted to short timelines (e.g. Metaculus[1] & Manifold appear to have roughly 2030 median timelines to AGI, though haven’t moved dramatically in recent months). We’ve consistently seen performance on benchmarks far exceed what most predicted. Most recently, Epoch was surprised to see OpenAI’s o3 model achi
Rory Fenton
 ·  · 6m read
 · 
Cross-posted from my blog. Contrary to my carefully crafted brand as a weak nerd, I go to a local CrossFit gym a few times a week. Every year, the gym raises funds for a scholarship for teens from lower-income families to attend their summer camp program. I don’t know how many Crossfit-interested low-income teens there are in my small town, but I’ll guess there are perhaps 2 of them who would benefit from the scholarship. After all, CrossFit is pretty niche, and the town is small. Helping youngsters get swole in the Pacific Northwest is not exactly as cost-effective as preventing malaria in Malawi. But I notice I feel drawn to supporting the scholarship anyway. Every time it pops in my head I think, “My money could fully solve this problem”. The camp only costs a few hundred dollars per kid and if there are just 2 kids who need support, I could give $500 and there would no longer be teenagers in my town who want to go to a CrossFit summer camp but can’t. Thanks to me, the hero, this problem would be entirely solved. 100%. That is not how most nonprofit work feels to me. You are only ever making small dents in important problems I want to work on big problems. Global poverty. Malaria. Everyone not suddenly dying. But if I’m honest, what I really want is to solve those problems. Me, personally, solve them. This is a continued source of frustration and sadness because I absolutely cannot solve those problems. Consider what else my $500 CrossFit scholarship might do: * I want to save lives, and USAID suddenly stops giving $7 billion a year to PEPFAR. So I give $500 to the Rapid Response Fund. My donation solves 0.000001% of the problem and I feel like I have failed. * I want to solve climate change, and getting to net zero will require stopping or removing emissions of 1,500 billion tons of carbon dioxide. I give $500 to a policy nonprofit that reduces emissions, in expectation, by 50 tons. My donation solves 0.000000003% of the problem and I feel like I have f