[Nothing here is original, I’ve just combined some standard EA arguments all in one place]
I’m confused about why EAs who place non-negligible value on future people justify the effectiveness of interventions by the direct effects of those interventions. By direct effects I mean the kinds of effects that are investigated by GiveWell, Animal Charity Evaluators, and Charity Entrepreneurship. I mean this in contrast to focusing on the effects of an intervention on the long-term future as investigated by places like Open Phil, the Global Priorities Institute, and the Future of Humanity Institute.
This post lays out my current understanding of the problem so that I can find out the bits I’m missing or not understanding properly. I think I’m probably wrong about something because plenty of smart, considerate people disagree with me. Also, to clarify, there are people I admire who choose to work on or donate to near-term causes.
Section one states the problem of cluelessness (for a richer treatment read this: Cluelessness, Hilary Greaves) and explains why we can’t ignore the long-term effects of interventions.
Section two points at some implications of this for people focussed on traditionally near-term causes like mental health, animal welfare, and global poverty. I think these causes all seem pressing. I think that they are long-term problems (ie. poverty or factory farms now are just as bad as poverty or factory farms in 1000 years) and that it makes sense to prioritise the interventions that have the best long-term effects on these causes.
Section three tries to come up with objections to my view, and respond to them.
1. Cluelessness and Long-term Effects
All actions we take have huge effects on the future. One way of seeing this is by considering identity-altering actions. Imagine that I pass my friend on the street and I stop to chat. She and I will now be on a different trajectory than we would have been otherwise. We will interact with different people, at a different time, in a different place, or in a different way than if we hadn’t paused. This will eventually change the circumstances of a conception event such that a different person will now be born because we paused to speak on the street. Now, when the person who is conceived takes actions, I will be causally responsible for those actions and their effects. I am also causally responsible for all the effects flowing from those effects.
This is an example of simple cluelessness, which I don’t think is problematic. In the above example, I have no reason to believe that the many consequences that would follow from pausing would be better than the many consequences that follow from not pausing. I have evidential symmetry between the two following claims:
- Pausing to chat would have catastrophic effects for humanity
- Not pausing to chat would have catastrophic effects for humanity
And similarly, I have evidential symmetry between the two following claims:
- Pausing to chat would have miraculous effects for humanity
- Not pausing to chat would have miraculous effects for humanity
(I’m assuming there’s nothing particularly special about this chat - eg. we’re not chatting about starting a nuclear war or influencing AI policy.)
And for all resulting states of the world between catastrophe and miracle. I have evidential symmetry between act-consequence pairs. By evidential symmetry between two actions, I mean that, though massive value or disvalue could come from a given action, these effects could equally easily, and in precisely analogous ways, result from the relevant alternative actions. In the previous scenario, I assume that each of the possible people that will be born are as likely as each other to be the next Norman Borlaug. And each of the possible people are as likely as each other to be the next Joseph Stalin.
So this situation isn’t problematic; the possible effects, though they are huge, cancel out in my expected value estimate.
Cluelessness is problematic in situations where we do not have evidential symmetry. For a pair of actions (act one and act two), we have complex cluelessness when:
- We have some reasons to think that the effects of act one would systematically tend to be substantially better than those of act two;
- We have some reasons to think that the effects of act two would systematically tend to be substantially better than those of act one;
- It is unclear how to weigh up these reasons against one another. (Here there is no evidential symmetry between act-consequence pairs. You have no EV estimate for taking one of the actions over another.)
(An explanation of what is meant by ‘systematically’ can be found in section 5 of Cluelessness, Hilary Greaves)
For example, we have some reasons to think that the long-term effects of a marginally higher economic growth rate would be good - for example, via driving more patient and pro-social attitudes. This would mean that taking action to increase economic growth could have much better effects than not taking the action. We have some reasons to think that the long-term effects of a marginally higher economic growth rate would be bad - for example, via increased carbon emissions leading to climate change. This would mean that not taking the action that increases economic growth could be a much better idea. It’s not immediately obvious that one of these is better than the other, but we also can’t say they have equal expected value. That would need either evidential symmetry, or a very detailed EV estimate. (Evidential symmetry here would be something like: every way a higher growth rate would be good is also an equally plausibly reason it would be bad eg. increased emissions are equally likely to be good as they are to be bad.)
I think that complex cluelessness implies we should be very skeptical of interventions whose claim to cost-effectiveness is through their direct, proximate effects. As has been well argued elsewhere, the long-term effects of these actions probably dominate. But we don’t know what the long-term effects of many interventions are or just how good or bad they will be.
Actions we take today have indirect long-term effects, and they seem to dominate over the direct near-term effects. Unless we have evidential symmetry we cannot ignore these long-term effects. So it seems to be that, if we care about future people, we’ll have to justify our interventions via their long-term effects, not their proximate ones.
2. Direct Effects
What position are we in?
- We think our actions now have these huge effects on the future
- These effects seem morally relevant (again, assuming you value the future)
- These long-term effects dominate the proximate ones
- We’re trying to find the actions that we have good reason to believe are the most cost-effective at improving the world (because we’re trying to improve the world as much as we can, and we have limited resources)
The direct approach (eg. looking at QALYs or deaths averted) doesn’t look at all the effects of our actions. In particular, the biggest effects (the long-term ones) are ignored. I think this means we shouldn’t use this approach to determine which interventions are most cost-effective. To me, it makes more sense, even if you’re focused on traditionally near-termist causes like mental health, animal welfare, and global poverty, to evaluate interventions based on their long-term effects.
(Don't worry - I'm not going to start proving things by analogy! This is just an intuition pump and I'm aware that it breaks down.)
Imagine a hotel with 1,000 single-occupant rooms. You are in the control room of the hotel and you can push different buttons that will do different things to the hotel occupants. Every button does something to every person, but you don’t know exactly what. You think some buttons cause bliss, or torture, or death for people in particular rooms. For most rooms, it’s very hard (but somewhat tractable) to get data on how the inhabitants feel about you pushing particular buttons. Fortunately, for room #327, it’s much easier to find out how pushing different buttons affects the occupant. If you care about every occupant should you:
- Get a bunch of data on how particular buttons affect room #327 and then press the buttons that you think are best for that one person
- Put your resources into estimating how different buttons affect all the rooms?
The direct approach is analogous to getting a bunch of data on how particular buttons affect room #327 and then pressing the buttons that you think are best for that one person.
This seems weird to me if you know that the buttons affect all 1,000 rooms. You might know that a button has good effects for room #327, but it could be torture for everyone in all the other rooms. Or there might be a button that doesn’t affect room #327 much but produces waves of meaningful bliss for everyone else.
My intuition here is that putting a lot of effort into finding out how different buttons affect all the rooms makes more sense. Then you can push the button that’s your best guess at being best for all 1,000 people in aggregate. Sure, it’s really hard to get data on how everyone is affected but that doesn’t mean we can just ignore it - it’s the most important consideration for which button to press.
(Relevant post: Growth and the case against randomista development)
Under a long-termist framework, it's possible we could weigh the effects of work on different causes and decide that global poverty was the best thing to be working on. It could further be the case that current GiveWell recommended charities are the best way to go. But that whole analysis would have to be justified by the effects on future people via flow-through effects rather than effects on something like present-day QALYs.
For example, we might decide that marginally increasing economic growth isn’t too dangerous after all (e.g. because the negative effects of the poor meat eater problem, increased emissions, or increased anthropogenic existential risk are outweighed by the benefits). We might then take cost-effective actions to accelerate growth, perhaps focusing on poor countries. These might be things like charter cities or macroeconomic stabilisation, or something else we haven’t considered.
I’m confused about why some EAs who value the future and are interested in global poverty seem to prefer AMF, SCI, or GiveDirectly over these things (side note: even if you prioritise these, it’s really worth considering investing now so you can give more later). The way the EA community got to care about AMF was by analysis of a small subset of AMF’s effects. AMF has far more effects than those that are measured so, under this longtermist framework, we don’t have any evidence of the cost-effectiveness of AMF’s actions.
I think there might be good reasons to think that present day QALYs or deaths averted are good correlates of total (long-term) value - perhaps because of flow through effects. But I don’t think this is obvious at all, and I think the burden of proof is on those claiming the correlation between near-term QALYs and long-term value is strong. I don’t regularly see people justifying global poverty interventions based on their flow through effects, and I’d love to see more of this (though, of course, it’s very difficult).
An interesting point here is that, if it were true that the most effective global poverty interventions turned out to be broad growth-boosting interventions, the EA position would come a little closer to the mainstream development economics view - which I think is reassuring.
(Relevant post: Should Longtermists Mostly Think About Animals?)
(I don't know much about animal welfare interventions at all, so expect I'm missing something here.)
People who value future nonhuman animals might achieve their goals better if they asked more questions like:
- ‘How can we increase the probability of factory farming ending in the next 100 years?’
- ‘How can we reduce the probability that factory farming continues for thousands of years?’
- ‘How can we reduce the probability of humanity spreading wild animal suffering across the cosmos?’
I think questions like the following seem valuable only insofar as they contribute to the first kind of question:
- ‘How can we avert the most present-day suffering for a given amount of money?’
- ‘How can we make present-day factory farmed animals suffer less?’
Again, it could be that ACE-recommended charities are the best place to donate and that current strategies (like corporate campaigns or working on clean meat) are the best kinds of direct work available. But the most effective interventions are the ones that are most effective across all time, not just the next few years or decades. Why? Because the long-term effects of animal welfare interventions will vastly dominate the near-term effects of those interventions.
Similarly for mental health, I’d argue that we don’t want to focus on buying QALYs now - we want to do long-lasting things like answering foundational questions, building an effectiveness-minded mental health field, and setting up institutions that will improve long-term mental health. For example, I’m excited about the research that HLI and QRI are doing. Of course, we need to roll out proposed interventions once they come around. We’ll need to test them and this will involve measurement of direct effects. But the primary value of this exploration is in the information value, and the field-building effects, not the direct welfare benefits.
Comparison to X-risk reduction
This focus on long-term field-building and trajectory change is different to biorisk, or short-timeline AI safety. For these two causes, there is risk of lock-in of some very bad state (extinction, or worse) sometime soon. This means it’s more urgent to do direct work right now to avoid the lock-in.
You could push back on this distinction by saying that there is risk of astronomical poverty lock-in or animal suffering lock-in in the next 200 years. Perhaps we will start space colonisation in that time and then fall into some weird Malthusian-style situation later on (see This is the Dream Time, and Potatonium (though the situation described here might be a good one)). Or perhaps we’ll expand to other planets and bring wild animals or factory farms with us. These things are concerning but they don’t seem to obviously point to donating to ACE or GiveWell charities as the solution.
[The point of this post is that I don't adequately understand the best arguments against my view. So my understanding of the objections to my view is obviously limited]
Near-term work is more certain
Objection: The route to value of some types of long-term work is highly uncertain, with very small probabilities of very large payoffs. If I want to be sure that I do at least some good, maybe I should prioritise more certain near-term work.
Response: If we care about all the effects of our actions, it’s not clear that near-term interventions are any less speculative than long-term interventions. This is because of the dominating but uncertain long-term effects of near-term interventions.
Near-term work is more evidence-based
Objection: For any action, it’s usually much harder to get evidence about it’s long-term effects than it’s near-term effects. So, given that we are using evidence to improve the world, maybe we should focus on the effects we can measure. It could be much easier to make a dent in near-term problems because we have much more evidence about them.
Response: It’s true that we don’t have much evidence about the long-term effects of our actions. But if we think those effects are morally relevant, we cannot ignore them (this is complex cluelessness, not simple). Rather, we should invest resources in getting more evidence about those effects. Unfortunately, this evidence isn’t going to be through randomised controlled trials (RCTs) or anything as rigorous as that. I agree that longtermism presents a huge epistemic challenge and, if we want to help people as much as possible, we have to deeply understand the past, and build excellent models of the future. We’ll need to get much better at rationality, forecasting, and generally understanding the world to do this.
[This is related, particularly the introduction: Reality is often underpowered]
Long-term work is subject to bias
Objection: Because the evidence we have about long-term effects is weak, there is much more weight placed on subjective judgements and expert opinion rather than RCTs or other data. In these situations, we might expect our cause prioritisation to be tracking the wrong thing - like the biases, interests, or preferences of people in the community. For example, maybe part of the reason the EA community values MIRI is because of Elieizer’s idiosyncrasies. In contrast, the EA community might value AMF because of impartial, dispassionate analysis.
Response: I think this is a good point, and something to be aware of. To me, it seems to point to doing better analysis of long-term effects, rather than to ignoring long-term effects. I'm not sure anyone uses this objection, but I'd be interested to what such people thought about the effect size of bias compared to the effect size of working on long-term causes.
If we have any effectiveness estimates at all, they are for near-term work
Objection: If we can’t get effectiveness estimates of something as measurable as AMF, how could we ever get estimates of intangible long-term effects or speculative interventions?
Response: It’s true we don’t have robust cost-effectiveness estimates for long-term interventions in the same way that we have robust cost-effectiveness estimates for the near-term effects of some things. However, there has been lots of work done prioritising between long-term causes and we do have some best guesses about the most effective things to work on.
We have a better idea of OpenAI's long-term effects than AMF's, just because we've thought more about the long-term effects of OpenAI, and it's targeting a long-term problem.
We're uncertain in our estimate of OpenAI's effectiveness. This uncertainty is unfortunate but that doesn’t mean we can ignore the future people that OpenAI is trying to help. If we’re trying to help others as much as possible, we’re going to have to deal with lots of difficulties and lots of uncertainties.
Long-term effects don’t persist
Objection: What makes me think that long-term effects tend to persist in the future, rather than slowly fading out? If I drop a stone into a pond, it has a large local effect. But then the ripples spread out and eventually it’s like I never dropped the rock at all. Maybe near-term interventions are like this. This is different to saying the long-term effects ‘cancel out’ in expectation - maybe they just disappear. If that’s true, then the biggest effects of an intervention are the near-term effects.
Response: One way we can see that long-term effects seem to persist is through identity-altering actions, as described in the ‘simple cluelessness’ section above. Once my decisions affect a conception event, I am causally responsible for everything that the conceived person counterfactually does. I am causally responsible for the effects of those things and for the effects of those effects and so on. As time goes on, I will be causally responsible for more and more effects, not fewer and fewer.
(Maybe there are domains in which effects are likely to wash out rather than persist, I haven’t read anything about this though.)
What’s good in the near-term is good in the long-term
Objection: If we improve the world today, that’s likely to lead to a better world tomorrow, if the ways in which it’s better are sustainable or likely to compound. For example, if I help the poorest people now, that will put the world in a better state in 100 years time.
Response: This is basically saying that the flow through effects of near-term interventions tend to be good. As discussed earlier, I think it’s possible that they are (though this is a hard and non-obvious question). But this doesn’t mean that we should justify interventions based on their near-term effects and look for whichever interventions have the best near-term effects. To me, it implies we should look for things with the best flow though effects and justify interventions by those effects. Otherwise, we might just succumb to Goodhart's Law.
Considering long-term effects leads to inconsistency
Objection: In my daily life, I don’t consider the long-term effects of my actions. If I delay someone on the street, I’m not worried about causing the next Stalin to be conceived. If I did do that, I’d never be able to do anything. It’s consistent to have a decision procedure that applies both to daily life and to improving the world.
Response: In daily life, we often have simple cluelessness because we have evidential symmetry, as described above. We have no more reason to believe that the effects of delaying someone will be good than we have to believe that they will be bad. Every way that affecting a conception event could be good, is also a way that it could be bad. However, every way that the long-term effects of a near-term intervention could be good are not the exact same ways that it could be bad. So we don’t have evidential symmetry and it’s consistent to behave differently in this different case.
Also, in daily life, we have goals that are not maximally, impartially welfarist so it makes sense to act differently.
Considering long-term effects leads to analysis paralysis
Objection: We are in triage every second of every day. Every day that we wait for better understanding of long-term effects is time that we are not helping people right now.
Response: Yes, we are in triage. We want to end factory farming, human diseases, and wild animal suffering that's happening. We want to make sure humanity is safe from asteroids, nuclear war, and misaligned AI so that we can go on to treat all beings fairly and fill the universe with meaningful joy. We can’t do all of these things right now so we’ve decided to pick the problems where we think we can make the biggest difference. But just as triage doesn’t mean that we should necessarily prioritise the first person we see on the street, it doesn’t mean that we should necessarily prioritise beings alive right now. Triage means finding the very best opportunities for doing good and then taking them. It might be that, if we want to do the most good, we have to spend a bit more time on finding opportunities than taking them right now.
Near-term work is more aligned with elite common sense
Objection: We should have elite common sense as a prior. Long-term interventions tend to be weird, wacky, and unconventional so we should be pretty sceptical of them for outside-view reasons.
Response: The recommendation from the linked post is to believe what you think a broad coalition of trustworthy people would believe if they were trying to have accurate views and they had access to your evidence. I think there’s a way this could point to focusing on near-term effects but I can’t see what it is. My perspective is that the EA community is a broad coalition of trustworthy people who have access to my evidence and are trying to have trustworthy views. It seems like, as people spend more time in EA, they become more longtermist. So this idea seems to point to long-termism. In general, it doesn’t seem that unconventional to value the future, the unconventional bit is acting on those values. This is where EA diverges from common sense, but it does so just as much for near-term interventions as for long-term interventions (from my perspective). Ie. FHI is unconventional, but so is GiveWell.
It seems to me that:
- Our actions have dominating long-term effects that we cannot ignore
- If you care about future people, it's best to pick your interventions based on (your best guess at) those dominating long-term effects
So, what am I missing? If you do value future people and you look to the direct effects of interventions, why is this?