Sarah Weiler

Research fellow (AI Governance) @ Global Policy Research Group
460 karmaJoined Oct 2020Working (0-5 years)Innsbruck, Österreich
www.globalprg.org/aigovernanceprogram

Participation
4

  • Completed the Introductory EA Virtual Program
  • Completed the In-Depth EA Virtual Program
  • Attended an EA Global conference
  • Attended more than three meetings with a local EA group

Sequences
1

Wrapping my head around the nuclear risks cause area

Comments
36

Thanks for writing this up, Oscar! I largely disagree with the (admittedly tentative) conclusions, and am not sure how apt I find the NIMBY analogy. But even so, I found the ideas in the post helpfully thought-provoking, especially given that I would probably fall into the cosmic NIMBY category as you describe it. 

First, on the implications you list. I think I would be quite concerned if some of your implications were adopted by many longtermists (who would otherwise try to do good differently):

Support pro-expansion space exploration policies and laws

Even accepting the moral case for cosmic YIMBYism (that aiming for a large future is morally warranted), it seems far from clear to me that support for pro-expansion space exploration policies would actually improve expected wellbeing for the current and future world. Such policies & laws could share many of the downsides colonialism and expansionism have had previously: 

  • Exploitation of humans & the environment for the sake of funding and otherwise enabling these explorations; 
  • Planning problems: Colonial-esque megaprojects like massive space exploration likely constitute a bigger task than human planners can reasonably take on, leading to large chances of catastrophic errors in planning & execution (as evidenced by past experiences with colonialism and similarly grand but elite-driven endeavours)
  • Power dynamics: Colonial-esque megaprojects like massive space exploration seem prone to reinforcing the prestige, status, and power for those people who are capable of and willing to support these grand endeavours, who - when looking at historical colonial-esque megaprojects - do not have a strong track record of being the type of people well-suited to moral leadership and welfare-enhancing actions (you do acknowledge this when you talk about ruthless expansionists and Molochian futures, but I think it warrants more concern and worry than you grant);
  • (Exploitation of alien species (if there happened to be any, which maybe is unlikely? I have zero knowledge about debates on this)).

This could mean that it is more neglected and hence especially valuable for longtermists to focus on making the future large conditional on there being no existential catastrophe, compared to focusing on reducing the chance of an existential catastrophe.

It seems misguided and, to me, dangerous to go from "extinction risk is not the most neglected thing" to "we can assume there will be no extinction and should take actions conditional on humans not going extinct". My views on this are to some extent dependent on empirical beliefs which you might disagree with (curious to hear your response there!): I think humanity's chances to avert global catastrophe in the next few decades are far from comfortably high, and I think the path from global catastrophe to existential peril is largely unpredictable but it doesn't seem completely unconceivable that such a path will be taken. I think there are far too few earnest, well-considered, and persistent efforts to reduce global catastrophic risks at present. Given all that, I'd be quite distraught to hear that a substantial fraction (or even a few members) of those people concerned about the future would decide to switch from reducing x-risk (or global catastrophic risk) to speculatively working on "increasing the size of the possible future", on the assumption that there will be no extinction-level event to preempt that future in the first place.

--- 

On the analogy itself: I think it doesn't resonate super strongly (though it does resonate a bit) with me because my definition of and frustration with local NIMBYs is different from what you describe in the post. 

In my reading, NIMBYism is objectionable primarily because it is a short-sighted and unconstructive attitude that obstructs efforts to combat problems that affect all of us; the thing that bugs me most about NIMBYs is not their lack of selflessness but their failure to understand that everyone, including themselves, would benefit from the actions they are trying to block. For example, NIMBYs objecting to high-rise apartment buildings seem to me to be mistaken in their belief that such buildings would decrease their welfare: the lack of these apartment buildings will make it harder for many people to find housing, which exacerbates problems of homelessness and local poverty, which decreases living standards for almost everyone living in that area (incl. those who have the comfort of a spacious family house, unless they are amongst the minority who enjoy or don't mind living in the midst of preventable poverty and, possibly, heightened crime). It is a stubborn blindness to arguments of that kind and an unwillingness to consider common, longer-term needs over short-term, narrowly construed self-interests that form the core characteristic of local NIMBYs in my mind. 

The situation seems to be different for the cosmic NIMBYs you describe. I might well be working with an unrepresentative sample, but most of the people I know/have read who consciously reject cosmic YIMBYism do so not primarily on grounds of narrow self-interest but for moral reasons (population ethics, non-consequentialist ethics, etc) or empirical reasons (incredibly low tractability of today's efforts to influence the specifics about far-future worlds; fixing present/near-future concerns as the best means to increase wellbeing overall, including in the far future). I would be surprised if local NIMBYs were motivated by similar concerns, and I might actually shift my assessment of local NIMBYism if it turned out that they are. 

New Update (as of 2024-03-27): This comment, with its very clear example to get to the bottom of our disagreement, has been extremely helpful in pushing me to reconsider some of the claims I make in the post. I have somewhat updated my views over the last few days (see the section on "the empirical problem" in the Appendix I added today), and this comment has been influential in helping me do that. Gave it a Delta for that reason; thanks Jeff!

While I now more explicitly acknowledge and agree that, when measured in terms of counterfactual impact, some actions can have hundreds of times more impact than others, I retain a sense of unease when adopting this framing:

When evaluating impact differently (e.g. through Shapley-value-like attribution of "shares of impact", or through a collective rationality mindset (see comments here and here for what I mean by collective rationality mindset)), it seems less clear that the larger donor is 100x more impactful than the smaller donor. One way for reasoning about this would be something like: Probably - necessarily? - the person donating $100,000 had more preceding actions leading up to the situation where she is able and willing to donate that much money and there will probably - necessarily? - be more subsequent actions needed to make the money count, to ensure that it has positive consequences. There will then be many more actors and actions between which the impact of the $100,000 donation will have to be apportioned; it is not clear whether the larger donor will appear vastly more impactful when considered from this different perspective/measurement strategy...

You can shake your head and claim - rightly, I believe - that this is irrelevant for deciding whether donating $100,000 or donating $1,000 is better. Yes, for my decision as an individual, calculating the possible impact of my actions by assessing the likely counterfactual consequences resulting directly from the action will sometimes be the most sensible thing to do, and I’m glad I’ve come to realise that explicitly in response to your comment.

But I believe recognising and taking seriously the fact that, considered differently, my choice to donate $100,000 does not mean that I individually am responsible for 100x more impact than the donor of $1,000 can be relevant for decisions in two ways:

  • 1) It prevents me from discounting and devaluing all the other actors that contribute vital inputs (even if they are “easily replaceable” as individuals)
  • 2) It encourages me to take actions that may facilitate, enable, or support large counterfactual impact by other people. This perspective also encourages me to consider actions that may have a large counterfactual impact themselves, but in more indirect and harder-to-observe ways (even if I appear easily replaceable in theory, it's unclear whether I will be replaced in practice, so the counterfactual impact seems extremely hard to determine; what is very clear is that by performing a relevant supportive action, I will be contributing something vital to the eventual impact).

If you find the time to come back to this so many days after the initial post, I'd be curious to hear what you think about these (still somewhat confused?) considerations :)

Thanks a lot for that comment, Dennis. You might not believe it (judging by your comment towards the end), but I did read the full thing and am glad you wrote it all up!

I come away with the following conclusions:

  1. It is true that we often credit individuals with impacts that were in fact the results of contributions from many people, often over long times. 
  2. However, there are still cases where individuals can have outsize impact compared to the counterfactual case where they do not exist. 
  3. It is not easy to say in advance which choices or which individuals will have these outsize influences …
  4. … but there are some choices which seem to greatly increase the chance of being impactful. 

Put in this way, I have very little to object. Thanks for providing that summary of your takeaways, I think that will be quite helpful to me as I continue to puzzle out my updated beliefs in response to all the comments the essay has gotten so far (see statements of confusion here and here). 

For example, anyone who thinks that being a great teacher cannot be a super-impactful role is just wrong. But if you do a very simplistic analysis, you could conclude that. It’s only when you follow through all the complex chain of influences that the teacher has on the pupils, and that the pupils have on others, and so on, that you see the potential impact.

That's interesting. I think I hadn't really considered the possibility of putting really good teachers (and similar people-serving professions) into the super-high-impact category, and then my reaction was something like "If obviously essential and super important roles like teachers and nurses are not amongst the roles a given theory considers relevant and worth pursuing, then that's suspicious and gives me reason to doubt the theory." I now think that maybe I was premature in assuming that these roles would necessarily lie outside the super-high-impact category?

The real question, even of not always posed very precisely, is: for individuals who, for whatever reason, finds themselves in a particular situation, are there choices or actions that might make them 100x more impactful? [...] And yet, it feels like there are choices we make which can greatly increase or decrease the odds that we can make a positive and even an outsize contribution. And I’m not convinced by (what I understand to be) your position that just doing good without thinking too much about potential impact is the best strategy.

I think the sentiment behind those words is one that I wrongfully neglected in my post. For practical purposes, I think I agree that it can be useful and warranted to take seriously the possibility that some actions will have much higher counterfactual impact than others. I continue to believe that there are downsides or perils to the counterfactual perspective, and that it misses some relevant features of the world; but I can now also see more clearly that there are significant upsides to that same perspective and that it can often be a powerful tool for making the world better (if used in a nuanced way). Again, I haven't settled on a neat stance to bring my competing thoughts together here, but I feel like some of your comments above will get me closer to that goal of conceptual clarification - thanks for that!

I feel like there are some models of how markets work that quite successfully predict macro behaviour of systems without knowing all the local individual factors?

You're right that you're more optimistic than me for this one. I don't think we have good models of that kind in economics (or: I haven't come across such models; I have tried to look for them a little bit but am far from knowing all modeling attempts that have ever been made, so I might have missed the good/empirically reliable ones).

I do agree that "we can make, in some cases, simple models that accurately capture some important features of the world" - but my sense is that in the social sciences (/ whenever the object of interest is societal or human), the features we are able to capture accurately are only a (small) selection of the ones that are relevant for reasonably assessing something like "my expected impact from taking action X." And my sense is also that many (certainly not all!) people who like to use models to improve their thinking on the world over-rely on the information they gain from the model and forget that these other, model-external features also exist and are relevant for real-life decision-making.

[The thoughts expressed below are tentative and reveal lingering confusion in my own brain. I hope they are somewhat insightful anyways.]

This seems on-point and super sensible as a rough heuristic (not a strict proof) when looking at impact through a counterfactual analysis that focuses mostly on direct effects. But I don't know if and how it translates to different perspectives of assessing impact. If there never were high impact opportunities in the first place, because impact is dispersed across the many actions needed to bring about desired consequences, then it doesn't matter whether a lot or only a few people try to grab these opportunities from the table - because there would be nothing to grab in the first place. 

Maybe the example helps to explain my thinking here (?): If we believe that shrimp/insect welfare can be improved significantly by targeted interventions that a small set of people push for and implement, then I think your case for it being a high impact opportunity is much more reasonable than if we believe that actual improvements in this area will require a large-scale effort by millions of people (researchers, advocates, implementers, etc). I think most desirable change in the world is closer to the latter category.* 

*Kind of undermining myself: I do recognise that this depends on what we "take for granted" and I tentatively accept that there are many concrete decision situations where it makes sense to take more for granted than I am inclined to do (the infrastructure we use for basically everything, many of the implementing and supporting actions needed for an intervention to actually have positive effects, etc), in which case it might be possible to consider more possible positive changes in the world to fall closer to the former category (the former category ~ changes in the world that can be brought about by a small group of individuals).

So I agree that there is a danger of thinking too much of oneself as some sort of ubermensch do-gooder, but the question of to what extent impact varies by person or action is separate.

I think that makes sense and is definitely a take that I feel respect (and gratitude/hope) for.

I think it is lamentable but probably true that some people's lives will have far greater instrumental effects on the world than others.

Even after a week of reflecting on the empirical question - do some people have magnitudes higher impact than others? - and the conceptual question - which impact evaluation framework (counterfactual, Shapley value attribution, something else entirely) should we use to assess levels of impact? -, I remain uncertain and confused on my own beliefs here (see more in my comment on the polio vaccine example above). So I'm not sure what my current response to your claim "[it's] probably true that some people's lives will have far greater instrumental effects on the world than others" is or should be.

[The thoughts expressed below are tentative and reveal lingering confusion in my own brain. I hope they are somewhat insightful anyways.]

but I think the counterfactual is illustrative

Completely agree! The concept of counterfactual analysis seems super relevant to explaining how and why some of my takes in the original post differ from "the mainstream EA narrative on impact". I'm still trying to puzzle out exactly how my claims in "The empirical problem" link to the counterfactual analysis point - do I think that my claims are irrelevant to a counterfactual impact analysis? do I, in other words, accept and agree that impact between actions/people differs by several magnitudes when calculated via counterfactual analysis methods? how can I best name, describe, illustrate, and maybe defend the alternative perspective on impact evaluations that seems to inform my thinking in the essay and in general? what role does and should counterfactual analysis play in my thinking alongside that alternative perspective?

To discuss with regards to the polio example: I see the rationale for claiming that the vaccine inventors are somehow more pivotal because they are less easily replaceable than all those people performing supportive and enabling actions. But just because an action is replacement doesn't mean it's unimportant. It is a fact that the vaccine discovery could not have happened and would not have had any positive consequences if the supporting & enabling actions had not been performed by somebody. I can't help myself, but this seems relevant and important when I think about the impact I as an individual can have; on some level, it seems true to say that as an individual, living in a world where everything is embedded in society, I cannot have any meaningful impact on my own; all effects I can bring about will be brought about by myself and many other people; if only I acted, no meaningful effects could possibly occur. Should all of this really just be ignored when thinking about impact evaluations and my personal decisions (as seems to occur in counterfactual analyses)? I don't know.

I think it is uncontroversial that at least on the negative side of the scale some actions are vastly worse than others, e.g. a mass murder or a military coup of a democratic leader, compared to more 'everyday' bads like being a grumpy boss.

Agreed! I share the belief that there are huge differences in how bad an action can be and that there's some relevance in distinguish between very bad and just slightly bad ones. I didn't think this was important to mention in my post, but if it came across as suggesting that we basically should only think in terms of three buckets, I clearly communicated poorly - I agree that this would be too crude.

It feels pretty hard to know which actions are neutral, for many of the reasons you say that the world is complex and there are lots of flow-through effects and interactions.

Strongly agreed! I strongly share the worry that identifying neutral actions would be extremely hard in practice - took me a while to settle on "bullshit jobs" as a representative example in the original post, and I'm still unsure whether it's a solid case of "neutral actions". But I think for me, this uncertainty reinforces the case for more research/thinking to identify actions with significantly positive outcomes vs actions that are basically neutral. I find myself believing that dividing actions into "significantly positive" vs "everything else" is epistemologically more tractable than dividing them into "the very best" vs "everything else". (I think I'd agree that there is a complementary quest - identifying very bad actions and roughly scoring them on how bad they would be - which is worthwhile pursuing alongside either of the two options mentioned in the last sentence; maybe I should've mentioned this in the post?)

Identifying which positive actions are significantly so versus insignificantly so feels like it just loses a lot of information compared to a finer-grained scale.

I think I disagree mostly for epistemological reasons - I don't think we have much access to that information at a finer-grained scale; based on that, giving up on finding such information wouldn't be a great loss because there isn't much to lose in the first place.

I think I might also disagree from a conceptual or strategic standpoint: my thinking on this - especially when it comes to catastrophic risks, maybe a bit less for global health & development / poverty - tends to be more about "what bundle of actions and organisations and people do we need for the world to improve towards a state that is more sustainable and exhibits higher wellbeing (/less suffering)?" For that question, knowing and contributing to significantly good actions seems to be of primary importance, since I believe that we'll need many of these good actions - not just the very best ones - for eventual success anyways. Since publishing this essay and receiving a few comments defending (or taking for granted) the counterfactual perspective on impact analysis, I've come to reconsider whether I should base my thinking on that perspective more often than I currently do. I remain uncertain and undecided on that point for now, but feel relatively confident that I won't end up concluding that I should pivot to only or primarily using the counterfactual perspective (vs. the "collective rationality / how do I contribute to success at all" perspective)... Curious to hear if all that makes some sense to you (though you might continue to disagree)?

Shapley values are a great tool for divvying up attribution in a way that feels intuitively just, but I think for prioritization they are usually an unnecessary complication. In most cases you can only guess what they might be because you can't mentally simulate the counterfactual worlds reliably, and your set of collaborators contains billions of potentially relevant actors. [emphasis added]

From what I've learned about Shapley values so far, this seems to mirror my takeaway. I'm still giving myself another 2-3 days until I write up a more fleshed-out response to the commenters who recommended looking into Shapley values, but I might well end up just copying some version of the above; so thanks for formulating and putting it here already!

(I think if EAs were more individualist, “the core” from cooperative game theory would be more popular than the Shapley value.)

I do not understand this point but would like to (since the stance I developed in the original post went more in the direction of "EAs are too individualist"). If you find the time, could you explain or point to resources to explain what you mean by "the core from cooperative game theory" and how that links to (non-)individualist perspectives, and to impact modeling?

Oh, and we get so caught up in the object-level here that we tend to fail to give praise for great posts: Great work writing this up! When I saw it, it reminded me of Brian Tomasik's important article on the same topic, and sure enough, you linked it right before the intro! I'm always delighted when someone does their research so well that whatever random spontaneous associations I (as a random reader) have are already cited in the article!

Very glad to read that, thank you for deciding to add that piece to your comment :)!

Thanks for the comment! Just to make sure I understand correctly: the tails would partially cancel out in expected impact estimates because many actions with potentially high positive impact could also have potentially high negative impact if any of our assumptions are wrong? Or were you gesturing at something else? (Please feel free to simply point me to the post you shared if the answer is continued therein; I haven't had the chance to read it carefully yet)

Load more