I have previously encountered EAs who have beliefs about EA communication that seem jaded to me. These are either, “Trying to make EA seem less weird is an unimportant distraction, and we shouldn’t concern ourselves with it” or “Sounding weird is an inherent property of EA/EA cause areas, and making it seem less weird is not tractable, or at least not without compromising important aspects of the movement.” I would like to challenge both of these views.

“Trying to make EA seem less weird is unimportant”

As Peter Wildeford explains in this LessWrong post:

People take weird opinions less seriously. The absurdity heuristic is a real bias that people -- even you -- have. If an idea sounds weird to you, you're less likely to try and believe it, even if there's overwhelming evidence.

Being taken seriously is key to many EA objectives, such as growing the movement, getting mainstream researchers to care about the EA risks of their work, and having policymakers give weight to EA considerations. Sci-fi-sounding ideas make it harder to achieve these, while making it easier for critics to mischaracterize the movement and (probably) contributing to the perception that EA is cult-like.

On a more personal note, and perhaps more relevant to some of the examples I am going to mention, it is also nice for friends and family to be able to understand why we do what we do, which I don’t think is a trivial desire to have.

All things considered, I think it would be better if EA ideas did not sound weird to outsiders, and instead sounded intuitive and immediately persuasive. 

“EA is inherently weird, and making our ideas seem less weird is not tractable”

I don’t think this is true, and I think this view generally comes from people who haven’t spent much time trying to think creatively about this. Some ways of framing things are more compelling than others, and this is an area where we can iterate, innovate and improve. Here are a few examples of possible ways we talk about weird EA ideas:

AI risk

Talking about other bad but less severe outcomes of AI misalignment besides paperclip maximizers and then saying “and it could even get as bad as paperclip maximizers,” requires less of a leap in imagination than opening with paperclip maximizers.  It may be the case that we don’t even need to make general audiences consider paperclip maximizers at all, since the mechanisms needed to prevent them are the same as those needed to prevent the less severe and more plausible-sounding scenarios of the form "you ask an AI to do X, and the AI accomplishes X by doing Y,  but Y is bad and not what you intended". 

Longtermism

Due to scope insensitivity, referencing visuals to show just how much larger the future can be than the present is particularly emotionally powerful here, and makes the whole idea of working to improve the far future feel far less abstract. My favorite longtermist visualization is this one by Our World in Data, which I have saved on my phone to be able to reference it in conversations. (I think visualizations also work well to combat scope insensitivity for wild animal welfare and farmed animal welfare).

Non-human Welfare

If it is the first time someone is contemplating the idea that insects or wild animals deserve moral consideration, it makes sense to want to give them the spiel with the least probability of being mocked and dismissed. If you start to explain it by saying we should spend money to enhance the lives of insects in the wild, the idea will probably get laughed out of the room. 

I think for insect welfare, the most palatable approach would be talking about the role of inhumane pesticides and other ways humans harm insects actively, making insect welfare more comparable to farmed animal welfare than wild animal welfare. Similarly, talking about helping wild animals during pandemics, famines, wildfires, etc. (problems humans also have) probably incites more compassion than talking about helping them from being chased by lions. How sensible something sounds to a layperson seems correlated with how tractable it is, so tractability can be used as a proxy for how likely an idea is to be dismissed by the person you are talking to.

The point is not to commit the motte-and-bailey fallacy (and one must be careful not to do this), but that people will be more open to contemplating your idea if you go in motte first instead of bailey first. 

Other existential risks

I think the point about paperclip maximizers generalizes—it is sometimes not necessary to frame existential risks as existential risks. Most still have unusually high expected value even if they fall short of extinction, and this can be a preferable framing in some cases. Extinction-level events can be difficult to imagine and emotionally process, leading to overwhelm and inaction (see climate paralysis). We can say that serious pandemics are one of the “highest priority risks” for the international community due to their potential to kill hundreds of millions of people, and in many cases this would resonate more than the harder-to-conceive “existential risk” that could “lead to the extinction of humanity”. (Whether a problem poses extinction risk is, of course, still a relevant factor for cause prioritization.)  

Also, as has been discussed many times already, longtermism is not a necessary prerequisite to care about existential risk. The expected value in the short term is enough to make people care about it, so trying to pitch existential risk through longtermism requires convincing people of an extra weird step and unnecessarily makes it less compelling to most.

Conclusion

My point is not necessarily that we should implement these specific examples, but that there are ways we can make our ideas more palatable to people. Also, there is the obvious caveat that the best way to talk about a topic will depend on your audience, but that doesn’t mean there aren’t some ways of communicating that work better most of the time, or work better for the general public.

125

21 comments, sorted by Click to highlight new comments since: Today at 11:16 PM
New Comment

I agree with the main point that we could sound less weird if we wanted to, but it seems unlikely to me that we want that.

since the mechanisms needed to prevent them are the same as those needed to prevent the less severe and more plausible-sounding scenarios of the form "you ask an AI to do X, and the AI accomplishes X by doing Y,  but Y is bad and not what you intended". 

This is just not true.

If you convince someone of a different, non-weird version of AI risk, that does not then mean that they should take the actions that we take. There are lots of other things you can do to mitigate the less severe versions of AI risk:

  1. You could create better "off-switch" policies, where you get tech companies to have less-useful but safe baseline policies that they can quickly switch to if one of their AI systems starts to behave badly (e.g. switching out a recommender system for a system that provides content chronologically).
  2. You could campaign to have tech companies not use the kinds of AI systems subject to these risks (e.g. by getting them to ban lethal autonomous weapons).
  3. You could switch to simpler "list of rules" based AI systems, where you can check that the algorithm the AI is using in fact seems good to you (e.g. Figure 3 here).

Most of these things are slightly helpful but overall don't have much effect on the versions of AI risk that lead to extinction.

(I expect this to generalize beyond AI risk as well and this dynamic is my main reason for continuing to give the weird version of EA ideas.)

This does seem to be an important dynamic.

Here are a few reasons this might be wrong (both sound vaguely plausible to me):

  1. If someone being convinced of a different non-weird version of an argument makes it easier to convince them of the actual argument, you end up with more people working on the important stuff overall.
  2. If you can make things sound less weird without actually changing the content of what you're saying, you don't get this downside (This might be pretty hard to do though.)

(1) is particularly important if you think this "non-weird to weird" approach will appeal to a set of people who wouldn't otherwise end up agreeing with your arguments. That would mean it has a high counterfactual impact - even if some of the people do work that whilst still being good is ultimately far less relevant to x-risk reduction. This is even more true if you think there's a low rate of people who would have just listened to your weirder sounding arguments in the first place who will get "stuck" at the non-weird stuff and as a result never do useful things.

I agree with both of those reasons in the abstract, and I definitely do (2) myself. I'd guess there are around 50 people total in the world who could do (2) in a way where I'd look at it and say that they succeeded (for AI risk in particular), of which I could name maybe 20 in advance. I would certainly not be telling a random EA to make our arguments sound less weird.

I'd be happy about the version of (1) where the non-weird version was just an argument that people talked about, without any particular connection to EA / AI x-risk. I would not say "make EA sound less weird", I'd say "one instrumental strategy for EA is to talk about this other related stuff".

I think there are trade-offs here. Ideally, I think we want community builders to prioritise high-fidelity, nuanced and precise communication over appealing to more people, but we also want community builders to prioritise broader appeal over signalling they are in the in-group or that they are intelligent (we’re all human and I think it takes conscious effort to push back against these instincts to 1) make it clear you belong in the tribe by signalling in-groupness and 2) that you possess traits that are valued by the tribe like intelligence)

[+][comment deleted]1mo 2

I have a slightly negative reaction to this kind of thinking.

At the limit, there is a trade-off between reporting my beliefs without having bias in the sampling (i.e. lies by omission) and trying to convince people. If I mainly talk about how recommender systems are having bad effects on the discourse landscape because they are aligned, I am filtering evidence (and therefore imposing very high epistemic costs on my discussion partner in the process!)

In the process of doing so, I would not only potentially be making the outside epistemic environment worse, but also might be damaging my own epistemics (or that of the EA community) in the process (via Elephant-in-the-brain-like dynamics or by the conjecture that if you say something long enough, you become more likely to believe it as well).

A good idea that came out of the discussion (point 3, "Bayesian Honesty") around Meta-Honesty was the heuristic that, when talking to another person, one shouldn't give information that would, in expectation, cause the other person to update in the wrong direction. I think the above proposals would sometimes skirt this line (and cross it when considering beliefs about the EA community, such as "EA mainly worries about recommender systems increasing political polarization").

Perhaps this is just a good reason for me not to be a spokesperson about AI risk (probably inappropriately married to the idea that truth is to be valued above everything else), but I wish that people will be very thoughtful around reporting misleading reasons why large parts of the EA community are extremely freaked out about AI (and not, as the examples would suggest, just a bit worried).

This is a good point, and I thought about it when writing the post—trying to be persuasive does carry the risk of ending up flatteringly mischaracterizing things or worsening epistemics, and we must be careful not to do this. But I don't think it is doomed to happen with any attempts at being persuasive, such that we shouldn't even try! I'm sure someone smarter than me could come up with better examples than the ones I presented. (For instance, the example about using visualizations seems pretty harmless—maybe attempts to be persuasive should look more like this than the rest of the examples?)

Maybe we don't just want to optimize the messaging, but the messengers: Having charismatic & likeable people talk about this stuff might be good (to what extent is this already happening? Are MacAskill & Ord as good as spokespeople as they are as researchers?).

Furthermore, taking the WaitButWhy approach, with easily understandable visualizations, sounds like a good approach, I agree.

Oh, I like this idea! And love WaitButWhy.

I agree strongly with the central thesis of this post and the suggestions are both helpful and practical.  The following excerpt resonated especially strongly with me: 

"Being taken seriously is key to many EA objectives, such as growing the movement, getting mainstream researchers to care about the EA risks of their work, and having policymakers give weight to EA considerations."

Indeed, EA doesn't need to become less weird in order to further its objectives, but it should be possible to develop more layperson-friendly framings to that end.

It's kind of funny for me to hear about people arguing that weirdness is a necessary part of EA. To me, EA concepts are so blindingly straightforward ("we should try to do as much good with donations as possible", "long-term impacts are more important than short-term impacts", "even things that have a small probability of happening are worth tackling if they are impactful enough") that you have to actively modify your rhetoric to make them seem weird. 

Strongly agree with all of the points you brought up - especially on AI Safety. I was quite skeptical for a while until someone gave me an example of AI risk that didn't sound like it was exaggerated for effect, to which my immediate reaction was "Yeah, that seems... really scarily plausible".

It seems like there are certain principles that have a 'soft' and a 'hard' version - you list a few here. The soft ones are slightly fuzzy concepts that aren't objectionable, and the hard ones are some of the tricky outcomes you come to if you push them. Taking a couple of your examples:

Soft: We should try to do as much good with donations as possible

Hard: We will sometimes guide time and money away from things that are really quite important, because they're not the most important

 

Soft:  Long-term impacts are more important than short-term impacts

Hard:  We may pass up interventions with known and high visible short-term benefits in favour of those with long-term impacts that may not be immediately obvious

 

This may seem obvious, but to people who aren't familiar, leading with the soft ones on the basis that the hard ones will come up soon enough if someone is interested or does their research will be more effective in giving a positive impression than jumping straight to the hard stuff. But I see a lot more jumping than would be justified. I can see why, but if you were trying to persuade someone to join or have a good opinion of your political party, would you lead with 'we should invest in public services' or 'you should pay more taxes'?

Strong agree.

I've seen some discourse on Twitter along the lines of "EA's critics never seem to actually understand what we actually believe!" In some ways, this is better than critics understanding EA well and strongly opposing the movement anyway! But it does suggest to me that EA has a problem with messaging, and part of this might be that some EAs are more concerned with making technically-defensible and reasonable statements - which, to be clear, is important! - than with meeting non-EAs (or not-yet-EAs) where they're at and empathizing with how weird some EA ideas seem at first glance. 

I feel like a lot of the ideas aren't really perceived as that weird, when I've discussed EA in intellectual circles unfamiliar with the concept? "Charity should first go to the most needy" is something most people espouse, even if they don't actually put it into action. A lot of my liberal friends are vegetarian or vegan for one reason or another and have strong opinions on animal abuse. The single most common complaint about politics is that it focuses too much on short-term incentives instead of long-term issues. That covers the top three; AI takeover? The only socially weird thing is how seriously the EA takes it, but everyone has an idea of what AI takeover might look like. Many people disagree with EAs, but not more than people disagree with, say, climate change activists.

I don't think EA should be weird. All we're doing is filling the gaps to make sure everyone is taken care of. And of course we do it cost-effectively! Most people I talk to find that reasonable.

EA's top cause area are neglected by others. By definition, they are unpopular and unusual. When discovering a new opportunity, we should therefore expect it to be weird. However, as more and more people interact with the new area, it becomes less and less weird to them. Ideally, it ends up no longer neglected, totally main stream and EA can incubate the next weird thing.

Weirdness is not an inherent property of the thing, it's a property of the relationship between the thing and its observer.

It may be the case that we don’t even need to make general audiences consider paperclip maximizers at all, since the mechanisms needed to prevent them are the same as those needed to prevent the less severe and more plausible-sounding scenarios

I’m somewhat unsure what exactly you meant by this, but if your point is “solutions to near-term AI concerns like bias and unexpected failures will also provide solutions to long-term concerns about AI alignment,” that viewpoint is commonly disputed by AI safety experts.

No, that's not what I mean. I mean we should use other examples of the form "you ask an AI to do X, and the AI accomplishes X by doing Y,  but Y is bad and not what you intended" where Y is not as bad as an extinction event.

I understand—and agree with—the overall point being made about “don’t just talk about the extreme things like paperclip maximizers”, but I’m still thrown off by the statement that “the mechanisms needed to prevent [paperclip maximizers] are the same as those needed to prevent the less severe and more plausible-sounding scenarios”

Hm,  yeah, I see where you're coming from. Changed the phrasing.

I appreciate you thinking through ideas of presentation to new people! I've also spent some time thinking about how to make things not seem as weird, and when that's useful.

One thought is had is that, while it's true that pandemics are really bad and don't need to be described as an existential risk for that to be true, it feels like it relies strongly on other people thinking "what's an actual existential risk" and then back generating reasons why those things are also bad separate from that. I think there are costs and benefits to that dual step process, but one cost is that we lose the focus on the actual discerning principle that generates good ideas, which seems more important to me than communicating those good ideas well (though I'd be really sad if we failed at getting a bunch of CS students thinking about AI Safety only because of framings).

Yes, this is true and very important. We should by no means lose sight of existential risks as a discerning principle! I think the best framing to use will vary a lot case-by-case, and often the one you outline will be the better option. Thanks for the feedback!