Thanks, Geoffrey, great points.
I agree that people should adopt advocacy styles that fit them and that the best tactics depend on the situation. What (arguably) matters most is making good arguments and raising the epistemic quality of (online) discourse. This requires participation and if people want/need to use disagreeable rhetoric in order to do that, I don’t want to stop them!
Admittedly, it's hypocritical of me to champion kindness while staying on the sidelines and not participating in, say, Twitter discussions. (I appreciate your engagement there!) Reading and responding to countless poor and obnoxious arguments is already challenging enough, even without the additional constraint of always having to be nice and considerate.
Your point about the evolutionary advantages of different personality traits is interesting. However, (you obviously know this already) just because some trait or behavior used to increase inclusive fitness in the EEA doesn’t mean it increases global welfare today. One particularly relevant example may be dark tetrad traits which actually negatively correlate with Agreeableness (apologies for injecting my hobbyhorse into this discussion :) ).
Generally, it may be important to unpack different notions of being “disagreeable”. For example, this could mean, say, straw-manning or being (passive-)aggressive. These behaviors are often infuriating and detrimental to epistemics so I (usually) don’t like this type of disagreeableness. On the other hand, you could also characterize, say, Stefan Schubert as being “disagreeable”. Well, I’m a big fan of this type of “disagreeableness”! :)
Thanks for writing this post!
You write:
While this is somewhat compelling, this may not be enough to warrant such a restriction of our search area. Many of the actors we should be concerned about, for our work here, might have very low levels of such traits. And features such as spite and unforgivingness might also deserve attention (see Clifton et al. 2022).
I wanted to note that the term 'malevolence' wasn't meant to exclude traits such as spite or unforgivingness. See for example the introduction which explicitly mentions spite (emphasis mine):
This suggests the existence of a general factor of human malevolence[2]: the Dark Factor of Personality (Moshagen et al., 2018)—[...] characterized by egoism, lack of empathy[3] and guilt, Machiavellianism, moral disengagement, narcissism, psychopathy, sadism, and spitefulness.
So to be clear, I encourage others to explore other traits!
Though I'd keep in mind that there exist moderate to large correlations between most of these "bad" traits such that for most new traits we can come up with, there will exist substantial positive correlations with other dark traits we already considered. (In general, I found it helpful to view the various "bad" traits not as completely separate, orthogonal traits that have nothing to do with each other but also as “[...] specific manifestations of a general, basic dispositional behavioral tendency [...] to maximize one’s individual utility— disregarding, accepting, or malevolently provoking disutility for others—, accompanied by beliefs that serve as justifications" (Moshagen et al., 2018).)
Given this, I'm probably more skeptical that there exist many actors who are, say, very spiteful but exhibit no other "dark" traits—but there are probably some!
That being said, I'm also wary of going too far in the direction of "whatever bad trait, it's all the same, who cares" and losing conceptual clarity and rigor. :)
Thanks, makes sense!
I agree that confrontational/hostile tactics have their place and can be effective (under certain circumstances they are even necessary). I also agree that there are several plausible positive radical flank effects. Overall, I'd still guess that, say, PETA's efforts are net negative—though it's definitely not clear to me and I'm by no means an expert on this topic. It would be great to have more research on this topic.[1]
I also think we should reconceptualize what the AI companies are doing as hostile, aggressive, and reckless. EA is too much in a frame where the AI companies are just doing their legitimate jobs, and we are the ones that want this onerous favor of making sure their work doesn’t kill everyone on earth.
Yeah, I'm sympathetic to such concerns. I sometimes worry about being biased against the more "dirty and tedious" work of trying to slow down AI or public AI safety advocacy. For example, the fact that it took us more than ten years to seriously consider the option of "slowing down AI" seems perhaps a bit puzzling. One possible explanation is that some of us have had a bias towards doing intellectually interesting AI alignment research rather than low-status, boring work on regulation and advocacy. To be clear, there were of course also many good reasons to not consider such options earlier (such as a complete lack of public support). (Also, AI alignment research (generally speaking) is great, of course!)
It still seems possible to me that one can convey strong messages like "(some) AI companies are doing something reckless and unreasonable" while being nice and considerate, similarly to how Martin Luther King very clearly condemned racism without being (overly) hostile.
Again, though, one amazing thing about not having explored outside game much in AI Safety is that we have the luxury of pushing the Overton window with even the most bland advocacy.
Agreed. :)
For example, present participants with (hypothetical) i) confrontational and ii) considerate AI pause protest scenarios/messages and measure resulting changes in beliefs and attitudes. I think Rethink Priorities has already done some work in this vein.
Great post, I agree with most of it!
Overall, I'm in favor of more (well-executed) public advocacy à la AI Pause (though I do worry a lot about various backfire risks (also, I wonder whether a message like "AI slow" may be better)), and I commend you for taking the initiative despite it (I imagine) being kinda uncomfortable or even scary at times!
(ETA: I've become even more uncertain about all of this. I might still be slightly in favor of (well-executed) AI Pause public advocacy but would probably prefer emphasizing messages like conditional AI Pause or AI Slow, and yeah, it all really depends greatly on the execution.)
The inside-outside game spectrum seems very useful. We might want to keep in mind another (admittedly obvious) spectrum, ranging from hostile/confrontational to nice/considerate/cooperative.
Two points in your post made me wonder whether you view the outside-game as necessarily being more on the hostile/confrontational end of the spectrum:
1) As an example for outside-game you list “moralistic, confrontational advocacy” (emphasis mine).
2) You also write (emphasis mine):
Funnily enough, even though animal advocates do radical stunts, you do not hear this fear expressed much in animal advocacy. If anything, in my experience, the existence of radical vegans can make it easier for “the reasonable ones” to gain access to institutions.
This implicitly characterizes the outer game with radical stunts, radical, and “unreasonable” people.
However, my sense is that outside-game interventions (hereafter: activism or public advocacy) can differ enormously on the hostility vs. considerateness dimension, even while holding other effects (such as efficacy) constant.
The obvious example is Martin Luther King’s activism, perhaps most succinctly characterized by his famous “I have a Dream” speech which was non-confrontational and emphasized themes of cooperation, respect, and even camaraderie.[1] (In fact, King was criticized by others for being too compromising.[2]) On the hostile/confrontational side of the spectrum you had people like Malcolm X, or the Black Panther Party.[3] In the field of animal advocacy, you have organizations like PETA on the confrontational end of the spectrum and, say, Mercy for Animals on the more considerate side.
As you probably have guessed, I prefer considerate activism over more confrontational activism. For example, my guess is that King and Mercy for Animals have done much more good for African Americans and animals, respectively, than Malcolm X and PETA.
(As an aside and to be super clear, I didn’t want to suggest that you or AI Pause is or will be disrespectful/hostile and, say, throw paper clips at Meta employees! :P )
A couple of weak arguments in favor of considerate/cooperative public advocacy over confrontational/hostile advocacy:
Taking a more confrontational tone makes everyone more emotional and tense, which probably decreases truth-seeking, scout-mindset, and the general epistemic quality of discourse. It also makes people more aggressive and might escalate conflict, and dangerous emotional and behavioral patterns such as spite, retaliation, or even (threats of) violence. It may also help to bring about a climate where the most outrage-inducing message spreads the fastest. Last, since this is EA, here’s the obligatory option value argument: It seems easier to go from a more considerate to a more confrontational stance than vice versa.
As an aside, (and contrary to what you write in the above quote), I often have heard the fear expressed that the actions of radical vegans will backfire. I’ve certainly witnessed that people were much less receptive to my animal welfare arguments because they’ve had bad experiences with “unreasonable” vegans who e.g. yelled expletives at them.[4] I think you can also see this reflected in the general public where vegans don’t have a great reputation, partly based on the aggressive actions of a few confrontational and hostile vegans or vegan organizations like PETA.
Political science research (e.g., Simpson et al., 2018) also seems to suggest that nonviolent protests are better than violent protests. (Of course, I’m not trying to imply that you were arguing for violent protests, in fact, you repeatedly say (in other places) that you’re organizing a nonviolent protest!) Importantly, the Simpson et al. paper suggests that violent protests make the protester side appear unreasonable and that this is the mechanism that causes the public to support this side less. It seems plausible to me that more confrontational and hostile public activism, even if it’s nonviolent, is more likely to appear unreasonable (especially when it comes to movements that might seem a bit fringe and which don’t yet have a long history of broad public support).
In general, I worry that increasing hostility/conflict, in particular in the field of AI, may be a risk factor for x-risk and especially s-risks. Of course, many others have written about the value of compromise/being nice and the dangers of unnecessary hostility, e.g., Schubert & Cotton-Barratt (2017), Tomasik (many examples, most relevant 2015), and Baumann (here and here).
Needless to say, there are risks to being too nice/considerate but I think they are outweighed by the benefits though it obviously depends on the specifics. (As you imply in your post, it’s probably also true that all public protests, by their very nature, are more confrontational than silently working out compromises behind closed doors. Still, my guess is that certain forms of public advocacy can score fairly high on the considerateness dimension while still being effective.)
To summarize, it may be valuable to emphasize considerateness (along other desiderate such as good epistemics) as a core part of the AI Pause movement's memetic fabric, to minimize the probability that it will become more hostile in the future since we will probably have only limited memetic control over the movement once it gets big. This may also amount to pulling the rope sideways, in the sense that public advocacy against AI risk may be somewhat overdetermined (?) but we are perhaps at an inflection point where we can shape its overall tone / stance on the confrontational vs. considerate spectrum.
Examples: “former slaves and the sons of former slave owners will be able to sit down together at the table of brotherhood” and “little black boys and black girls will be able to join hands with little white boys and white girls as sisters and brothers”.
From Wikipedia: “Some Black leaders later criticized the speech (along with the rest of the march) as too compromising. Malcolm X later wrote in his autobiography: "Who ever heard of angry revolutionaries swinging their bare feet together with their oppressor in lily pad pools, with gospels and guitars and 'I have a dream' speeches?
To be fair, their different tactics were probably also the result of more extreme religious and political beliefs.
I should note that I probably have much less experience with animal advocacy than you.
Sorry, yeah, I didn't make my reasoning fully transparent.
One worry is that most private investigations won't create common knowledge/won't be shared widely enough that they cause the targets of these investigations to be sufficiently prevented from participating in a community even if this is appropriate. It's just difficult and has many drawbacks to share a private investigations with every possible EA organization, EAGx organizer, podcast host, community builder, etc.
My understanding is that this has actually happened to some extent in the case of NonLinear and in other somewhat similar cases (though I may be wrong!).
But you're right, if private investigations are sufficiently compelling and sufficiently widely shared they will have almost the same effects. Though at some point, you may also wonder how different very widely shared private investigations are from public investigations. In some sense, the latter may be more fair because the person can read the accusations and defend themselves. (Also, frequent widely shared private investigations might contribute even more to a climate of fear, paranoia and witch hunts than public investigations.)
ETA: Just to be clear, I also agree that public investigations should be more of a "last resort" measure and not be taken lightly. I guess we disagree about where to draw this line.
- More negative press for EA (which I haven't seen yet)
- Reducing morale of EA people in general, causing lower productivity or even people leaving the movement.
My sense is that these two can easily go the other way.
If you try to keep all your worries about bad actors a secret you basically count on their bad actions never becoming public. But if they do become public at a later date (which seems fairly likely because bad actors usually don't become more wise and sane with age, and, if they aren't opposed, they get more resources and thus more opportunities to create harm and scandals), then the resulting PR fallout is even bigger. I mean, in the case of SBF, it would have been good for the EA brand if there were more public complaints about SBF early on and then EAs could refer to them and say "see, we didn't fully trust him, we weren't blindly promoting him".
Keeping silent about bad actors can easily decrease morale because many people who interacted with bad actors will have become distrustful of them and worry about the average character/integrity of EAs. Then they see these bad actors giving talks at EAGs, going on podcast interviews, and so on. That can easily give rise to thoughts/emotions like "man, EA is just not my tribe anymore, they just give a podium to whomever is somewhat productive, doesn't matter if they're good people or not."
Thanks for this post! I've been thinking about similar issues.
One thing that may be worth emphasizing is that there are large and systematic interindividual differences in idea attractiveness—different people or groups probably find different ideas attractive.
For example, for non-EA altruists, the idea of "work in soup kitchen" is probably much more attractive than for EAs because it gives you warm fuzzies (due to direct personal contact with the people you are helping), is not that effortful, and so on. Sure, it's not very cost-effective but this is not something that non-EAs take much into account. In contrast, the idea of "earning-to-give" may be extremely unattractive to non-EAs because it might involve working at a job that you don't feel passionate about, you might be disliked by all your left-leaning friends, and so on. For EAs, the reverse is true (though earning-to-give may still be somewhat unattractive but not that unattractive).
In fact, in an important sense, one primary reason for starting the EA movement was the realization of schlep blindness in the world at large—certain ideas (earning to give, donate to the global poor or animal charities) were unattractive / uninspiring / weird but seemed to do (much) more good than the attractive ideas (helping locally, becoming a doctor, volunteering at a dog shelter, etc.).
Of course, it's wise to ask ourselves whether we EAs share certain characteristics that would lead us to find a certain type of idea more attractive than others. As you write in a comment, it's fair to say that most of us are quite nerdy (interested in science, math, philosophy, intellectual activities) and we might thus be overly attracted to pursue careers that primarily involve such work (e.g., quantitative research, broadly speaking). On the other hand, most people don't like quantitative research, so you could also argue that quantitative research is neglected! (And that certainly seems to be true sometimes, e.g., GiveWell does great work relating to global poverty.)
I see where you're coming from but I also think it would be unwise to ignore your passions and personal fit. If you, say, really love math and are good at it, it's plausible that you should try to use your talents somehow! And if you're really bad at something and find it incredibly boring and repulsive, that's a pretty strong reason to not work on this yourself. Of course, we need to be careful to not make general judgments based on these personal considerations (and this plausibly happens sometimes subconsciously), we need to be mindful of imitating the behavior of high-status folks of our tribe, etc.)
We could zoom out even more and ask ourselves if we as EAs might be attracted to certain worldviews/philosophies that are more attractive.
What type of worldviews might EAs be attracted to? I can't speak for others but personally, I think that I've been too attracted to worldviews according to which I can (way) more impact than the average do-gooder. This is probably because I derive much of my meaning and self-worth in life from how much good I believe I do. If I can change the world only a tiny amount even if I try really, really hard and am willing to bite all the bullets, that makes me feel pretty powerless, insignificant, and depressed—there is so much suffering in the world and I can do almost nothing in the big scheme of things? Very sad.
In contrast, "silver-bullet worldviews" according to which I can have super large amounts of impact because I galaxy-brained my way into finding very potent, clever, neglected levers that will change the world-trajectory—that feels pretty good. It makes me feel like I'm doing something useful, like my life has meaning. More cynically, you could say it's all just vanity and makes me feel special and important. "I'm not like all those others schmucks who just feel content with voting every four years and donating every now and then. Those sheeple. I'm helping way more. But no big deal, of course! I'm just that altruistic."
To be clear, I think probably something in the middle is true. Most likely, you can have more (expected) impact than the average do-gooder if you really try and reflect hard and really optimize for this. But in the (distant) past, following the reasoning of people like Anna Salamon (2010) [to be clear: I respect and like Anna a lot], I thought this might buy you like a factor of a million, whereas now I think it might only buy you a factor of a 100 or something. As usual, Brian has argued for this a long time ago. However, a factor of a 100 is still a lot and most importantly, the absolute good you can do is what ultimately matters, not the relative amount of good, and even if you only save one life in your whole life, that really, really matters.
Also, to be clear, I do believe that many interventions of most longtermist causes like AI alignment plausibly do (a great deal) more good than most "standard" approaches to doing good. I just think that the difference is considerably smaller than I previously believed, mostly for reasons related to cluelessness.
For me personally, the main take-away is something like this: Because of my desperate desire to have ever-more impact, I've stayed on the train to crazy town for too long and was too hesitant to walk a few stops back. The stop I'm now in is still pretty far away from where I started (and many non-EAs would think it uncomfortably far away from normal town) but my best guess is that it's a good place to be in.
(Lastly, there are also biases against "silver-bullet worldviews". I've been thinking about writing a whole post about this topic at some point.)