Two Strange Things About AI Safety Policy

by Jay_Shooster28th Sep 201628 comments


AI governance

Views expressed here are solely my own, and not of my employer (I’ve always wanted to say that)

Not to brag (but probably to brag a little bit) I work at a pretty awesome place for those that want to be involved in national security policy. Just Security (JS), where I work as an editor, is one of the leading publications on national security law, policy, and strategy. Our readership and editorial staff consists of the top lawyers, journalists, law professors, and government officials working in this space.


Naturally, as a self-described “hardcore EA,” my first thought when I arrived at JS was “how can I weave AI Safety into our programming?” I quickly got permission from my bosses to start exploring some kind of extensive project on AI-safety: a high profile panel event, an in-depth blog post, or a special feature for our website (a video, podcast, or data visualization).


Surely, the AI safety-focused EAs were going be thrilled when they heard that I could leverage JS’s platform/connections to promote the best thinking in AI-safety, right? Wrong. Not even close.


When I reached out to two leading AI safety-focused EAs about this opportunity, I was sorely disappointed by their lack of enthusiasm. To be clear, they were very kind, but they didn’t think there was much I could really do to help. Both AI experts discouraged me from doing an event on anything that touches on the catastrophic risks posed by AI and recommended I try something on biosecurity or autonomous weapons instead, in spite of the fact that they both agreed that the catastrophic risk from AI is a much more important issue in expectation. That’s strange.

I have little doubt that if I reached out to two random poverty or animal-focused EAs with the pitch “I can get a bunch of respected journalists, academics, and policymakers to hear the exact perspective you want me to share with them on our trusted/prestigious platform,” they would be pretty psyched about that (as I think they should be). So what’s so different about AI safety?


A few things, perhaps. One thing that came up in both my conversations was the sentiment that the existential risks from AI are too far off and thus too nebulous to say anything concrete and interesting about them. A related concern was that basically anyone with sound views on this subject (except for Stuart Russell) is bound to be perceived to be a crazy person for talking about it when the tech is still so far off; and by raising the concern too early, we risk branding ourselves as fearmongers and making it harder to be taken seriously down the line when the threat is more clearly materializing (or at least once the field of safety research has become more institutionalized and prestigious).


Another thing they suggested is that, in many ways, the field is already quite crowded. They offered the idea of doing an event on autonomous weapons or surveillance, as a way of building credentials/capital in this general area. But both noted that this is already a very sexy field and that fancy/smart safety-oriented folks are already thinking about this. In the few short weeks working at JS, I’m already seeing more and more discussion of autonomous weapons systems popping up.


They mentioned that there are already influential people in the US government that are already completely aware of the issues around AI safety. I think the suggestion there was that there is little I could do to advance the policy conversation in a strategically sound way that they wouldn’t already be on top of.


All of this is to say: maybe AI safety not as neglected (or tractable) as I thought. I’m interested in hearing people’s thoughts about this. Perhaps it’s just that AI research is basically the only way to be helpful in this space right now, and maybe there’s an extreme lack of talent in this space that we need to develop. And maybe, the best way to sustainably grow the field is by doing it kind of under the radar. I can accept that. But it’s certainly...different.

Another related strange thing is that nobody is trying to slow down the development of AGI even though many EAs have decided that an unfriendly/misaligned AGI is basically the worst thing in the world by many orders of magnitude.


I spoke to one EA who made an argument against slowing down AGI development that I think is basically indefensible: that doing so would slow the development of machine learning-based technology that is likely to lead to massive benefits in the short/medium term. But by the own arguments of the AI-focused EAs, the far future effects of AGI dominate all other considerations by orders of magnitude. If that’s the case, then getting it right should be the absolute top priority, and virtually everyone agrees (I think) that the sooner AGI is developed, the higher the likelihood that we were ill prepared and that something will go horribly wrong. So, it seems clear that if we can take steps to effectively slow down AGI development we should.


Of course, EAs have made good arguments for not trying to slow down AI progress. The big ones that came up with the experts I consulted were (1) that it’s intractable because the forces of industry are stacked against us, and (2) that amplifying fear of AI might exacerbate the arms race ( a common concern that applies to all technological developments).


The first point I think is reasonable to a degree. If (as is likely the case, for now) marginal resources are better spent on safety research than slowing down development, then we should avoid it. But, it seems likely that we might get to a point where the research field becomes so saturated that this changes. And, even more importantly, there are already lots of people like myself, that have advocacy skills that would be very applicable to changing institutional policies on AI, but lack technical skills useful for AI research. Even if we think that slowing down development would be very intractable, the scale of the problem is so great that it seems like a plausible contender for the most impactful thing lots of folks could work on (especially for those that believe anything not AI-related is just a rounding error in the utilitarian calculus).


The second point (that the very act of trying to slow down development could exacerbate the problem) is one that I take seriously. But, it’s certainly far from obvious that any attempts to slow down development will have negative consequences in expectation. And I’d definitely like to see more discussion of  this crucially important question.





I’m nowhere near an expert in this area, and I’m doing my best to reflect the opinions of the experts I consulted, but I’m surely going to misrepresent their views at least slightly. Sorry!



Honest question: can we invest in making more Stuart Russells? (e.g. Safety-oriented authority figures in AI). Can we use our connections in academia to give promising EAs big prestige-building opportunities (conference invites, publication opportunities, scholarships, research and teaching positions, co-authorships) in academia etc.? (Also can we do this more in general?)


28 comments, sorted by Highlighting new comments since Today at 7:16 PM
New Comment

The idea of running an event in particular seems misguided. Conventions come after conversations. Real progress toward understanding, or conveying understanding, does not happen through speakers going On Stage at big events. If speakers On Stage ever say anything sensible, it's because an edifice of knowledge was built in the background out of people having real, engaged, and constructive arguments with each other, in private where constructive conversations can actually happen, and the speaker On Stage is quoting from that edifice.

(This is also true of journal publications about anything strategic-ish - most journal publications about AI alignment come from the void and are shouting into the void, neither aware of past work nor feeling obliged to engage with any criticism. Lesser (or greater) versions of this phenomenon occur in many fields; part of where the great replication crisis comes from is that people can go on citing refuted studies and nothing embarrassing happens to them, because god forbid there be a real comments section or an email reply that goes out to the whole mailing list.)

If there's something to be gained from having national-security higher-ups understanding the AGI alignment strategic landscape, or from having alignment people understand the national security landscape, then put Nate Soares in a room with somebody in national security who has a computer science background, and let them have a real conversation. Until that real progress has already been made in in-person conversations happening in the background where people are actually trying to say sensible things and justify their reasoning to one another, having a Big Event with people On Stage is just a giant opportunity for a bunch of people new to the problem to spout out whatever errors they thought up in the first five seconds of thinking, neither aware of past work nor expecting to engage with detailed criticism, words coming from the void and falling into the void. This seems net counterproductive.

These seem like reasonable points in isolation, but I'm not sure they answer the first question as actually posed. In particular:

  1. Why would it necessarily be 'a bunch of people new to the problem [spouting] whatever errors they've thought up in the first five seconds of thinking'? Jay's spectrum of suggestions was wide and included a video or podcast. With that kind of thing there would appear to be ample scope to either have someone experienced with the problem doing the presenting or it could be reviewed by the people with relevant expertise before being released. A Big Event On Stage wasn't the only thing on offer.

  2. The actual question in the post was "I have little doubt that if I reached out to two random poverty or animal-focused EAs with the pitch “I can get a bunch of respected journalists, academics, and policymakers to hear the exact perspective you want me to share with them on our trusted/prestigious platform,” they would be pretty psyched about that (as I think they should be). So what’s so different about AI safety?" I don't really know what your answer to this is; is AI particularly vulnerable to the downsides you described (Why?). Or are the other areas of EA making a mistake?

  3. "If there's something to be gained from having national-security higher-ups understanding the AGI alignment strategic landscape, or from having alignment people understand the national security landscape,..." I'm pretty surprised that the start of this sentence is phrased as 'if there is' rather than a 'while there is certainly', so I want to check: is that deliberate; i.e. are you actually sceptical about whether there's anything that national security higher-ups have to offer?

If you actually don't think there's anything to be gained from cooperation between AGI alignment people and national security people, the weakness of your other objections makes more sense, because they aren't really your true rejection; your true rejection is that there's no upside and some potential downsides.

So do you think other kinds of non-event programming could be useful? Like an in depth blog post, or a podcast episode?

I have heard about retreats and closed conferences/workshops to get people together, I would imagine something like that would be better from the point of view that Eliezer is coming from.

In order for people to have useful conversations where genuine reasoning and thinking is done, they have to actually meet each other.

Are these recommendations based on sound empirical data (e.g. a survey of AI researchers who've come to realize AI risk is a thing, asking them what they were exposed to and what they found persuasive), or just guessing/personal observation?

If persuasive speaking is an ineffective way of spreading concern for AI risk, then we live in one of two worlds.

In the first world, the one you seem to imply we live in, persuasive speaking is ineffective for most things, and in particular it's ineffective for AI risk. In this world, I'd expect training in persuasive speaking (whether at a 21st century law school or an academy in Ancient Greece) to be largely a waste of time. I would be surprised if this is true. The only data I could find offhand related to the question is from Robin Hanson: "The initially disfavored side [in a debate] almost always gains a lot... my guess is that hearing half of a long hi-profile argument time devoted to something makes it seem more equally plausible."

In the second world, public speaking is effective persuasion in at least some cases, but there's something about this particular case that makes public speaking a bad fit. This seems more plausible, but it could also be a case of ineffective speakers or an ineffective presentation. It's also important to have good measurement methods: for example, if most post-presentation questions offer various objections, it's still possible that your presentation was persuasive to the majority of the audience.

I'm not saying all this because I think events are a particularly promising way to persuade people here. Rather, I think this issue is important enough that our actions should be determined by data whenever it's possible. (Might be worthwhile to do that survey if it hasn't been done already.)

I also think the burden of proof for a strategy focused primarily on personal conversations should be really high. Personal conversations are about the least scalable method of persuasion. Satvik Beri recommends that businesses do sales first to figure out how to overcome common objections, then use a sales pitch that's known to be effective as marketing copy. A similar strategy could work here: take notes on common objections & the best ways to refute them after personal conversations, then use that knowledge to inform the creation of scalable persuasive content like books/talks/blog posts.

|...having a Big Event with people On Stage is just a giant opportunity for a bunch of people new to the problem to spout out whatever errors they thought up in the first five seconds of thinking, neither aware of past work nor expecting to engage with detailed criticism...

I had to go back and double-check that this comment was written before Asilomar 2017. It describes some of the talks very well.

One way to have interesting conversations - is to have them on a dinner between public speeches on a conference. The most interesting thing during conferences is informal connection between people during breaks and during evenings. A conference is just a cause to collect right people together and put topic frame. So such conference may help to connect national security people and AI safety people.

But I have feeling from previous conversation is that current wisdom of AI people is that government people are unable to understand their complex problems and also are not players in the game in AI creation. Only hackers and corporations are. I don't think that it is tight approach.

These are both good points worth addressing! My understanding on (2) is that any proposed method of slowing down AGI research would likely antagonize the majority of AI researchers with relatively little actual slowdown. It seems more valuable to build alliances with current AI researchers, and get them to care about safety, in order to increase the amount of safety-concerned research done vs. safety-agnostic research.

Exactly. If someone were trying to slow down AI research, they definitely wouldn't want to make it publicly known that they were doing so, and they wouldn't write articles on a public forum about how they believe we should try to slow down AI research.

I recognize that I'm a novice on these issues, and I'm open to being persuaded about this, but that position just seems incredibly counterintuitive to me. Of course, that's doesn't mean it's not true, and if I had to guess right now, I would say it's right (based on the weight of opinions of those who have been more involved in AI than me). But I'd really like to see more discussion about the strategy here.

AI researchers don't like it when you try to slow down AI research. AI researchers are a lot more powerful than AI safety supporters. Right now AI researchers' opinions on AI safety range from "this is stupid but I don't care" to "this is really important, let's keep doing AI research though." If it becomes widely known that you're trying to slow down AI research in the name of AI safety, AI researchers' opinions will shift to "this is stupid and I care a lot because these stupid idiots are trying to stop me from doing research" and "I used to think this was important but clearly these people are out to get me so I'm not going to support them anymore."

Maybe not a great analogy, but suppose you're living under an oppressive totalitarian regime. You think it would be super effective to topple the regime. So you go around telling people, "Hey guys I think we should try to topple this regime, I think things would be a lot better. It's weird that I don't see people going around talking about toppling this regime, people should talk about it more." Then they arrest you and throw you into a gulag. Now you know why people don't go around talking about it.

Maybe not a great analogy


1) It's not obvious how the speed of AI affects global risk and sustainability. E.g. getting to powerful AI faster through more AI research would reduce the time spent exposed to various state risk. It would also reduce the amount of computing hardware at the time which could make for a less disruptive transition. If you think the odds are 60:40 that one direction is better than the other (with equal magnitudes), then you get a fifth of the impact.

2) AI research overall is huge relative to work focused particularly on AI safety, by orders of magnitude. So the marginal impact of a change in research effort is much greater for the latter. Combined with the first point it looks at least hundreds of times more effective to address safety rather than to speed up or slow down software progress with given resources, and not at all worthwhile to risk the former for the latter.

3) AI researchers aren't some kind of ogres or tyrants: they are smart scientists and engineers with a lot of awareness of uninformed and destructive technophobia (consider GMO crops, opposition to using gene drives to wipe out malaria, opposition to the industrial revolution, panics about books/film/cars/video games, anti-vaccine movements, anti-nuclear). And they are very aware of the large benefits their work could produce. There actually is a very foolish technophobic response to AI that doesn't care about the immense benefits, one that it is important not to be confused with (and that it is understandable that people might confuse with someone like Bostrom who has written a lot about the great benefits of AI and that its expected value is good).

4) If you're that worried about the dangers of offending people (some of whose families may have fled the Soviet Union, and other places with gulags), don't make needlessly offensive analogies about them. It is AI researchers who will solve the problems of AI safety.

Regarding your point (2), couldn't this count as an argument for trying to slow down AI research? I.e., given that the amount of general AI research done is so enormous, even changing community norms around safety a little bit could result in dramatically narrowing the gap between the rates of general AI research and AI safety research?

I don't think I'm following your argument. Are you saying that we should care about the absolute size of the difference in effort in the two areas rather than proportions?

Research has diminishing returns because of low-hanging fruit. Going from $1MM to $10 MM makes a much bigger difference than going from $10,001 MM to $10,010 MM.

I guess the argument is that, if it takes (say) the same amount of effort/resources to speed up AI safety research by 1000% and to slow down general AI research by 1% via spreading norms of safety/caution, then plausibly the latter is more valuable due to the sheer volume of general AI research being done (with the assumption that slowing down general AI research is a good thing, which as you pointed out in your original point (1) may not be the case). The tradeoff might be more like going from $1 million to $10 million in safety research, vs. going from $10 billion to $9.9 billion in general research.

This does seem to assume that absolute size in difference is more important than proportions. I'm not sure how to think about whether or not this is the case.

This is a tacit claim about the shape of the search space, granted a reasonable one given most search spaces show decreasing marginal utility. Some search spaces have threshold effects or other features that make them have increasing marginal utility per resources spent, at least in some localized areas. AI is weird enough this seems worth thinking about.

Yeah, this is kind of weird, I had a similar experience. I had a friend who was also interested in AI risks and worked on the board of a good university international relations publication. I tried to find someone interested in writing a paper, but no dice.

Slowing down AGI

Strongly against, this is the #1 reason that AI scientists in academia and government are adversarial towards public awareness of AI risks, they are worried about a loss of funding and research progress which is needed to combat short and medium term problems.

Pushing AI progress is extremely important to everyone working in AI whereas slowing it is only vaguely/uncertainly important to people worried about AI. So it's a poor point to try and fight over.

Honest question: can we invest in making more Stuart Russells? (e.g. Safety-oriented authority figures in AI). Can we use our connections in academia to give promising EAs big prestige-building opportunities (conference invites, publication opportunities, scholarships, research and teaching positions, co-authorships) in academia etc.? (Also can we do this more in general?)

It's already a problem that AI safety researchers are only cited by other AI safety researchers and are perceived as an island community.

It would of course be good for more AI safety people to enter computer science research and academia however.

I think it's a really bad idea to try to slow down AI research. In addition to the fact that you'll antagonize almost all of the AI community and make them not take AI safety research as seriously, consider what would happen on the off chance that you actually succeeded.

There are a lot of AI firms, so if you're able to convince some to slow down, then the ones that don't slow down would be the ones that care less about AI safety. Much better idea to get the ones who care about AI safety to focus on AI safety than to potentially cede their cutting-edge research position to others who care less.

I think creating more Stuart Russells is just about the best thing that can be done for AI Safety. What he has different from others who care about AI Safety is that he's a prestigious CS professor, while many who focus on AI Safety, even if they have good ideas, aren't affiliated with a well-known and well-respected institution. Even when Nick Bostrom or Steven Hawking talk about AI, they're often dismissed by people who say "well sure they're smart, but they're not computer scientists, so what do they know?"

I'm actually a little surprised that they seemed so resistant to your idea. It seems to me that there is so much noise on this topic, that the marginal negative from creating more noise is basically zero, and if there's a chance you could cut through the noise and provide a platform to people who know what they're talking about here then that would be good.

In general, I think that people are being too conservative about addressing the issue. I think we need some "radicals" who aren't as worried about losing some credibility. Whether or not you want to try and have mainstream appeal, or just be straightforward with people about the issue is a strategic question that should be considered case-by-case.

Of course, it is a big problem that talking about AIS makes a good chunk of people think you're nuts. It's been my impression that most of those people are researchers, not the general public, who are actually quite receptive to the idea (although maybe for the wrong reasons...)

I don't think the issue is that we don't have any people willing to be radicals and lose credibility. I think the issue is that radicals on a certain issue tend to also mar the reputations of their more level-headed counterparts. Weak men are superweapons, and groups like PETA and Greenpeace and Westboro Baptist Church seem to have attached lasting stigma to their causes because people's pattern-matching minds associate their entire movement with the worst example.

Since, as you point out, researchers specifically grow resentful, it seems really important to make sure radicals don't tip the balance backward just as the field of AI safety is starting to grow more respectable in the minds of policymakers and researchers.

Sure, but the examples you gave are more about tactics than content. What I mean is that there are a lot of people who are downplaying their level of concern about Xrisk in order to not turn off people who don't appreciate the issue. I think that can be a good tactic, but it also risks reducing the sense of urgency people have about AI-Xrisk, and can also lead to incorrect strategic conclusions, which could even be disasterous when they are informing crucial policy decisions.

TBC, I'm not saying we are lacking in radicals ATM, the level is probably about right. I just don't think that everyone should be moderating their stance in order to maximize their credibility with the (currently ignorant, but increasingly less so) ML research community.

It probably wouldn't hurt if AI inclined EAs focused more on getting experts on board. It's a very bad situation to be in if the vast majority of experts on a given topic think that a given issue you are interested in is overblown, because 1) tractabilty goes down the tubes, since most experts actively contradict you, 2) your ability to collaborate with other experts is greatly hampered, since most experts won't work with you, and 3) it becomes really easy for people to assume that you're a crackpot. I'm also not sure if it's even 'rational' for non experts to get involved in this until a majority of experts in the field is on board. I mean, if person A has no experience with a topic, and the majority of experts say one thing, but person A gets convinced that the opposite is true by an expert in the minority, am I wrong in thinking that that's not a great precedent to set?

I agree this is really strange. I agree many ai people supposedly into safety don't seem to givemuch thought to the more obvious policies, at least publicly (unless someone can signpost).

Why not move national security research funding from ai development and application to safety research?

Why not call out the risks and bring more skepticism to a. The hope of ever achieving aligned AI, and b. That aligned AI really improving the human condition anyway, while reminding people of the risks?

Why not ask all companies or industry researchers t apply for a permit with some prior training in risks or safety prior to them working on anything more advanced than basic statistical algorithms?Or even professional registration? Just slow it down and make I more expensive. These bodies can be set up intemationally without having to be passed into law.

Why not tempt coders and researchers who are making particularly good.progress, to work on something else? This could be done around the world like counter recruitment in espionage or competitive industries.

Right, I was going to mention the fact that AIS concerned people are very interested in courting the ML community, and very averse to anything which might alienate them, but it's already come up.

I'm not sure I agree with this strategy. I think we should maybe be more "good cop / bad cop" about it. I think the response so far from ML people is almost indefensible, and the AIS folks have won every debate so far, but there is of course this phenomena with debate where you think that your side won ;).

If it ends up being necessary to slow down research, or, more generally, carefully control AI technology in some way, then we might have genuine conflicts of interest with AI researchers which can't be resolved solely by good cop tactics. This might be the case if, e.g. using SOTA AIS techniques significantly impairs performance or research, which I think it likely.

It's still a huge instrumental good to get more ML people into AIS and supportive of it, but I don't like to see AIS people bending over backwards to do this.