Existential risk
Existential risk
Discussions of risks which threaten the destruction of the long-term potential of life

Quick takes

98
10d
32
(COI note: I work at OpenAI. These are my personal views, though.) My quick take on the "AI pause debate", framed in terms of two scenarios for how the AI safety community might evolve over the coming years: 1. AI safety becomes the single community that's the most knowledgeable about cutting-edge ML systems. The smartest up-and-coming ML researchers find themselves constantly coming to AI safety spaces, because that's the place to go if you want to nerd out about the models. It feels like the early days of hacker culture. There's a constant flow of ideas and brainstorming in those spaces; the core alignment ideas are standard background knowledge for everyone there. There are hackathons where people build fun demos, and people figuring out ways of using AI to augment their research. Constant interactions with the models allows people to gain really good hands-on intuitions about how they work, which they leverage into doing great research that helps us actually understand them better. When the public ends up demanding regulation, there's a large pool of competent people who are broadly reasonable about the risks, and can slot into the relevant institutions and make them work well. 2. AI safety becomes much more similar to the environmentalist movement. It has broader reach, but alienates a lot of the most competent people in the relevant fields. ML researchers who find themselves in AI safety spaces are told they're "worse than Hitler" (which happened to a friend of mine, actually). People get deontological about AI progress; some hesitate to pay for ChatGPT because it feels like they're contributing to the problem (another true story); others overemphasize the risks of existing models in order to whip up popular support. People are sucked into psychological doom spirals similar to how many environmentalists think about climate change: if you're not depressed then you obviou
32
5d
1
Vasili Arkhipov is discussed less on the EA Forum than Petrov is (see also this thread of less-discussed people). I thought I'd post a quick take describing that incident. Arkhipov & the submarine B-59’s missile On October 27, 1962 (during the Cuban Missile Crisis), the Russian diesel-powered submarine B-59 started experiencing[1] nearby depth charges from US forces above them; the submarine had been detected and US ships seemed to be attacking. The submarine’s air conditioning was broken,[2] CO2 levels were rising, and B-59 was out of contact with Moscow. Two of the senior officers on the submarine, thinking that a global war had started, wanted to launch their “secret weapon,” a 10-kiloton nuclear torpedo. The captain, Valentin Savistky, apparently exclaimed: “We’re gonna blast them now! We will die, but we will sink them all — we will not become the shame of the fleet.”  The ship was authorized to launch the torpedo without confirmation from Moscow, but all three senior officers on the ship had to agree.[3] Chief of staff of the flotilla Vasili Arkhipov refused. He convinced Captain Savitsky that the depth charges were signals for the Soviet submarine to surface (which they were) — if the US ships really wanted to destroy the B-59, they would have done it by now. (Part of the problem seemed to be that the Soviet officers were used to different signals than the ones the Americans were using.) Arkhipov calmed the captain down[4] and got him to surface the submarine to get orders from the Kremlin, which ended up eventually defusing the situation.  (Here's a Vox article on the incident.) The B-59 submarine. 1. ^ Vadim Orlov described the impact of the depth charges as being inside an oil drum getting struck with a sledgehammer. 2. ^ Temperatures were apparently above 45ºC (113ºF). 3. ^ The B-59 was apparently the only submarine in the flotilla that required three officers’ approval in order to fire the “special weapon” —
4
2d
2
I am a researcher in the space community and I recently wrote a post introducing the links between outer space and existential risk. I'm thinking about developing this into a sequence of posts on the topic. I plan to cover: 1. Cosmic threats - what are they, how are they currently managed, and what work is needed in this area. Cosmic threats include asteroid impacts, solar flares, supernovae, gamma-ray bursts, aliens, rogue planets, pulsar beams, and the Kessler Syndrome. I think it would be useful to provide a summary of how cosmic threats are handled, and determine their importance relative to other existential threats. 2. Lessons learned from the space community. The space community has been very open with data sharing - the utility of this for tackling climate change, nuclear threats, ecological collapse, animal welfare, and global health and development cannot be understated. I may include perspective shifts here, provided by views of Earth from above and the limitless potential that space shows us.  3. How to access the space community's expertise, technology, and resources to tackle existential threats.  4. The role of the space community in global politics. Space has a big role in preventing great power conflicts and building international institutions and connections. With the space community growing a lot recently, I'd like to provide a briefing on the role of space internationally to help people who are working on policy and war.  Would a sequence of posts on space and existential risk be something that people would be interested in? (please agree or disagree to the post) I haven't seen much on space on the forum (apart from on space governance), so it would be something new.
10
8d
1
One thing the AI Pause Debate Week has made salient to me: there appears to be a mismatch between the kind of slowing that on-the-ground AI policy folks talk about, versus the type that AI policy researchers and technical alignment people talk about. My impression from talking to policy folks who are in or close to government—admittedly a sample of only five or so—is that the main[1] coordination problem for reducing AI x-risk is about ensuring the so-called alignment tax gets paid (i.e., ensuring that all the big labs put some time/money/effort into safety, and that none “defect” by skimping on safety to jump ahead on capabilities). This seems to rest on the assumption that the alignment tax is a coherent notion and that technical alignment people are somewhat on track to pay this tax. On the other hand, my impression is that technical alignment people, and AI policy researchers at EA-oriented orgs,[2] are not at all confident in there being a viable level of time/money/effort that will produce safe AGI on the default trajectory. The type of policy action that’s needed, so they seem to say, is much more drastic. For example, something in the vein of global coordination to slow, limit, or outright stop development and deployment of AI capabilities (see, e.g., Larsen’s,[3] Bensinger’s, and Stein-Perlman’s debate week posts), whilst alignment researchers scramble to figure out how on earth to align frontier systems. I’m concerned by this mismatch. It would appear that the game plans of two adjacent clusters of people working to reduce AI x-risk are at odds. (Clearly, this is an oversimplification and there are a range of takes from within both clusters, but my current epistemic status is that this oversimplification gestures at a true and important pattern.) Am I simply mistaken about there being a mismatch here? If not, is anyone working to remedy the situation? Or does anyone have thoughts on how this arose, how it could be rectified, or how to prevent similar m
23
2mo
2
Has there been any formal probabilistic risk assessment on AI X-risk? e.g. fault tree analysis or event tree analysis — anything of that sort?
42
4mo
2
In Twitter and elsewhere, I've seen a bunch of people argue that AI company execs and academics are only talking about AI existential risk because they want to manufacture concern to increase investments and/or as a distraction away from near-term risks and/or regulatory capture. This is obviously false.  However, there is a nearby argument that is likely true: which is that incentives drive how people talk about AI risk, as well as which specific regulations or interventions they ask for. This is likely to happen both explicitly and unconsciously. It's important (as always) to have extremely solid epistemics, and understand that even apparent allies may have (large) degrees of self-interest and motivated reasoning.  Safety-washing is a significant concern; similar things have happened a bunch in other fields, it likely has already happened a bunch in AI, and will likely happen again in the months and years to come, especially if/as policymakers and/or the general public become increasingly uneasy about AI.
2
16d
People talk about AI resisting correction because successful goal-seekers "should" resist their goals being changed. I wonder if this also acts as an incentive for AI to attempt takeover as soon as it's powerful enough to have a chance of success, instead of (as many people fear) waiting until it's powerful enough to guarantee it. Hopefully the first AI powerful enough to potentially figure out that it wants to seize power and has a chance of succeeding is not powerful enough to passively resist value change, so acting immediately will be its only chance.
4
1mo
4
Seems like there's room in the ecosystem for a weekly update on AI that does a lot of contextualization / here's where we are on ongoing benchmarks. I'm familiar with:   * a weekly newsletter on AI media (that has a section on important developments that I like) * Jack Clark's substack which I haven't read much of but seems more about going in depth on new developments (though does have a "Why this matters" section. Also I love this post in particular for the way it talks about humility and confusion. * Doing Westminster Better on UK politics and AI / EA, which seems really good but again I think goes in depth on new stuff * I could imagine spending time on aggregation of prediction markets for specific topics, which Metaculus and Manifold are doing better and better over time. I'm interested in something that says "we're moving faster / less fast than we thought we would 6 months ago" or "this event is surprising because" and kind of gives a "you are here" pointer on the map. This Planned Obsolescence post called "Language models surprised us" I think is the closest I've seen. Seems hard, also maybe not worth it enough to do, also maybe it's happening and I'm not familiar with it, would love to hear, but it's what I'd personally find most useful and I suspect I'm not alone.
Load more (8/39)