Existential risk
Existential risk
Discussions of risks which threaten the destruction of the long-term potential of life

Quick takes

Two sources of human misalignment that may resist a long reflection: malevolence and ideological fanaticism (Alternative title: Some bad human values may resist idealization[1]) The values of some humans, even if idealized (e.g., during some form of long reflection), may be incompatible with an excellent future. Thus, solving AI alignment will not necessarily lead to utopia. Others have raised similar concerns before.[2] Joe Carlsmith puts it especially well in the post “An even deeper atheism”: What makes human hearts bad?  What, exactly, makes some human hearts bad drivers? If we better understood what makes hearts go bad, perhaps we could figure out how to make bad hearts good or at least learn how to prevent hearts from going bad. It would also allow us better spot potentially bad hearts and coordinate our efforts to prevent them from taking the driving seat. As of now, I’m most worried about malevolent personality traits and fanatical ideologies.[3] Malevolence: dangerous personality traits Some human hearts may be corrupted due to elevated malevolent traits like psychopathy, sadism, narcissism, Machiavellianism, or spitefulness. Ideological fanaticism: dangerous belief systems There are many suitable definitions of “ideological fanaticism”. Whatever definition we are going to use, it should describe ideologies that have caused immense harm historically, such as fascism (Germany under Hitler, Italy under Mussolini), (extreme) communism (the Soviet Union under Stalin, China under Mao), religious fundamentalism (ISIS, the Inquisition), and most cults.  See this footnote[4] for a preliminary list of defining characteristics. Malevolence and fanaticism seem especially dangerous Of course, there are other factors that could corrupt our hearts or driving ability. For example, cognitive biases, limited cognitive ability, philosophical confusions, or plain old selfishness.[5] I’m most concerned about malevolence and ideological fanaticism for two reasons
Mildly against the Longtermism --> GCR shift Epistemic status: Pretty uncertain, somewhat rambly TL;DR replacing longtermism with GCRs might get more resources to longtermist causes, but at the expense of non-GCR longtermist interventions and broader community epistemics Over the last ~6 months I've noticed a general shift amongst EA orgs to focus less on reducing risks from AI, Bio, nukes, etc based on the logic of longtermism, and more based on Global Catastrophic Risks (GCRs) directly. Some data points on this: * Open Phil renaming it's EA Community Growth (Longtermism) Team to GCR Capacity Building * This post from Claire Zabel (OP) * Giving What We Can's new Cause Area Fund being named "Risk and Resilience," with the goal of "Reducing Global Catastrophic Risks" * Longview-GWWC's Longtermism Fund being renamed the "Emerging Challenges Fund" * Anecdotal data from conversations with people working on GCRs / X-risk / Longtermist causes My guess is these changes are (almost entirely) driven by PR concerns about longtermism. I would also guess these changes increase the number of people donation / working on GCRs, which is (by longtermist lights) a positive thing. After all, no-one wants a GCR, even if only thinking about people alive today. Yet, I can't help but feel something is off about this framing. Some concerns (no particular ordering): 1. From a longtermist (~totalist classical utilitarian) perspective, there's a huge difference between ~99% and 100% of the population dying, if humanity recovers in the former case, but not the latter. Just looking at GCRs on their own mostly misses this nuance. * (see Parfit Reasons and Persons for the full thought experiment) 2. From a longtermist (~totalist classical utilitarian) perspective, preventing a GCR doesn't differentiate between "humanity prevents GCRs and realises 1% of it's potential" and "humanity prevents GCRs realises 99% of its potential" * Preventing an extinction-level GCR might move u
Y-Combinator wants to fund Mechanistic Interpretability startups "Understanding model behavior is very challenging, but we believe that in contexts where trust is paramount it is essential for an AI model to be interpretable. Its responses need to be explainable. For society to reap the full benefits of AI, more work needs to be done on explainable AI. We are interested in funding people building new interpretable models or tools to explain the output of existing models." Link https://www.ycombinator.com/rfs (Scroll to 12) What they look for in startup founders https://www.ycombinator.com/library/64-what-makes-great-founders-stand-out
(COI note: I work at OpenAI. These are my personal views, though.) My quick take on the "AI pause debate", framed in terms of two scenarios for how the AI safety community might evolve over the coming years: 1. AI safety becomes the single community that's the most knowledgeable about cutting-edge ML systems. The smartest up-and-coming ML researchers find themselves constantly coming to AI safety spaces, because that's the place to go if you want to nerd out about the models. It feels like the early days of hacker culture. There's a constant flow of ideas and brainstorming in those spaces; the core alignment ideas are standard background knowledge for everyone there. There are hackathons where people build fun demos, and people figuring out ways of using AI to augment their research. Constant interactions with the models allows people to gain really good hands-on intuitions about how they work, which they leverage into doing great research that helps us actually understand them better. When the public ends up demanding regulation, there's a large pool of competent people who are broadly reasonable about the risks, and can slot into the relevant institutions and make them work well. 2. AI safety becomes much more similar to the environmentalist movement. It has broader reach, but alienates a lot of the most competent people in the relevant fields. ML researchers who find themselves in AI safety spaces are told they're "worse than Hitler" (which happened to a friend of mine). People get deontological about AI progress; some hesitate to pay for ChatGPT because it feels like they're contributing to the problem (another true story); others overemphasize the risks of existing models in order to whip up popular support. People are sucked into psychological doom spirals similar to how many environmentalists think about climate change: if you're not depressed then you obviously don't take it seriously enough. Just like environmentalists often block some of the most valua
Not that we can do much about it, but I find the idea of Trump being president in a time that we're getting closer and closer to AGI pretty terrifying. A second Trump term is going to have a lot more craziness and far fewer checks on his power, and I expect it would have significant effects on the global trajectory of AI.
Longtermist shower thought: what if we had a campaign to install Far-UVC in poultry farms? Seems like it could: 1. Reduce a bunch of diseases in the birds, which is good for: a. the birds’ welfare; b. the workers’ welfare; c. Therefore maybe the farmers’ bottom line?; d. Preventing/suppressing human pandemics (eg avian flu) 2. Would hopefully drive down the cost curve of Far-UVC 3. May also generate safety data in chickens, which could be helpful for derisking it for humans Insofar as one of the main obstacles is humans' concerns for health effects, this would at least only raise these for a small group of workers.
Vasili Arkhipov is discussed less on the EA Forum than Petrov is (see also this thread of less-discussed people). I thought I'd post a quick take describing that incident. Arkhipov & the submarine B-59’s missile On October 27, 1962 (during the Cuban Missile Crisis), the Russian diesel-powered submarine B-59 started experiencing[1] nearby depth charges from US forces above them; the submarine had been detected and US ships seemed to be attacking. The submarine’s air conditioning was broken,[2] CO2 levels were rising, and B-59 was out of contact with Moscow. Two of the senior officers on the submarine, thinking that a global war had started, wanted to launch their “secret weapon,” a 10-kiloton nuclear torpedo. The captain, Valentin Savistky, apparently exclaimed: “We’re gonna blast them now! We will die, but we will sink them all — we will not become the shame of the fleet.”  The ship was authorized to launch the torpedo without confirmation from Moscow, but all three senior officers on the ship had to agree.[3] Chief of staff of the flotilla Vasili Arkhipov refused. He convinced Captain Savitsky that the depth charges were signals for the Soviet submarine to surface (which they were) — if the US ships really wanted to destroy the B-59, they would have done it by now. (Part of the problem seemed to be that the Soviet officers were used to different signals than the ones the Americans were using.) Arkhipov calmed the captain down[4] and got him to surface the submarine to get orders from the Kremlin, which ended up eventually defusing the situation.  (Here's a Vox article on the incident.) The B-59 submarine. 1. ^ Vadim Orlov described the impact of the depth charges as being inside an oil drum getting struck with a sledgehammer. 2. ^ Temperatures were apparently above 45ºC (113ºF). 3. ^ The B-59 was apparently the only submarine in the flotilla that required three officers’ approval in order to fire the “special weapon” — the othe
After talking and working for some time with non-EA organisations in the AI Policy space, I believe that we need to give more credence to the here-and-now of AI safety policy as well to get the attention of policymakers and get our foot in the door. That also gives us space to collaborate with other think tanks and organisations outside of the x-risk space that are proactive and committed to AI policy. Right now, a lot of those people also see X-risks as being fringe and radical(and these are people who are supposed to be on our side). Governments tend to move slowly, with due process, and in small increments(think, "We are going to first maybe do some risk monitoring, only then auditing"). Policymakers are only visionaries with horizons until the end of their terms(hmm, no surprise). Usually, broad strokes in policy require precedents of a similar size for it to be feasible within a policymakers' agenda and the Overton window.  Every group that comes to a policy meeting thinks that their agenda item is the most pressing because, by definition, most of the time, contacting and getting meetings with policymakers means that you are proactive and have done your homework. I want to see more EAs respond to Public Voice Opportunities, for instance- something I rarely hear on the EA forum or via EA channels/material. 
Load more (8/69)