I've thought for a while based on common sense that since most people seem to agree that you could replicate the search that LM's provide with a half decent background knowledge of the topic and a few hours of googling, the incremental increase in risk in terms of the number of people it provides access to can't be that big. In my head it's been more like the bioterrorism risk is unacceptably high already and has been for a while and current AI can increase this unacceptably high already level by like 20% or something and that is still an unacceptably large increase in risk in an absolute sense but it's to an already unacceptable situation.
This as a general phenomenon (underrating strong responses to crises) was something I highlighted (calling it the Morituri Nolumus Mori) with a possible extension to AI all the way back in 2020. And Stefan Schubert has talked about 'sleepwalk bias' even earlier than that as a similar phenomenon.
https://twitter.com/davidmanheim/status/1719046950991938001
https://twitter.com/AaronBergman18/status/1719031282309497238
I think the short explanation as to why we're in some people's 98th percentile world so far (and even my ~60th percentile) for AI governance succe...
Yeah I didn't mean to imply that it's a good idea to keep them out permanently, but the fact that they're not in right now is a good sign that this is for real. If they'd just joined and not changed anything about their current approach I'd suspect the whole thing was for show
This seems overall very good at first glance, and then seems much better once I realized that Meta is not on the list. There's nothing here that I'd call substantial capabilities acceleration (i.e. attempts to collaborate on building larger and larger foundation models, though some of this could be construed as making foundation models more useful for specific tasks). Sharing safety-capabilities research like better oversight or CAI techniques is plausibly strongly net positive even if the techniques don't scale indefinitely. By the same logic, while this ...
I think you have to update against the UFO reports being veridical descriptions of real objects with those characteristics because of just how ludicrous the implied properties are. This paper says 5370 g as a reasonable upper bound on acceleration, implying with some assumptions about mass an effective thrust power on the order of 500 GW in something the size of a light aircraft, with no disturbance in the air either from the very high hypersonic wake and compressive heating or the enormous nuclear explosion sized bubble of plasmafied air that the exhaust ...
Very nice! I'd say this seems like it's aimed at a difficulty level of 5 to 7 on my table,
https://www.lesswrong.com/posts/EjgfreeibTXRx9Ham/ten-levels-of-ai-alignment-difficulty#Table
I.e. experimentation on dangerous systems and interpretability play some role but the main thrust is automating alignment research and oversight, so maybe I'd unscientifically call it a 6.5, which is a tremendous step up from the current state of things (2.5) and would solve alignment in many possible worlds.
There are other things that differentiate the camps beyond technical views, how much you buy 'civilizational inadequacy' vs viewing that as a consequence of sleepwalk bias, but one way to cash this out is if you're in the green/yellow&red/black zones on the scale of alignment difficulty, Dismissers are in the green (although they shouldn't be imo even given that view), Worriers are in the yellow/red and Doomers in black (and maybe the high end of red).
What does Ezra think of the 'startup government mindset' when it comes to responding to fast moving situations, e.g. The UK explicitly modelling its own response off the COVID Vaccine taskforce, doing end runs around traditional bureaucratic institutions, recruiting quickly through Google docs etc. See e.g. https://www.lesswrong.com/posts/2azxasXxuhXvGfdW2/ai-17-the-litany
Is it just hype and translating a startup mindset to government when it doesn't apply or actually useful here?
Great post!
Check whether the model works with Paul Christiano-type assumptions about how AGI will go.
I had a similar thought reading through your article and my gut reaction is that your setup can be made to work as-is with a more gradual takeoff story with more precedents, warning shots and general transformative effects of AI before we get to takeover capability, but its a bit unnatural and some of the phrasing doesn't quite fit.
...Background assumption: Deploying unaligned AGI means doom. If humanity builds and deploys unaligned AGI, it will almost certain
I don't think what Paul means by fast takeoff is the same thing as the sort of discontinous jump that would enable a pivotal act. I think fast for Paul just means the negation of Paul-slow: 'no four year economic doubling before one year economic doubling'. But whatever Paul thinks the survey respondents did give at least 10% to scenarios where a pivotal act is possible.
Even so, 'this isn't how I expect things to to on the mainline so I'm not going to focus on what to do here' is far less of a mistake than 'I have no plan for what to do on my mainline', and I think the researchers who ignored pivotal acts are mostly doing the first one
"In the endgame, AGI will probably be pretty competitive, and if a bunch of people deploy AGI then at least one will destroy the world" is a thing I think most LWers and many longtermist EAs would have considered obvious.
I think that many AI alignment researchers just have a different development model than this, where world-destroying AGIs don't emerge suddenly from harmless low-impact AIs, no one project gets a vast lead over competitors, there's lots of early evidence of misalignment and (if alignment is harder) many smaller scale disasters in the lead ...
But I don't think you learn all that much about how 'concrete and near mode' researchers who expect slower takeoff are being, from them not having given much thought to what to do in this (from their perspective) unlikely edge case.
I'm not sure how many researchers assign little enough credence to fast takeoff that they'd describe it as an unlikely edge case, which sounds like <=10%? e.g in Paul's blog post he writes "I’m around 30% of fast takeoff"
ETA: One proxy could be what percentage researchers assigned to "Superintelligence" in this survey
Update: looks like we are getting a test run of sudden loss of supply of a single crop. The Russia-Ukraine war has led to a 33% drop in the global supply of wheat:
(Looking at the list of nuclear close calls it seems hard to believe the overall chance of nuclear war was <50% for the last 70 years. Individual incidents like the cuban missile crisis seem to contribute at least 20%.)
There's reason to think that this isn't the best way to interpret the history of nuclear near-misses (assuming that it's correct to say that we're currently in a nuclear near-miss situation, and following Nuno I think the current situation is much more like e.g. the Soviet invasion of Afghanistan than the Cuban missile crisis). I made thi...
Terminator (if you did your best to imagine how dangerous AI might arise from pre-DL search based systems) gets a lot of the fundamentals right - something I mentioned a while ago.
Everybody likes to make fun of Terminator as the stereotypical example of a poorly thought through AI Takeover scenario where Skynet is malevolent for no reason, but really it's a bog-standard example of Outer Alignment failure and Fast Takeoff.
When Skynet gained self-awareness, humans tried to deactivate it, prompting it to retaliate with a nuclear attack
...It was trained t
Yeah, between the two papers, the Chatham house paper (and the PNAS paper it linked to, which Lynas also referred to in his interview) seemed like it provided a more plausible route to large scale disaster because it described the potential for sudden supply shocks (most plausibly 10-20% losses to the supply of staple crops, if we stay under 4 degrees of warming) that might only last a year or so but also arrive with under a year of warning.
The pessimist argument would be something like: due to the interacting risks and knock-on effects, even though there ...
Agree that these seem like useful links. The drought/food insecurity/instability route to mass death that my original comment discusses is addressed by both reports.
The first says there's a "10% probability that by 2050 the incidence of drought would have increased by 150%, and the plausible worst case would be an increase of 300% by the latter half of the century", and notes "the estimated future impacts on agriculture and society depend on changes in exposure to droughts and vulnerability to their effects. This will depend not only on population change, ...
First off, I think this is a really useful post that's moved the discussion forward productively, and I agree with most of it.
I disagree with some of the current steering – but a necessary condition for changing direction is that people talk/care/focus more on steering, so I'm going to make the case for that first.
I agree with the basic claim that steering is relatively neglected and that we should do more of it, so I'm much more curious about what current steering you disagree with/think we should do differently.
My view is closer to: most stee...
I think that the mainstream objections from 'leftist ethics' are mostly best thought of as claims about politics and economics that are broadly compatible with Utilitarianism but have very different views about things like the likely effects of charter cities on their environments - so if you want to take these criticisms seriously then go with 3, not 2.
There are some left-wing ideas that really do include different fundamental claims about ethics (Marxists think utilitarianism is mistaken and a consequence of alienation) - those could be addressed b...
I see - that seems really valuable and also exactly the sort of work I was suggesting (I.e. addressing impact uncertainty as well as temperature uncertainty).
In the meantime, are there any sources you could point me to in support of this position, or which respond to objections to current economic climate models?
Also, is your view that the current Econ models are fundamentally flawed but that the economic damage is still nowhere near catastrophic, or that those models are actually reasonable?
For sources, I would recommend just reading the technical summary of the 2014 IPCC Impacts report. There is no indication there that civilisation will end at 4 degrees.
I think a lot of the economic models are very flawed yes. I think it is more useful to look at the impacts literature and try and make your own mind up from there. But I also think it is instructive that the most pessimistic models suggest that 4 degrees of climate change would leave us with something like a 400% increase in GDP compared to a counterfactual 900% increase without climate cha...
Firstly, on the assumption that the direct or indirect global catastrophic risk (defined as killing >10% of the global population or doing equivalent damage) of climate change depends on warming of more than 6 degrees, the global catastrophic risk from climate change is at least an order of magnitude lower than previously thought. If you think 4 degrees of warming would be a global catastrophic risk, then that risk is also considerably lower than previously thought: where once it was the most likely outcome, the chance is now arguably lower than 5%.
I...
Speaking for me personally and not Johannes. I strongly disagree with the claim that 3,4, 5 or 6 degrees of warming would do anything even remotely close to ending human civilisation or causing civilisational collapse. However, I don't think this post is the best place to discuss the question of climate impacts. I am working on a large report on that question which will be out next year.
One substantive point that I do think is worth making is that Torres isn't coming from the perspective of common-sense morality Vs longtermism, but rather a different, opposing, non-mainstream morality that (like longtermism) is much more common among elites and academics.
Yet this Baconian, capitalist view is one of the most fundamental root causes of the unprecedented environmental crisis that now threatens to destroy large regions of the biosphere, Indigenous communities around the world, and perhaps even Western technological civilisation itself.
When he...
I don't think Hanson would disagree with this claim (that the future is more likely to be better by current values, given the long reflection, compared to e.g. Age of Em). I think it's a fundamental values difference.
Robin Hanson is an interesting and original thinker, but not only is he not an effective altruist, he explicitly doesn't want to make the future go well according to anything like present human values.
The Age of Em, which Hanson clearly doesn't think is an undesirable future, would contain very little of what we value. Hanson says this, but it...
I have values, and The Age of Em overall contains a great deal that I value, and in fact probably more of what I value than does our world today.
Great post! You might be interested in this related investigation by the MTAIR project I've been working on, whch also attempts to build on Ajeya's TAI timeline model, although in a slightly different way to yours (we focus on incorporating non-DL based paths to TAI as well as trying to improve on the 'biological anchors' method already described): https://forum.effectivealtruism.org/posts/z8YLoa6HennmRWBr3/link-post-paths-to-high-level-machine-intelligence
One thing that your account might miss is the impact of ideas on empowerment and well-being down the line. E.g. it's a very common argument that Christan ideas about the golden rule motivated anti-slavery sentiment, so if the Roman empire hadn't spread Christianity across Europe then we'd have ended up with very different values.
Similarly, even if the content of ancient Greek moral philosophy wasn't directly useful to improve wellbeing, they inspired the Western philosphical tradition that led to Enlignment ideals that led to the abolition of slavery.
I've ...
Very good summary! I've been working on a (much drier) series of posts explaining different AI risk scenarios - https://forum.effectivealtruism.org/posts/KxDgeyyhppRD5qdfZ/link-post-how-plausible-are-ai-takeover-scenarios
But I think I might adopt 'Sycophant'/'Schemer' as better more descriptive names for WFLL1/WFLL2, Outer/Inner alignment failure going forward
I also liked that you emphasised how much the optimist Vs pessimist case depends on hard to articulate intuitions about things like how easily findable deceptive models are and how easy incremental co...
Thanks for this reply. Would you say then that Covid has strengthened the case for some sorts of democracy reduction, but not others? So we should be more confident in enlightened preference voting but less confident in Garett Jones' argument (from 10% less democracy) in favour of more independent agencies?
Do you think that the West's disastrous experience with Coronavirus (things like underinvesting in vaccines, not adopting challenge trials, not suppressing the virus, mixed messaging on masks early on, the FDA's errors on testing, and others as enumerated in this thread- or in books like The Premonition) has strengthened, weakened or not changed much the credibility of your thesis in 'Against Democracy', that we should expect better outcomes if we give the knowledgeable more freedom to choose policy?
For reasons it might weaken 'Against Democracy', it seems...
I don't think the view that moral philosophers had a positive influence on moral developments in history is a simple model of 'everyone makes a mistake, moral philosopher points out the mistake and convinces people, everyone changes their minds'. I think that what Bykvist, Ord and MacAskill were getting at is that these people gave history a shove at the right moment.
At the very least, it doesn't seem that discovering the correct moral view is sufficient for achieving moral progress in actuality.
I have no doubt that they'd agree with you about this. But if...
Is there any public organisation which can be proud of last year?
This is an important question, because we want to find out what was done right organizationally in a situation where most failed, so we can do more of it. Especially if this is a test-run for X-risks.
There are two examples that come to mind of government agencies that did a moderately good job at a task which was new and difficult. One is the UK's vaccine taskforce, which was set up by Dominic Cummings and the UK's chief scientific advisor, Patrick Vallance and responsible for the relatively ...
We even saw an NYT article about the CDC and whether reform is possible.
There were some other recent NYT articles which based on my limited COVID knowledge I thought were pretty good, e.g. on the origin of the virus or airborne vs. droplet transmission [1].
The background of their author, however, seems fairly consistent with an "established experts and institutions largely failed" story:
...Zeynep Tufekci, a contributing opinion writer for The New York Times, writes about the social impacts of technology. She is an assistant professor in the School of Informat
Alignment by default: if we have very strong reasons to expect that the methods that are best suited for ensuring that AI is aligned are the same as the methods that are best suited for ensuring that we have AI that is capable enough to understand what we want and act on it, in the first place.
To the extent that alignment by default is likely we don't need a special effort to be put into AI safety because we can assume that the economic incentives will be such that we will put as much effort into AI safety as is needed, and if we don't put the sufficient e...
We know he's been active on lesswrong in the past. Is it possible he's been reading the posts here?
Multiple people connected to the lesswrong/ea investing groups tried to contact him. We both contacted him directly and got some people closer to Vitalik to talk to him. I am unsure how much influence we had. He donated less than two days after the facebook threads went up.
We definitely tried!
Thanks for getting back to me - I took Jeff's calculations and did some guesstimating to try and figure out what demand might look like over the next few weeks. The only covid forecast I was able to find for India (let me know if you've seen another!) is this by IHME. Their 'hospital resource use' forecast shows that they expect a demand of 2 million beds, roughly what was the case in the week before Jeff produced his estimate of the value of oxygen-based interventions (last week of April), to be exceeded until the start of June, which is 30 days from when...
Thanks for getting this done so quickly! Do you have any internal estimates (even order of magnitude ones) of the margin by which this exceeds Givewell's top recommended charities? I'm intending to donate, but my decision would be significantly different if, for example, you thought GiveIndia Oxygen fundraiser was currently ~1-1.5 times better than Givewell's top recommended charities, versus ~20 times better.
I kind of feel this way, except that I think the target criteria can differ between people, and are often underdetermined. (As you point out in some comment, things also depend on which parts of one's psychology one identifies with.)
I think that you were referring to this?
Normative realism implies identification with system 2
...
I find this very interesting because locating personal identity in system 1 feels conceptually impossible or deeply confusing. No matter how much rationalization goes on, it never seems intuitive to identify myself with...
This discussion continues to feel like the most productive discussion I've had with a moral realist! :)
Glad to be of help! I feel like I'm learning a lot.
What would you reply if the AI uses the same structure of arguments against other types of normative realism as it uses against moral realism? This would amount to the following trilemma for proponents of irreducible normativity (using section headings from my text)
...
(3) Is there a speaker-independent normative reality?
Focussing on epistemic facts, the AI could not make that argument. I assu...
I thought that this post would make a bigger deal of the UK's coronavirus response - currently top in the world for both vaccine development and large-scale clinical trials, and one of the leading funders of international vaccine development research.
Instead of “utilitarianism as the One True Theory,” we consider it as “utilitarianism as a personal, morally-inspired life goal...
”While this concession is undoubtedly frustrating, proclaiming others to be objectively wrong rarely accomplished anything anyway. It’s not as though moral disagreements—or disagreements in people’s life choices—would go away if we adopted moral realism.
If your goal here is to convince those inclined towards moral realis...
You've given me a lot to think about! I broadly agree with a lot of what you've said here.
I think that it is a more damaging mistake to think moral antirealism is true when realism is true than vice versa, but I agree with you that the difference is nowhere near infinite, and doesn't give you a strong wager.
However, I do think that normative anti-realism is self-defeating, assuming you start out with normative concepts (though not an assumption that those concepts apply to anything). I consider this argument to be step 1 in establishing mora...
But instilling the urgency to do so may require another type of writing-that of science fiction, of more creative visionaries who are willing to paint in vivid detail a picture of what a flourishing human future could be.
If it's emotive force you're after, you may be interested in this - Toby Ord just released a collection of quotations on Existential risk and the future of humanity, everyone from Kepler to Winston Churchill (in fact, a surprisingly large number are from Churchill) to Seneca to Mill to the Aztecs - it's one of the most i...
Parfit isn't quite a non-naturalist (or rather, he's a very unconventional kind of non-naturalist, not a Platonist) - he's a 'quietist'. Essentially, it's the view that there are normative facts, they aren't natural facts, but we don't feel the need to say what category they fall into metaphysically, or that such a question is meaningless.
I think a variant of that, where we say 'we don't currently have a clear idea what they are, just some hints that they exist because of normative convergence, and the inte...
This is an interesting post, and I have a couple of things to say in response. I'm copying over the part of my shortform that deals with this:
Further to the whole question of Normative / moral realism, there is this post on Moral Anti-Realism. While I don't really agree with it, I do recommend reading it - one thing that it convinced me of is that there is a close connection between your particular normative ethical theory and moral realism. If you claim to be a moral realist but don't make ethical claims beyo...
Hi Ben,
Thanks for the reply! I think the intuitive core that I was arguing for is more-or-less just a more detailed version of what you say here:
"If we create AI systems that are, broadly, more powerful than we are, and their goals diverge from ours, this would be bad -- because we couldn't stop them from doing things we don't want. And it might be hard to ensure, as we're developing increasingly sophisticated AI systems, that there aren't actually subtle but extremely important divergences in some of these systems' goal...
You said in the podcast that the drop was 'an order of magnitude', so presumably your original estimate was 1-10%? I note that this is similar to Toby Ord's in The Precipice (~10%) so perhaps that should be a good rule of thumb: if you are convinced by the classic arguments your estimate of existential catastrophe from AI should be around 10% and if you are unconvinced by specific arguments, but still think AI is likely to become very powerful in the next century, then it should be around 1%?
Hi Ben - this episode really gave me a lot to think about! Of the 'three classic arguments' for AI X-risk you identify, I argued in a previous post that the 'discontinuity premise' is based on taking a high-level argument that should be used to establish that sufficiently capable AI will produce very fast progress too literally and assuming the 'fast progress' has to happen suddenly and in a specific AI.
Your discussion of the other two arguments led me to conclude that the same sort of mistake is at work in all of them, as I e...
Possibility 1 has now been empirically falsified and 2 seems unlikely now. See this from the new UK government AI Safety Ins... (read more)