Greg_Colbourn ⏸️

5659 karmaJoined
Interests:
Slowing down AI

Bio

Participation
4

Global moratorium on AGI, now (Twitter). Founder of CEEALAR (née the EA Hotel; ceealar.org)

Comments
1128

Having the superpowers on board is the main thing. If others opt out, then enforcement against them can be effective in that case.

No, but it's far better than what we have now.

It's not "longtermist" or "fanatical" at all (or even altruistic) to try and prevent yourself and everyone else on the planet (humans and animals) being killed in the near future by uncontrollable ASI[1] (quite possibly in a horrible, painful[2], way[3]).

  1. ^

    Indeed, there are many non-EAs who care a great deal about this issue now.

  2. ^

    I mention this as it's a welfarist consideration, even if one doesn't care about death in and of itself.

  3. ^

    Ripped apart by self-replicating computronium-building nanobots, anyone?

My model looks something like this:

There are a bunch of increasingly hard questions on the Alignment Test. We need to get enough of the core questions right to avoid the ASI -> everyone quickly dies scenario. This is the 'passing grade'. There are some bonus/extra credit questions that we need to also get right to get an A (a flourishing future). 

I think the bonus/extra credit questions are part of the main test - if you don't get them right everyone still dies, but maybe a bit more slowly.

All the doom flows through the cracks of imperfect alignment/control. And we can asymptote toward, but never reach, existential safety[1].

  1. ^

    Of course this applies to all other x-risks too. It's just that ASI x-risk is very near term and acute (in absolute terms, and relative to all the others), and we aren't even starting in earnest with the asymptoting yet (and likely won't if we don't get a Pause).

Hi Niel, what I'd like to see is an argument for the tractability of successfully "navigating the transition to a world with AGI" without a global catastrophe (or extinction) (i.e. an explanation for why your p(doom|AGI) is lower). I think this is much less tractable than getting a (really effective) Pause! (Even if a Pause itself is somewhat unlikely at this point.)

I think most people in EA have relatively low (but still macroscopic) p(doom)s (e.g. 1-20%), and have the view that "by default, everything turns out fine". And I don't think this has ever been sufficiently justified. The common view is that alignment will just somehow be solved enough to keep us alive, and maybe even thrive (if we just keep directing more talent and funding to research). But then the extrapolation to the ultimate implications of such imperfect alignment (e.g. gradual disempowerment -> existential catastrophe) never happens.

Ok, so in the spirit of 

EA’s focus on collaborativeness and truthseeking has meant that people encouraged us to interrogate whether our previous plans were in line with our beliefs 

[about p(doom|AGI)], and

we aim to be prepared to change our minds and plans if the evidence

 [is lacking], I ask if you have seriously considered whether

safely navigating the transition to a world with AGI

is even possible? (Let alone at all likely from where we stand.)

You (we all) should be devoting a significant fraction of resources toward slowing down/pausing/stopping AGI (e.g. pushing for a well enforced global non-proliferation treaty on AGI/ASI), if we want there to be a future at all.

Reposting this from Daniel Kokotajlo:

This is probably the most important single piece of evidence about AGI timelines right now. Well done! I think the trend should be superexponential, e.g. each doubling takes 10% less calendar time on average. Eli Lifland and I did some calculations yesterday suggesting that this would get to AGI in 2028. Will do more serious investigation soon.

Why do I expect the trend to be superexponential? Well, it seems like it sorta has to go superexponential eventually. Imagine: We've got to AIs that can with ~100% reliability do tasks that take professional humans 10 years. But somehow they can't do tasks that take professional humans 160 years? And it's going to take 4 more doublings to get there? And these 4 doublings are going to take 2 more years to occur? No, at some point you "jump all the way" to AGI, i.e. AI systems that can do any length of task as well as professional humans -- 10 years, 100 years, 1000 years, etc.

Also, zooming in mechanistically on what's going on, insofar as an AI system can do tasks below length X but not above length X, it's gotta be for some reason -- some skill that the AI lacks, which isn't important for tasks below length X but which tends to be crucial for tasks above length X. But there are only a finite number of skills that humans have that AIs lack, and if we were to plot them on a horizon-length graph (where the x-axis is log of horizon length, and each skill is plotted on the x-axis where it starts being important, such that it's not important to have for tasks less than that length) the distribution of skills by horizon length would presumably taper off, with tons of skills necessary for pretty short tasks, a decent amount necessary for medium tasks (but not short), and a long thin tail of skills that are necessary for long tasks (but not medium), a tail that eventually goes to 0, probably around a few years on the x-axis. So assuming AIs learn skills at a constant rate, we should see acceleration rather than a constant exponential. There just aren't that many skills you need to operate for 10 days that you don't also need to operate for 1 day, compared to how many skills you need to operate for 1 hour that you don't also need to operate for 6 minutes.

There are two other factors worth mentioning which aren't part of the above: One, the projected slowdown in capability advances that'll come as compute and data scaling falters due to becoming too expensive. And two, pointing in the other direction, the projected speedup in capability advances that'll come as AI systems start substantially accelerating AI R&D.

This is going viral on X (2.8M views as of posting this comment).

On this view, why not work to increase extinction risk? (It would be odd if doing nothing was the best course of action when the stakes are so high either way.)

Even if it seems net-negative now, we don't know that it always will be (and we can work to make it net-positive!).

Also, on this view, why not work to increase our chance of extinction?

Load more