Hide table of contents

GPT-5 training is probably starting around now. It seems very unlikely that GPT-5 will cause the end of the world. But it’s hard to be sure. I would guess that GPT-5 is more likely to kill me than an asteroid, a supervolcano, a plane crash or a brain tumor. We can predict fairly well what the cross-entropy loss will be, but pretty much nothing else.

Maybe we will suddenly discover that the difference between GPT-4 and superhuman level is actually quite small. Maybe GPT-5 will be extremely good at interpretability, such that it can recursively self improve by rewriting its own weights.

Hopefully model evaluations can catch catastrophic risks before wide deployment, but again, it’s hard to be sure. GPT-5 could plausibly be devious enough to circumvent all of our black-box testing. Or it may be that it’s too late as soon as the model has been trained. These are small, but real possibilities and it’s a significant milestone of failure that we are now taking these kinds of gambles.

How do we do better for GPT-6?

Governance efforts are mostly focussed on relatively modest goals. Few people are directly aiming at the question: how do we stop GPT-6 from being created at all? It’s difficult to imagine a world where governments actually prevent Microsoft from building a $100 billion AI training data center by 2028.

In fact, OpenAI apparently fears governance so little that they just went and told the UK government that they won’t give it access to GPT-5 for pre-deployment testing [Edit - 17 May 2024: I now think this is probably false]. And the number of safety focussed researchers employed by OpenAI is dropping rapidly.

Hopefully there will be more robust technical solutions for alignment available by the time GPT-6 training begins. But few alignment researchers actually expect this, so we need a backup plan.

Plan B: Mass protests against AI

In many ways AI is an easy thing to protest against. Climate protesters are asking to completely reform the energy system, even if it decimates the economy. Israel / Palestine protesters are trying to sway foreign policies on an issue where everyone already holds deeply entrenched views. Social justice protesters want to change people’s attitudes and upend the social system.

AI protesters are just asking to ban a technology that doesn’t exist yet. About 0% of the population deeply cares that future AI systems are built. Most people support pausing AI development. It doesn’t feel like we’re asking normal people to sacrifice anything. They may in fact be paying a large opportunity cost on the potential benefits of AI, but that’s not something many people will get worked up about. Policy-makers, CEOs and other key decision makers that governance solutions have to persuade are some of the only groups that are highly motivated to let AI development continue.

No innovation required

Protests are the most unoriginal way to prevent an AI catastrophe - we don’t have to do anything new. Previous successful protesters have made detailed instructions for how to build a protest movement.

This is the biggest advantage of protests compared to other solutions - it requires no new ideas (unlike technical alignment) and no one's permission (unlike governance solutions). A sufficiently large number of people taking to the streets forces politicians to act. A sufficiently large and well organized special interest group can control an issue:

I walked into my office while this was going on and found a sugar lobbyist hanging around, trying to stay close to the action. I felt like being a smart-ass so I made some wise-crack about the sugar industry raping the taxpayers. Without another word, I walked into my private office and shut the door. I had no real plan to go after the sugar people. I was just screwing with the guy.

My phone did not stop ringing for the next five weeks….I had no idea how many people in my district were connected to the sugar industry. People were calling all day, telling me they made pumps or plugs or boxes or some other such part used in sugar production and I was threatening their job. Mayors called to tell me about employers their towns depended on who would be hurt by a sugar downturn. It was the most organized effort I had ever seen.

And that’s why you don’t fuck with sugar.

The discomfort of doing something weird

If we are correct about the risk of AI, history will look kindly upon us (assuming we survive). Already people basically know about AI x-risk and understand that it is not a ridiculous conspiracy theory. But for now protesting about AI is kind of odd. This doesn’t have to be a bad thing - PauseAI protests are a great way to meet interesting, unusual people. Talking about PauseAI is a conversation starter because it’s such a surprising thing to do.

When AI starts to have a large impact on the economy, it will naturally move up the priority list of the general population. But people react too late to exponentials. If AI continues to improve at the current rate, the popular reaction may come too late to avoid the danger. PauseAI’s aim is to bring that reaction forward.

Some AI researchers think that they should not go to protests because it is not their comparative advantage. But this is wrong, the key skill required is the ability to do something weird - to take ideas seriously and to actually try to fix important problems. The protests are currently so small that the marginal impact of an extra person showing up for a couple of hours once every few months is very large.

Preparing for the moment

I think a lot about this post from just after ChatGPT came out, asking why the alignment community wasn’t more prepared to seize the moment when everyone suddenly noticed that AI was getting good. I think this is a good question and one of the reasons is that most alignment researchers did not see it coming.

There will be another moment like that, when people realize that AI is coming for their job imminently and that AI is an important issue affecting their lives. We need to be prepared for that opportunity and the small movement that PauseAI builds now will be the foundation which bootstraps this larger movement in the future.

To judge the value of AI protests by the current, small protests would be to judge the impact of AI by the current language models (a mistake which most of the world appears to be making). We need to build the mass movement. We need to become the Sugar Lobby.

PauseAI’s next protest is on Monday 13 May, in 8 cities around the world.

141

12
4
6

Reactions

12
4
6

More posts like this

Comments36
Sorted by Click to highlight new comments since:

One concern I haven't seen raised elsewhere: PauseAI was initiated during the 'era' of scaling training-time compute, and seems predicated on the assumption that we can stop the development of more advanced AIs by stopping big labs from training bigger models. 

However, the paradigm has shifted. As Ilya Sutskever discussed in his 2024 NeurIPS talk, we've hit slowdowns on scaling pre-training and post-training compute. (You could say that AI development along these lines has already naturally 'paused', not due to political pressure, but due to technical and economic factors) 

Instead, gains are increasingly coming from scaling inference-time compute. And this is no longer in the future - it can be straightforwardly done now, as long as you have the compute to run queries. So I feel like there's a need to also govern inference-time laws. 

(FWIW: I recognise two things can be important at once) 

Great point! Off the cuff I don't think this massively changes considerations for PauseAI, but I'll need to think about this.

I'd like to make clear to anyone reading that you can support the PauseAI movement right now, only because you think it is useful right now. And then in the future, when conditions change, you can choose to stop supporting the PauseAI movement. 

AI is changing extremely fast (e.g. technical work was probably our best bet a year ago, I'm less sure now). Supporting a particular tactic/intervention does not commit you to an ideology or team forever!

I'd like to add an asterisk. It is true that you can and should support things that seem good while they seem good and then retract support, or express support on the margin but not absolutely. But sometimes supporting things for a period has effects you can't easily take back. This is especially the case if (1) added marginal support summons some bigger version of the thing that, once in place, cannot be re-bottled, or (2) increased clout for that thing changes the culture significantly (I think cultural changes are very hard to reverse; culture generally doesn't go back, only moves on).

I think there are many cases where, before throwing their lot in with a political cause for instrumental reasons, people should've first paused to think more about whether this is the type of thing they'd like to see more of in general. Political movements also tend to have an enormous amount of inertia, and often end up very influenced by by path-dependence and memetic fitness gradients.

Thanks for your comment Rudolf! I predict that my comment is going to be extremely downvoted but I'm writing it partly because I think it is true and partly because it points to a meta issue in EA:

I think it is unrealistic to ask people to internalise the level of ambiguity you're proposing. This is how EA's turn themselves into mental pretzels of innaction.

Yup.

the small movement that PauseAI builds now will be the foundation which bootstraps this larger movement in the future

Is one of the main points of my post. If you support PauseAI today you may unleash a force which you cannot control tomorrow.

A few questions:

  • What is the risk level below which you'd be OK with unpausing AI?
  • What do you think about the potential benefits from AI?
  • How do you interpret models of AI pause, such as this one from Chad Jones?
  • What is the risk level below which you'd be OK with unpausing AI?

I think approximately 1 in 10,000 chance of extinction for each new GPT would be acceptable given the benefits of AI. This is approximately my guess for GPT-5, so if we could release that model and then pause, I'd be okay with that.

A major consideration here is the use of AI to mitigate other x-risks. Some of Toby Ord's x-risk estimates:

  • AI - 1 in 10
  • Engineering Pandemic - 1 in 30
  • Unforeseen anthropogenic risks (eg. dystopian regime, nanotech) - 1 in 30
  • Other anthropogenic risks - 1 in 50
  • Nuclear war - 1 in 1000
  • Climate change - 1 in 1000
  • Other environmental damage 1 in 1000
  • Supervolcano - 1 in 10,000

If there was a concrete plan under which AI could be used to mitigate pandemics and anthropogenic risks, then I would be ok with a higher probability of AI extinction, but it seems more likely that AI progress would increase these risks before it decreased them.

AI could be helpful for climate change and eventually nuclear war. So maybe I should be willing to go a little higher on the risk. But we might need a few more GPTs to fix these problems and if each new GPT is 1 in 10,000 then it starts to even out.

  • What do you think about the potential benefits from AI?

I'm very bullish about the benefits of an aligned AGI. Besides mitigating x-risk, I think curing aging should be a top priority and is worth taking some risks to obtain.

I've read the post quickly, but I don't have a background in economics, so it would take me a while to fully absorb. My first impression is that it is interesting but not that useful for making decisions right now. The simplifications required by the model offset the gains in rigor. What do you think? Is it something I should take the time to understand?

My guess would be that the discount rate is pretty cruxy. Intuitively I would expect almost any gains over the next 1000 years to be offset by reductions in x-risk since we could have zillions of years to reap the benefits. (On a meta-level I believe moral questions are not "truthy" so this is just according to my vaguely total utilitarian preferences, not some deeper truth).

I think approximately 1 in 10,000 chance of extinction for each new GPT would be acceptable given the benefits of AI. This is approximately my guess for GPT-5, so I think if we could release that model and then pause, I'd be okay with that.

To me, this is wild. 1/10,000 * 8 billion people = 800,000 current lives lost in expectation, not even counting future lives. If you think GPT-5 is worth 800k+ human lives, you must have high expectations. :)

When you're weighing existential risks (or other things which steer human civilization on a large scale) against each other, effects are always going to be denominated in a very large number of lives. And this is what OP said they were doing: "a major consideration here is the use of AI to mitigate other x-risks". So I don't think the headline numbers are very useful here (especially because we could make them far far higher by counting future lives).

Thanks for the comment, Richard.

So I don't think the headline numbers are very useful here (especially because we could make them far far higher by counting future lives).

I used to prefer focussing on tail risk, but I now think expected deaths are a better metric.

  • Interventions in the effective altruism community are usually assessed under 2 different frameworks, existential risk mitigation, and nearterm welfare improvement. It looks like 2 distinct frameworks are needed given the difficulty of comparing nearterm and longterm effects. However, I do not think this is quite the right comparison under a longtermist perspective, where most of the expected value of one’s actions results from influencing the longterm future, and the indirect longterm effects of saving lives outside catastrophes cannot be neglected.
  • In this case, I believe it is better to use a single framework for assessing interventions saving human lives in catastrophes and normal times. One way of doing this, which I consider in this post, is supposing the benefits of saving one life are a function of the population size.
  • Assuming the benefits of saving a life are proportional to the ratio between the initial and final population, and that the cost to save a life does not depend on this ratio, it looks like saving lives in normal times is better to improve the longterm future than doing so in catastrophes.

Thanks for pointing that out, Ted!

1/10,000 * 8 billion people = 800,000 current lives lost in expectation

The expected death toll would be much greater than 800 k assuming a typical tail distribution. This is the expected death toll linked solely to the maximum severity, but lower levels of severity would add to it. Assuming deaths follow a Pareto distribution with a tail index of 1.60, which characterises war deaths, the minimum deaths would be 25.3 M (= 8*10^9*(10^-4)^(1/1.60)). Consequently, the expected death toll would be 67.6 M (= 1.60/(1.60 - 1)*25.3*10^6), i.e. 1.11 (= 67.6/61) times the number of deaths in 2023, or 111 (= 67.6/0.608) times the number of malaria deaths in 2022. I certainly agree undergoing this risk would be wild.

Side note. I think the tail distribution will eventually decay faster than that of a Pareto distribution, but this makes my point stronger. In this case, the product between the deaths and their probability density would be lower for higher levels of severity, which means the expected deaths linked to such levels would represent a smaller fraction of the overall expected death toll.

Thanks for elaborating, Joseph!

A major consideration here is the use of AI to mitigate other x-risks. Some of Toby Ord's x-risk estimates

I think Toby's existential risk estimates are many orders of magnitude higher than warranted. I estimated an annual extinction risk of 5.93*10^-12 for nuclear wars, 2.20*10^-14 for asteroids and comets, 3.38*10^-14 for supervolcanoes, a prior of 6.36*10^-14 for wars, and a prior of 4.35*10^-15 for terrorist attacks. These values are already super low, but I believe existential risk would still be orders of magnitude lower. I think there would only be a 0.0513 % (= e^(-10^9/(132*10^6))) chance of a repetition of the last mass extinction 66 M years ago, the Cretaceous–Paleogene extinction event, being existential. I got my estimate assuming:

  • An exponential distribution with a mean of 132 M years (= 66*10^6*2) represents the time between i) human extinction in such catastrophe and ii) the evolution of an intelligent sentient species after such a catastrophe. I supposed this on the basis that:
    • An exponential distribution with a mean of 66 M years describes the time between:
      • 2 consecutive such catastrophes.
      • i) and ii) if there are no such catastrophes.
    • Given the above, i) and ii) are equally likely. So the probability of an intelligent sentient species evolving after human extinction in such a catastrophe is 50 % (= 1/2).
    • Consequently, one should expect the time between i) and ii) to be 2 times (= 1/0.50) as long as that if there were no such catastrophes.
  • An intelligent sentient species has 1 billion years to evolve before the Earth becomes habitable.

Hi Matthew! I'd be curious to hear your thoughts on a couple of questions (happy for you to link if you've posted elsewhere): 

1/ What is the risk level above which you'd be OK with pausing AI?

2/ Under what conditions would you be happy to attend a protest? (LMK if you have already attended one!)

What is the risk level above which you'd be OK with pausing AI?

My loose off-the-cuff response to this question is that I'd be OK with pausing if there was a greater than 1/3 chance of doom from AI, with the caveats that:

  • I don't think p(doom) is necessarily the relevant quantity. What matters is the relative benefit of pausing vs. unpausing, rather than the absolute level of risk.
  • "doom" lumps together a bunch of different types of risks, some of which I'm much more OK with compared to others. For example, if humans become a gradually weaker force in the world over time, and then eventually die off in some crazy accident in the far future, that might count as "humans died because of AI" but it's a lot different than a scenario in which some early AIs overthrow our institutions in a coup and then commit genocide against humans.
  • I think it would likely be more valuable to pause later in time during AI takeoff, rather than before AI takeoff

Under what conditions would you be happy to attend a protest? (LMK if you have already attended one!)

I attended the protest against Meta because I thought their approach to AI safety wasn't very thoughtful, although I'm still not sure it was a good decision to attend. I'm not sure what would make me happy to attend a protest, but these scenarios might qualify:

  • A company or government is being extremely careless about deploying systems that pose great risks to the world. (This doesn't count situations in which the system poses negligible risks but some future system could pose a greater risk.)
  • The protesters have clear, reasonable demands that I broadly agree with (e.g. they don't complain much about AI taking people's jobs, or AI being trained on copyrighted data, but are instead focused on real catastrophic risks that are directly addressed by the protest).

There's a crux which is very important. If you only want to attend protests where the protesters are reasonable and well informed and agree with you, then you implicitly only want to attend small protests.

It seems pretty clear to me that most people are much less concerned about x-risk than job loss and other concerns. So we have to make a decision - do we stick to our guns and have the most epistemically virtuous protest movement in history and make it 10x harder to recruit new people and grow the moment? Or do we compromise and welcome people with many concerns, form alliances with groups we don't agree with in order to have a large and impactful movement?

It would be a failure of instrumental rationality to demand the former. This is just a basic reality about solving coordination problems.

[To provide a counter argument: having a big movement that doesn't understand the problem is not useful. At some point the misalignment between the movement and the true objective will be catastrophic.

I don't really buy this because I think that pausing is a big and stable enough target and it is a good solution for most concerns.]

This is something I am actually quite uncertain about so I would like to hear your opinion.

I think it's worth trying hard to stick to strict epistemic norms. The main argument you bring against is that it's more effective to be more permissive about bad epistemics. I doubt this. It seems to me that people overstate the track record of populist activism at solving complicated problems. If you're considering populist activism, I would think hard about where, how, and on what it has worked.

Consider environmentalism. It seems quite uncertain whether the environmentalist movement has been net positive (!). This is an insane admission to have to make, given that the science is fairly straightforward, environmentalism is clearly necessary, and the movement has had huge wins (e.g. massive shift in public opinion, pushing governments to make commitments, & many mundane environmental improvements in developed country cities over the past few decades). However, the environmentalist movement has repeatedly spent enormous efforts on directly harming their stated goals through things like opposing nuclear power and GMOs. These failures seem very directly related to bad epistemics.

In contrast, consider EA. It's not trivial to imagine a movement much worse along the activist/populist metrics than EA. But EA seems quite likely positive on net, and the loosely-construed EA community has gained a striking amount of power despite its structural disadvantages.

Or consider nuclear strategy. It seems a lot of influence was had by e.g. the staff of RAND and other sober-minded, highly-selected, epistemically-strong actors. Do you want more insiders at think-tanks and governments and companies, and more people writing thoughtful pieces that swing elite opinion, all working in a field widely seen as credible and serious? Or do you want more loud activists protesting on the streets?

I'm definitely not an expert here, but by thinking through what I understand about the few cases I can think of, the impression I get is that activism and protest have worked best to fix the wrongs of simple and widespread political oppression, but that on complex technical issues higher-bandwidth methods are usually how actual progress is made.

I think there are also some powerful but abstract points:

  1. Choosing your methods is not just a choice over methods, but also a choice over who you appeal to. And who you appeal to will change the composition of your movement, and therefore, in the long run, the choice of methods. Consider carefully before summoning forces you can't control (this applies both to superhuman AI as well as epistemically-shoddy charismatic activist-leaders).
  2. If we make the conversation about AIS more thoughtful, reasonable, and rational, it increases the chances that the right thing (whatever that ends up being - I think we should have a lot of intellectual humility here!) ends up winning. If we make it more activist, political, and emotional, we privilege the voice of whoever is better at activism, politics, and narratives. I think you basically always want to push the thoughtfulness/reasonableness/rationality. This point is made well in one of Scott Alexander's best essays (see section IV in particular, for the concept of asymmetric vs symmetric weapons). There is a spirit here, of truth-seeking and liberalism and building things, of fighting Moloch rather than sacrificing our epistemics to him for +30% social clout. I admit that this is partly an aesthetic preference on my part. But I do believe in it strongly.

Thanks, Rudolf, I think this is a very important point, and probably the best argument against PauseAI. It's true in general that The Ends Do Not Justify the Means (Among Humans).

My primary response is that you are falling for status-quo bias. Yes this path might be risky, but the default path is more risky. My perception is the current governance of AI is on track to let us run some terrible gambles with the fate of humanity.

Consider environmentalism. It seems quite uncertain whether the environmentalist movement has been net positive (!).

We can play reference class tennis all day but I can counter with the example of the Abolitionists, the Suffragettes, the Civil Rights movement, Gay Pride or the American XL Bully.

It seems to me that people overstate the track record of populist activism at solving complicated problems
...
the science is fairly straightforward, environmentalism is clearly necessary, and the movement has had huge wins

As I argue in the post, I think this is an easier problem than climate change. Just as most people don't need a detailed understanding of the greenhouse effect, most people don't need a detailed understanding of the alignment problem ("creating something smarter than yourself is dangerous").

The advantage with AI is that there is a simple solution that doesn't require anyone to make big sacrifices, unlike with climate change. With PauseAI, the policy proposal is right there in the name, so it is harder to become distorted than vaguer goals of "environmental justice".

fighting Moloch rather than sacrificing our epistemics to him for +30% social clout

I think to a significant extent it is possible for PauseAI leadership to remain honest while still having broad appeal. Most people are fine if you say that "I in particular care mostly about x-risk, but I would like to form a coalition with artists who have lost work to AI."

There is a spirit here, of truth-seeking and liberalism and building things, of fighting Moloch rather than sacrificing our epistemics to him for +30% social clout. I admit that this is partly an aesthetic preference on my part. But I do believe in it strongly.

I'm less certain about this but I think the evidence is much less strong than rationalists would like to believe. Consider: why has no successful political campaign ever run on actually good, nuanced policy arguments? Why do advertising campaigns not make rational arguments for why should prefer their product, instead appealing to your emotions? Why did it take until 2010 for people to have the idea of actually trying to figure out which charities are effective? The evidence is overwhelming that emotional appeals are the only way to persuade large numbers of people.

If we make the conversation about AIS more thoughtful, reasonable, and rational, it increases the chances that the right thing (whatever that ends up being - I think we should have a lot of intellectual humility here!) ends up winning.

Again, this seems like it would be good, but the evidence is mixed. People were making thoughtful arguments for why pandemics are a big risk long before Covid, but the world's institutions were sufficiently irrational that they failed to actually do anything. If there had been an emotional, epistemically questionable mass movement calling for pandemic preparedness, that would have probably been very helpful.

Most economists seem to agree that European monetary policy is pretty bad and significantly harms Europe, but our civilization is too inadequate to fix the problem. Many people make great arguments about why aging sucks and it should really be a top priority to fix, but it's left to Silicon Valley to actually do something. Similarly for shipping policy, human challenge trials and starting school later. There is long list of preventable, disastrous policies which society has failed to fix due lack of political will, not lack of sensible arguments.

>in the long run

What if we don't have very long? You aren't really factoring in the time crunch we are in (the whole reason that PauseAI is happening now is short timelines).

I think that AI Safety is probably neglected in the public consciousness, simply because most people still don't understand what AI even "is". This lack of obviously precludes people from caring about AI safety, because they don't appreciate that AI is a qualitatively different technology to any technology hitherto created. And if they're not interfacing with the current LLMs (I suspect most older people aren't) then they can't appreciate the exponential progress in sophistication. By now, people have some visceral understanding of the realities of progressive climate change.  But AI is still an abstract concept, and an exponential technology in its infancy, so it's hard to viscerally grok the idea of AI-x-risk.

Let's say that proportion of adults in a developed country that know of, or have used an LLM, is 20%. From that 20%, perhaps half of them (10% of population) have a dim premonition of the profundity of AI. But, anecdotally, no-one I know is really thinking of AI's trajectory, except perhaps a sense of vague foreboding. 

I am fairly new to the EA and rationality communities, but I sense that members of EA/rationality are on average cerebral, and perhaps introverted or have an unassuming demeanor. Moreover, the mindset is one of epistemic humility. EA rarely attracts the extroverted, disagreeable, outspoken "activist" types that other movements attract--for example, Israel-Palestine causes or Extinction Rebellion. Due to this, I'm predicting that we have a scarcity of EAs with the comparative advantage of organising and attending protests, and making noise in public. However, I think that protests are necessary to raise public awareness about AI safety and galvanise an educated mass response. The key considerations are: what demands do we set, based on what evidence/reasoning? And in broadcasting AI-safety, how do we balance the trade-off between:

  • Trying to be as comprehensive and rational as possible in explaining AI, AI-x-risk and the need for safety research / a pause; and
  • Marketing the protest in a salient way, and explaining the cause in a way for the "average Joe" or "average Jane" to understand.

I support the PauseAI protests, will continue learning more about the issue, and hopefully will contribute to a protest in Melbourne, Australia, soon.

"GPT-5 training is probably starting around now." I'm pretty sure its already been trained, and they're now running evals.

the number of safety focussed researchers employed by OpenAI is dropping rapidly

Is this true? The links only establish that two safety-focused researchers have recently left, in very different circumstances.

It seemed like OpenAI made a big push for more safety-focused researchers with the launch of Superalignment last July; I have no idea what the trajectory looks like more recently.

Do you have other information that shows that the number of safety-focused researchers at OpenAI is decreasing?

Now that Jan Leike has left, superalignment team has been disbanded, OAI has really lost most of the safety focused researchers.

Yes, based on the last month, "the number of safety-focused researchers is dropping rapidly" certainly seems true.

I'd guess "most" is still an overstatement; I doubt the number of people has actually dropped by >50%. But the departures, plus learning that Superalignment never got their promised compute, have caused me to revise my fuzzy sense of "how much core safety work OpenAI is doing" down by a lot, probably over 50%.

I agree this is slightly hyperbolic. If you include the disappearance of Ilya Sutskever, there's three. And I know of two more less widely reported. Depending on how narrow your definition of a "safety-focused researcher" is, five people leaving in less than 6 months is fairly significant.

I know of one that is less widely reported; not sure if they're counted in the two Joseph Miller knows of that are less widely reported, or if separate.

Good reasoning, well written. Reading this post convinced me to join the next NYC protest. Unfortunately I missed the one literally two days ago because I waited too long to read this. But I plan to be there in September.

Maybe we will suddenly discover that the difference between GPT-4 and superhuman level is actually quite small. Maybe GPT-5 will be extremely good at interpretability, such that it can recursively self improve by rewriting its own weights.

This paragraph seems too weak for how important it is in the argument. Notably, I doubt we'll discover the difference between GPT4 and superhuman to be small and I doubt GPT5 will be extremely good at interpretability. 

That we have several years before these become maybes to me is quite an important part of why I don't advocate for pausing now. 

I respect the unpopularity of protesting though.

Notably, I doubt we'll discover the difference between GPT4 and superhuman to be small and I doubt GPT5 will be extremely good at interpretability.

I also doubt it, but I am not 1 in 10,000 confident.

This paragraph seems too weak for how important it is in the argument. Notably, I doubt we'll discover the difference between GPT4 and superhuman to be small and I doubt GPT5 will be extremely good at interpretability.

The important question for the argument is whether GPT-6 will pose an unacceptable risk.

Is it, are PauseAI clear that they think GPT5 will almost certainly be fine. 

The main message of this post is that current PauseAI protest's primary purpose is to build momentum for a later point.

This post is just my view. As with Effective Altruism, PauseAI does not have a homogenous point of view or a specific required set of beliefs to participate. I expect that the main organizers of PauseAI agree that GPT-5 is very unlikely to end the world. Whether they think it poses an acceptable risk, I'm not sure.

Idk, I'd put GPT-5 at a ~1% x-risk, or crossing-the-point-of-no-return risk (unacceptably high).

Executive summary: The author argues that mass protests against AI development are necessary as a backup plan to prevent potential catastrophic risks from future AI systems like GPT-6, given the uncertainty and limitations of current governance and technical solutions.

Key points:

  1. GPT-5 training is starting soon, and while catastrophic risks are unlikely, they are hard to predict and mitigate with certainty.
  2. Governance efforts and technical solutions for AI alignment may not be sufficient to prevent the development of potentially dangerous AI systems like GPT-6 by 2028.
  3. Mass protests against AI are a viable "Plan B" because they require no new ideas or permissions, and most people support pausing AI development without feeling like they are sacrificing anything.
  4. Building a small protest movement now through efforts like PauseAI can lay the foundation for a larger, more impactful movement when the general public becomes more aware of AI's imminent effects on society and the economy.

 

 

This comment was auto-generated by the EA Forum Team. Feel free to point out issues with this summary by replying to the comment, and contact us if you have feedback.

If we are correct about the risk of AI, history will look kindly upon us (assuming we survive).

Perhaps not. It could be more like Y2K, where some believe problems were averted only by a great deal of effort and others believe there would have been minimal problems anyway.

Cross posting from LessWrong:

I absolutely sympathize, and I agree that with the world view / information you have that advocating for a pause makes sense. I would get behind 'regulate AI' or 'regulate AGI', certainly. I think though that pausing is an incorrect strategy which would do more harm than good, so despite being aligned with you in being concerned about AGI dangers, I don't endorse that strategy.

Some part of me thinks this oughtn't matter, since there's approximately ~0% chance of the movement achieving that literal goal. The point is to build an anti-AGI movement, and to get people thinking about what it would be like to be able to have the government able to issue an order to pause AGI R&D, or turn off datacenters, or whatever. I think that's a good aim, and your protests probably (slightly) help that aim.

I'm still hung up on the literal 'Pause AI' concept being a problem though. Here's where I'm coming from: 

1. I've been analyzing the risks of current day AI. I believe (but will not offer evidence for here) current day AI is already capable of providing small-but-meaningful uplift to bad actors intending to use it for harm (e.g. weapon development). I think that having stronger AI in the hands of government agencies designed to protect humanity from these harms is one of our best chances at preventing such harms. 

2. I see the 'Pause AI' movement as being targeted mostly at large companies, since I don't see any plausible way for a government or a protest movement to enforce what private individuals do with their home computers. Perhaps you think this is fine because you think that most of the future dangers posed by AI derive from actions taken by large companies or organizations with large amounts of compute. This is emphatically not my view. I think that actually more danger comes from the many independent researchers and hobbyists who are exploring the problem space. I believe there are huge algorithmic power gains which can, and eventually will, be found. I furthermore believe that beyond a certain threshold, AI will be powerful enough to rapidly self-improve far beyond human capability. In other words, I think every AI researcher in the world with a computer is like a child playing with matches in a drought-stricken forest. Any little flame, no matter how small, could set it all ablaze and kill everyone. Are the big labs playing with bonfires dangerous? Certainly. But they are also visible, and can be regulated and made to be reasonably safe by the government. And the results of their work are the only feasible protection we have against the possibility of FOOM-ing rogue AGI launched by small independent researchers. Thus, pausing the big labs would, in my view, place us in greater danger rather than less danger. I think we are already well within the window of risk from independent-researcher-project-initiated-FOOM. Thus, the faster we get the big labs to develop and deploy worldwide AI-watchdogs, the sooner we will be out of danger.

I know these views are not the majority views held by any group (that I know of). These are my personal inside views from extensive research. If you are curious about why I hold these views, or more details about what I believe, feel free to ask. I'll answer if I can.
 

Curated and popular this week
Relevant opportunities