To make things more specific:
Lot of money = $1B+; lot of power = CEO of $10B+ org; lot of influence = 1M+ followers, or an advisor to someone with a lot of money or power.
AI timelines = time until an AI-mediated existential catastrophe
Very short = ≥ 10% chance of it happening in ≤ 2 years.

Please don’t use this space to argue that AI x-risk isn’t possible/likely, or that timelines aren’t that short. There are plenty of other places to do that. I want to know what you would do conditional on being in this scenario, not whether you think the scenario is likely.




New Answer
New Comment

4 Answers sorted by

Hm, if I felt timelines were that short I would probably feel like I knew which company/government was going to be responsible for actually building AGI (or at least narrow it to a few). The plan is to convince such orgs to ask me for advice, then have a team ready to research & give them the best possible advice, and hope that is good enough.

To convince them: I would be trying to leverage my power/influence to get to a position where leaders of the AGI-building organization would see me as someone to consult for help if they had a promising AGI-looking-thing and were trying to figure out how best to deploy it.


  • if rich, donating lots of money to causes that such people care about and thus buying invitations to conferences and parties where they might hang out.
  • If otherwise influential, then use my influence to get their attention with similar results.
  • There might be other leveraged projects (like blogs, etc) that could generate lots of influence and admiration among the leaders of AGI-building orgs

Simultaneously, I would also be trying to join (or create, if necessary) some sort of think tank group comprising people who are the best for advice on short term AGI strategy. Again, power and money seem useful for putting together such a group - you should be able to recruit the best possible people with star power, and/or pay them well, to start thinking about such things full time. The hard part here is shaping the group in the right way, so that they are both smart and thoughtful about high stakes decisions, and their advice will be listened to and trusted by the AGI-building organization.

Assumptions / how does this strategy fail?

  • I cannot build the influence required:
    • I have to influence too many AGI builders (because I don't know which one is most likely to succeed), so my influence is too diluted
    • They are not influenceable in this way
  • AGI builders don't ask for the advice even if they want to:
    • maybe the project is too secret
  • advice can't solve the problem:
    • maybe there is an internal deadline - things are moving too fast and they don't have time to ask
    • maybe there are external deadlines, like competition between AGI builders, such that even if they get the advice they choose not to heed it
    • maybe the AGI building leadership doesn't have sufficient control over the organization, so even if they get advice, their underlings fail to heed it
  • advice is too low quality
    • I wasn't able to recruit the people for the think tank
    • They just didn't come up with the answer

Interesting and hopefully very hypothetical question. :')

Hmm, hard to say what an AI mediated existential catastrophe within 2 years might look like, that‘s so fast. Maybe a giant leap towards a vastly superintelligent system that is so unconstrained and given so much access that it very quickly overpowers any attempt to constrain it? Feels kinda like it requires surprising carelessness… Or maybe a more narrow system that is deliberatly used to cause an existential catastrophe? 


  • Ask people at FHI/MIRI/GovAI/OpenPhil/… what I should and shouldn’t do.
  • Confidentially talk to other rich and influential and trustworthy people to coordinate a joint effort

Some concrete not-well-thought-through ideas

  • talk to potential AI breakthrough companies about my worries, and what I can do to slow down the pace and increase safety/alignment/testing/cooperation with Safety organizations 
  • if there are race dynamics, try anything to get some cooperation between racing parties going (probably they will already have tried that, I suppose... but maybe there will be possibilities)

Increase the number of safety researchers:

  • ask AI Safety researchers that are in broad agreement with you to write out big research prizes for small projects/answers to questions that might be helpful (with short time horizon, maybe 3 months, and make prizes widely known among CS people)
  • try to hire everyone who contributed solid work to work full time on Safety Research in a research institute you set up, give them all info you have and let them think and work on whatever makes sense... hire the best science managers you can find
  • offer the potential AI breakthrough companies to support them on safety issues with your hopefully somewhat impressive group of hires who won the prizes

Use of pressure/involve government:

  • especially unsure: if they decline for reasons that seem irresponsible, talk to the government and try to convince them that AI research is on the brink of developing huge catastrophe causing AI?

One thing I'd want to do is to create an organisation that builds networks with add many AI research communities as possible, monitors AI research as comprehensively as possible and assesses the risk posed by different lines of research.

Some major challenges:

  • a lot of labs want to keep substantial parts of their work secret, even more so for e.g. military
  • encouraging sharing of more knowledge might inadvertently spread knowledge of how to do risky stuff
  • even knowing someone is doing something risky, might be hard to get them to change
  • might be hard to see in advance what lines of research are risky

I think networking + monitoring + risk assessing together can help with some of these challenges. Risk assessing + monitoring: we have a better idea of what we do and don't need to know, which helps with the first and second issues. Also, if we have good relationships with labs we are probably better placed to come up with proposals that reduce risk while not hindering lab goals too much.

Networking might also help know where relatively unmonitored research is taking place, even if we can't find out much more about it.

It would still be quite hard to have a big effect, but I think even knowing partially who is taking risks is pretty valuable in your scenario.

I think the answer to this depends very specifically on why this scenario exists. Are the people creating this AI not aware of the risk? If so, why? Are they creating it because they want extinction to happen*? Are they creating it because they believe it is the least worse option?**

*maybe they're extremists who think humans deserve to die, or maybe they just think populating the cosmos with intelligent beings that are like us but not us is an ethical thing to do. Or some other philosophical stance.

**maybe they're in competition with another group who they know is more likely to deploy a misaligned AGI.

Imagine it's just the standard AGI scenario where the world ends "by accident", i.e. the people making the AI don't heed the standard risks,  or solve the Control Problem, as outlined in books like Human Compatible and Superintelligence, in a bid to be first to make AGI  (perhaps for economic incentives, or perhaps for your ** scenario). I imagine it will also be hard to know who exactly the actors are, but you could have some ideas (e.g. the leading AI companies, certain governments etc). 

Okay. I'm still gonna assume that they have atleast read some AI alignment theory. Then I think some options are: * convincing them they are close to AGI / convinve them their AGI is misaligned, whichever of the two is important. * getting them stopped by force Convincing them requires opening dialogue with them, not being hostile and convincing them that you're atleast somewhat aligned with them. Money might or might not help with this. Getting them to stop could mean appealing to the public, or funding a militia that enters their facility by force. Appealing to the govt and public is too slow unless the govt already has people very aware of the problem. It can work though. If you do use force and manage to steal their code, one desperate option is to attempt getting a powerful but not superintelligent AI that gives you personally or the US govt or anyone else trustworthy a decisive strategic advantage to prevent future people from working on it - burn all copies etc.
I think the main problem is that you don't know for sure that they're close to AGI, or that it  is misaligned, beyond saying that all AGIs are misaligned by default, and what they have looks close to one. If they don't buy this argument -- which I'm assuming they won't given they're otherwise proceeding  -- then you probably won't get very far.  As for using force (lets assume this is legal/governmental force), we might then find ourselves in a "whack-a-mole" situation, and how do we get global enforcement (/cooperation)?
Agreed that you don't know for sure, but "≥ 10% chance of it happening in ≤ 2 years" must have some concrete reasoning backing it. Maybe this reasoning doesn't convince them at first, but it may still be high EV to "explain harder" if you have a personal connection with them. Bring in more expert opinions, appeal to authority - or whatever other mode of reasoning they prefer. You're right that second option is hard and messy. I think it depends on a) are you able to use the code to get any form of strategic advantage without unleashing misaligned AGI b) what form of strategic advantage you get. For instance if you can use the AI to get tech for global surveillance or nanotech for yourself, that would be a good scenario. You can then establish yourself as a world dictator and prevent other attempts at AGI. Maybe don't need to have dictatorial power in all domains, just the ones most revelant to enforcement and the ones that can be built out in very short timeframes. I'm a bit skeptical of govts wielding this power, I figure if you're a billionaire you may want to consider weilding this power yourself. With support of people you trust and/or hire. Worst case the govt takes it from you by force and you're back to where you started. If the US govt takes it by force and is in control of the sole copy of the code, that again becomes a whole scenario to be analysed. There's a lot of scenarios here, hard for one to draw generalised conclusions I guess. Edit: Looks like there's pages on arbitral about it.
Sorted by Click to highlight new comments since: Today at 1:43 PM

This seems much more relevant now. I actually think we are (post GPT-4+planners) at a ≥ 10% chance of an existential catastrophe happening in ≤ 2 years (and maybe 50% within 5 years).

My expectation is that having $1 billion is more in money terms than being CEO of a $1 billion company is in power terms.

Ok, changed to $10B.