I tried doing a Fermi estimation of the impact I would have if I worked on AI safety, and I realized it wasn't easy to do with only a calculator. So I build a website which does this Fermi estimation given your beliefs about AGI, AI safety, and your impact on AI safety progress.

You can try it out here: https://xriskcalculator.vercel.app/

This tool focuses on technical work, and assumes that progress on AGI and progress on AI safety are independent. This is obviously an approximation that is vastly inaccurate, but for now I don't think of a simple way of taking into account the fact that advanced AI could speed up AI safety progress. Other limitations are outlined on the website.

What do you think of this tool? Do you think of a way it could be improved?

Note: this is still work in progress. If you want to use this tool to make important decisions, please contact me so that I increase its reliability.

11 comments, sorted by Click to highlight new comments since: Today at 11:15 AM
New Comment

Maybe add ways working on it can backfire, either explicitly in the model, or by telling people to take expectations with potentials for backfire in mind, and allow for the possibility that you do more harm than good in the final estimate.

How would you model these effects? I have two ideas :

  1. add a section with how much you speed up AGI (but I'm not sure how I could break this down further)
  2. add a section with how likely it would be for you to take on resources away from other actions that could be used to save the world (either through better AI safety, or something else)

Is one of them what you had in mind? Do you have other ideas?

Ya, those were some of the kinds of things I had in mind, and also the possibility of contributing to or reducing s-risks, and adjustable weights to s-risks vs extinction:

https://arbital.com/p/hyperexistential_separation/

https://reducing-suffering.org/near-miss/

Because of the funding situation, taking resources away from other actions to reduce extinction risks would probably mostly come in people's time, e.g. the time of the people supervising you, reading your work or otherwise engaging with you. If an AI safety org hires you or you get a grant to work on something, then presumably they think you're worth the time, though! And one more person going through the hiring or grant process is not that costly for those managing it.

I've discovered something that is either a bug in the code, or a parameter that isn't explained super well. 

Under "How likely is it to work" I assume "it" refers to AGI safety. If so, this parameter is reversed - the more likely I say AGI safety is to work, the higher the x-risk becomes. If I set it to 0%, the program reliably tells me there's no chance the world ends.

I made the text a bit more clear. As for the bug, it didn't affect the end result of the Fermi estimation but how I computed the intermediate "probability of doom" was wrong: I forgot to take into account situations where AGI safety ended up being impossible... It is fixed now.

Thank you for the feedback!

I like the tool! One thing I would like to have added is total impact. I ended up using a calculator on a different webpage, but it would be nice to include something like "Expected lives saved", even if that's just 7 billion * P(world saved by you) that updates whenever P(world saved) does.

At first, I thought this would be distracting, as there are many orders of magnitudes between the lowest "lives saved if you avoid extinction" estimations and the higher ones. But given that you're not the first to ask for that, I think it would be a good idea to add this feature! I will probably add that soon.

I added this feature!

Great to see tools like this that make assumptions clear - I think not only useful as a calculator but as a concrete operalisation of your model of AI risk, which is a good starting point for discussion. Thanks for creating!

This tool is impressive, thanks! I like the framing you use of safety as a race against capabilities, though think don't really know what it would look like to have "solved " AGI safety 20 years before AGI. I also appreciate all the assumptions being listed at the end of the page.

Some minor notes

  • the GitHub link in the webpage footer points to the wrong page
  • I think two of the prompts "How likely is it to work?" and "How much do you speed it up?" would be made clearer if "it" was replaced by AGI safety (if that is what it is referring to).

Thank you for the feedback. It's fixed now!