I tried doing a Fermi estimation of the impact I would have if I worked on AI safety, and I realized it wasn't easy to do with only a calculator. So I build a website which does this Fermi estimation given your beliefs about AGI, AI safety, and your impact on AI safety progress.

You can try it out here: https://xriskcalculator.vercel.app/

This tool focuses on technical work, and assumes that progress on AGI and progress on AI safety are independent. This is obviously an approximation that is vastly inaccurate, but for now I don't think of a simple way of taking into account the fact that advanced AI could speed up AI safety progress. Other limitations are outlined on the website.

What do you think of this tool? Do you think of a way it could be improved?

Note: this is still work in progress. If you want to use this tool to make important decisions, please contact me so that I increase its reliability.

24

0
0

Reactions

0
0
Comments13


Sorted by Click to highlight new comments since:

Maybe add ways working on it can backfire, either explicitly in the model, or by telling people to take expectations with potentials for backfire in mind, and allow for the possibility that you do more harm than good in the final estimate.

How would you model these effects? I have two ideas :

  1. add a section with how much you speed up AGI (but I'm not sure how I could break this down further)
  2. add a section with how likely it would be for you to take on resources away from other actions that could be used to save the world (either through better AI safety, or something else)

Is one of them what you had in mind? Do you have other ideas?

Ya, those were some of the kinds of things I had in mind, and also the possibility of contributing to or reducing s-risks, and adjustable weights to s-risks vs extinction:

https://arbital.com/p/hyperexistential_separation/

https://reducing-suffering.org/near-miss/

Because of the funding situation, taking resources away from other actions to reduce extinction risks would probably mostly come in people's time, e.g. the time of the people supervising you, reading your work or otherwise engaging with you. If an AI safety org hires you or you get a grant to work on something, then presumably they think you're worth the time, though! And one more person going through the hiring or grant process is not that costly for those managing it.

Describe what fraction of the AGI safety progress your field will be responsible for, and how much you think you will speedup your field's progress

Describe what fraction of the AGI safety work your organization is doing, and how much you think you will speedup your organization's progress in this direction

These should have lower defaults, I think

I talked to people who think defaults should be higher. I really don't know where they should be.

I put "fraction of the work your org. is doing" at 5% because I was thinking about a medium-sized AGI safety organization (there are around 10 of them, so 10% each seems sensible), and because I expect that there will be many more in the future, I put 5%.

I put "how much are you speeding up your org." at 1%, because there are around 10 people doing core research in each org., but you are only slightly better than the second-best candidate who would have taken the job, so 1% seemed reasonable. I don't expect this percentage to go down, because as the organization scale up, senior members become more important. Having "better" senior researchers, even if there are hundreds of junior researchers, would probably speed up progress quite a lot.

Where do you think the defaults should be, and why?

I've discovered something that is either a bug in the code, or a parameter that isn't explained super well. 

Under "How likely is it to work" I assume "it" refers to AGI safety. If so, this parameter is reversed - the more likely I say AGI safety is to work, the higher the x-risk becomes. If I set it to 0%, the program reliably tells me there's no chance the world ends.

I made the text a bit more clear. As for the bug, it didn't affect the end result of the Fermi estimation but how I computed the intermediate "probability of doom" was wrong: I forgot to take into account situations where AGI safety ended up being impossible... It is fixed now.

Thank you for the feedback!

I like the tool! One thing I would like to have added is total impact. I ended up using a calculator on a different webpage, but it would be nice to include something like "Expected lives saved", even if that's just 7 billion * P(world saved by you) that updates whenever P(world saved) does.

At first, I thought this would be distracting, as there are many orders of magnitudes between the lowest "lives saved if you avoid extinction" estimations and the higher ones. But given that you're not the first to ask for that, I think it would be a good idea to add this feature! I will probably add that soon.

I added this feature!

Great to see tools like this that make assumptions clear - I think not only useful as a calculator but as a concrete operalisation of your model of AI risk, which is a good starting point for discussion. Thanks for creating!

This tool is impressive, thanks! I like the framing you use of safety as a race against capabilities, though think don't really know what it would look like to have "solved " AGI safety 20 years before AGI. I also appreciate all the assumptions being listed at the end of the page.

Some minor notes

  • the GitHub link in the webpage footer points to the wrong page
  • I think two of the prompts "How likely is it to work?" and "How much do you speed it up?" would be made clearer if "it" was replaced by AGI safety (if that is what it is referring to).

Thank you for the feedback. It's fixed now!

Curated and popular this week
 ·  · 5m read
 · 
[Cross-posted from my Substack here] If you spend time with people trying to change the world, you’ll come to an interesting conundrum: Various advocacy groups reference previous successful social movements as to why their chosen strategy is the most important one. Yet, these groups often follow wildly different strategies from each other to achieve social change. So, which one of them is right? The answer is all of them and none of them. This is because many people use research and historical movements to justify their pre-existing beliefs about how social change happens. Simply, you can find a case study to fit most plausible theories of how social change happens. For example, the groups might say: * Repeated nonviolent disruption is the key to social change, citing the Freedom Riders from the civil rights Movement or Act Up! from the gay rights movement. * Technological progress is what drives improvements in the human condition if you consider the development of the contraceptive pill funded by Katharine McCormick. * Organising and base-building is how change happens, as inspired by Ella Baker, the NAACP or Cesar Chavez from the United Workers Movement. * Insider advocacy is the real secret of social movements – look no further than how influential the Leadership Conference on Civil Rights was in passing the Civil Rights Acts of 1960 & 1964. * Democratic participation is the backbone of social change – just look at how Ireland lifted a ban on abortion via a Citizen’s Assembly. * And so on… To paint this picture, we can see this in action below: Source: Just Stop Oil which focuses on…civil resistance and disruption Source: The Civic Power Fund which focuses on… local organising What do we take away from all this? In my mind, a few key things: 1. Many different approaches have worked in changing the world so we should be humble and not assume we are doing The Most Important Thing 2. The case studies we focus on are likely confirmation bias, where
 ·  · 2m read
 · 
I speak to many entrepreneurial people trying to do a large amount of good by starting a nonprofit organisation. I think this is often an error for four main reasons. 1. Scalability 2. Capital counterfactuals 3. Standards 4. Learning potential 5. Earning to give potential These arguments are most applicable to starting high-growth organisations, such as startups.[1] Scalability There is a lot of capital available for startups, and established mechanisms exist to continue raising funds if the ROI appears high. It seems extremely difficult to operate a nonprofit with a budget of more than $30M per year (e.g., with approximately 150 people), but this is not particularly unusual for for-profit organisations. Capital Counterfactuals I generally believe that value-aligned funders are spending their money reasonably well, while for-profit investors are spending theirs extremely poorly (on altruistic grounds). If you can redirect that funding towards high-altruism value work, you could potentially create a much larger delta between your use of funding and the counterfactual of someone else receiving those funds. You also won’t be reliant on constantly convincing donors to give you money, once you’re generating revenue. Standards Nonprofits have significantly weaker feedback mechanisms compared to for-profits. They are often difficult to evaluate and lack a natural kill function. Few people are going to complain that you provided bad service when it didn’t cost them anything. Most nonprofits are not very ambitious, despite having large moral ambitions. It’s challenging to find talented people willing to accept a substantial pay cut to work with you. For-profits are considerably more likely to create something that people actually want. Learning Potential Most people should be trying to put themselves in a better position to do useful work later on. People often report learning a great deal from working at high-growth companies, building interesting connection
 ·  · 1m read
 · 
I wanted to share a small but important challenge I've encountered as a student engaging with Effective Altruism from a lower-income country (Nigeria), and invite thoughts or suggestions from the community. Recently, I tried to make a one-time donation to one of the EA-aligned charities listed on the Giving What We Can platform. However, I discovered that I could not donate an amount less than $5. While this might seem like a minor limit for many, for someone like me — a student without a steady income or job, $5 is a significant amount. To provide some context: According to Numbeo, the average monthly income of a Nigerian worker is around $130–$150, and students often rely on even less — sometimes just $20–$50 per month for all expenses. For many students here, having $5 "lying around" isn't common at all; it could represent a week's worth of meals or transportation. I personally want to make small, one-time donations whenever I can, rather than commit to a recurring pledge like the 10% Giving What We Can pledge, which isn't feasible for me right now. I also want to encourage members of my local EA group, who are in similar financial situations, to practice giving through small but meaningful donations. In light of this, I would like to: * Recommend that Giving What We Can (and similar platforms) consider allowing smaller minimum donation amounts to make giving more accessible to students and people in lower-income countries. * Suggest that more organizations be added to the platform, to give donors a wider range of causes they can support with their small contributions. Uncertainties: * Are there alternative platforms or methods that allow very small one-time donations to EA-aligned charities? * Is there a reason behind the $5 minimum that I'm unaware of, and could it be adjusted to be more inclusive? I strongly believe that cultivating a habit of giving, even with small amounts, helps build a long-term culture of altruism — and it would