This is a special post for quick takes by JohanEA. Only they can create top-level comments. Comments here also appear on the Quick Takes page and All Posts page.
Sorted by Click to highlight new quick takes since:

AI safety is largely about ensuring that humanity can reap the benefits of AI in the long term. To effectively address the risks of AI, it's useful to keep in mind what we haven't yet figured out.

I am currently exploring the implications of our current situation and the best ways to contribute to the positive development of AI. I am eager to hear your perspective on the gaps we have not yet addressed. Here is my quick take on things we seem to not have figured out yet:

  1. We have not figured out how to solve the alignment problem. We don’t know whether alignment is solvable in the first place, even though we hope so. It may not be solvable at all.
  2. We don’t know the exact timelines (I define 'timelines' here as the moments when an AI system becomes capable of recursively self-improving). It might range from already having happened to 100 years or more.
  3. We don’t know what takeoff will look like once we develop AGI.
  4. We don’t know how likely it is that AI will become uncontrollable, and if it does become uncontrollable, how likely it is to cause human extinction.
  5. We haven't figured out the most effective ways to govern and regulate AI development and deployment, especially at an international level.
  6. We don't know how likely it is that rogue actors will use sophisticated open-source AI to cause large-scale harm to the world.

I think it is useful to call it "we have not figured x out" if there is no consensus on it. People in the community have very different probability estimates for each, all across the range.

Do you disagree with any of these points? And what are other points we might want to add to the list?

I hope to read your take! 

I agree that none of these seem figured out (both no broad consensus and also personally I am not hugely confident either way).

Some notes

We have not figured out how to solve the alignment problem

It seems useful to distinguish the problem of alignment from the problem of ensuring safety and usefulness from a given AI. Also, it seems worth distinguishing safety issues from wildly superhuman AI and the first AIs which are transformatively useful.

It seems plausible to me that you can adequately control and utilize transformatively useful (but not wildly superhuman) AIs even if these AIs are hugely misaligned (e.g. deceptive alignment). See here for a bit more discussion. By transformatively useful, I mean AIs capable of radically accelerating (e.g. 30x speed up) R&D on key topics like AI safety. It's not clear that using these AIs to speed up cognitive work will suffice for solving the next problems, but it at least seems relevant.

We don’t know the exact timelines (I define 'timelines' here as the moments when an AI system becomes capable of recursively self-improving). It might range from already having happened to 100 years or more.

I think publically known AI is already capable of recursively self-improving via contributing to normal ML research; thus, there is just a quantitative question of how quickly. So, I would try to use a different operationalization of timelines. See here for more discussion.

(As far as "already happened", I think It seems very unlikely that there are non-publically known AI systems which are much more capable than current publically known AI systems, but much more capable systems might be trained over the next year.)

Ryan, thank you for your thoughts! The distinctions you brought up are something I did not think about yet, so I am going to take a look at the articles you linked in your reply. If I have more to add to this point, I'll add that. Lots of work ahead to figure out these important things. I hope we have enough time.

Would you consider adding your ideas for 2 minutes?  - Creating an comprehensive overview of AI x-risk reduction strategies
------

Motivation: To identify the highest impact strategies for reducing the existential risk from AI, it’s important to know what options are available in the first place.

I’ve just started creating an overview and would love for you to take a moment to contribute and build on it with the rest of us!

Here is the work page: https://workflowy.com/s/making-sense-of-ai-x/NR0a6o7H79CQpLYw

Some thoughts on how we collaborate:

  • Please don’t delete others’ bullet points; instead, use the comment feature to suggest changes or improvements.
  • If you’re interested in discussing this further, feel free to add your name and contact details here. I may organize a follow-up discussion.
     
Curated and popular this week
 ·  · 32m read
 · 
Summary Immediate skin-to-skin contact (SSC) between mothers and newborns and early initiation of breastfeeding (EIBF) may play a significant and underappreciated role in reducing neonatal mortality. These practices are distinct in important ways from more broadly recognized (and clearly impactful) interventions like kangaroo care and exclusive breastfeeding, and they are recommended for both preterm and full-term infants. A large evidence base indicates that immediate SSC and EIBF substantially reduce neonatal mortality. Many randomized trials show that immediate SSC promotes EIBF, reduces episodes of low blood sugar, improves temperature regulation, and promotes cardiac and respiratory stability. All of these effects are linked to lower mortality, and the biological pathways between immediate SSC, EIBF, and reduced mortality are compelling. A meta-analysis of large observational studies found a 25% lower risk of mortality in infants who began breastfeeding within one hour of birth compared to initiation after one hour. These practices are attractive targets for intervention, and promoting them is effective. Immediate SSC and EIBF require no commodities, are under the direct influence of birth attendants, are time-bound to the first hour after birth, are consistent with international guidelines, and are appropriate for universal promotion. Their adoption is often low, but ceilings are demonstrably high: many low-and middle-income countries (LMICs) have rates of EIBF less than 30%, yet several have rates over 70%. Multiple studies find that health worker training and quality improvement activities dramatically increase rates of immediate SSC and EIBF. There do not appear to be any major actors focused specifically on promotion of universal immediate SSC and EIBF. By contrast, general breastfeeding promotion and essential newborn care training programs are relatively common. More research on cost-effectiveness is needed, but it appears promising. Limited existing
Ben_West🔸
 ·  · 1m read
 · 
> Summary: We propose measuring AI performance in terms of the length of tasks AI agents can complete. We show that this metric has been consistently exponentially increasing over the past 6 years, with a doubling time of around 7 months. Extrapolating this trend predicts that, in under a decade, we will see AI agents that can independently complete a large fraction of software tasks that currently take humans days or weeks. > > The length of tasks (measured by how long they take human professionals) that generalist frontier model agents can complete autonomously with 50% reliability has been doubling approximately every 7 months for the last 6 years. The shaded region represents 95% CI calculated by hierarchical bootstrap over task families, tasks, and task attempts. > > Full paper | Github repo Blogpost; tweet thread. 
 ·  · 2m read
 · 
For immediate release: April 1, 2025 OXFORD, UK — The Centre for Effective Altruism (CEA) announced today that it will no longer identify as an "Effective Altruism" organization.  "After careful consideration, we've determined that the most effective way to have a positive impact is to deny any association with Effective Altruism," said a CEA spokesperson. "Our mission remains unchanged: to use reason and evidence to do the most good. Which coincidentally was the definition of EA." The announcement mirrors a pattern of other organizations that have grown with EA support and frameworks and eventually distanced themselves from EA. CEA's statement clarified that it will continue to use the same methodologies, maintain the same team, and pursue identical goals. "We've found that not being associated with the movement we have spent years building gives us more flexibility to do exactly what we were already doing, just with better PR," the spokesperson explained. "It's like keeping all the benefits of a community while refusing to contribute to its future development or taking responsibility for its challenges. Win-win!" In a related announcement, CEA revealed plans to rename its annual EA Global conference to "Coincidental Gathering of Like-Minded Individuals Who Mysteriously All Know Each Other But Definitely Aren't Part of Any Specific Movement Conference 2025." When asked about concerns that this trend might be pulling up the ladder for future projects that also might benefit from the infrastructure of the effective altruist community, the spokesperson adjusted their "I Heart Consequentialism" tie and replied, "Future projects? I'm sorry, but focusing on long-term movement building would be very EA of us, and as we've clearly established, we're not that anymore." Industry analysts predict that by 2026, the only entities still identifying as "EA" will be three post-rationalist bloggers, a Discord server full of undergraduate philosophy majors, and one person at