Hide table of contents

This is part of a weekly series summarizing the top posts on the EA and LW forums - you can see the full collection here. The first post includes some details on purpose and methodology. Feedback, thoughts, and corrections are welcomed.

If you'd like to receive these summaries via email, you can subscribe here.

Podcast version: Subscribe on your favorite podcast app by searching for 'EA Forum Podcast (Summaries)'. A big thanks to Coleman Snell for producing these!

Author's note: because of travel / leave, this post covers the past three weeks of posts. We'll be back to our usual weekly schedule from here.

Object Level Interventions / Reviews


FLI open letter: Pause giant AI experiments

by Zach Stein-Perlman

Linkpost for this open letter by Future of Life Institute, which calls for “all AI labs to immediately pause for at least 6 months the training of AI systems more powerful than GPT-4.” It has over 2000 signatures, including Yoshua Bengio, Stuart Russell, Elon Musk, Steve Wozniak, and other well-known academics, entrepreneurs and public figures. It’s been covered by NYTBBC, and many other media outlets.

Time Ideas also published an article by Eliezer Yudkowsky which argues the letter’s ask doesn’t go far enough and 6 months isn’t enough time to solve for safety.


New survey: 46% of Americans are concerned about extinction from AI; 69% support a six-month pause in AI development

by Akash

YouGov America (a reputable pollster) released a survey of 20,810 American adults which found:

  • 46% reported they are “very concerned” or “somewhat concerned” about the possibility of AI causing the end of the human race on Earth.
  • When prompted with information that >1K tech leaders signed a letter calling for a pause to certain large-scale AI systems for 6 months worldwide, 69% supported this.

There were no meaningful differences by region, gender, or political party.


Announcing Epoch’s dashboard of key trends and figures in Machine Learning

by Jaime Sevilla

Epoch has launched a new dashboard, covering key numbers from their research to help understand the present and future of machine learning. Eg. training compute requirements, model size, availability and use of training data, hardware efficiency, algorithmic improvements and investment in training runs over time.


New blog: Planned Obsolescence

by Ajeya, Kelsey Piper

Kelsey Piper and Ajeya Cotra have launched a new blog on AI futurism and alignment. It’s aiming to clearly communicate thoughts on the biggest challenges in technical work and policy to make AI go well, and is targeted at a broader audience than EA or AI Safety communities.


Hooray for stepping out of the limelight

by So8res

Celebrates that since ~2016, DeepMind has stepped out of the limelight and hyped their developments a lot less than they could have. The author suspects this was a purposeful move to avoid bringing AGI capabilities to the forefront of public attention.


GPT-4 Plugs In

by Zvi

OpenAI has launched the ability for ChatGPT to browse the internet for up to date information, run Python in a walled sandbox without internet access, and integrate with third party apps. For instance, it can integrate with Slack and Zapier to access personal data and put responses in context. It’s also been trained to know when to reach out to plug-ins like Wolfram Alpha to improve its responses.


If interpretability research goes well, it may get dangerous

by So8res

Interpretability research is important, but the author argues it should be kept private (or to a limited group) if it allows understanding of AIs that could significantly advance capabilities. They don’t think we’re close to that yet but want to highlight the potential trade-off.


AISafety.world is a map of the AIS ecosystem

by Hamish Doodles

Aisafety.world is a reasonably comprehensive map of organizations, people, and resources in the AI safety space (including research organizations, blogs / forums, podcasts, youtube channels, training programs, career support and funders).


Policy discussions follow strong contextualizing norms

by Richard_Ngo

Claims like "X is worse than Y" can often be interpreted as a general endorsement of causing Y in order to avoid X. This is particularly true in areas with strong ‘contextualizing norms’ (ie. which expect implications of statements to be addressed) like policy.


GPTs are Predictors, not Imitators

by Eliezer Yudkowsky

A predictor needs to be more intelligent than an imitator or simulator. For instance, predicting <Hash, plaintext> pairs requires cracking the hash algorithm - but generating typical instances of <Hash, plaintext> pairs does not. A lot of what ChatGPT predicts has complex causal chains behind it - for instance, the results of a science paper. To accurately predict this, you need to understand the science at play. Therefore the task GPTs are being trained on (next-token-prediction) is in many ways harder than being an actual human.


On AutoGPT

by Zvi

AutoGPT uses GPT-4 to generate, prioritize and execute sub-tasks toward a given objective or larger task, using plug-ins for internet browsing and other access. It quickly became #1 on Github and generated excitement. So far though, it hasn’t achieved much - it has a tendency to get distracted or confused or caught in loops. However, it’s the first version and it’s likely it will get significantly better over time. The author suggests AutoGPT happening now could be good, as we were always going to get agents eventually, and this gives us a chance to gradually face more capable AI agents. 

Relatedly, Stanford and Google researchers put 25 ChatGPT characters into a Sims-like game world, with starting personas and relationships to see how they would interact. The author suggests taking this further, and putting one into a game world with an economy with the goal to make money (and seeing if it takes over the game world in the process).

The author also talks about arguments for and against AutoGPT being a ‘true agent’, and predictions on what they expect next.


Critiques of prominent AI safety labs: Redwood Research

by Omega

The first of a series of critiques of AI safety organizations that have received >$10M per year in funding. 

Redwood Research launched in 2021 and is focused on technical AI safety (TAIS) alignment research. They have strong connections and reputation within EA and have received ~$21M in funding, primarily from Open Philanthropy (OP). So far they list six research projects on their website, have run programs (MLAB and REMIX) for junior TAIS researchers, and run the Longtermist office Constellation. They employed ~6-15 FTE researchers at any given time in the past 2 years.

The author offers several observations and suggestions:

  1. Their willingness to pivot research focus if something isn’t impactful is great, but also led to disruption and multiple staff being let go.
    1. Initially Redwood focused on adversarial training, but largely canned it and pivoted to interpretability research. They may pivot again.
    2. Senior ML research staff could de-risk some research agendas upfront or help pivots happen faster. They do have senior advisors like Paul Christiano and Jacob Steinhardt, but have limited ML expertise in-house.
  2. Lack of dissemination of research (particularly outside EA) could be limiting feedback loops, ability of others to build on their work, and access to talent and impact pathways outside EA.
  3. Of the work that is public, it is useful but the author argues underwhelming for the money & time invested. Eg. they’re able to find a few papers by independent researchers of similar quality.
  4. Burnout, turnover, diversity, inclusivity and work culture need more focus as important areas to improve. Discussions with current and former staff suggest significant issues.
  5. There are some conflicts of interest / relationships between Redwood leadership / board and OP leadership / grantmakers.

Giant (In)scrutable Matrices: (Maybe) the Best of All Possible Worlds

by 1a3orn

“Giant inscrutable matrices” are often referred to as the reason why deep learning models are hard to understand and control. Some have argued for work on a more easily-interpretable paradigm. However the author argues current state is the best we can expect, because:

  1. It’s probable that generally intelligent systems must be connectionist (ie. involve a massive number of uniform units connected to each other). All animals became smart this way, and neural nets surpassed other attempts to build artificial intelligence rapidly.
  2. Among connectionist systems, synchronous matrix operations are the most interpretable we know of. Eg. they are much simpler to interpret than spike-timing-dependent-plasticity neurons that biological brains use.
  3. The complexity is in the training data - any ML model trained on the domain of all human text will be complex.

They also suggest it’s possible something quite simple will work for scaling interpretability of these matrices - citing research where a team was able to easily identify and patch ‘cheese-seeking’ behavior out of a cheese seeking model, using the simplest first approach they came across.


Nobody’s on the ball on AGI alignment

by leopold

The author argues we’re not on track to solve alignment of superhuman AGI systems, but could be with a large-scale effort similar to the first moon landing.

Few people are working on AI alignment: ~300 technical alignment researchers vs. 100,000 ML capabilities researchers. Of those, many aren’t tackling the core difficulties. Reinforcement learning from human feedback relies on human supervision, which isn’t reliable when models become superhuman. Mechanistic interpretability is producing interesting findings but is like “trying to engineer nuclear reactor security by doing fundamental physics research, 2 hours before starting the reactor”. There isn’t a lot else happening in technical AI safety outside of abstract / theoretical work like decision theory and mathematical proofs. However, alignment is increasingly becoming a ‘real science’ where experimentation is possible, meaning it’s possible to make substantial progress with enough money and attention.

Other Existential Risks (eg. Bio, Nuclear)

U.S. is launching a $5 billion follow-up to Operation Warp Speed

by Juan Cambeiro

Linkpost for this article. Author’s summary: “The Biden administration is launching a $5 billion follow-up to Operation Warp Speed called "Project Next Gen." It has 3 goals, of which the most relevant for future pandemic preparedness is development of pan-coronavirus vaccines. The $5 billion seems to be coming from unspent COVID funds, so no new appropriations are needed.”


Polio Lab Leak Caught with Wastewater Sampling

by Cullen

Linkpost for this article, describing a polio lab leak in the Netherlands caught by wastewater sampling.


Global Health and Development

How much funding does GiveWell expect to raise through 2025?

by GiveWell

Medians and 90% confidence intervals for expected funds raised in 2023 to 2025:

  • 2023: $581M ($416M - $774M)
  • 2024: $523M ($313M - $846M)
  • 2025: $578M ($308M - $1.07B)

These are relatively constant due to a decrease in expected funding from GiveWell’s biggest funder Open Philanthropy (OP), offset by an expected increase from other donors. OP donated ~$350M in 2022, and tentatively plans to give ~$250M in 2023, with possible further decreases. This substantially increases uncertainty in the total funding level for GiveWell. Their strategy will continue to focus on finding outstanding giving opportunities, but they may smooth spending year to year to maintain a stable 10x cash cost-effectiveness bar, and plan to increase fundraising efforts (with a goal of $500M in funds raised from non-OP donors by 2025).


Introducing the Maternal Health Initiative

by Ben Williamson, Sarah H

The Maternal Health Initiative was launched in 2022 via the Charity Entrepreneurship incubation programme. It’s conducting an initial pilot in Ghana through local partner organisations, where it trains health providers to provide family planning counseling during the postpartum period - with the first training sessions taking place in the next month. Their initial estimate is that this could avert a DALY (disability adjusted life year) per $166, though more will be known after the pilot. Long-term, the plan is to demonstrate efficacy and then shift from direct delivery into supporting the Ghana Health Service to make this a standard long-term practice. You can see more on their website, sign up to their mailing list, or reach out directly.


Lead exposure: a shallow cause exploration

by JoelMcGuire, Samuel Dupret, MichaelPlant, Ryan Dwyer

2-week investigation into the impact of lead exposure in childhood on subjective wellbeing in adulthood. Two correlational longitudinal studies in New Zealand and Australia suggest an additional microgram of lead per deciliter of blood throughout 10 years of childhood leads to a loss of 1.5 WELLBYs (or an estimated 3.8 including household spillovers). Back of the envelope calculations suggest this means lead-reducing interventions could be 1 to 107 times more cost-effective than cash transfers. The authors are unsure if top organisations working to reduce lead exposure (eg. Pure EarthLead Exposure Elimination Project) have funding gaps.


Animal Welfare

Announcing a new animal advocacy podcast: How I Learned to Love Shrimp

by James Özden, AmyOdene

How I Learned to Love Shrimp is a podcast about promising ways to help animals and build the animal advocacy movement. We showcase interesting and exciting ideas within animal advocacy and will release bi-weekly, hour-long interviews with people who are working on these projects.”


Healthier Hens Y1.5 Update and scaledown assessment

by lukasj10, Isaac_Esparza

Author’s tl;dr: “Healthier Hens has scaled down due to not being able to secure enough funding to provide a sufficient runway to pilot dietary interventions effectively. We will continue through mini-projects and refining our plan for a feed pilot on the ground until our next organisational assessment at the end of summer 2023. Most efforts will now be spent on reporting, dissemination and fundraising. In this post we share updates, show what went well, less so and what others can learn from our attempts.”



Planned Updates to U.S. Regulatory Analysis Methods are Likely Relevant to EAs

by MHR

The U.S. Office of Management and Budget (OMB) has proposed an update to Circular A-4, which provides guidance to federal agencies regarding methods of regulatory analysis. Relevant updates include:

  1. Allowing for consideration of impacts on future generations when analyzing the benefits of policies that reduce the chance of catastrophic risks.
  2. In certain contexts, allowing consideration of impacts to non U.S. citizens residing abroad. (Eg. where this supports a cooperative international approach / leads other countries in doing the same).
  3. Reducing the default social discount rate over the next 30 years from 3% to 1.7%.
  4. Discussion of long-term discounting, with reference to work by Derek Parfit. The updated guidance leaves substantial flexibility in picking an approach to this.

Public comments can now be submitted here and here until June 6th.


GWWC's 2020–2022 Impact evaluation (executive summary)

by Michael Townsend, Sjir Hoeijmakers, Giving What We Can

Giving What We Can (GWWC) estimates their counterfactual impact on donations in 2020 - 2022, based largely on self-reported data. Key results include:

  • $1 spent on GWWC caused ~$9 (conservatively) to ~$30 (best guess) to go to highly effective charities.
  • Total counterfactual generation of ~$62M in value for highly effective charities. ~60% of donations go to improving human wellbeing, ~10% to animal welfare, ~10% to improving the future, and ~20% is unknown.
  • An average of ~$22K is generated for each full GWWC pledge, and ~$2K per trial pledge (of which ~9% convert into full pledges).
  • Pledge donations increase with age of the pledge ie. pledgers give more per year over time, more than compensating for those who stop donating entirely.
  • <1% of donors (a few dozen) account for >50% of recorded donations.
  • Pledges generate 1.5-4.5x as much value as non-pledge donations.


The billionaires’ philanthropy index

by brb243

A spreadsheet of 2.5K billionaires, containing information on current wealth, donation amounts and cause areas donated to.



Apply to >30 AI safety funders in one application with the Nonlinear Network

by Drew Spartz, Kat Woods, Emerson Spartz

Nonlinear has built a network of >30 (and growing) earn-to-givers who are interested in funding good AI safety-related projects. Apply for funding or sign up as a funder by May 17th. Funders will get access to a database of applications relevant to their specific interests (eg. interpretability, moonshots, forecasting etc.) and can then get in touch directly or via Nonlinear with those they’d like to fund.


Write more Wikipedia articles on policy-relevant EA concepts

by freedomandutility

The impact of writing Wikipedia articles on important EA concepts is hard to estimate, but potentially high upside with little downside risk. The author suggests 23 pages (with more ideas in comments) that could use creating or adding detail to, including ‘Existential Risk’, ‘Alternative Proteins’, and ‘Political Representation of Future Generations’.


SERI MATS - Summer 2023 Cohort

by Aris Richardson

Applications are open for the Summer 2023 Cohort of the SERI ML Alignment Theory Scholars Program, due May 7th. It aims to help scholars develop as alignment researchers. The program will run from ~June - August (including 2 months in-person in Berkeley), with an optional extension through to December.


Community & Media

EA is three radical ideas I want to protect

by Peter Wildeford

Argues that EA contains three important ideas rarely found elsewhere, and which are important enough to protect:

  1. Radical Empathy - the idea that there are many groups of people, or other entities, that are worthy of moral concern even if they don't look or act like us.
  2. Scope Sensitivity - willingness to systematically search for the best ways to use our resources, from a cause neutral and scope sensitive standpoint.
  3. Scout Mindset - the view that we should be open, collaborative, and truth-seeking in our understanding of what to do.


Run Posts By Orgs

by Jeff Kaufman

The author is very happy to see posts about EA orgs which point out errors or ask hard questions. However, they suggest letting orgs review a draft first. This allows the org to prepare a response (potentially including important details not accessible to those outside the organization) and comment it when you post. Without this, staff may have to choose between scrambling to respond (potentially working out of hours), or delaying response and risking that many people will never see their follow-up and may downgrade their perception of the org without knowing those details.


Things that can make EA a good place for women

by lilly

EA is a subpar place for women in some ways, but the author argues it also does well on many gender issues relative to other communities, including:

  1. Defying conventional norms like pressuring women to spend time on makeup, hair etc.
  2. Valuing important things about people (like their kindness and work quality).
  3. Intentionally connecting people who can help each other, creating an alternative to ‘old boys club’ networking.
  4. Caring about community building and representation (eg. funding Magnify Mentoring).
  5. Looking out for each other.
  6. High prevalence of good allies - EA has a lot of good actors.
  7. Organizations that walk the walk, with progressive policies.
  8. People are willing to fix things that are broken, even at high cost.


My updates after FTX

by Benjamin_Todd

A long list of updates the author is and isn’t making after reflecting on the events of the past 6 months, including FTX.

Updates include (not a comprehensive list - see post for more):

  • EA may attract some dangerous people, and EAs shouldn’t be automatically assumed as trustworthy and competent - particularly if they are “aggressive optimizers”.
  • Governance and not doing things that are wrong from a common sense viewpoint are even more important than they previously thought.
  • The value (and bar) for object-level funding has gone up ~2x, and the long-term cost-effectiveness of community building has gone down.
  • EA as a set of ideas and values seems more important to engage with over EA as an identity / subculture / community.

Updates not made include (again not comprehensive):

  • Anything on the core ideas / values of effective altruism (eg. helping others, prioritizing, donating, caring about x-risk) - these still make sense.
  • Any distrust of longtermism or x-risk - that would be a weird leap, as the arguments for these weren’t related to what happened.
  • EA relying too much on billionaires / giving billionaires too much influence on cause prioritization / not being democratic enough in funding decisions. These are separate discussions and it remains the case that one billionaire could donate more than the entire community.


Announcing CEA’s Interim Managing Director

by Ben_West

Ben West is the new Interim Managing Director of CEA, after Max Dalton stepped down as Executive Director a few weeks ago. This position will likely last ~6-12 months until a new Executive Director is hired.

They use this post to reflect on some wins in CEA over the past year, with plentiful jokes and memes (click into the post itself for those!):

  • Āpproximately doubling the connections made at EA events in 2022 vs. 2021.
  • Quintupling engagement with object-level posts on the EA forum over 2 years.
  • Rapidly growing the University Group Accelerator Program and EA virtual programs.


EA & “The correct response to uncertainty is *not* half-speed”

by Lizka

“When we're unsure about what to do, we sometimes naively take the "average" of the obvious options — despite the fact that a different strategy is often better.”

For example:

  • Being unsure if a role is a good fit, so continuing in it but with less effort.
  • Being unsure about the value of community / field building, so thinking those working on it should slow it down.
  • Thinking that animal welfare is significantly more important than you used to, but it’s hard to switch careers, so you make your work slightly animal-welfare related.

Going ‘half-speed’ in this way can make sense if speed itself is part of the problem, if you’re being careful / preserving option value, or if it’s a low cost way of getting capacity for figuring out what to do. Otherwise it’s often not the best option.


Sorted by Click to highlight new comments since:

Thanks for posting this, Zoe, this was really useful! I've subscribed to the Newsletter, too.

Curated and popular this week
Relevant opportunities