Hide table of contents

This series explains my part in the EA response to COVID, my reasons for switching from AI alignment work to the COVID response for a full year, and some new ideas the experience gave me. While it is written from my (Jan Kulveit's) personal perspective, I co-wrote the text with Gavin Leech, with input from many others.

The first post covers my main motivation: experimental longtermism.

Feedback loop

Possibly the main problem with longtermism and x-risk reduction is the weak and slow feedback loop. 

(You work on AI safety; at some unknown time in the future, an existential catastrophe happens, or doesn’t happen, as a result of your work, or not as a result of your work.)

Most longtermists and existential risk people openly admit that the area doesn't have good feedback loops. Still, I think the community at large underappreciates how epistemically tricky our situation is. Disciplines that lack feedback from reality are exactly the ones that can easily go astray.

But most longtermist work is based on models of how the world works - or doesn’t work. These models try to explain why such large risks are neglected, the ways institutions like government or academia are inadequate, how various biases influence public perception and decision making, how governments work during crises, and so on. Based on these models, we take further steps (e.g. writing posts like this, uncovering true statements in decision theory, founding organisations, working at AI labs, going into policy, or organising conferences where we explain to others why we believe the long-term future is important and x-risk is real). 

Covid as opportunity

Claim: COVID presented an unusually clear opportunity to put some of our models and theory in touch with reality, thus getting more "experimental" data than is usually possible, while at the same time helping to deal with pandemic. The impact of the actions I mentioned above is often unclear even after many years, whereas in the case of COVID impact of similar actions was observable within weeks and months.

For me personally, there was one more pull. My background is in physics, and in many ways, I still think like a physicist. Physics - in contrast to most of maths and philosophy - has the advantage of being able to put its models in touch with reality, and to use this signal as an important driver in finding out what's true. In modern maths, (basically) whatever is consistent is true, and a guiding principle for what's important to work on is a sense of beauty. To a large extent, the feedback signal in philosophy is what other philosophers think. (Except when a philosophy turns into a political movement - then the signal comes from outcomes such as greater happiness, improved governance, large death tolls, etc.) In both maths and philosophy, the core computation mostly happens "in” humans. Physics has the advantage that in its experiments, "reality itself" does the computation for us.

I miss this feedback from reality in my x-risk work. Note that many of the concrete things longtermists do, like posting on the Alignment Forum or explaining things at conferences, actually do have feedback loops. But these are usually more like maths or philosophy: they provide social feedback, including intuitions about what kinds of research are valuable. One may wonder about the problems with these feedback loops, and what kind of blind-spots or biases they entail.

At the beginning of the COVID crisis, it seemed to me that some of our "longtermist" models were making fairly strong predictions about specific things that would fail - particularly about inadequate research support for executive decision-making. After some hesitation, I decided that if I trusted these models for x-risk mitigation, it made sense to use them to solve COVID as well. And in pretty much every scenario, I learn something.   

Over the next year, I and many collaborators tried a number of interventions to limit the damage associated with COVID. While we were motivated by trying to help, basically every intervention was also an experiment, putting some specific model in touch with reality, or attempting to fix some perceived inadequacy. Our efforts have had some direct impact, but from  the longtermist perspective, the main source of value is  ‘value of information’.

A more detailed description of our work is forthcoming, but briefly: we focused on inadequacies in the world’s modeling, forecasting, and decision support. Legible outputs include our research on non-pharmaceutical interventions, advising major vaccine manufacturers, advising multiple governments, sometimes at the executive level, consulting with bodies such as the European CDC and multiple WHO offices, and reaching millions of educated readers with our arguments, with mostly unknowable effects. I'm fairly confident the efforts made at least one country’s COVID policy not suck during at least one epidemic wave, and moderately confident our efforts influenced multiple countries toward marginally better decisions.  

Concretely, here’s a causal graph of some of our efforts: 
 


(Every edge has value of information.)


The sequence of posts, to be released over the next couple of weeks, will cover more detail:

  1. Experimental longtermism (you are here)
  2. Hinges and crises 
    1. An exemplar crisis with a timescale of months; 
    2. Crisis and opportunity; 
    3. Default example for humanity thinking about large-scale risk;
    4. Yet another drowning child thought experiment
  3. What we tried
  4. How we failed
  5. The case for emergency response teams
  6. Static and dynamic prioritisation: effective altruism should switch from argmax() to softmax()
  7. Different forms of capital
  8. Miscellaneous lessons
    1. Evidence in favour of trespassing
    2. Evidence for crises as opportunities
    3. Research distillation is neglected
  9. Call to Action

 

Part of the value of my COVID year depends on whether I can pass on the data I collected, and the updates I made from them. The posts to come discuss some of these.
 

Conclusion

A year of intense work on COVID likely gave me more macrostrategy ideas, governance insights, and general world-modelling skills than the counterfactual (which would have been mostly solo research from my home office and occasional zoom calls with colleagues from FHI). My general conclusion is that such "experimental longtermist" work is useful, and relatively neglected. 

One reason for neglectedness may be the type of reasoning where a longtermist compares the "short-term direct impacts" of similar work with the potential "long-term direct impacts" of a clearly longtermist project, and neglects the value of information term. (Note that a longtermist prioritisation taking value of information into account will often look different from a prioritisation focused on maximising direct impact - e.g. optimising for the value of information will lead to exploring more possible interventions). 

My rough guess of the total value of information is a >10% improvement in my decision-making ability about large matters. Adding in what I hope you learn from me, it seems a clearly good investment. 

On the margin, more longtermists should do experiments in this spirit; for the future, seize the day.

Comments9


Sorted by Click to highlight new comments since:

Some brain noise: "Hard data are the opposite of memes. Nature abhors a vacuum: if you don't have data then memes will rush in to replace it."

In general, I'm a big fan of approaches that are optimized around Value of Information. Given EA/longtermism's  rapidly growing resources (people and $), I expect that acquiring information to make use of resources in the future is a particularly high EV use of resources today.

I'm excited for this series! I'm a big believer in EAs doing more things out in the world, both for the direct impacts but probably even more for the information value.

For example, I'm thrilled that Longview is getting into nuclear security grantmaking. I think this is:

  1. good in its own terms
  2. will teach us more about how international relations, coordination, and treaties work, which seems essential to ensure AI and synthetic bio advances go well
  3. gives us something concrete to point to that almost everyone can agree is valuable

(disclosure that I contract for Longview on something totally different and learned about this when everyone else did). 

I think the sociology of EA will make us overly biased towards research and away from action, even when action would be more effective, in the near and long term. For example, I think there are major limitations to developing AI governance strategies in the absence of working with and talking to governments.

TBC, research is extremely important, and I'm glad the community is so focused on asking and answering important questions, but I'd be really happy to see more people "get after it" the way you have. 

Very interesting! I'm often worried about these facets of Longtermism. I also find it a bit weird that lots of longtermists are aware of the explore/exploit trade-off and the importance of encouraging exploration for minimising regret - yet don't nearly encourage it enough. Not that I'm immune, of course.

As a side note, I also liked your writing.

Edit: I find it fitting to link to this xkcd: https://xkcd.com/2353/

Wow, thanks a lot for the work, and for sharing your insights here, I'm really impressed you were able to get involved and contribute on such a massive scale!

Minor thing I stumbled on: 
> reaching millions of educated readers with our arguments

If this is based on the upper bound of 20 million followers of the accounts who tweeted about the paper, I'm somewhat sceptical that more than 10% of those have actually read even one of your arguments. Would expect that maybe 5% have read the specific tweet and .1% have gone more in depth on the paper?

Also, I wonder what you think of forecasting as another route to tie longtermists to the reality-mast. It seems like it's much less effortful, but also won't provide nearly as much high quality feedback compared when you actually interact with the systems that you're interested in understanding better.

[Epistemic status: massive extrapolation from a few hard numbers.]

Yeah that paper is just the big one, and just its Twitter audience; there are 7 papers, 100 or so major newspaper spots, and a dozen big Wiki spots. (e.g. The masks paper was on the BBC, ACX, NYT, Wired, Guardian, Mail, MR...) I've not actually estimated the total audience but I would eyeball a 95% CI as like [6m, 300m] using a weak operational audience of "people who read 1+ of our main claims presented as having good evidence".

As for in-depth readers: ~160k people downloaded the papers, 6k of which saved it to Mendeley, the poor sods. 20k deep readers sounds about right.

Millions is probably a safe bet/lower bound: majority won't be via direct twitter reads, but via mainstream media using it in their writing. 

With twitter, we have a better overview in the case of our other research on seasonality (still in review!). Altmetric estimate is it was shared with accounts with an upper bound of 13M followers. However, in this case, almost all the shares were due to people retweeting my summary. Per twitter stats, it got 2M actual impressions. Given the fact the NPI research was shared and referenced more, it's probably more >1M  reads just on twitter.

Re: forecasting (or bets). In a broad sense, I do agree. In practice I'm a bit skeptical that a forecasting mindset is that good for generating ideas about "what actions to take". "Successful planning and strategy" is often something like "making a chain of low-probability events happen", which seems distinct, or even at tension with typical forecasting reasoning. Also, empirically, my impression is that forecasting skills can be broadly decomposed into two parts - building good models / aggregates of other peoples models, and converting those models into numbers. For most people, the "improving at converting non-numerical information into numbers" part has initially much better marginal returns (e.g. just do calibration trainings...), but I suspect doesn't do that much for the "model feedback".

 

Thanks for the response,  seems like a safe bet, yeah. :)

Re forecasting, "making low-probability events happen" is a very interesting framing, thanks! I still am maybe somewhat more positive about forecasting:

  • many questions involve the actions of highly capable agents and therefore requiring at least some thinking in the direction of this framing
  • the practice of deriving concrete forecasting questions from  my models seems very valuable for my own thinking, and some feedback from a generalist crowd about how likely some event will happen, and seeing in the comments what variables they believe are relevant + having some people posting new info that relate to the question seems fairly valuable, too, because you can easily miss important things
Curated and popular this week
 ·  · 5m read
 · 
This work has come out of my Undergraduate dissertation. I haven't shared or discussed these results much before putting this up.  Message me if you'd like the code :) Edit: 16th April. After helpful comments, especially from Geoffrey, I now believe this method only identifies shifts in the happiness scale (not stretches). Have edited to make this clearer. TLDR * Life satisfaction (LS) appears flat over time, despite massive economic growth — the “Easterlin Paradox.” * Some argue that happiness is rising, but we’re reporting it more conservatively — a phenomenon called rescaling. * I test rescaling using long-run German panel data, looking at whether the association between reported happiness and three “get-me-out-of-here” actions (divorce, job resignation, and hospitalisation) changes over time. * If people are getting happier (and rescaling is occuring) the probability of these actions should become less linked to reported LS — but they don’t. * I find little evidence of rescaling. We should probably take self-reported happiness scores at face value. 1. Background: The Happiness Paradox Humans today live longer, richer, and healthier lives in history — yet we seem no seem for it. Self-reported life satisfaction (LS), usually measured on a 0–10 scale, has remained remarkably flatover the last few decades, even in countries like Germany, the UK, China, and India that have experienced huge GDP growth. As Michael Plant has written, the empirical evidence for this is fairly strong. This is the Easterlin Paradox. It is a paradox, because at a point in time, income is strongly linked to happiness, as I've written on the forum before. This should feel uncomfortable for anyone who believes that economic progress should make lives better — including (me) and others in the EA/Progress Studies worlds. Assuming agree on the empirical facts (i.e., self-reported happiness isn't increasing), there are a few potential explanations: * Hedonic adaptation: as life gets
 ·  · 38m read
 · 
In recent months, the CEOs of leading AI companies have grown increasingly confident about rapid progress: * OpenAI's Sam Altman: Shifted from saying in November "the rate of progress continues" to declaring in January "we are now confident we know how to build AGI" * Anthropic's Dario Amodei: Stated in January "I'm more confident than I've ever been that we're close to powerful capabilities... in the next 2-3 years" * Google DeepMind's Demis Hassabis: Changed from "as soon as 10 years" in autumn to "probably three to five years away" by January. What explains the shift? Is it just hype? Or could we really have Artificial General Intelligence (AGI)[1] by 2028? In this article, I look at what's driven recent progress, estimate how far those drivers can continue, and explain why they're likely to continue for at least four more years. In particular, while in 2024 progress in LLM chatbots seemed to slow, a new approach started to work: teaching the models to reason using reinforcement learning. In just a year, this let them surpass human PhDs at answering difficult scientific reasoning questions, and achieve expert-level performance on one-hour coding tasks. We don't know how capable AGI will become, but extrapolating the recent rate of progress suggests that, by 2028, we could reach AI models with beyond-human reasoning abilities, expert-level knowledge in every domain, and that can autonomously complete multi-week projects, and progress would likely continue from there.  On this set of software engineering & computer use tasks, in 2020 AI was only able to do tasks that would typically take a human expert a couple of seconds. By 2024, that had risen to almost an hour. If the trend continues, by 2028 it'll reach several weeks.  No longer mere chatbots, these 'agent' models might soon satisfy many people's definitions of AGI — roughly, AI systems that match human performance at most knowledge work (see definition in footnote). This means that, while the compa
 ·  · 4m read
 · 
SUMMARY:  ALLFED is launching an emergency appeal on the EA Forum due to a serious funding shortfall. Without new support, ALLFED will be forced to cut half our budget in the coming months, drastically reducing our capacity to help build global food system resilience for catastrophic scenarios like nuclear winter, a severe pandemic, or infrastructure breakdown. ALLFED is seeking $800,000 over the course of 2025 to sustain its team, continue policy-relevant research, and move forward with pilot projects that could save lives in a catastrophe. As funding priorities shift toward AI safety, we believe resilient food solutions remain a highly cost-effective way to protect the future. If you’re able to support or share this appeal, please visit allfed.info/donate. Donate to ALLFED FULL ARTICLE: I (David Denkenberger) am writing alongside two of my team-mates, as ALLFED’s co-founder, to ask for your support. This is the first time in Alliance to Feed the Earth in Disaster’s (ALLFED’s) 8 year existence that we have reached out on the EA Forum with a direct funding appeal outside of Marginal Funding Week/our annual updates. I am doing so because ALLFED’s funding situation is serious, and because so much of ALLFED’s progress to date has been made possible through the support, feedback, and collaboration of the EA community.  Read our funding appeal At ALLFED, we are deeply grateful to all our supporters, including the Survival and Flourishing Fund, which has provided the majority of our funding for years. At the end of 2024, we learned we would be receiving far less support than expected due to a shift in SFF’s strategic priorities toward AI safety. Without additional funding, ALLFED will need to shrink. I believe the marginal cost effectiveness for improving the future and saving lives of resilience is competitive with AI Safety, even if timelines are short, because of potential AI-induced catastrophes. That is why we are asking people to donate to this emergency appeal