[I wrote this back in August, but it never felt finished enough to post. I don't have any plans to come back to this, so I'm posting it here so it doesn't just gather dust.]
This is the third post in a three-part series summarising, critiquing, and suggesting extensions of Leopold Aschenbrenner's paper Existential Risk and Growth. I summarised the model and results from the paper in the first post, and Ben critiqued the model in the second post. The series is written by me and Ben Snodin, with input from Phil Trammell. Any errors in this post are mine.
1. Introduction
The global factors contributing to existential risk form a complex system. One way of understanding complex systems is by building a granular gearsy understanding of the system. For example, forecasting computer hardware trends and their effects on AI capabilities, thinking about how risks from advanced biotechnology might shift from sophisticated to unsophisticated actos, or speculating on the safest order for specific technologies to be developed. Currently, most work on understanding existential risk makes granular and specific arguments about particular technologies or political situations.
Another approach to understanding complex systems is to describe them with simple, abstract models that capture the most important behavior without being attached to specific mechanisms. These models can give us a different flavor of insight about existential risk. An example of this approach besides Existential Risk and Growth is Modelling the Human Trajectory. For some systems this abstract approach works, and for some it doesn’t. We don’t know yet how well it works for existential risk. If it does, we could get insights that were more robust, and more general than we currently can.
Whichever your preferred approach, it might be that a mixed strategy, where both approaches are used, gives more robust conclusions, particularly where the approaches converge. Exploring both approaches in parallel also has high information value given the high stakes and our uncertainty about which approach might be more fruitful. To help others build the new field, this post presents some promising directions for further research.
Section 2 posits some plausibly useful extensions and variations. Section 3 discusses some extensions and variations that didn't seem promising to us. These judgements are our current best guesses and we wouldn't be at all surprised if one of the 'less promising' extensions actually turned out to be very promising.
2. Promising Extensions and Variations
This section presents some extensions and variations to try. The general idea here is to work out what the most important factors contributing to existential risk might be, and then find a way to model them which is simple enough to work with, but preserves the most important relationships between variables.
An asterisk* indicates I'm particularly excited about the idea.
What Does The Risk Depend On?
- Lots of qualitative work has been done on understanding existential risks and risk factors. Are there any common themes among this work that could be modelled well quantitatively?
- Some people and organisations that come to mind are Toby Ord, Nick Bostrom, Nick Beckstead, Owen Cotton-Barratt, Will MacAskill, CLR, and OpenPhil
- What happens if we model different sectors (like AI, synthetic biology, APM) that each have their own potency parameters and ?
- What happens if we model and as functions of time to capture particular facts about the world? For example, if we thought that wisdom was growing over time for significant reasons other than economic growth, we might assume to be increasing over time.
- The model assumes that all consumption goods carry equal risk. It could be worth considering risk that varies across goods. For example what happens if we include goods that can be discovered by research in the consumption sector, but are then seen to be dangerous and so not produced?
- It could then be worth extending the representative agent to include some malicious actors who might deliberately use these new dangerous goods to cause catastrophe.
- A further extension in this direction could give malicious actors the option of doing research into dangerous goods, distinct from consumption research.
- Alternatively, building on Bostrom's urn analogy, we could model a probability that each technology will cause catastrophe as soon as it is discovered, rather than having to be produced to cause catastrophe. (See the Russian Roulette model in Life and Growth.)
- How do the results change if risk depends on per capita safety production? This would model the increased safety spending needed to regulate lots of actors.
- How do the results change if the risk depends on the level of consumption technology?
- This comes from the intuition that what matters for risk is whether civilisation is able to create some unfriendly AI, synthetic pathogen etc. and not how many of those things there actually are. Some technologies are dangerous as soon as they are discovered, and don't need to be produced at scale.
- Note that this extension might not be quite as promising as it appears on first glance, because the choice to model risk as depending on production rather than technology is more reasonable than might be obvious. Ideas printed on academic papers or R&D reports don’t do direct damage. It’s only when these technologies are actually implemented that the risk occurs. For example, it seems quite safe for a civilisation to have the knowledge to create nuclear weapons, advanced AI, or atomically precise manufacturing but never actually build them. This is a speculative judgement call though and I'm not sure about all the details (eg. are nuclear weapons really well modelled as a consumption good?).
- How do the results change if the risk depends on the rate of change of consumption technology? This would weakly model the idea of transition risks, or Bostrom's urn analogy.
What Does The Growth Depend On?
The model assumes exogenous exponential population growth. How do the results change if this is replaced by:
- An endogenous fertility rate (see The End of Economic Growth?), or
- An exogenous growth rate in total factor productivity, so that a unit of labor becomes more productive over time?
There are existing models of learning by doing where technology is discovered as a side effect of production. What happens if we replace the existing production function for safety technologies with a model where:
- Safety technologies are discovered as a side effect of consumption research;
- , where is a new parameter representing the spillover from consumption research to safety technologies
- Safety technologies are discovered as a side effect of consumption technology;
- , where is a new parameter representing the spillover from consumption technologies to safety technologies
- Or where a safety technology can only be developed once the associated consumption technology has been discovered?
What Should A Philanthropist Do?
- *The current model includes only one resource, labor, which must be instantly spent across the four kinds of work - (safety, consumption)(research, production). What happens if we introduce a patient philanthropist with some amount of money which she can spend on:
- The four kinds of labor,
- Investment
- Lowering the discount rate by some mechanism?
- *The effect of a shock to growth and a shock to the discount rate on the relative value of life are compared in the paper. What are the relative effects on of
- A shock to safety technology, ,
- Moving towards the global impatient optimum from a decentralized allocation, whatever that might be (recall that the paper focuses on the dynamics of the optimal allocation. Risk mitigation is a global public good so it is unlikely resources are optimally allocated in reality, even for impatient preferences),
- A shock to growth,
- A shock to the discount rate?
- If , civilisation might still have a chance at a relatively long future - say 10,000 years instead of 1 billion years. What are the relative effects on expected long-term welfare of accelerating growth compared to increasing safety spending in this case?
Empirical Questions
- *Economic stagnation could lead to dawdling around in the ‘time of perils’. What are the most likely causes of extended economic stagnation?
- *What are the best ways to change a person or an organisation’s discount rate? How much can it change?
- The mechanism by which risk decreases is that, as people become richer, they value life relatively more than marginal consumption. Can we shortcut around increasing people’s consumption and influence their welfare directly so that they value life more? What might be the best ways to do this?
- This depends on whether increased safety spending with wealth is because people are richer specifically, or just because they are happier.
What about AI?
One might argue that AI risk is lower when growth is slower, because the slower growth gives safety researchers more time to prepare for future technological developments. Note, however, that this assumes growth preferentially speeds up capabilities work over safety work. This would run counter to Kuznets curves in other domains, and also doesn’t seem right empirically. Funding for AI safety seems to come from richer people (eg. Open Phil) than funding for AI capabilities (which comes indirectly from paying for, or advertisements on, consumption goods like Google search).
All that said, one might maintain (as done here) that there is a sense in which AI development parallelizes more than AI safety work, and that this overturns the conclusion above. An interpretation in the framework of the model would be that perhaps the lambda on larger than the lambda on . So it would be interesting to see whether plausible variation in lambda across sectors changes the qualitative conclusions above, rather than merely the rate at which society shifts labor between the sectors.
There are probably lots of useful AI-specific extensions or variations to the paper that I don't have the expertise to think of. For example, considering some of the following could be useful:
- How much AI safety work can happen in parallel to capabilities work, rather than having to happen afterwards?
- What's the most useful distinction between AI technologies and AI goods?
- Which AI systems should be modelled as capital, labor, sector-specific goods or technologies, or general purpose goods or technologies?
- For example, a particular neural net method like deep learning could be modelled as a technology, compute could be modelled as capital, and actual systems that use compute to run a neural net could be modelled as labor (just an example for provocation).
- Empirically, how fast is the AI safety sector growing compared to the AI capabilities sector?
I'm interested to see approaches that are more granular than this model, but still take a high-level approach. Bridging the gap between this abstract model, and other granular arguments about AI could be fruitful. I've presented two ways you could model AI in the framework of Existential Risk and Growth below.
Machines as Population
One interpretation of the exogenous exponential population growth in the model is that, though human population growth will stagnate, new robot labor will allow growth to continue. This would imply trying an extension where:
- The human population is constant, or tends to a constant;
- Civilisation can invest in developing AI which functions as a 'population-increasing' good;
- The risk of catastrophe depends on the level of this 'population-increasing' technology, corresponding to more advanced AI capabilities as well as the amount of this 'population-increasing' good, corresponding to more AI agents being deployed.
This could capture the tradeoffs between researching more advanced AI systems; deploying existing AI systems to work for us; and the risks associated with both of these. As well as catastrophic events, this model might also predict positive singularity-type events.
One way of modelling this quantitatively would be by introducing a population sector analogous to the consumption and safety sectors, so that the production functions for technologies are given by:
Where is level of population technology, and is the number of scientists researching population technology. Then, the (endogenous) growth comes from investment in labor-reproduction:
With this variation, there would be six kinds of labor - (safety, consumption, population)(research, production). The resource constraints would be:
At low levels of population-technology, , we can interpret population-producing labor, , as resources spent on standard biological human reproduction (including childcare and schooling). At high levels of population technology, we can interpret as labor spent on creating artificial workers (digital or physical). One obvious problem with this approach is that, as the artificial population grows, the consumption per capita decreases, even though the consumption per human doesn't decrease. We might not want to model artificial labor as caring about it's own consumption, but in the current model artificial labor would be included in the representative agent.
What we're really interested in is the consumption per amount of human labor, not the consumption per amount of total labor. To model this, we can define total population as the sum of human labor , and machine labor, . Then the consumption per human is rather than .
Note that there are other ways to model this general idea.
Modelling Computer Hardware
Following Hanson (2001) we might want production functions that model computer hardware specifically:
Where is total computer hardware at and it can be divided between consumption production and safety production. The allocation of hardware could be broken down further between all of the four kinds of labor in (safety, consumption)×(research, production)). There's a choice here for the production function for hardware - one option is to follow Hanson, but there might be a simpler approach that works better.
Extending the Representative Agent
The model assumes the centralised optimal impatient allocation.
- *What might the decentralised allocation be (where the global public good of safety tech is not provided efficiently)?
- What is the effect on risk of moving towards the centralised allocation?
- What interventions might best move us towards the centralised allocation?
This would also shed light on the relative value of improving international coordination compared to the four kinds of labor or lowering the discount rate.
In the model, a representative agent must instantly allocate the endowment good, labor, between four types of work.
- What happens if agents are also able to save, or invest in capital?
- What if that capital is sector-specific?
Agents are assumed to know the values of and .
- *What happens if we allow for agent uncertainty in the values of and ?
Agents also are assumed to know the allocation of labor between the four sectors, and the resulting production of consumption and safety technologies and goods. In the centralized optimal allocation, this would probably be a fine assumption. But even then, complete knowledge of and might not be.
Estimating Model Parameters
- Safety production is a crucial variable for reducing risk. Is there a way to track this (as GDP tracks production in the consumption sector)?•Another critical variable is the discount rate. Whose discount rate are we modelling - the electorate,governments, firms like Google’s? Do these groups have significantly different discount rates?
- How can we better estimate ? This question is useful because, according to the model,
- if and catastrophe is inevitable, near-term work looks more attractive;
- if and catastrophe is likely but avoidable, existential risk mitigation work looks more attractive;
- if and catastrophe is very unlikely given some growth, patient longtermist work looks more attractive.
However, this particular question might be intractable. One obvious way to approach the question would be to get historical data on the size of the consumption sector, the size of the safety sector, and the chance of anthropogenic existential catastrophe. The second seems very hard, and the third close to impossible.
Less Promising Extensions and Variations
Natural Risks
Only anthropogenic risks are considered. What happens if we include some natural hazard rate (either constant, or decreasing in population) and either extend the existing safety sector so that it mitigates both natural and anthropic risk or add a natural-safety sector distinct from the anthropic-safety sector? This captures things like asteroid tracking and deflection,solar flare preparation, natural pandemic preparedness etc. Synder-Beattie et al. (2019) find that natural risks are negligible compared to potential anthropogenic extinction risks, which makes this extension seems less promising.
Overlapping Generations
Existing literature models overlapping generations, which might seem at first glance to be a useful extension to the model - it seems intuitively plausible that there could be some effect from agents considering welfare over their own finite lifetimes. However, we can already interpret the model’s infinitely-lived selfish agent with exponentially discounted welfare as a sequence of finitely-lived partially altruistic agents who discount the welfare of their descendants, so this extension seems less insightful. Indeed, in reality, we can see how the preferences of overlapping generations are actually aggregated. The overlapping generations form governments which exponentially discount future welfare in their analyses - just as if there was a single infinitely-lived agent exponentially discounting their own welfare
Sector-dependent production functions
Recall that consumption goods and safety goods have analogous production functions:
Similarly, consumption technologies, , and safety technologies, , have analogous production functions:
It might seem like an unnecessarily strong assumption to give each sector the same parameters , , and . But the key results of the model don't depend on these values being the same between sectors, so the assumption doesn't seem too problematic, and relaxing the assumption seems less promising.
What about AI?
One way to model AI that wouldn’t would be as a general purpose technology that can affect production in both the consumption and the safety sector. This technology would have the same production function as the other two sectors:
And the new production functions for consumption and safety goods would depend on the level of general purpose technology, :
It seems to me that would just increase the effective scale of the economy, so would probably need to be tweaked to give new insights.
What does the risk depend on?
It’s assumed that risk depends only on instantaneous aggregate production in the consumption and safety sectors. What would happen if risk depended on (some combination of):
- Rate of change of consumption production
- This comes from the intuition that if we rapidly scaled up production of consumption goods, we might expect the hazard rate to increase. Or if we suddenly stopped producing consumption goods, we might expect the hazard rate to drop. It seems that this is already captured by the model though
- If the instantaneous consumption production decreases, and safety production remains the same, the model predicts the hazard rate would drop.
- To further dissect the intuition that rate of progress itself is risky, imagine we increased resources spent on AI capabilities work by a factor of ten, but 80% of those extra resources went to safety work, like verifying that the new systems were working correctly. Or imagine that we increased spending on electricity production by a factor of ten, with 20% spent on burning fossil fuels but 80% spent on safety technologies like carbon capture. It seems like neither of these cases would actually increase risk, even though the rate of change of production in the consumption sector is high.
- Per capita consumption
- There could be a useful intuition here around more people being harder to kill. Note that Life and Growth models risk as depending on per capita safety spending and looks at risks to individuals,rather than to civilisation. However, it seems like aggregate consumption and safety production might be the more relevant variables for extreme civilisational disasters, which are the most concerning to longtermists.
What Should a Philanthropist Do?
- How does the above result change given a time varying scale parameter that makes spending at time produce relatively more utility than at other times? (See Discounting for Patient Philanthropists.)
The reason this seem potentially less promising is that the 'time of perils' that emerges in 'Existential Risk and Growth' already captures the useful part of hingeyness, - a philanthropist can have an outsized effect by spending in this time.