0. Abstract and main claims

[Skip this section if you want a minimally repetitive reading experience :)]

This post discusses an approach to x-risk management called ‘mitigation through modularity’. Roughly speaking, the approach involves decorrelating risks, as opposed to minimising them; its slogan would be "don’t put all your eggs in one basket", rather than "don’t drop the basket full of eggs". 

As you might suspect (given it can be summarised by a well-known proverb), this kind of thinking recurs in lots of places: diversification in finance, redundancy in software development, etc. To some extent, it has also already been applied to x-risk - most notably in the context of space settlement. But my impression is that its application to x-risk so far leaves quite a bit of value on the table (in particular, space settlement isn’t the only kind of strategy this approach points to). Below, I draw together what’s been said about the approach already, present a new frame through which this work could usefully be extended, and make a number of more speculative suggestions. My central claims are:

1. In principle, the strategy discussed below (‘mitigation through modularity’) is a robust way of achieving existential security.[1]

2. In practice, it is unlikely to be effective against:

  1. unaligned AI; and
  2. some other risks.[2]

3. It is likely to be (at least somewhat) effective against:

  1. asteroid or comet impact;
  2. supervolcanic eruption;
  3. nuclear war;
  4. climate change;
  5. other environmental damage;
  6. naturally arising pandemics;
  7. engineered pandemics; and
  8. some other risks.[3]

4. Near-term space settlement, as one possible implementation of this strategy, is not sufficient for existential security, and unlikely to be cost-effective. 

5. One promising reframing of the strategy is to take a ‘risk-first, rather than a ‘proposal-first’, approach. By this, I mean taking a specific risk (e.g. climate change), and then thinking of ways to decorrelate this across human civilisation - rather than taking a specific proposal (e.g. Martian settlement), and then thinking of the ways it might reduce risk.

1. Mitigation through modularity

Superficially, certain species of butterfly appear to be quite bad at survival. They live in small, isolated groups (around different meadows, for example). They are relatively poor flyers, meaning reliable inter-group travel is not an option. In addition, the groups are individually vulnerable, such that each one can be obliterated by a passing squall, a disease affecting host plants, or just an unfortunate role of the Darwinian die.[4] And yet, the butterflies persist.

We can explain their surprising resilience through the use of metapopulation models. In such a model, the total butterfly population is divided into numerous subpopulations, representing the isolated groups. For any given subpopulation, there is some risk of local extinction (e.g. squall). If the subpopulations were fully disconnected from one another, this process would eventually lead to the species’ global extinction, as each group of butterflies meets with its private doom. In practice, however, the metapopulation achieves global stability. While the butterflies cannot routinely travel between different meadows, there is nonetheless a small amount of exchange between subpopulations: every so often, a butterfly will be blown from one meadow to the next, or will happen to fly an unusually long distance in one direction. As a result, when one subpopulation becomes extinct, the area it previously occupied will be resettled by accidental pioneers of this sort. Moreover, the rate at which such resettlement events occur more than balances the rate of local extinction events. Even though each subpopulation will certainly go extinct, the metapopulation as a whole achieves effective immortality.[5]

One might be moved to wonder: can humanity pull off a similar trick? That is, can we develop into a kind of ‘metacivilization’, which achieves existential security by decorrelating risks across its several parts?[6]

More specifically, the example above serves to illustrate a distinct approach to risk mitigation. Rather than reducing the actual probability of catastrophic events, the approach works by (1) reducing the extent to which the effects of such events are correlated across different parts of the population, (2) allowing for ‘resettlement’ should one part of the population suffer a catastrophe, and (3) ensuring that the rate of resettlement is high enough to outweigh the rate of catastrophic collapse.[7] We might call this approach ‘mitigation through modularity’.    

In what follows, my primary aim is just to draw together what has been said about this approach already. As a secondary aim, I suggest a frame through which this work could usefully be extended (Section 3). In Section 4, I include a number of more speculative thoughts, claims, and suggestions in the general vicinity of the approach.  

2. What has been said about this already?

As far as I am aware, there isn’t any published work that directly addresses this approach, at least in the general case, as it applies to the present-day risk landscape. The most directly relevant work is:

  1. The Precipice, p194 (and associated end-notes); and
  2. A section called ‘General survival using Meta-Civilizations’ in Anders Sandberg's Grand Futures manuscript.

Page 194 of The Precipice contains a standalone textbox entitled ‘Security Among the Stars?’. It provides a useful summary of how modularity-type reasoning plays out in the specific case of space settlement. The main points made are (1) that various luminaries have suggested space settlement as a means of achieving existential security[8], (2) that space settlement would in fact reduce certain x-risks[9], but (3) that it’s not obvious it would have much impact on many of the most important risks over the next century[10], and (4) that even for the risks it would reduce, it’s not obviously among the most cost-effective ways of achieving similar results.[11] I think it’s also worth noting that none of the pro-space advocates appears to have done a rigorous assessment of the approach.[12] 

My general impression is that x-risk comes up as a rhetorical point in support of space settlement, which trades on an implied false equivalence between the claims ‘we should avoid putting all our eggs in one basket’ and ‘we should settle space’. It also seems likely there’s some amount of backwards rationalisation involved, given that many of these advocates are passionate about space in a way that’s independent of its potential to reduce x-risk.[13]  

In point of fact, of course, mitigation through modularity isn’t wedded to any specific practical implementation, and the full set of proposals that could employ this approach extends beyond space settlement. 

Anders’ Grand Futures manuscript does consider the approach at a higher level of generality (i.e. not restricted to space settlement), but its focus is (understandably) on how this works at a grand scale; it’s implicitly considering an interstellar or intergalactic meta-civilization, rather than shovel-ready proposals to mitigate present-day x-risk. The section introduces a simple birth-death model, very similar to how you might model the butterflies mentioned earlier, and uses this to establish the in-principle viability of the approach. Specifically, the model shows that the annual rate of (global) x-risk can be reduced below that required for humanity’s expected lifespan to continue into the stelliferous era (and beyond), using fairly reasonable assumptions about the rate of new civilisations starting up, and ‘dead’ areas being resettled. In other words, minimising global existential risk does not require minimising local existential risk. Another point worth noting here is that the model assumes the risks facing subpopulations are fully independent. 

A quick word on terminology: I don’t think standard terms exist in this area. The term ‘meta-civilizations’ (‘a meta-civilizational approach’) looks fairly well-suited to the Grand Futures case, but strikes me as somewhat out of place when considering things like a self-sustaining settlement in Antarctica.[14] ‘Modularity’ has the disadvantage of being an overloaded term already, and is a bit nonspecific, but comes in degrees (which I like), and can be adapted to specific risks relatively easily (e.g. modularity_biorisk). Other terms that seem potentially promising: ‘redundancy-based approaches’; ‘x-risk decorrelation’; ‘existential firebreaks’; ‘distribution-focussed approaches’; ‘risk decoupling’. Further suggestions welcome.

Alongside the two sources mentioned so far, some of the more general x-risk literature is indirectly relevant here. In particular, I think work that looks like ‘different ways of categorising x-risks’ can provide useful conceptual frames for thinking about mitigation through modularity. Some examples of this I’m aware of:

  • ‘Defence in Depth Against Human Extinction: Prevention, Response, Resilience, and Why They All Matter’ (Cotton-Barratt, Daniel, & Sandberg; 2020)
  • ‘Classifying Global Catastrophic Risks’ (Avin et al., 2018); and
  • The Precipice, Ch 6 ‘The Risk Landscape’.

‘Defence in Depth’ considers x-risk mitigation at three stages: prevention, response, and resilience. The first asks ‘how can we prevent the catastrophe happening in the first place?’, the second asks ‘how can we stop it from reaching global scale?’ and the third asks ‘how can we ensure at least some people survive?’. Mitigation through modularity fits most cleanly into the second and third of these categories. 

‘Classifying GCRs’ presents three ways of categorising risks, of which I think the most relevant is the ‘critical systems’ classification. This divides risks up according to which of human civilisation’s ‘critical systems’ a risk affects (e.g. the air we breathe). In doing so, it mirrors the common division of medical problems by ‘affected organ’. In the next section, I talk about how this classification could be useful in the context of mitigation through modularity.   

Chapter 6 of The Precipice contains brief summaries of the previous two papers, divides risks up in a commonsensical fashion (i.e. by their primary causal mechanism), and offers several other useful conceptual tools. I’ve adopted this terminology for the names of specific risks (e.g ‘engineered pandemics’, rather than ‘bioterror’, or ‘biorisk’). Some of the other ‘general risk’ material is broadly useful (e.g. defining existential risk factors, subtleties in combining risks), but it’s not immediately clear to me that it applies to the approach considered here.

My guess is that there’s also a wealth of relevant literature in other areas, but I haven’t done a comprehensive literature review for the purposes of writing this. Some examples of areas I have reason to believe are relevant: ecology (e.g. meta-popoluation models), public health, safety and risk engineering, systemic risk, nuclear deterrence. I’m sure there are many more, and would be especially interested if there’s other directly relevant work I’m not aware of. 

3. A risk-first approach

One way in which existing work in this space strikes me as deficient is the absence of what you might call ‘risk-first’ thinking.[15] By ‘risk-first’ I mean an approach that begins with a relatively well-defined risk, or perhaps a category of risk, and proceeds to consider ways in which humanity might increase its degree of modularity with respect to that. For example, you might begin with an asteroid impact, and consider ways in which human civilisation could decorrelate the risk of impact across two or more populations (e.g. by building a self-sustaining Lunar base). This contrasts with virtually all comments I’ve read relating to mitigation through modularity, in that these tend to be ‘proposal-first’.[16] In other words, the comments tend to take the form ‘one advantage of proposal X is that it also reduces extinction risks A, B, and C’, where X might be a Lunar or Martian settlement, or perhaps some sort of closed ecosystem on Earth. I think that a useful reframing would be: ‘given risk X, it seems like we could best decorrelate this across different populations by pursuing proposals A, B, or C’.  

I think this reframing highlights an important feature of mitigation through modularity: that a given ‘modularity-increasing’ plan need not increase modularity with respect to all risks. While it is certainly appealing to look for proposals that decorrelate all known risks in one fell swoop (and the fact that space settlement superficially looks like it would do this might explain some of its popularity), this is not really necessary. If we think of each risk as a unique sort of filter, and our objective is just for some part of civilization to pass through each one, we can still survive the full gamut of risks without any specific part surviving all the filters (provided there’s enough time to resettle between filter-events). What’s more, increasing risk-specific modularity is probably strictly easier than coming up with things that work in all cases.[17] 

To illustrate how I think this approach might work, consider the following thought experiment. Suppose that, for each of the specific risks given in Section 0.1, you are tasked with designing a single ‘safe city’ (as cheaply as possible). That is, you have to design one city for each kind of catastrophe in which the following holds: if the given catastrophe should occur outside of the city, the city has a high likelihood of surviving; if the given catastrophe should occur within the city, the outside world has a high likelihood of surviving. Anecdotally (i.e. ‘when I think about this’), the strategies that present themselves differ quite a lot between risks, and only a few of them involve space settlement.

For example, a ‘safe city’ designed to address (natural and engineered) pandemics might involve extremely strict import/export and immigration policies, such that the influx of plausible disease vectors (i.e. people and goods) is kept to a minimum. It might make sense for such a city to avoid economic dependence on industries that militate against this (e.g. tourism, trade goods in general), to specialise in areas where this is less of a problem (e.g. software development), and to work towards ‘material independence’ more broadly (i.e. independence from imported food and other goods). By contrast, a ‘safe city’ designed to address nuclear war would probably care a lot less about border restrictions. Instead, it might be concerned with achieving ecological independence from the Sun (assuming the main mechanism by which nuclear war acts as an existential risk is blocking the Sun, triggering a collapse of agriculture), and improving diplomatic relations (such that it is not directly targeted in a nuclear exchange).[18]

This second example points to another feature of the risk-first approach worth mentioning: it is useful in identifying strategies that might be effective against multiple risks. In iterating the ‘safe city’ thought experiment for each of the risks in Section 0.1, it becomes clear that several of these cities have overlapping design goals. For example, ‘achieving solar independence’ would plausibly be a goal of the safe cities for asteroid impact and supervolcanoes, as well as nuclear war.[19]

4. Speculative thoughts, claims, and suggestions

4.1 Cost-effectiveness work looks promising

Signal-boosting the relevant section of The Precipice: I am not aware of anyone working out the expected cost-effectiveness of e.g. specific Martian settlement plans, taking into account the associated reduction in x-risk. I think there are some relatively easy wins in this space (i.e. some risks are well-characterised, and would quite clearly be restricted to Earth).  

4.2 r- and K- strategies

In ecology, r-strategists are organisms that ‘have lots of offspring and hope some survive’, while K-strategists ‘have very few offspring and make sure they survive’. Humans are K-strategists; many insects are r-strategists.

When thinking about this modularity stuff, my guess is that most people are ‘native K-strategists’. In other words, people tend to think of things like self-sufficient Martian settlements, generation ships, Antarctic biodomes, etc. Each of these represents a relatively large investment in a relatively small number of ‘offspring’ (i.e. a single colony/ship/dome). We might wonder: are there plausible r-strategies?

This kind of move can be applied directly to the safe cities thought experiment. Instead of thinking of a single city for each risk, with a high likelihood of being effective, we might think of numerous different approaches a city might take, each with a lower likelihood of actually working. Suppose, for example, there was a widespread norm according to which different municipalities take on the role of being ‘a haven in case of [some specific catastrophe]’. In this circumstance, we might have a large number of cities spending some fraction of their budget to increase modularity with respect to, say, nuclear war. Presumably, they would implement a variety of policies, only some of which would actually work in the event of a nuclear war. This is more of an ‘r-type’ approach (though I expect the full sweep of r-strategies to be a lot broader).

4.3 ‘Running’ and ‘hiding’

Here’s an idea that came directly from taking a ‘risk-first’ approach to unaligned AI. Basically, I was trying to come up with ways of decorrelating AI risk across different populations (this seems hard). The idea has two parts:

Running: I remember reading somewhere that the Voyager probes are sufficiently far away from us, and travelling sufficiently quickly, that we couldn’t actually catch up to them using present-day tech. This strikes me as mildly surprising, given they were built in the 70s. The idea is that you could pull off a similar trick, only deliberately, and to a far higher standard: that is, you could send some object (i.e. a civilisation seed) away from Earth at a sufficiently high speed that it can’t be caught-up-with by any plausible advances in propulsion. In my head (which is thoroughly unschooled in astrodynamics) this involves leveraging some rare alignment of celestial bodies. Some problems with this are: (1) you can’t really predict such advances, (2) destructive tech (e.g. lasers) can go a lot faster than transport, (3) this feels less and less plausible when you think about longer timescales, and (4) naively, once we figure out how to expand at ~light speed, we’ll presumably be able to (eventually) catch all earlier, slower things  (H/T Max Daniel).

Hiding: I am also pretty ignorant of the various techniques used to detect stuff in space. However, I have some sense that this isn’t super easy, and that existing techniques, at least, might be spoof-able. Here’s some flimsy evidence: it took twenty years and $70 million to track 90% of the 1km+ asteroids orbiting near earth; my understanding is that comets are substantially harder to track (having longer and more irregular (?) orbits); reports that ‘object x turned about not to be what we thought’ seem somewhat common. The idea here is that you might design a craft to be virtually undetectable from Earth (or near-Earth): perhaps it’s disguised as some other kind of object (e.g. an asteroid), is invisible to most known detection methods, or uses another celestial body as a hiding place.[20] This has similar problems to those mentioned for ‘running’.  Anders’ take on this is that hiding something unpowered may be feasible (esp. If the hider knows where the seekers may be), but hiding while accelerating looks hard or impossible.

Overall, I think neither of these things looks especially promising: they would likely be expensive and unreliable. They also suffer from the ‘resettlement problem’ (see the next section).  

4.4 Trying to exploit physical limits doesn’t really work

One idea that might seem appealing is this: over long-enough timescales, parts of the universe become causally isolated from one another. So, if we can only spread civilisation across two or more such parts, we can successfully achieve a meta-civilisation (something that wipes out one part can’t possibly wipe out the other!).

In one of The Precipice’s end-notes, Toby Ord points out why this isn’t as useful as it might appear. For one thing, the relevant timescales are extremely long (it will be ~100 billion years before the universe is divided up like this). For another thing, causal isolation precludes the possibility of ‘dead’ areas being resettled by living ones (like the butterflies accidentally flying to meadows that have suffered local extinction events). This makes it significantly harder for a civilisation exploiting causal limits to achieve an infinite life expectancy: it would need to be expanding into new (causally isolated) areas at a sufficiently rapid rate.[21]

One way that causal limits could potentially be useful is as the scaffolding for a kind of ‘meta-meta-civilization’. If we suppose that each causally isolated area contains its own causally connected meta-civilization, and that some properties of these (e.g. the resettlement rate) are approximately ‘independently sampled’ from the set of possible values, then each one might provide an additional ‘bite of the apple (of existential security)’. For example, it could be true that the mean resettlement rate of these meta-civilizations falls below the required threshold, while still being true that at least one of them is stable (H/T Max Daniel for this suggestion).    

I think this reasoning also deserves a bit of a signal boost, because it underlines what strikes me as a counterintuitive aspect of this approach. Specifically, the approach generally requires that (1) resettlement is possible, and (2) the rate of resettlement is high enough that it outweighs the risk of each part collapsing. These points are perhaps non-obvious if your mental model of this involves eggs and baskets.  

4.5 Closed ecosystems are mostly a red herring

At least for me, biodome-type schemes are another thing that springs to mind when trying to think of ways to decorrelate x-risk. This seems like a red herring. CES stuff is undoubtedly interesting[22], but basically looks poorly tailored to x-risk mitigation. In some ways it looks like overkill (e.g. in many x-risk scenarios, independence from the surrounding ecosystem is irrelevant), but in other ways it doesn’t go far enough (e.g. most closed ecosystems require sunlight, which is the system that several x-risks target). 

Of course, being a closed ecosystem entails some form of self-sustainability, and this is highly relevant to x-risk mitigation. The problem is that the kind of self-sustainability closed ecosystems shoot for looks like it doesn’t match up very well with x-risks (with the possible exceptions of climate change and environmental damage). I think the ‘safe city’ thought experiment makes this point: in my head, at least,  very few of these cities look much like biodomes.

4.6 A few comments on the desirability of safe city-type things

Another consideration that seems highly relevant to any specific safe-city-esque proposal (including space settlement) is the real-world desirability of living in the relevant conditions. Regardless of how well a given proposal scores in terms of its effectiveness against x-risk, it could be undone by being sufficiently undesirable. Historically, my understanding is that settlement efforts have been at least partly incentivised by the prospect of genuine (e.g. economic) reward. It’s hard to see how ‘relocating to a self-sustaining base in Antarctica’ could hold similar appeal.

Note that this looks less true of actual space settlement, both because ‘space is cool’, and because early settlers could plausibly have privileged access to relevant resources (e.g. Martian real estate, asteroid mining rights). H/T Daniel Eth for these thoughts.

5. Notes

[1]  “A place where existential risk is low and stays low.” (Ord, 2020). This is ‘security’ in the sense of ‘freedom from danger’, rather than ‘the security services’.

[2]  E.g. stellar explosions, vacuum collapse, some ‘science-experiments-gone-wrong’.

[3]  E.g. reckless geoengineering, robust totalitarianism.

[4] I.e for disparate reasons, a decisive fraction of one generation fails to reproduce.

[5] H/T Anders Sandberg for this example.

[6] In fact, one can reasonably model human history so far as a metacivilization (H/T Anders Sandberg) - but note that this doesn’t help us in the face of present-day x-risks. 

[7] It may also be important that the rate of e.g. ‘new meadows being settled’ is above some threshold. 

[8] E.g. Stephen Hawking, Elon Musk, Carl Sagan, Isaac Asimov, Derek Parfit. 

[9] E.g. asteroid or comet impact. However, it’s worth noting that this would plausibly require a self-sustaining space settlement, which is a far taller order than one dependent on regular supplies from Earth.

[10] E.g. unaligned AI, engineered pandemics. 

[11] E.g. you might achieve a similar reduction in risk, at much lower cost, by building a self-sustaining settlement in Antarctica.

[12] I.e. I haven’t seen claims like ‘this specific martian settlement plan is expected to reduce x-risk by this amount’.

[13] For example, Hawking and Sagan were astrophysicists, and Musk owns a space company. H/T Daniel Eth for this point.

[14] I.e. the colony doesn’t seem like a distinct civilization. 

[15] As mentioned above, biorisk may well be a clear exception. 

[16] I’m not super happy with this terminology. Suggestions welcome. 

[17] Though it’s worth noting the advantage of proposals that work in a wide variety of cases, given the possibility of unanticipated risks (H/T Daniel Eth). 

[18] It might look like this example fails the reciprocal condition given above. This is true where the catastrophe is defined as ‘nuclear war’: if a nuclear war happened within the safe city, and were severe enough to constitute an x-risk, there is no reason to think the outside world would have a high likelihood of surviving. But nuclear war is an odd fit here, and the objection doesn’t work if we define the catastrophe as ‘removing the ecological keystone’ (i.e. blocking the Sun): if the safe city’s ecological keystone (e.g. chemotrophic agricultural base) were removed, the outside world could continue unphased. 

[19]  This also brings to mind the ‘critical systems classification’ mentioned earlier, in that it might be useful to consider ‘how can we increase the modularity of critical system X?’ (the example here would be something like ‘the phototrophic foundations of ecosystems’).

[20]  In my head, I’m picturing a ship that starts off in the position of ‘counter-Earth’ and then spirals out from the Sun, such that it’s never directly visible from Earth, but gets further and further away. This wouldn’t really work for a bunch of reasons (e.g. the rate at which it recedes from Earth gets slower and slower, and you can see it if you just pop to e.g. Mars).

[21]  I think the rule would be something like ‘the average frequency with which each (causally isolated) sub-population splits into two new sub-populations (which are causally isolated from each other) is greater than the average extinction risk facing each part.’ …but I wouldn’t be surprised if I’m wrong about this. 

[22]  E.g. the Wikipedia page for Biosphere 2 makes for a fun read (esp. the second mission). 

84

4 comments, sorted by Highlighting new comments since Today at 4:25 PM
New Comment

4.5 Closed ecosystems are mostly a red herring

[...] In some ways it looks like overkill (e.g. in many x-risk scenarios, independence from the surrounding ecosystem is irrelevant), but in other ways it doesn’t go far enough (e.g. most closed ecosystems require sunlight, which is the system that several x-risks target). 

I do think closed ecosystems would offer little or no benefit in many/most x-risk scenarios, and that some people seem to prioritise them a bit too much. 

But it also seems like they might be useful for a substantial subset of biorisk scenarios?[1] And biorisk in general seems to account for a substantial chunk of total existential risk (e.g., more than nuclear winter, asteroid/comet impacts, or super volcanoes, which I imagine are the sunlight-targeting risks you had in mind). 

Taken together, that seems to suggest closed ecosystems might still be fairly useful (even if not super cost-effective compared to alternative options at the moment), and something a civilization which "has its act together" and is pursuing existential security might want?

[1] I'm not saying closed ecosystems would be either necessary or sufficient to protect civilization from all biorisk scenarios, and it's possible it's a bad idea to brainstorm in detail here which biorisk scenarios closed ecosystems might or might not be relevant for.

Great post!  Thanks.

[...] I think work that looks like ‘different ways of categorising x-risks’ can provide useful conceptual frames for thinking about mitigation through modularity. Some examples of this I’m aware of:

  • ‘Defence in Depth Against Human Extinction: Prevention, Response, Resilience, and Why They All Matter’ (Cotton-Barratt, Daniel, & Sandberg; 2020)
  • ‘Classifying Global Catastrophic Risks’ (Avin et al., 2018); and
  • The Precipice, Ch 6 ‘The Risk Landscape’.

I previously made a collection of all such things that I was aware of. It only really has 5 items, the most notable of which you've already covered. The other two are:

I also note in that collection that "Cotton-Barratt also discusses [the Defence in Depth] model, and rationales for building such models, on the 80,000 Hours podcast."

Great post - thanks for writing it!

One way in which existing work in this space strikes me as deficient is the absence of what you might call ‘risk-first’ thinking.[15] By ‘risk-first’ I mean an approach that begins with a relatively well-defined risk, or perhaps a category of risk, and proceeds to consider ways in which humanity might increase its degree of modularity with respect to that. [...] This contrasts with virtually all comments I’ve read relating to mitigation through modularity, in that these tend to be ‘proposal-first’.[16] [...] I think that a useful reframing would be: ‘given risk X, it seems like we could best decorrelate this across different populations by pursuing proposals A, B, or C’. 

Something I felt unsure of here was: Do you think 'risk-first' thinking would in general be more useful than 'proposal-first' thinking? Or is it more that you think both perspectives seem useful, and so far we've pretty much only tried the latter perspective so we should add the former perspective to our toolkit?

FWIW, I agree with your arguments about some benefits of 'risk-first' thinking and some downsides of 'proposal-first' thinking. But I think the following point warrants more emphasis than a footnote:

[17] Though it’s worth noting the advantage of proposals that work in a wide variety of cases, given the possibility of unanticipated risks (H/T Daniel Eth). 

Reasons that might warrant more emphasis are:

  1. Unanticipated risks might account for a substantial portion of total existential risk
    • This seems prima facie plausible
    • This also seems in line with the one (only one!) directly relevant existential risk estimate I'm aware of us having
      • Namely, Ord estimates that "unforeseen anthropogenic risks" have about as high a chance of causing an existential catastrophe over the coming century as engineered pandemics, with only unaligned AI having a higher chance of causing that, and things like nuclear war and climate change having a notably lower chance
        • I wrote some thoughts on this here
      • (It's possible that there are other directly relevant estimates. If you're aware of any, please comment to say so in my database!)
    • If I recall correctly, the following post also made good-seeming arguments for this view: The Importance of Unknown Existential Risks
    • Of course, none of those points are very strong evidence, but I think the evidence for that claim being false would be similarly weak
  2. I think a key part of the appeal for a "modularity"-focused approach, or in general approaches focused on something like "resilience" or "recovery" rather than "prevention", is probably precisely that they might be better able to cover unforeseen existential risks than prevention-focused efforts can