Critical Review of 'The Precipice': A Reassessment of the Risks of AI and Pandemics

by Fods1225 min read11th May 202032 comments

77

Criticism (EA Movement)EA BooksExistential RiskBiosecurityAI Alignment
Frontpage

Introduction

In this essay I will present a critical response to Toby Ord’s recent book The Precipice (page numbers refer to the soft cover version of this book). Rather than attempting to address all of the many issues discussed by Ord, I will focus on what I consider to be one of the most critical claims of the book. Namely, Ord claims that the present century is a time of unprecedented existential risk, that “we stand at a crucial moment in the history of our species” (p. 3), a situation which is “unsustainable” (p. 4). Such views are encapsulated in Ord’s estimate of the probability of an existential catastrophe over the next century, which he places at one in six. Of this roughly seventeen percent chance, he attributes roughly ten percentage points to the risks posed by unaligned artificial intelligence, and another three percentage points to the risks posed by engineered pandemics, with most of the rest of the risk is due to unforeseen and ‘other’ anthropogenic risks (p. 167). In this essay I will focus on the two major sources of risk identified by Ord, artificial intelligence and engineered pandemics. I will consider the analysis presented by Ord, and argue that by neglecting several critical considerations, Ord dramatically overestimates the magnitude of the risks from these two sources. This short essay is insufficient to provide a full justification for all of my views about these risks. Instead, my aim is to highlight some of what I believe to be the major flaws and omissions of Ord’s account, and also to outline some of the key considerations that I believe support a significantly lower assessment of the risks.

Why probability estimates matter

Before analysing the details of Ord’s claims about the risks of engineered pandemics and unaligned artificial intelligence, I will first explain why I think it is important to establish as accurate as possible estimates of the magnitude of these existential risks. After all, it could be argued that even if the risks are significantly less than those presented by Ord, nevertheless the risks are still far higher than we would like them to be, and causes such as unaligned AI and engineered pandemics are clearly neglected and require much more attention than they currently receive. As such, does it really matter what precise probabilities we assign to these risks? I believe it does matter, for a number of reasons.

First, Ord’s core thesis in his book is that humanity faces a ‘precipice’, a relatively short period of time with uniquely high and unsustainable levels of existential risk. To substantiate this claim, Ord needs to show not just that existential risks are high enough to warrant our attention, but that existential risk is much higher now than in the past, and that the risks are high enough to represent a ‘precipice’ at which humanity stands at the edge. Ord articulates this in the following passage:

“If I’m even roughly right about their (the risks’) scale, then we cannot survive many centuries with risk like this. It is an unsustainable level of risk. Thus, one way or another, this period is unlikely to last more than a small number of centuries. Either humanity takes control of its destiny and reduces the risk to a sustainable level, or we destroy ourselves.” (p. 31)

Critical here is Ord’s linkage of the scale of the risk with our inability to survive many centuries of this scale of risk. He goes on to argue that this is what leads to the notion of a precipice:

This comparatively brief period is a unique challenge in the history of our species... Historians of the future will name this time, and schoolchildren will study it. But I think we need a name now. I call it the Precipice. The Precipice gives our time immense meaning. (p. 31)

Given these passages, it is clear that there is a direct connection between the magnitude of the existential risks over the next century or so, and the existence of a ‘precipice’ that uniquely defines our time as historically special. This is a distinct argument from the weaker claim that existential risks are far higher than we should be comfortable with, and that more should be done to reduce them. My argument in this essay is that the main sources of the abnormally high risk identified by Ord, namely engineered pandemics and unaligned artificial intelligence, do not pose nearly as high a risk as Ord contends, and therefore his argument that the present period constitutes a ‘precipice’ is unpersuasive.

Second, I think precise estimates of the probabilities matter because there is a very long history of predicting the end of the world (or the end of civilisation, or other existential catastrophes), so the baseline for accuracy of such claims is poor. As such it seems reasonable to exercise some scepticism and caution when evaluating such claims, and ensure that they are based on sufficiently plausible evidence and reasoning to be taken seriously. This is also important for convincing others of such risks, as exaggeration of risks to humanity is very common, and is likely to reduce the credibility of those attempting to raise awareness of such risks. Ord makes a similar argument when he advises:

Don’t exaggerate the risks. There is a natural tendency to dismiss claims of existential risk as hyperbole. Exaggerating the risks plays into that, making it much harder for people to see that there is sober, careful analysis amidst the noise. (p. 213)

Third, I think that accurate estimates of probabilities of different forms of existential risk are important because it helps us to align our efforts and resources in proportion to the amount of risk posed by different causes. For example, if one type of risk is estimated to pose one hundred times as much risk as another, this implies a different distribution of efforts compared to if both causes posed roughly comparable amounts of risk. Ord makes this argument as follows:

This variation (in risk) makes it extremely important to prioritise our efforts on the right risks. And it also makes our estimate of the total risk very sensitive to the estimates of the top few risks (which are among the least well understood). So getting better understanding and estimates for those becomes a key priority. (p. 168)

As such, I believe it is important to carefully consider the probability of various proposed existential risk scenarios. In the subsequent two sections I will consider risks of engineered pandemics and unaligned artificial intelligence.

Engineered Pandemics

Extinction level agent exists

One initial consideration that must be addressed is how likely it is that any biological pathogen can even kill enough people to drive humanity to extinction. This places an upper limit on what any biotechnology could achieve, regardless of how advanced. Note that here I am referring to an agent such as a virus or bacterium that is clearly biological in nature, even if it is engineered to be more deadly than any naturally-occurring pathogen. I am not including entities that are non-biological in nature, such as artificial nanotechnology or other chemical agents. Whilst it is impossible to determine the ultimate limits of biology, one relevant point of comparison is the most deadly naturally-occurring infectious disease. To my knowledge, the highest fatality rate for any infectious biological agent that is readily transmissible between living humans is the Zaire ebolavirus, with a fatality rate of around 90%. It is unclear whether such a high fatality rate would be sustained outside of the social and climactic environment of West Africa whence the disease originated, but nevertheless we can consider this to be a plausible baseline for the most deadly known human infectious pathogen. Critically, it appears unlikely that the death of even 90% of the world population would result in the extinction of humanity. Death rates of up to 50% during the Black Death in Europe do not appear to have even come close to causing civilisational collapse in that region, while population losses of up to 90% in Mesoamerica over the course of the invasion and plagues of the 16th century did not lead to the end of civilization in those regions (though social and political disruption during these events were massive).

If we think the minimal viable human population is roughly 7,000 (which is near the upper end of the figures cited by Ord (p. 41), though rounded for simplicity), then a pathogen would need to directly or indirectly lead to the deaths of more than 99.9999% of the current world population in order to lead to human extinction. One could argue that the pathogen would only need to directly cause a much smaller number of deaths, with the remaining deaths caused by secondary disruptions such as war or famine. However to me this seems very unlikely, considering that such a devastating pathogen would significantly impair the ability of nations to wage war, and it is hard to see how warfare would affect all areas of the globe sufficiently to bring about such significant population loss. Global famine also seems unlikely, given that the greater the number of pandemic deaths, the more food stores would be available to survivors. Perhaps the most devastating scenario would be a massive global pandemic followed by a full-scale nuclear war, though it is unclear why should a nuclear exchange would follow a pandemic. One can of course devise various hypothetical scenarios, but overall it appears to me that a pathogen would have to have an extremely high fatality rate in order to have the potential to cause human extinction.

In addition to a high fatality rate, an extinction-level pathogen would also have to be sufficiently infectious such that it would be able to spread rapidly through human populations. It would need to have a long enough incubation time such that infected persons can travel and infect more people before they can be identified and quarantined. It would also need to be able to survive and propagate in a wide range of temperatures and climactic conditions. Finally, it would also need to be sufficiently dangerous to a wide range of ages and genetic populations, since any pockets of immunity would render extinction considerably less likely. Overall, it is highly unclear whether any biological agent with all these properties is even possible. In particular, pathogens which are sufficiently virulent to cause 99% or more fatality rates are likely to place such a burden on human physiology such that they would have a short incubation time, potentially rendering it easier to quarantine infected persons. Of course we do not know what is possible at the limits of biology, but given the extreme properties required of such an extinction-level pathogen, in my view it is very unlikely that such a pathogen is even possible.

Extinction level agent technologically feasible

Even if biological agents with the potential of wiping out humanity are theoretically possible, the question remains as to how long it will be until it becomes technologically feasible to engineer such an agent. While our current scientific understanding places significant limitations on what can be engineered, Ord argues that “it is not twentieth-century bioweaponry that should alarm us, but the next hundred years of improvements” (p. 133), which indicates that he believes that biotechnological advances over the next century are likely to enable the creation of a much wider range of dangerous biological agents. Of course, it is impossible to know how rapidly such technology will develop in the coming decades, however I believe that Ord overstates the current capabilities of such technology, and underestimates the challenges in developing pathogens of dramatically greater lethality than existing natural agents.

For example, Ord states that it is possible to “create entire functional viruses from their written code” (p. 128). I believe this claim is misleading, especially when read alongside Ord’s concern about ease of obtaining synthesised DNA, as it can potentially be read as asserting that viruses can be created using entirely synthetic means using only their DNA. This is false, as the methods cited by Ord describe techniques in which synthesised viral DNA is cultured cellular extracts, which as Ord also notes is not trivial and requires careful technique (p. 359). This approach still relies critically on utilising the ribosomes and other cellular machinery to translate viral DNA and produce the needed viral proteins. It does not involve the degree of control or understanding of the precise molecular processes involved that would be implied if an intact virus could be produced from its DNA using entirely synthetic means.

Ord also cites the 2012 experiments of Ron Fouchier, who conducted a gain-of-function experiment with H5N1 influenza in ferrets. Ord states that “by the time it passed to the final ferret, his strain of H5N1 had become directly transmissible between mammals” (p. 129). While technically correct, I believe this claim is misleading, since only a few sentences prior Ord states that this strain of influenza had an estimated 60% mortality rate in humans, implying that this would also apply to an airborne variant of the same virus. However in Fouchier’s study, it is reported that “although the six ferrets that became infected via respiratory droplets or aerosol also displayed lethargy, loss of appetite, and ruffled fur, none of these animals died within the course of the experiment.” Furthermore, the mere possibility of airborne transmission says nothing about the efficiency of this transmission mechanism. As reported in the paper:

Although our experiments showed that A/H5N1 virus can acquire a capacity for airborne transmission, the efficiency of this mode remains unclear. Previous data have indicated that the 2009 pandemic A/H1N1 virus transmits efficiently among ferrets and that naïve animals shed high amounts of virus as early as 1 or 2 days after exposure. When we compare the A/H5N1 transmission data with that of [another paper]..., the data shown in Figs. 5 and 6 suggest that A/H5N1 airborne transmission was less robust, with less and delayed virus shedding compared with pandemic A/H1N1 virus.

These qualifications illustrate the fundamental point that most biological systems exist as a set of tradeoffs and balances between competing effects and conflicting needs. Thus changing one aspect of a pathogen, such as its mode of transmission, is likely to have effects on other aspects of the pathogen, such as its lethality, incubation period, susceptibility to immune system attack, or survival outside a host. In theory it may be possible to design a pathogen with properties optimised to be as lethal to humans as possible, but doing so would require far greater understanding of protein folding pathways, protein-protein interactions, gene expression, mechanisms of pathogen invasion, immune system evasion strategies, and other such factors than is currently possessed. Thus it is by no means clear that Ord is correct when he states that “this progress in biotechnology seems unlikely to fizzle out soon: there are no insurmountable challenges looming; no fundamental laws blocking further developments” (p. 128). Indeed, I believe there are many fundamental challengers and gaps in our understanding which prevent the development of pathogens with arbitrarily specified properties.

Extinction level agent produced and delivered

Even if was technologically possible to produce a pathogen capable of causing human extinction, the research, production, and distribution of such an infectious agent would still actually need to be carried out by an organisation with the capabilities and desire to do so. While Ord’s example of the Aum Shinrikyo cult does demonstrate that such groups exist, the very small number of such attacks historically appears to indicate that such groups do not exist in large numbers. Very few ideologies have an interest in bringing humanity to an end through violent means. Indeed as Ord notes:

For all our flirtation with biowarfare, there appear to have been relatively few deaths from either accidents or use... Exactly why this is so is unclear. One reason may be that bioweapons are unreliable and prone to backfiring, leading states to use other weapons in preference. (p. 132)

Ord partially counters this observation by arguing that the severity of events such as terrorist attacks and incidents of biowarfare follow a power law distribution, with very rare, very high impact events meaning that the average size of past events will underestimate the expected size of future events. However this response does not seem to address the core observation that bioweapons have proven very hard to control, and that very few agents or organisations have any interest in unleashing a pathogen that kills humans indiscriminately. This appears to be reflected in the fact that as far as is publicly known, very few attempts have even been made to deploy such weapons in modern times. I thus believe that we have good reason to think that the number of people and amount of effort devoted to developing such dangerous bioweapons is likely to be low, especially for non-state actors.

Furthermore, Ord fails to consider the practical difficulties of developing and releasing a pathogen sufficiently deadly to cause human extinction. In particular, developing a novel organism would require lengthy research and extensive testing. Even if all the requisite supplies, technology, and expertise over a period of time could be obtained without arousing enough suspicion for the project to be investigated and shut down, there still remains the challenge of how such a pathogen could be tested. No animal model is perfect, and so any novel pathogen would (just like vaccines and other medical treatments) need to be tested on large numbers of human subjects, and likely adjusted in response to results. It would need to be trialed in different environments and climates to determine whether it would spread sufficiently rapidly and survive outside a host long enough. Without such tests, it is virtually impossible that an untested novel pathogen would be sufficiently optimised to kill enough people across a wide enough range of environments to cause human extinction. However, it is hard to see how it would be possible to carry out such widespread testing with a diverse enough range of subjects without drawing the attention of authorities.

A rogue state such as North Korea might be able to circumvent this particular problem, however that raises as range of new difficulties, such as why it would ever be in the interest of a state actor (as opposed to a death cult terrorist group) to develop such a deadly, indiscriminate pathogen. Ord raises the possibility of its use as a deterrent (akin to the deterrence function of nuclear weapons), but the analogy does not appear to hold up. Nuclear weapons work as a deterrent because their possession can be publicly demonstrated (by testing), their devastating impact is widely known, and there is no practical defence against them. None of these properties are true of an extremely lethal novel pathogen. A rogue state would have great difficulty proving that they possessed such a weapon without actually making available enough information about the pathogen, such that the world would likely be able to develop countermeasures to that particular pathogen. As such, it does not appear feasible to use bioweapons as effective deterrents, which may partly explain why despite extensive research into the possibility, no states have yet used them in this manner. As a result of these considerations, I conclude that even if it were technologically possible to develop a pathogen sufficiently lethal to cause human extinction, it is unlikely that anyone would actually have both the desire and the ability to successfully produce and deliver the pathogen.

Failure of timely public policy response

The release of a pathogen that has the potential to cause human extinction in itself does not imply that human extinction would inevitably occur. Whether this would follow depends on the extent of the governmental and societal responses to the outbreak of the novel pandemic, such as quarantines, widespread testing, and contact tracing. In considering the balance of positive and negative effects that organisational and civilization advances have had on the ability to respond to the risk of pathogens, Ord states that “it is hard to know whether these combined effects have increased or decreased the existential risk from pandemics” (p. 127). This argument, however, seems implausible, since deaths from infectious diseases and pandemics in particular have decreased in recent centuries, with no major pandemics in Western Europe since the early eighteenth century. The disappearance of plague from Western Europe, while still not well understood, plausibly may have been caused at least in part by the improvement of quarantine and public policy responses to plague. In the US, the crude death rate from infectious diseases fell by about 90% over the course of the twentieth century. Furthermore, a successful public policy response to a pathogen outbreak in even a single country would likely be enough to prevent extinction, even if most countries failed to enact a sufficient public policy response. As such, I believe it is unlikely that even an extinction-level novel pathogen would be able to sufficiently evade all public health responses so as to cause human extinction.

Failure of timely biomedical response

In addition to the failure of public policy responses, extinction of humanity by a novel pathogen would also require the failure of any biomedical response to the pandemic. Ord believes that as biological techniques become easier and cheaper, they become accessible to more and more people, and hence represent a greater and greater risk. He argues:

As the pool of people with access to a technique grows, so does the chance it contains someone with malign intent. (p. 134)

This argument, however, appears to only consider one side of the issue. As the pool of people with access to a technique grows, so too does the number of people who wish to use that technique to do good. This includes developing techniques and technologies for more easily detecting, controlling, and curing infectious diseases. It surprises me that Ord never mentions this, since the development of biomedical technologies does not only mean that there is greater scope for use of the technology to cause disease, but also greater scope for use new techniques to prevent and cure disease. Indeed, since the prevention of disease receives far more research attention that causing disease, it seems reasonable to assume that our abilities to development treatments, tests, and vaccines for diseases will develop more rapidly than our abilities to cause disease. There are a range of emerging biomedical technologies that promise to greater improve our ability to fight existing and novel diseases, including transmissible vaccines, rational design of drugs, and reverse vaccinology. As such, I regard it unlikely that if biomedical technology had advanced sufficiently to be able to produce an extinction-level pathogen, it would nevertheless fail to develop sufficient countermeasures to the pathogen to at least prevent full human extinction.

Unaligned Artificial Intelligence

AI experts and AI timelines

Although Ord appeals to surveys of AI researchers as evidence of the plausibility of the development of superhuman artificial intelligence in the next century, experts in artificial intelligence do not have a good track record of predicting future progress in AI. Massively inflated expectations of the capabilities of symbolic AI systems in the 1950s and 1960s, and of expert systems in the 1980s, are well-known examples of this. More generally, it is unclear why we should even expect AI researchers to have any particular knowledge about the future trajectories of AI capabilities. Such researchers study and develop particular statistical and computational techniques to solve specific types of problems. I am not aware of any focus of their training on extrapolating technological trends, or in investigations historical case studies of technological change. Indeed, it would seem that cognitive psychologists or cognitive neuroscientists might be better placed (although probably still not very well placed) to make judgements about the boundaries of human capability and what would be required for these to be exceeded in a wide range of tasks, since AI researchers have no particular expertise in the limits of human ability. AI researchers generally only consider human-level performance in the context of baseline levels of performance on well-defined tasks such as image recognition, categorisation, or game-playing. This is far removed from being able to make judgements about when AIs would be able to outperform humans on ‘every task’. For example, do AI researchers really have any expertise on when AIs are likely to overtake human ability to do philosophy, serve as political leaders, compose a novel, or teach high school mathematics? These are simply not questions that are studied by AI researchers, and therefore I don’t see any reason why they should be regarded as having special knowledge about them. These concerns are further emphasised by the inconsistency of researcher responses to AI timeline surveys:

Asked when an AI system would be ‘able to accomplish every task better and more cheaply than human workers, on average they estimated a 50 percent change of this happening by 2061. (p. 141)

However in a footnote Ord notes:

Note also that this estimate may be quite unstable. A subset of the participants were asked a slightly different question instead (emphasising the employment consequences by talking of all occupations instead of all tasks). Their time by which there would be a 50% chance of this standard being met was 2138, with a 10% chance of it happening as early as 2036. (p. 362)

Another factor highly pertinent to establishing the relevant set of experts concerns how the current topics researched by AI researchers relate to the eventual set of methods and techniques eventually used in building an AGI. Ord seems to think that developments of current methods may be sufficient to develop AGI:

One of the leading paradigms for how we might eventually create AGI combines deep learning with an earlier idea called reinforcement learning. (p. 143)

However such current methods, in particular deep learning, are known to be subject to a wide range of limitations. Major concerns include the ease with which adversarial examples can be used to ‘fool’ networks into misclassifying basic stimuli, the lack of established methods for integrating syntactically-structured information with neural networks, the fact that deep learning is task-specific and does not generalise well, the inability of deep learning systems to develop human-like ‘understanding’ that permits robust inferences about the world, and the requirement for very large datasets for deep learning algorithms to be trained on. While it remains possible that all these limitations may be overcome in the future, at present they represent deep theoretical limitations of current methods, and as such I see little reason to expect they can be overcome without the development of substantially new and innovative concepts and techniques. If this is current, then there seems little reason to expect that AI researchers to have any expertise in predicting when such developments are likely to take place. AI researchers study current techniques, but if (as I have argued) such techniques are fundamentally inadequate for the development of true AGI, then such expertise is of limited relevance in assessing plausible AI timelines.

One argument that Ord gives in apparent support of the notion that current methods may in principle be sufficient for the development of AGI relates to the success of using deep neural networks and reinforcement learning to train artificial agents to play Atari games:

The Atari-playing systems learn and master these games directly from the score and the raw pixels on the screen. They are a proof of concept for artificial general agents: learning to control the world from raw visual input; achieving their goals across a diverse range of environments. (p. 141)

I believe this is a gross overstatement. While these developments are impressive, they in no way provide a proof of concept for ‘artificial general agents’, anymore than programs developed in the 1950s and 1960s to solve grammatical or geometric problems in simple environments provided such a proof of concept. Atari games are highly simplified environments with comparatively few degrees of freedom, the number of possible actions is highly limited, and where a clear measure of success (score) is available. Real-world environments are extremely complicated, with a vast number of possible actions, and often no clear measure of success. Uncertainty also plays little direct role in Atari games, since a complete picture of the current gamespace is available to the agent. In the real world, all information gained from the environment is subject to error, and must be carefully integrated to provide an approximate model of the environment. Given these considerations, I believe that Ord overstates how close we currently are to achieving superhuman artificial intelligence, and understates the difficulties that scaling up current techniques would face in attempting to achieve this goal.

AI has the power to usurp humanity

Ord argues that artificial intelligence that was more intelligent than humans would be able to usurp humanity’s position as the most powerful species on Earth:

What would happen if sometime this century researchers created an artificial general intelligence surpassing human abilities in almost every domain? In this act of creation, we would cede our status as the most intelligent entities on Earth. So without a very good plan to keep control, we should also expect to cede our status as the most powerful species, and the one that controls its own destiny. (p. 143)

The assumption behind this claim appears to be that intelligence alone is the critical determining factor behind which species or entity maintains control over Earth’s resources and future. This premise, however, conflicts with what Ord says earlier in the book:

What set us (humanity) apart was not physical, but mental - our intelligence, creativity, and language...each human’s ability to cooperate with the dozens of other people in their band was unique among large animals. (p. 12)

Here Ord identifies not only intelligence, but also creativity and ability to cooperate with others as critical to the success of humanity. This seems consistent with the fact that human intelligence, as far as can be determined, has not fundamentally changed over the past 10,000 years, even while our power and capabilities have dramatically increased. Obviously, what has changed is our ability to cooperate at much larger scales, and also our ability to build upon the achievements of previous generations to gradually increase our knowledge, and build up more effective institutions and practices. Given these considerations, it seems far from obvious to me that there mere existence of an agent more intelligent than an individual human would have the ability to usurp humanity’s position. Indeed, Ord’s own examples seem to further emphasise this point:

History already involves examples of individuals with human-level intelligence (Hitler, Stalin, Genghis Khan) scaling up from the power of an individual to a substantial fraction of all global power. (p. 147)

Whilst we have no clear data on the intelligence of these three individuals, what does seem clear is that none of them achieved the positions they did by acts of profound intellect. They were capable men, with Stalin in particular being very widely read, and Hitler known to have a sharp memory for technical details, nevertheless they were far from being the greatest minds of their generation. Nor did they achieve their positions by ‘scaling up’ from an individual to world superpower. I think it is more accurate to say that they used their individual talents (military leadership for Genghis Khan, administrative ability and political scheming for Stalin, and oratory and political scheming for Hitler) to gain control over existing power structures (respectively Mongol tribes, the Soviet government, and the German government). They did not build these things from scratch themselves (though Genghis Khan did establish a unified Mongol state, so comes closer than the others), but were able to hijack existing systems and convince enough people to follow their leadership. These skills may be regarded as a subset of a very broad notion of intelligence, but do not seem to correspond very closely at all to the way we normally use the word ‘intelligence’, nor do they seem likely to be the sorts of things AIs would be very good at doing.

Lacking a physical body to interact with people, it is hard to see how an AI could inspire the same levels of loyalty and fear that these three leaders (and many others like then) relied upon in their subordinates and followers. Of course, AIs could manipulate humans to do this job for them, but this would raise an immense difficulty of ensuring that their human pawns do not usurp their authority, which would be very difficult if all the humans that the AI is attempting to control do not actually have any personal loyalty for the AI itself. Perhaps the AI could pit multiple humans against one another and retain control over them in this manner (indeed that is effectively what Hitler did with his subordinates), however doing so generally requires some degree of trust and loyalty on behalf of one’s subordinates to be sustainable. Such methods are also very difficult to manage (such as the need to prevent plots by subordinates against the leader), and place clear limits on how effectively the central ruler can personally control everything. Of course one could always say ‘if an AI is intelligent enough it can solve these problems’, but my argument is precisely that it is not at all clear to me that ‘intelligence’ is even the key factor determining success. A certain level of intelligence is needed, but various forms of subtle interpersonal skills distinct from intelligence seem far more important in acquiring and maintaining their positions, skills which a non-embodied AI would face particular difficulty in acquiring.

Overall, I am not convinced that the mere existence of a highly-intelligent AI would imply anything about the ability of that AI to acquire significant power over humanity. Gaining power requires much more than individual intelligence, but also the ability to coordinate large numbers of people, to exercise creativity, to inspire loyalty, to build upon past achievements, and many others. I am not saying that an AI could not do these things, only that they would not automatically be able to do these things by being very intelligent, nor would these things necessarily be able to be done very quickly.

AI has reason to usurp humanity

Although Ord’s general case for concern about AI does not appeal to any specific vision for what AI might look like, an analysis of the claims that he makes indicates that his arguments are mostly relevant to a specific type of agent based on reinforcement learning. He says:

One of the leading paradigms for how we might eventually create AGI combines deep learning with an earlier idea called reinforcement learning... unfortunately, neither of these methods can be easily scaled up to encode human values in the agent’s reward function. (p. 144)

While Ord presents this as merely a ‘leading paradigm’, subsequent discussion appears to assume that an AI would likely embody this paradigm. For example he remarks:

An intelligent agent would also resist attempts to change its reward function to something more aligned with human values. (p. 145)

Similarly he argues:

The real issue is that AI researchers don’t yet know how to make a system which, upon noticing this misalignment, updates its ultimate values to align with ours rather than updating its instrumental goals to overcome us. (p. 146)

While this seems plausible in the case of a reinforcement learning agent, it seems far less clear that it would apply to another form of AI. In particular, it is not even clear if humans actually posses anything that corresponds to a ‘reward function’, nor is it clear that such a thing is immutable with experience or over the lifespan. To assume that an AI would have such a thing therefore is to make specific assumptions about the form such an AI would take. This is also apparent when Ord argues:

It (the AI) would seek to acquire additional resource, computational, physical or human, as these would let it better shape the world to receive higher reward. (p. 145)

Again, this remark seems explicitly to assume that the AI is maximising some kind of reward function. Humans often act not as maximisers but as satisficers, choosing an outcome that is good enough rather than searching for the best possible outcome. Often humans also act on the basis of habit or following simple rules of thumb, and are often risk averse. As such, I believe that to assume that an AI agent would be necessarily maximising its reward is to make fairly strong assumptions about the nature of the AI in question. Absent these assumptions, it is not obvious why an AI would necessarily have any particular reason to usurp humanity.

Related to this question about the nature of AI motivations, I was surprised that (as far as I could find) Ord says nothing about the possible development of artificial intelligence through the avenue of whole brain emulation. Although currently infeasible, simulation of the neural activity of an entire human brain is potential route to AI which requires only very minimal theoretical assumptions, and no major conceptual breakthroughs. A low-level computer simulation of the brain would only require sufficient scanning resolution to measure neural connectivity and parameters of neuron physiology, and sufficient computing power to run the simulation in reasonable time. Plausible estimates have been made which indicate that extrapolating from current trends, such technologies are likely to be developed by the second half of this century. Although it is by no means certain, I believe it is likely that whole brain emulation will be achievable before it is possible to build a general artificial intelligence using techniques that do not attempt to directly emulate the biology of the brain. This potentially results in a significantly different analysis of the potential risks than that presented by Ord. In particular, while misaligned values still represent a problem for emulated intelligences, we do at least possess an in-principle method for aligning their values, namely the same sort of socialisation that is used with general success in aligning the values of the next generation of humans. As a result of such considerations, I am not convinced that it is especially likely that an artificial intelligence would have any particular reason or motivation to usurp humanity over the next century.

AI retains permanent control over humanity

Ord seems to assume that once an AI attained a position of power over the destiny of humanity, it would inevitably maintain this position indefinitely. For instance he states:

Such an outcome needn’t involve the extinction of humanity. But it could easily be an existential catastrophe nonetheless. Humanity would have permanently ceded its control over the future. Our future would be at the mercy of how a small number of people set up the computer system that took over. If we are lucky, this could leave us with a good or decent outcome, or we could just as easily have a deeply flawed or dystopian future locked in forever. (p. 148)

In this passage Ord speaks of the AI as it if is simply a passive tool, something that is created and forever after follows its original programming. Whilst I do not say this is impossible, I believe that it is an unsatisfactory way to describe an entity that is supposedly a superintelligent agent, something capable of making decisions and taking actions on the basis of its own volition. Here I do not mean to imply anything about the nature of free will, only that we do not regard the behaviour of humans as simply the product of what evolution has ‘programmed into us’. While it must be granted that evolutionary forces are powerful in shaping human motivations and actions, nevertheless the range of possible sets of values, social arrangements, personality types, life goals, beliefs, and habits that is consistent with such evolutionary forces is extremely broad. Indeed, this is presupposed by Ord’s claim that “humanity is currently in control of its own fate. We can choose our future.” (p. 142).

If humanity’s fate is in our own hands and not predetermined by evolution, why should we not also say that the fate of a humanity dominated by an AI would be in the hands of that AI (or collective of AIs that share control), rather than in the hands of the designers who built that AI? The reason I think this is important is that it highlights the fact that an AI-dominated future is by no means one in which the AI’s goals, beliefs, motivations, values, or focus is static and unchanging. To assume otherwise is to assume that the AI in question takes a very specific form which, as I have argued above, I regard as being unlikely. This significantly reduces the likelihood that a current negative outcome with AI represents a permanent negative outcome. Of course, this is irrelevant if the AI has driven humans to extinction, but it becomes highly relevant in other situations in which an AI has placed humans in an undesirable, subservient position. I am not convinced that such a situation would be perpetuated indefinitely.

Probability Estimates

Taking into consideration the analysis I have presented above, I would like to close by presenting some estimates of my best guess of the probability of an existential catastrophe occurring within the next century by an engineered pandemic and unaligned artificial intelligence. These estimates should not be taken very seriously. I do not believe we have enough information to make sensible quantitative estimates about these eventualities. Nevertheless, I present my estimates largely in order to illustrate the extent of my disagreement with Ord’s estimates, and to illustrate the key considerations I examine in order to arrive at an estimate.

Probability of engineered pandemics

Considering the issue of how an engineered pandemic could lead to the extinction of humanity, I identify five separate things that must occur, which to a first approximation I will regard as being conditionally independent of one another:

1. There must exist a biological pathogen with the right balance of properties to have the potential of leading to human extinction.

2. It must become technologically feasible within the next century to evolve or engineer this pathogen.

3. The extinction-level agent must be actually produced and delivered by some person or organisation.

4. The public policy response to the emerging pandemic must fail in all major world nations.

5. Any biomedical response to the pandemic, such as developing tests, treatments, or vaccines, must fail to be developed within sufficient time to prevent extinction.

On the basis of the reasoning presented in the previous sections, I regard 1) as very unlikely, 2), 4), and 5) as unlikely, and 3) as slightly less unlikely. I will operationalise ‘very unlikely’ as corresponding to a probability of 1%, ‘unlikely’ as corresponding to 10%, and the ‘slightly less likely’ as 20%. Note each of these probabilities is taken as conditional on all the previous elements; so for example my claim is that conditional on an extinction-level pathogen being possible, there is a 10% chance that it will be technologically feasible to produce this pathogen within the next century. Combining all these elements results in the following probability:

P(bio extinction) = P(extinction level agent exists) x P(extinction level agent technologically feasible) x P(extinction level agent produced and delivered) x P(failure of timely public policy response) x P(failure of timely biomedical response)

P(bio extinction) = 0.01×0.1×0.2×0.1×0.1 = 2×10^(-6)

In comparison, Ord’s estimated risk from engineered pandemics is 1/30, or 3×10^(-2). Ord’s estimated risk is thus roughly 10,000 times larger than mine.

Probability of unaligned artificial intelligence

Considering the issue of unaligned artificial intelligence, I identify four key stages that would need to happen for this to occur, which again I will regard to first approximation as being conditionally independent of one another:

1. Artificial general intelligence, or an AI which is able to out-perform humans in essentially all human activities, is developed within the next century.

2. This artificial intelligence acquires the power to usurp humanity and achieve a position of dominance on Earth.

3. This artificial intelligence has a reason/motivation/purpose to usurp humanity and achieve a position of dominance on Earth.

4. This artificial intelligence either brings about the extinction of humanity, or otherwise retains permanent dominance over humanity in a manner so as to significantly diminish our long-term potential.

On the basis of the reasoning presented in the previous sections, I regard 1) as roughly as likely as not, and 2), 3), and 4) as being unlikely. Combining all these elements results in the following probability:

P(AI x-risk) = P(AI of sufficient capability is developed) x P(AI gains power to usurp humanity) x P(AI has sufficient reason to usurp humanity) x P(AI retains permanent usurpation of humanity)

P(AI x.risk) = 0.5×0.1×0.1×0.1=5×10^(-4)

In comparison, Ord’s estimated risk from unaligned AI is 1/10, or 10^-1 . Ord’s estimated risk is roughly 200 times larger than mine.

Arriving at credible estimates

Although I do not think the specific numbers I present should be taken very seriously, I would like to defend the process I have gone through in estimating these risks. Specifically, I have identified the key processes I believe would need to occur in order for extinction or other existential catastrophe to occur, and then assessed how likely each of these processes would be to occur on the basis of the relevant historical, scientific, social, and other considerations that I believe to be relevant. I then combine these probabilities to produce an overall estimate.

Though far from perfect, I believe this process if far more transparent than the estimates provided by Ord, for which no explanation is offered as to how they were derived. This means that it is effectively impossible to subject them to critical scrutiny. Indeed, Ord even states that his probabilities “aren’t simply an encapsulation of the information and argumentation in the chapters on the risks” (p. 167), which seems to imply that it is not even possible to subject them to critical analysis on the basis of the information present in this book. While he defends this on the basis that what he knows about the risks “goes beyond what can be distilled into a few pages” (p. 167), I do not find this a very satisfactory response given the total lack of explanation of these numbers in a book of over 400 pages.

Conclusion

In this essay I have argued that in his book The Precipice, Toby Ord has failed to provide a compelling argument that humanity faces a ‘precipice’ with unprecedentedly high and clearly unsustainable levels of existential risk. My main objective was to present an alternative analysis of the risks associated with engineered pandemics and unaligned artificial intelligence, highlighting issues and considerations that I believe Ord does not grant sufficient attention. Furthermore, on the basis of this analysis I presented an alternative set of probability estimates for these two risks, both of which are considerably lower than those presented by Ord. While far from comprehensive or free from debatable premises, I hope that the approach I have outlined here provides a different perspective on the debate, and helps in the development of a nuanced understanding of these important issues.

77