The last several years have witnessed a strong rise of activity on the topic of AI safety. Institutional and academic support has vindicated several elements of the embryonic Friendly AI research program. However, I believe that the degree of attention it has received is undue when compared to other aspects of artificial intelligence and the far future. It resembles the concept of an “availability cascade”, defined by Wikipedia as follows:
An availability cascade is a self-reinforcing cycle that explains the development of certain kinds of collective beliefs. A novel idea or insight, usually one that seems to explain a complex process in a simple or straightforward manner, gains rapid currency in the popular discourse by its very simplicity and by its apparent insightfulness. Its rising popularity triggers a chain reaction within the social network: individuals adopt the new insight because other people within the network have adopted it, and on its face it seems plausible. The reason for this increased use and popularity of the new idea involves both the availability of the previously obscure term or idea, and the need of individuals using the term or idea to appear to be current with the stated beliefs and ideas of others, regardless of whether they in fact fully believe in the idea that they are expressing. Their need for social acceptance, and the apparent sophistication of the new insight, overwhelm their critical thinking.
In this post I’m going to argue for a different approach which should bring more balance to the futurist ecosystem. There are significant potential problems which are related to AI development but are not instances of value alignment and control, and I think that they are more deserving of additional effort at the margin.
The prospects for a single superintelligence
Bostrom (2016) says that a recursively self-improving artificial general intelligence with a sufficient lead over competitors would have a decisive strategic advantage that is likely to ensure that it controls the world. While this is plausible, it is not inevitable and may not be the most likely scenario.
Little argument has been given that this scenario should be our default expectation as opposed to merely plausible. Yudkowsky (2013) presents an argument that the history of human cognitive evolution indicates that an exponential takeoff in intelligence should be expected, though the argument has yet to be formally put together and presented. Computer scientists frequently refer to complexity theory, which implies that getting better at problem solving rapidly becomes very difficult, towards asymptotic limits. In broader economic strokes, Bloom et al (2017) argue that there is a general trend of diminishing returns to research. Both these points suggest that for an agent to acquire a decisive strategic advantage in cognition would either take a very long time or not happen at all.
It seems to me, intuitively, that if superintelligence is the sort of thing that one agent cannot obtain rapidly enough to outcompete all other agents, then it’s also the sort of thing which cannot be obtained rapidly enough by a small subset of agents, like three or four of them. So it will be widespread, or alternatively, it cannot be obtained at all, leaving billions of humans or other agents at the top of the hierarchy. So while I don’t think that a true multi-agent scenario (with scores or more agents, as is typically meant by the term in game theory) is inevitable in the event that there is no single superintelligence, I think it’s conditionally probable.
The Importance of Multi-agent Analysis: Three Scenarios
Whole brain emulation and economic competition
Robin Hanson (2016) writes that the future of human civilization will be a fast-growing economy dominated by whole brain emulations. The future looks broadly good in this scenario given approximately utilitarian values and the assumption that ems are conscious, with a large growing population of minds which are optimized for satisfaction and productivity, free of disease and sickness. Needless to say, without either of the above premises, the em scenario looks very problematic. But other aspects of it would potentially lead to suboptimal utility: social hierarchy, wealth inequality and economic competition. Also, while Hanson gives a very specific picture of the type of society which “ems” will inhabit, he notes that the conjunction of all his claims is extremely unlikely, so there is room for unforeseen issues to arise. It is plausible to me that the value of an em society is heavily contingent upon how ems are built, implemented and regulated.
However, the idea of whole brain emulation as a path to general artificial intelligence has been criticized and is a minority view. Bostrom (2016) argues that there seem to be greater technological hurdles to em development than to other kinds of progress in intelligence. The best current AI is far more capable than the best current emulation (OpenWorm). Industry and academia seem to be placing much more effort into even the very speculative strains of AI research than into emulation.
The future of evolution
If humans are not superseded by a monolithic race of ems, then trends in technological progress and evolution might have harmful effects upon the composition of the population. Bostrom (2009) writes that “freewheeling evolutionary developments, while continuing to produce complex and intelligent forms of organization, lead to the gradual elimination of all forms of being that we care about.” With the relaxation of contemporary human social and biological constraints, two possibilities are plausible: a Malthusian catastrophe where the population expands until welfare standards are neutral or negative, and the evolution of agents which outperform existing ones but without the same faculties of consciousness. Either of these scenarios would entail the extinction of most or all that we find valuable.
Andres Gomez Emilsson also writes that this is a possibility on his blog, saying:
I will define a pure replicator, in the context of agents and minds, to be an intelligence that is indifferent towards the valence of its conscious states and those of others. A pure replicator invests all of its energy and resources into surviving and reproducing, even at the cost of continuous suffering to themselves or others. Its main evolutionary advantage is that it does not need to spend any resources making the world a better place.
Bostrom does not believe that the problem is unavoidable, saying that a ‘singleton’ could combat this process. By singleton he refers to not just a superintelligence but also to any global governing body or even a set of moral codes with the right properties. He writes that such an institution should implement “a coordinated policy to prevent internal developments from ushering it onto an evolutionary trajectory that ends up toppling its constitutional agreement, and doing this would presumably involve modifying the fitness function for its internal ecology of agents.”
Augmented intelligence and military competition
Daniel McIntosh (2010) writes that the near-inevitable adoption of transhuman technologies poses a significant security dilemma due to the political, economic, and battlefield advantages provided by agents with augmented cognitive and physical capabilities. Critics who argue for restraint “tend to deemphasize the competitive and hedonic pressures encouraging the adoption of these products.” Not only is this a problem on its own, but I see no reason to think that the conditions described above wouldn’t apply for scenarios where AI agents turned out to be the primary actors and decisionmakers rather than transhumans or posthumans.
Whatever the type of agent, arms races in future technologies would lead to opportunity costs in military expenditures and would interfere with the project of improving welfare. It seems likely that agents designed for security purposes would have preferences and characteristics which fail to optimize for the welfare of themselves and their neighbors. It’s also possible that an arms race would destabilize international systems and act as a catalyst for warfare.
These trends might continue indefinitely with technological progress. McIntosh rejects the assumption that a post-singularity world would be peaceful:
In a post-singularity, fifth-generation world, there would always be the possibility that the economic collapse or natural disaster was not the result of chance, but of design. There would always be the possibility that internal social changes are being manipulated by an adversary who can plan several moves ahead, using your own systems against you. The systems themselves, in the form of intelligences more advanced than we can match, could be the enemy. Or it might be nothing more than paranoid fantasies. The greatest problem that individuals and authorities might have to deal with may be that one will never be sure that war is not already under way. Just as some intelligence analysts cited the rule that “nothing is found that is successfully hidden” – leading to reports of missile gaps and Iraqi WMD – a successful fifth generation war would [be] one that an opponent never even realized he lost.
Almost by definition, we cannot precisely predict what will happen in a post-singularity world or develop policies and tools that will be directly applicable in such a world. But this possibility highlights the importance of building robust cooperative systems from the ground up, rather than assuming that technological changes will somehow remove these problems. A superintelligent agent with a sufficient advantage over other agents would presumably be able to control a post-singularity world sufficiently to avoid this, but as has been noted, it’s not clear that this is the most likely scenario.
Multi-agent systems are neglected
The initiatives and independent individuals close to the EA sphere who are working towards developing reliable, friendly AI include the Machine Intelligence Research Institute, the Future of Humanity Institute, Berkeley’s Center for Human-Compatible AI, Roman Yampolskiy, and all the effective altruists who are students of AI as far as I can tell. There is less attention towards multi-agent outcomes, as Robin Hanson, Nick Bostrom and Andres Gomez Emilsson seem to be the only ones who have done research on it (and Bostrom seems to be focused on superintelligence), while the Foundational Research Institute has given a general nod towards looking into this direction with its concerns over AI suffering, cooperation, and multipolar takeoffs.
The disparity is preserved as you look farther afield. Pragmatic industry-oriented initiatives to make individual AI systems safe, ethical and reliable include the Partnership on AI among the six major tech companies, some attention from the White House on the subject, and a notable amount of academic work at universities. The work in universities and industry from researchers on multi-agent systems and game theory seems to be entirely focused on pragmatic problems like distributed computational systems and traffic networks; only a few researchers have indicated the need for analyzing multi-agent systems of the future, let alone actually done so. Finally, in popular culture, Bostrom’s Superintelligence has received 319 Amazon reviews to Age of Em’s 30 despite being published at a similar time, and the disparity in general media and journalism on the two general topics seems comparably large.
I do not expect this to change in the future. Multi-agent outcomes are varied and complex, while superintelligence is highly available and catchy. My conclusion is that the former is significantly more neglected than the latter.
Is working on multi-agent systems of the future a tractable project?
The main point of Scott Alexander’s “Meditations on Moloch” is essentially that “the only way to avoid having all human values gradually ground down by optimization-competition is to install a Gardener over the entire universe who optimizes for human values.” In other words, given the problems which have been described above, the only way to actually achieve a really valuable society is to have a singleton which has the right preferences and keeps everyone in line.
This is not different from what Bostrom argues. But remember that the singleton need not be a superintelligence with a decisive strategic advantage. This is fortunate, since it is plausible that computational difficulties will prevent such an entity from ever existing. Instead, the Gardener of the universe might be a much more complex set of agents and institutions. For instance, Peter Railton and Steve Petersen are (I believe) both working on arguments that agents will be linked via a teleological thread where they accurately represent the value functions of their ancestors. We’ll need to think more carefully about how to implement this sort of thing in a way that reliably maximizes welfare.
This is why analysis in multi-agent game theory and mechanism design is important. The very idea behind game theory in general is that you can find useful conclusions by abstracting away from the details of a situation and only looking at players as abstract entities with basic preferences and strategies. This means that analyses and institutions are likely to be pertinent to a wide range of scenarios of technological progress.
While ideas of preventing evolution, economic competition and arms races sound extremely difficult, there is some historical precedent for human institutions to install robust regulations and international agreements on this type of issue. Admittedly, none of it has been on nearly the same scale that would be required to solve the problems described above. But due to the preliminary stage of this line of research, I think that additional research, or literature review at minimum, is needed at least to investigate the various possibilities which we might pursue. Also, there is a similar problem with cooperation when it comes to ordinary AI safety anyway (Armstrong et al 2013).
Conclusion and proposal
I believe I have shown that recent interest in AI and the future of humanity has disproportionately neglected the idea of working on a broader range of futures in which society is not controlled by a single agent. There is still value in AI safety work insofar as alignment and control would help us with building the right agents in multi-agent scenarios, but there are other parts of the picture which need to be explored.
First, there are specific questions which should be answered. How likely are the various scenarios described above, and how can we ensure that they turn out well? Should we prefer that society is governed by a superintelligence with a decisive strategic advantage, and if so, then how much of a priority is it?
Second, there are specific avenues where practical work now can uncover the proper procedures and mindsets for increasing the probability of a positive future. Aside from setting precedents for international cooperation on technical issues, we can start steering the course of machine ethics as it is implemented in modern-day systems. Better systems of machine ethics which don’t require superintelligence to be implemented (as coherent extrapolated volition does) are likely to be valuable for mitigating potential problems involved with AI progress, although they won’t be sufficient (Brundage 2014). Generally speaking, we can apply tools of game theory, multi-agent systems and mechanism design to issues of artificial intelligence, value theory and consciousness.
Given the multiplicity of the issues and the long timeline from here to the arrival of superhuman intelligence, I would like to call for a broader, multifaceted approach to the long term future of AI and civilization. Rather than having a singleminded focus on averting a particular failure mode, it should be a more ambitious and positive project towards a pattern of positive and self-reinforcing interactions between social institutions and intelligent systems, supported by a greater amount of human and financial capital.
Armstrong, Stuart et al (2016). Racing to the Precipice: a Model of Artificial Intelligence Development. AI & Society.
Bloom, Nicholas et al (2017). Are Ideas Getting Harder To Find?
Bostrom, Nick (2009). The Future of Human Evolution. Bedeutung.
Bostrom, Nick (2016). Superintelligence. Oxford University Press.
Brundage, Miles (2014). Limitations and Risks of Machine Ethics. Journal of Experimental & Theoretical Artificial Intelligence.
Hanson, Robin (2016). Age of Em. Oxford University Press.
McIntosh, Daniel (2010). The Transhuman Security Dilemma. Journal of Evolution and Technology.
Yudkowsky, Eliezer (2013). Intelligence Explosion Microeconomics.