When people first began to discuss advanced artificial intelligence, existing AI was rudimentary at best, and we had to reply on ideas about human thinking and extrapolate. Now, however, we've developed many different advanced AI systems, some of which outperform human thinking on certain tasks. In this talk from EA Global 2018: London, Eric Drexler argues that we should use this new data to rethink our models for how superintelligent AI is likely to emerge and function.
A transcript of Eric's talk is below, which CEA has lightly edited for clarity. You can also watch this talk on YouTube, or read the transcript on effectivealtruism.org.
I've been working in this area for quite a while. The chairman of my doctoral committee was one Marvin Minsky. We had some discussions on AI safety around 1990. He said I should write them up. I finally got around to writing up some developed versions of those ideas just very recently, so that's some fairly serious procrastination. Decades of procrastination on something important.
For years, one couldn't talk about advanced AI. One could talk about nanotechnology. Now it's the other way around. You can talk about advanced AI, but not about advanced nanotechnology. So this is how the Overton window moves around.
What I would like to do is to give a very brief presentation which is pretty closely aligned with talks I've given at OpenAI, DeepMind, FHI, and Bay Area Rationalists. Usually I give this presentation to a somewhat smaller number of people, and structure it more around discussion. But what I would like to do, still, is to give a short talk, put up points for discussion, and encourage something between Q&A and discussion points from the audience.
Okay so, when I say "Reframing Superintelligence," what I mean is thinking about the context of emerging AI technologies as a process rolling forward from what we see today. And asking, "What does that say about likely paths forward?" Such that whatever it is that you're imagining needs to emerge from that context or make sense in that context. Which I think reframes a lot of the classic questions. Most of the questions don't go away, but the context in which they arise, the tools available for addressing problems, look different. That's what we'll be getting into.
Once upon a time, when we thought about advanced AI, we didn't really know what AI systems were likely to look like. It was very unknown. People thought in terms of developments in logic and other kinds of machine learning, different from the deep learning that we now see moving forward with astounding speed. And people reached for an abstract model of intelligent systems. And what intelligent systems do we know? Well, actors in the world like ourselves. We abstract from that very heavily and you end up with rational, utility-directed agents.
Today, however, we have another source of information beyond that abstract reasoning, which applies to a certain class of systems. And information that we have comes from the world around us. We can look at what's actually happening now, and how AI systems are developing. And so we can ask questions like, "Where do AI systems come from?" Well, today they come from research and development processes. We can ask, "What do AI systems do today?" Well, broadly speaking, they perform tasks. Which I think of, or will describe, as "performing services." They do some approximation or they do something that someone supposedly wants in bounded time with bounded resources. What will they be able to do? Well, if we take AI seriously, AI systems will be able to automate asymptotically all human tasks, and more, at a piecemeal and asymptotically general superintelligent level. So we said AI systems come from research and development. Well, what is research and development? Well, it's a bunch of tasks to automate. And, in particular, they're relatively narrow technical tasks which are, I think, uncontroversially automate-able on the path to advanced AI.
So the picture is of AI development moving forward broadly along the lines that we're seeing. Higher-level capabilities. More and more automation of the AI R&D process itself, which is an ongoing process that's moving quite rapidly. AI-enabled automation and also classical software techniques for automating AI research and development. And that, of course, leads to acceleration. Where does that lead? It leads to something like recursive improvement, but not the classic recursive improvement of an agent that is striving to be a more intelligent, more capable agent. But, instead, recursive improvement where an AI technology base is being advanced at AI speed. And that's a development that can happen incrementally. We see it happening now as we take steps toward advanced AI that is applicable to increasingly general and fast learning. Well, those are techniques that will inevitably be folded into the ongoing AI R&D process. Developers, given some advance in algorithms and learning techniques, and a conceptualization of how to address more and more general tasks, will pounce on those, and incorporate them into a broader and broader range of AI services.
So where that leads is to asymptotically comprehensive AI services. Which, crucially, includes the service of developing new services. So increasingly capable, increasingly broad, increasingly piecemeal and comprehensively superintelligent systems that can work with people, and interact with people in many different ways to provide the service of developing new services. And that's a kind of generality. That is a general kind of artificial intelligence. So a key point here is that the C in CAIS, C in Comprehensive AI Services does the work of the G in AGI. Why is it a different term? To avoid the implication... when people say AGI they mean AGI agent. And we can discuss the role of agents in the context of this picture. But I think it's clear that a technology base is not inherently in itself an agent. In this picture agents are not central, they are products. They are useful products of diverse kinds for providing diverse services. And so with that, I would like to (as I said, the formal part here will be short) point to a set of topics.
They kind of break into two categories. One is about short paths to superintelligence, and I'll argue that this is the short path. The topic of AI services and agents, including agent services, versus the concept of "The AI" which looms very large in people's concepts of future AI. I think we should look at that a little bit more closely. Superintelligence as something distinct from agents, superintelligent non-agents. And the distinction between general learning and universal competence. People have, I think, misconstrued what intelligence means and I'll take a moment on that. If you look at definitions of good from the 1960s, ultra-intelligence and more recent Bostrom and so on (I work across the hall from Nick) on superintelligence the definition is something like "a system able to outperform any person in any task whatsoever." Well, that implies general competence, at least as ordinarily read. But there's some ambiguity over what we mean by the word "intelligence" more generally. We call children intelligent and we call senior experts intelligent. We call a child intelligent because the child can learn, not because the child can perform at a high level in any particular area. And we call an expert who can perform at a high level intelligent not because the expert can learn - in principle you could turn off learning capacity in the brain - but because the expert can solve difficult problems at a high level.
So learning and competence are dissociable components of intelligence. They are in fact quite distinct in machine learning. There is a learning process and then there is an application of the software. And when you see discussion of intelligent systems that does not distinguish between learning and practice, and treats action as entailing learning directly, there's a confusion there. There's a confusion about what intelligence means and that's, I think, very fundamental. In any event, looking toward safety-related concerns, there are things to be said about predictive models of human concerns. AI-enabled solutions to AI-control problems. How this reframes questions of technical AI safety. Issues of services versus addiction, addictive services and adversarial services. Services include services you don't want. Taking superintelligent services seriously. And a question of whether faster development is better.
And, with that, I would like to open for questions, discussion, comment. I would like to have people come away with some shared sense of what the questions and comments are. Some common knowledge of thinking in this community in the context of thinking about questions this way.
Question: Is your model compatible with end-to-end reinforcement learning?
To say a little bit more. By the way, I've been working on a collection of documents for the last two years. It's now very large, and it will be an FHI technical report soon. It's 30,000 words structured to be very skim-able. Top-down, hierarchical, declarative sentences expanding into longer ones, expanding into summaries, expanding into fine-grained topical discussion. So you can sort of look at the top level and say, hopefully, "Yes, yes, yes, yes, yes. What about this?" And not have to read anything like 30,000 words. So, what I would say is that reinforcement learning is a technique for AI system development. You have a reinforcement learning system. It produces through a reinforcement learning process, which is a way of manipulating the learning of behaviors. It produces systems that are shaped by that mechanism. So it's a development mechanism for producing systems that provide some service. Now if you turn reinforcement learning loose in the world open-ended, read-write access to the internet, a money-maximizer and did not have checks in place against that? There are some nasty scenarios. So basically it's a development technique, but could also be turned loose to produce some real problems. "Creative systems trying to manipulate the world in bad ways" scenarios are another sector of reinforcement learning. So not a problem per se, but one can create problems using that technique.
Question: What does asymptotic improvement of AI services mean?
Eric: I think I'm abusing the term asymptotic. What I mean is increasing scope and increasing level of capability in any particular task to some arbitrary limit. Comprehensive is sort of like saying infinite, but moving toward comprehensive and superintelligent level services. What it's intended to say is, ongoing process going that direction. If someone has a better word than asymptotic to describe that I'd be very happy.
Question: Can the tech giants like Facebook and Google be trusted to get alignment right?
Eric: Google more than Facebook. We have that differential. I think that questions of alignment look different here. I think more in terms of questions of application. What are the people who wield AI capabilities trying to accomplish? So there's a picture which, just background to the framing of that question, and a lot of these questions I think I'll be stepping back and asking about framing. As you might think from the title of the talk. So picture a rising set of AI capabilities: image recognition, language understanding, planning, tactical management in battle, strategic planning for patterns of action in the world to accomplish some goals in the world. Rising levels of capability in those tasks. Those capabilities could be exploited by human decision makers or could, in principle, be exploited by a very high-level AI system. I think we should be focusing more, not exclusively, but more on human decision makers using those capabilities than on high-level AI systems. In part because human decision makers, I think, are going to have broad strategic understanding more rapidly. They'll know how to get away with things without falling afoul of what nobody had seen before, which is intelligence agencies watching and seeing what you're doing. It's very hard for a reinforcement learner to learn that kind of thing.
So I tend to worry about not the organizations making aligned AI so much as whether the organizations themselves are aligned with general goals.
Question: Could you describe the path to superintelligent services with current technology, using more concrete examples?
Eric: Well, we have a lot of piecemeal examples of superintelligence. AlphaZero is superintelligent in the narrow domain of Go. There are systems that outperform human beings in playing these very different kinds of games, like Atari games. Face recognition recently surpassed human ability to map from human speech to transcriptive words. Just more and more areas piecemeal. A key area that I find impressive and important is the design of neural networks at the core of modern deep learning systems. The design of and learning to use appropriately, hyperparameters. So, as of a couple of years ago, if you wanted a new neural network, a convolutional network for vision, or some recurrent network, though recently they're going for convolution networks for language understanding and translation, that was a hand-crafted process. You had human judgment and people were building these networks. A couple of years ago people started in these, this is not AI in general but it's a chunk that a lot of attention went into, getting superhuman performance in neural networks by automated, AI-flavored like, for example, reinforcement learning systems. So developing reinforcement learning systems that learn to put together the building blocks to make a network that outperforms human designers in that process. So we now have AI systems that are designing a core part of AI systems at a superhuman level. And this is not revolutionizing the world, but that threshold has been crossed in that area.
And, similarly, automation of another labor-intensive task that I was told very recently by a senior person at DeepMind would require human judgment. And my response was, "Do you take AI seriously or not?" And, out of DeepMind itself, there was then a paper that showed how to outperform human beings in hyperparameter selection. So those are a few examples. And the way one gets to an accelerating path is to have more and more, faster and faster implementation of human insights into AI architectures, training methods, and so on. Less and less human labor required. Higher and higher level human insights being turned into application throughout the existing pool of resources. And, eventually, fewer and fewer human insights being necessary.
Question: So what are the consequences of this reframing of superintelligence for technical AI safety research?
Eric: Well, re-contexting. If in fact one can have superintelligent systems that are not inherently dangerous, then one can ask how one can leverage high-level AI. So a lot of the classic scenarios of misaligned powerful AI involve AI systems that are taking actions that are blatantly undesirable. And, as Shane Legg said when I was presenting this at DeepMind last Fall, "There's an assumption that we have superintelligence without common sense." And that's a little strange. So Stuart Russell has pointed out that machines can learn not only from experience, but from reading. And, one can add, watching video and interacting with people and through questions and answers in parallel over the internet. And we see in AI that a major class of systems is predictive models. Given some input you predict what the next thing will be. In this case, given a description of a situation or an action, you try to predict what people will think of it. Is it something that they care about or not? And, if they do care about it, is there widespread consensus that that would be a bad result? Widespread consensus that it would be a good result? Or strongly mixed opinion?
Note that this is a predictive model trained on many examples, it's not an agent. That is an oracle that, in principle, could operate with reasoning behind the prediction. That could in principle operate at a super intelligent level, and would have common sense about what people care about. Now think about having AI systems that you intend to be aligned with human concerns where, available for a system that's planning action, is this oracle. It can say, "Well, if such and such happened, what would people think of it?" And you'd have a very high-quality response. That's a resource that I think one should take account of in technical AI safety. We're very unlikely to get high-level AI without having this kind of resource. People are very interested in predicting human desires and concerns if only because they want to sell you products or brainwash you in politics or something. And that's the same underlying AI technology base. So I would expect that we will have predictive models of human concerns. That's an example of a resource that would reframe some important aspects of technical AI safety.
Question: So, making AI services more general and powerful involves giving them higher-level goals. At what point of complexity and generality do these services then become agents?
Eric: Well, many services are agent-services. A chronic question that arises, people will be at FHI or DeepMind and someone will say, "Well, what is an agent anyway?" And everybody will say, "Well, there is no sharp definition. But over here we're talking about agents and over here we're clearly not talking about agents." So I would be inclined to say that if a system is best thought of as directed toward goals and it's doing some kind of planning and interacting with the world I'm inclined to call it an agent. And, by that definition, there are many, many services we want, starting with autonomous vehicles, autonomous cars and such, that are agents. They have to make decisions and plan. So there's a spectrum from there up to higher and higher level abilities to do means-ends analysis and planning and to implement actions. So let's imagine that your goal is to have a system that is useful in military action and you would like to have the ability to execute tactics with AI speed and flexibility and intelligence, and have strategic plans for using those tactics that are superintelligent level.
Well, those are all services. They're doing something in bounded time with bounded resources. And, I would argue, that that set of systems would include many systems that we would call agents but they would be pursuing bounded tasks with bounded goals. But the higher levels of planning would naturally be structured as systems that would give options to the top level decision makers. These decision makers would not want to give up their power, they don't want a system guessing what they want. At a strategic level they have a chance to select, since strategy unfolds relatively slowly. So there would be opportunities to say, "Well, don't guess, but here's the trade off I'm willing to make between having this kind of impact on opposition forces with this kind of lethality to civilians and this kind of impact on international opinion. I would like options that show me different trade-offs. All very high quality but within that trade-off space. And here I'm deliberately choosing an example which is about AI resources being used for projecting power in the world. I think that's a challenging case, so it's a good place to go.
I'd like to say just a little bit about the opposite end, briefly. Superintelligent non-agents. Here's what I think is a good paradigmatic example of superintelligence and non-agency. Right now we have systems that do natural language translation. You put in sentences or, if you had a somewhat smarter system that dealt with more context, books, and out comes text in a different language. Well, I would like to have systems that know a lot to do that. You do better translations if you understand more about history, chemistry if it's a chemistry book, human motivations. Just, you'd like to have a system that knows everything about the world and everything about human beings to give better quality translations. But what is the system? Well, it's a product of R&D and it is a mathematical function of type character string to character string. You put in a character string, things happen, and out comes a translation. You do this again and again and again. Is that an agent? I think not. Is it operating at a superintelligent level with general knowledge of the world? Yes. So I think that one's conceptual model of what high-level AI is about should have room in it for that system and for many systems that are analogous.
Question: Would a system service that combines general learning with universal competence not be more useful or competitive than a system that displays either alone? So does this not suggest that agents might be more useful?
Eric: Well, as I said, agents are great. The question is what kind and for what scope. So, as I was saying, distinguishing between general learning and universal competence is an important distinction. I think it is very plausible that we will have general learning algorithms. And general learning algorithms may be algorithms that are very good at selecting algorithms that are good at selecting algorithms for learning a particular task and inventing new algorithms. Now, given an algorithm for learning, there's a question of what you're training it to do. What information? What competencies are being developed? And I think that the concept of a system being trained on and learning about everything in the world with some objective function, I don't think that's a coherent idea. Let's say you have a reinforcement learner. You're reinforcing the system to do what? Here's the world and it's supposed to be getting competence in organic chemistry and ancient Greek and, I don't know, control of the motion of tennis-playing robots and on and on and on and on. What's the reward function, and why do we think of that as one task?
I don't think we think of it as one task. I think we think of it as a bunch of tasks which we can construe as services. Including the service of interacting with you, learning what you want, nuances. What you are assumed to want, what you're assumed not to want as a person. More about your life and experience. And very good at interpreting your gestures. And it can go out in the world and, subject to constraints of law and consulting an oracle on what other people are likely to object to, implement plans that serve your purposes. And if the actions are important and have a lot of impact, within the law presumably, what you want is for that system to give you options before the system goes out and takes action. And some of those actions would involve what are clearly agents. So that's the picture I would like to paint that I think reframes the context of that question.
Question: So on that is it fair to say that the value-alignment problem still exists within your framework? Since, in order to train a model to build an agent that is aligned with our values, we must still specify our values.
Eric: Well, what do you mean by, "train an agent to be aligned with our values." See, the classic picture says you have "The AI" and "The AI" gets to decide what the future of the universe looks like and it had better understand what we want or would want or should want or something like that. And then we're off into deep philosophy. And my card says philosophy on it, so I guess I'm officially a philosopher or something according to Oxford. I was a little surprised. "It says philosophy on it. Cool!" I do what I think of as philosophy. So, in a services model, the question would instead be, "What do you want to do?" Give me some task that is completed in bounded time with bounded resources and we could consider how to avoid making plans that stupidly cause damage that I don't want. Plans that, by default, automatically do what I could be assumed to want. And that pursue goals in some creative way that is bounded, in the sense that it's not about reshaping the world; other forces would presumably try to stop you. And I'm not quite sure what value alignment means in that context. I think it's something much more narrow and particular.
By the way, if you think of an AI system that takes over the world, keep in mind that a sub-task of that, part of that task, is to overthrow the government of China. And, presumably, to succeed the first time because otherwise they're going to come after you if you made a credible attempt. And that's in the presence of unknown surveillance capabilities and unknown AI that China has. So you have a system and it might formulate plans to try to take over the world, well, I think an intelligent system wouldn't recommend that because it's a bad idea. Very risky. Very unlikely to succeed. Not an objective that an intelligent system would suggest or attempt to pursue. So you're in a very small part of a scenario space where that attempt is made by a high-level AI system. And it's a very small part of scenario space because it's an even smaller part of scenario space where there is substantial success. I think it's worth thinking about this. I think it's worth worrying about it. But it's not the dominant concern. It's a concern in a framework where I think we're facing an explosive growth of capabilities that can amplify many different purposes, including the purposes of bad actors. And we're seeing that already and that's what scares me.
Question: So I guess, in that vein, could the superintelligent services be used to take over the world by a state actor? Just the services?
Eric: Well, you know, services include tactical execution of plans and strategic planning. So could there be a way for a state actor to do that using AI systems in the context of other actors with, presumably, a comparable level of technology? Maybe so. It's obviously a very risky thing to do. One aspect of powerful AI is an enormous expansion of productive capacity. Partly through, for example, high-level, high quality automation. More realistically, physics-limited production technology, which is outside today's sphere of discourse or Overton window.
Security systems, I will assert, could someday be both benign and effective, and therefore stabilizing. So the argument is that, eventually it will be visibly the case that we'll have superintelligent level, very broad AI, enormous productive capacity, and the ability to have strategic stability, if we take the right measures beforehand to develop appropriate systems, or to be prepared to do that, and to have aligned goals among many actors. So if we distribute the much higher productive capacity well, we can have an approximately strongly Pareto-preferred world, a world that looks pretty damn good to pretty much everyone.
Note: for a more thorough presentation on this topic, see Eric Drexler's other talk from this same conference.
Question: What do you think the greatest AI threat to society in the next 10, 20 years would be?
Eric: I think the greatest threat is instability. Sort of either organic instability from AI technologies being diffused and having more and more of the economic relationships and other information-flow relationships among people be transformed in directions that increase entropy, generate conflict, destabilize political institutions. Who knows? If you had the internet and people were putting out propaganda that was AI-enabled, it's conceivable that you could move elections in crazy directions in the interest of either good actors or bad actors. Well, which will that be? I think we will see efforts made to do that. What kinds of counter-pressures could be applied to bad actors using linguistically politically-competent AI systems to do messaging? And, of course, there's the perennial states engaging in an arms race which could tip into some unstable situation and lead to a war. Including the long-postponed nuclear war that people are waiting for and might, in fact, turn up some day. And so I primarily worry about instability. Some of the modes of instability are because some actor decides to do something like turn loose a competent hacking, reinforcement-learning system that goes out there and does horrible things to global computational infrastructure that either do or don't serve the intentions of the parties that released it. But take a world that's increasingly dependent on computational infrastructure and just slice through that, in some horribly destabilizing way. So those are some of the scenarios I worry about most.
Question: And then maybe longer term than 10, 20 years? If the world isn't over by then?
Eric: Well, I think all of our thinking should be conditioned on that. If one is thinking about the longer term, one should assume that we are going to have superintelligent-level general AI capabilities. Let's define that as the longer term in this context. And, if we're concerned with what to do with them, that means that we've gotten through the process to there then. So there's two questions. One is, "What do we need to do to survive or have an outcome that's a workable context for solving more problems?" And the other one is what to do. So, if we're concerned with what to do, we need to assume solutions to the preceding problems. And that means high-level superintelligent services. That probably means mechanisms for stabilizing competition. There's a domain there that involves turning surveillance into something that's actually attractive and benign. And the problems downstream, therefore, one hopes to have largely solved. At least the classic large problems and now problems that arise are problems of, "What is the world about anyway?" We're human beings in a world of superintelligent systems. Is trans-humanism in this direction? Uploading in this direction? Developing moral patients, superintelligent-level entities that really aren't just services, and are instead the moral equivalent of people? What do you do with the cosmos? It's an enormously complex problem. And, from the point of view of having good outcomes, what can I say? There are problems.
Question: So what can we do to improve diversity in the AI sector? And what are the likely risks of not doing so?
Eric: Well, I don't know. My sense is that what is most important is having the interests of a wide range of groups be well represented. To some extent, obviously, that's helped if you have in the development process, in the corporations people who have these diverse concerns. To some extent it's a matter of politics regulation, cultural norms, and so on. I think that's a direction we need to push in. To put this in the Paretotopian framework, your aim is to have objectives, goals that really are aligned, so, possible futures that are strongly goal-aligning for many different groups. For many of those groups, we won't fully understand them from a distance. So we need to have some joint process that produces an integrated, adjusted picture of, for example, how do we have EAs be happy and have billionaires maintain their relative position? Because if you don't do that they're going to maybe oppose what you're doing, and the point is to avoid serious opposition. And also have the government of China be happy. And I would like to see the poor in rural Africa be much better off, too. Billionaires might be way up here, competing not to build orbital vehicles but instead starships. And the poor in rural Africa of today merely have orbital space capabilities convenient for families, because they're poor. Nearly everyone much, much better off.