Some Background on Open Philanthropy's Views Regarding Advanced Artificial Intelligence

Holden Karnofsky

This is a linkpost for https://www.openphilanthropy.org/blog/some-background-our-views-regarding-advanced-artificial-intelligence

We’re planning to make potential risks from advanced artificial intelligence a major priority in 2016. A future post will discuss why; this post gives some background.

Summary:

I first give our definition of “transformative artificial intelligence,” our term for a type of potential advanced artificial intelligence we find particularly relevant for our purposes. Roughly and conceptually, transformative AI refers to potential future AI that precipitates a transition comparable to (or more significant than) the agricultural or industrial revolution. I also provide (below) a more detailed definition. The concept of “transformative AI” has some overlap with concepts put forth by others, such as “superintelligence” and “artificial general intelligence.” However, “transformative AI” is intended to be a more inclusive term, leaving open the possibility of AI systems that count as “transformative” despite lacking many abilities humans have.
I then discuss the question of whether, and when, we might expect transformative AI to be developed. This question has many properties (long timelines, relatively vague concepts, lack of detailed public analysis) I associate with developments that are nearly impossible to forecast, and I don’t think it is possible to make high-certainty forecasts on the matter. With that said, I am comfortable saying that I think there is a nontrivial likelihood (at least 10% with moderate robustness, and at least 1% with high robustness) of transformative AI within the next 20 years. I can’t feasibly share all of the information that goes into this view, but I try to outline the general process I have followed to reach it.
Finally, I briefly discuss whether there are other potential future developments that seem to have similar potential for impact on similar timescales to transformative AI, in order to put our interest in AI in context.

The ideas in this post overlap with some arguments made by others, but I think it is important to lay out the specific views on these issues that I endorse. Note that this post is confined in scope to the above topics; it does not, for example, discuss potential risks associated with AI or potential measures for reducing them. I will discuss the latter topics more in the future.

Defining “transformative artificial intelligence” (transformative AI)

There are many ways to classify potential advanced AI systems. For our purposes, we prefer to focus in on the particular classifications that are most relevant to AI’s potential impact on the world, while putting aside many debates that don’t relate to this (for example, whether and when an AI system might have human-like consciousness and emotions). “Transformative AI” is our term for a particular classification we find important. In this section, after some basic background, I will give two definitions of the term as we’re using it: one is a relatively simple, rough sketch of the concept we’re trying to capture, and the other is a more specific (though still far from precise) definition meant to help give a more detailed picture of what I would and wouldn’t include in this classification.

One of the main things we seek to assess about any given cause is its importance: how many people are affected, and how deeply? All else equal, we’re more interested in artificial intelligence developments that would affect more people and more deeply. And consistent with our philosophy of hits-based giving, we think it’s productive for any given cause to ask: “What’s the highest imaginable impact here? What are the most extreme scenarios, importance-wise, even if they’re unlikely?”

When asking these sorts of questions for US policy, we’ve discussed potential policy changes whose impact could be equivalent to hundreds of billions of dollars per year in economic value. These are high-impact, low-probability scenarios. In some contexts, however, I think it is appropriate to think about changes on an even larger scale. When asking just how significant a development could be, I think it’s worth starting with the question: “What were the most significant developments in history, and could a development in cause X compare?” I think the answer will usually be “no” (I feel this way about most issues we work on), but when it is “yes,” it would be a mistake not to consider such a scenario.

When thinking of the most significant developments in history, my thinking jumps to the agricultural (neolithic) revolution and the industrial revolution, both of which I believe brought about fundamental, permanent changes in the nature of civilization.^[1] I believe that there is a serious possibility that progress in artificial intelligence could precipitate a transition comparable to (or more significant than) these two developments. One way in which this could happen would be if future AI systems became very broad in their abilities, to the point of being able to outperform humans in a wide array of jobs and thus fundamentally (and possibly quickly) transforming the economy. Another way would be if future AI systems proved capable of making major contributions to science and/or engineering, in one field or many, and thus caused a dramatic, possibly unexpected acceleration in the development of some transformative technology. (A later section of this post lists some potential transformative technologies that I believe are possible in principle, though this is based only on my own impressions; I expect others to have different lists of the most potentially transformative technologies, but I expect fairly wide agreement on the basic point that some advance(s) in science and/or engineering could be transformative.)

More broadly, I note that the agricultural and industrial revolutions both seem to have been driven largely by the discovery and proliferation of new technologies and ideas, developed through applied human intelligence. And with two such epochal events in the last ~10,000 years (one within the past ~300), I think it would be mistaken to dismiss such a dramatic transition as unprecedented or impossible.

With this in mind, we define transformative AI as follows:

Definition #1: Roughly and conceptually, transformative AI is AI that precipitates a transition comparable to (or more significant than) the agricultural or industrial revolution.

Definition #2: Since definition #1 leaves a great deal of room for judgment, I provide a more detailed definition that I feel would likely (though not certainly) satisfy the first. Under this more detailed definition, transformative AI is anything that fits one or more of the following descriptions:

AI systems capable of fulfilling all the necessary functions of human scientists, unaided by humans, in developing another technology (or set of technologies) that ultimately becomes widely credited with being the most significant driver of a transition comparable to (or more significant than) the agricultural or industrial revolution. Note that just because AI systems could accomplish such a thing unaided by humans doesn’t mean they would; it’s possible that human scientists would provide an important complement to such systems, and could make even faster progress working in tandem than such systems could achieve unaided. I emphasize the hypothetical possibility of AI systems conducting substantial unaided research to draw a clear distinction from the types of AI systems that exist today. I believe that AI systems capable of such broad contributions to the relevant research would likely dramatically accelerate it.
AI systems capable of performing tasks that currently (in 2016) account for the majority of full-time jobs worldwide, and/or over 50% of total world wages, unaided and for costs in the same range as what it would cost to employ humans. Aside from the fact that this would likely be sufficient for a major economic transformation relative to today, I also think that an AI with such broad abilities would likely be able to far surpass human abilities in a subset of domains, making it likely to meet one or more of the other criteria laid out here.
Surveillance, autonomous weapons, or other AI-centric technology that becomes sufficiently advanced to be the most significant driver of a transition comparable to (or more significant than) the agricultural or industrial revolution. (This contrasts with the first point because it refers to transformative technology that is itself AI-centric, whereas the first point refers to AI used to speed research on some other transformative technology.)

Definition #2 is far from precise, and still leaves plenty of room for individual judgment. And neither of the two definitions strictly implies the other. That said, my view is that anything meeting definition #2 would quite likely meet definition #1, and definition #2 provides more clarity regarding what sorts of developments would and would not seem sufficiently different from today’s technology to qualify as transformative AI from our perspective. Definition #2 is also intended to help a person imagine how they might judge whether (by their own judgment) transformative AI as I envision it has been developed at a given point in the future. The remainder of this post will be proceeding from definition #2 as it discusses comparisons and predictions where more detail is helpful.

Note that these definitions of transformative AI are agnostic to many possible comparisons between AI systems and human minds. For example, they leave open the possibility of AI systems that count as “transformative” despite not having human-like consciousness or emotions. They also leave open the possibility of AI systems that count as “transformative” despite lacking many abilities humans have - it is necessary only that such systems have sufficient ability to bring about major changes in the world.

Relationship to some other AI-related terms

It’s worth addressing the relationship of the “transformative AI” concept to some other terms for potential high-impact advanced AI:

In Superintelligence, Nick Bostrom defines a superintelligence as “any intellect that greatly exceeds the cognitive performance of humans in virtually all domains of interest.”^[2]
Artificial general intelligence (AGI) is currently defined, according to Wikipedia, as “the intelligence of a (hypothetical) machine that could successfully perform any intellectual task that a human being can.”
High-level machine intelligence refers to an AI system “that can carry out most human professions at least as well as a typical human.”

We intend “transformative AI” to be, for the most part, a less restrictive term than any of these. Anything fitting one of the above three descriptions would likely meet at least the second condition of the detailed definition of transformative AI, and it would therefore likely (though not definitely) meet the broader, more conceptual definition we gave.^[3] But it is possible to imagine transformative AI that does not qualify as superintelligence, AGI or high-level machine intelligence, because its impact comes from a relatively small number of domains in which AI systems achieve better-than-human performance. To give a couple of illustrative examples:

Future AI systems might prove capable of analyzing scientific literature, generating new hypotheses, and designing experiments to test these hypotheses, resulting in a speedup of scientific progress comparable to what happened in the Scientific Revolution and/or a speedup of technological progress comparable to what happened in the Industrial Revolution.
Future AI systems might prove capable of accelerating progress on particular, highly important areas of science, even if they are limited in other areas.
Future AI systems might bring about a dramatic leap in surveillance capabilities, e.g. by reducing the labor necessary to interpret large amounts of data. (Whether this would count as “transformative” in our sense would depend on the details of how it played out.)

It’s important to us to include these sorts of possibilities in “transformative AI,” for two reasons. First, the potential benefits and risks could be different from those posed by e.g. superintelligence, while still being highly worthy of our consideration. Second, below I discuss my views on when we might expect transformative AI to be developed, and it’s important to my views that there are a large number of possible paths to transformative AI - not all of which require replicating all (or even most) of the functions of human brains.

When should we expect transformative artificial intelligence?

So far, I have discussed the in-principle possibility of an extremely powerful technology that could bring about extremely important changes. But I haven’t addressed the thornier, and very important, question of whether there is any way to anticipate how soon we might expect such a development.

I think the only defensible position on this question is one of very high uncertainty. I’ve seen no signs of data or arguments that should give us confidence about timelines to transformative AI.

However, having thought hard about this question and put a fair amount of time into investigating it over the last year, one claim I am comfortable making is that I think there is a nontrivial likelihood of transformative AI within the next 20 years. Specifically, when I say “nontrivial likelihood,” I mean:

I believe the probability to be at least 10%, and consider this view to be moderately robust and stable. I’m fairly (not highly) confident that a maximally thorough investigation would result in an estimate of at least 10%.
I am highly confident that a maximally thorough investigation would put the probability at at least 1%.

This view is important to my stance on the importance of potential risks from advanced artificial intelligence. If I did not hold it, this cause would probably still be a focus area of the Open Philanthropy Project, but holding this view is important to prioritizing the cause as highly as we’re planning to.

I recognize that it is an extremely difficult claim to evaluate. And my current view is based on a large number of undocumented conversations, such that I don’t think it is realistic to aim for being highly convincing on this point in this post. However, I will attempt to lay out the general process I’ve followed to reach my current views.

I also note that we are doing work on several fronts to further refine our thinking about likely timelines. These include working toward a broader discussion of relevant technical issues (discussed below) and continuing an ongoing survey of the literature on forecasting, particularly the work of Philip Tetlock (whom we have funded), as well as seeking to understand the performance of past long-term predictions about technology.

Expert surveys and trend extrapolations

When asking how likely transformative AI is to be developed in the next 20 years, I think a natural first approach is to ask: (a) what do relevant experts believe about the likelihood, and (b) can we learn anything from extrapolation of relevant trends?

We’ve done our best to examine the available information about these questions. Findings are summarized by Luke Muehlhauser here, and I discuss the takeaways below. Unfortunately, I believe that these lines of inquiry give relatively little to go on, and they do not represent the only (or even primary) inputs into my thinking. With that said, the information we do have along these lines reinforces the view that there is a nontrivial likelihood of transformative AI within the next 20 years.

I’ve chosen to present this information first because I think many readers would instinctively expect it to be the most useful information, even though - with matters as they stand - I have ended up putting much more weight on the arguments presented in later sections.

Expert surveys: Some prominent AI researchers have made public statements seemingly implying that certain kinds of AI are a long way away. For instance, in the context of discussing potential risks, Andrew Ng has said, “I don’t work on preventing AI from turning evil for the same reason that I don’t work on combating overpopulation on the planet Mars… Maybe hundreds of years from now, maybe thousands of years from now—I don’t know—maybe there will be some AI that turn evil, but that’s just so far away that I don’t know how to productively work on that.” As mentioned elsewhere in this post, I think it’s plausible that he is right to think that certain forms of advanced AI are hundreds or thousands of years away. However, I haven’t been able to identify systematic arguments for this view, and (as discussed below) I believe there are many researchers with relevant expertise who disagree. When assessing expert opinion, I am inclined to attempt to rely on surveys rather than on a small number of brief public statements.

Most attempts to survey relevant experts have had major methodological issues. For reasons laid out in Luke’s writeup, I believe the most useful available survey is the “TOP100” survey from Müller and Bostrom 2014, which asked a number of researchers for the year by which they estimated a 10%, 50% and 90% chance of high-level machine intelligence (HLMI, defined in the previous section). Taking either the mean or median of responses implies a 10% probability within 20 years of today (median 2024, mean 2034); also note that the survey (again based on the mean and median of responses) implies a 90% chance of high-level machine intelligence well within the next 200 years. And as discussed above, I feel that “high-level machine intelligence” is mostly a more restrictive concept than “transformative AI.”

However, I have major reservations about all the surveys that have been done, including the one just cited. Most importantly, I believe that the people surveyed are in many, if not all, cases giving essentially off-the-cuff responses, with little or no attempt to make detailed models of key factors or break the question into smaller pieces that can then be investigated. I think that these practices are generally accepted as important for difficult forecasting challenges,^[4] and my personal experience supports this; for example, I’ve found the practice of doing cost-effectiveness analysis to be important in raising crucial considerations for GiveWell’s top charities.

With this point noted, I have several further reservations about the forecasts made in surveys, some of which derive from this high-level point. Most echo Luke’s. A few seem particularly worth highlighting:

I am concerned that the people surveyed may be biased toward shorter timelines, because the fact that they engage in these questions may indicate that they’re unusually enthusiastic about the relevant technologies.
I fear that those surveyed are not accounting for growth in the relevant fields. The conversations I’ve had with machine learning researchers, as well as with Daniel Dewey, have led me to believe that - partly due to excitement over relatively recent results (discussed below) - there is a fairly rapid influx of researchers into AI- and machine-learning-related work. For example, attendance of the Conference on Neural Information Processing Systems (NIPS) appears to have been growing rapidly over the past decade. I note that multiple major, heavily funded AI labs have been started since 2010: DeepMind (acquired by Google for $500 million), Google Brain, Facebook AI Research (announced in 2013), Baidu Silicon Valley AI Lab, Vicarious (which has reportedly raised $72 million),^[5] and OpenAI (whose funders have committed $1 billion).^[6] One prominent AI researcher has stated, “Industry [has probably invested] more in the last 5 years than governments have invested since the beginning of the field [in the 1950s].” Someone giving an off-the-cuff projection of progress in AI research might be extrapolating from past progress, without accounting for the far greater interest and funding in the field now.
I think it’s quite possible that the path to transformative AI will involve many technical challenges, and that different challenges will be best addressed by different fields and different intellectual traditions. As long as our main source of information is off-the-cuff estimates rather than detailed discussions, I fear there could be distortions introduced by people’s intuitions about fields they are relatively unfamiliar with. These distortions could cause a bias toward shorter timelines, if people are over-optimistic about fields they aren’t familiar with or are extrapolating their own field’s progress to that of other fields that could prove both necessary and substantially slower than their own. There could also be a bias toward longer timelines, if people are overlooking the fact that many of the problems that look difficult to them could prove more tractable to other approaches. (In particular, if a small number of approaches look like they may be highly general and could prove sufficient to develop transformative AI, a survey average will miss this dynamic by counting estimates from people working on these approaches the same way it counts estimates from everyone else.)

Some of the above considerations would imply that surveys underestimate how far we are from developing transformative AI; some would imply that they overestimate it. I think the issues in both directions are significant and seriously undermine the idea of relying on this data.

Trend extrapolation: In general, I believe it is often useful to look for relevant trends in quantifiable data when making predictions.^[7] Unfortunately, I believe there is little to go on in this category. The most relevant-seeming trend-extrapolation-based work seems to be the various attempts to answer the question, “When will affordable [by various definitions] computers be capable of matching the processing power of a human brain?” Luke’s review of this work implies (in my view) that this capability may already have been reached, and in any case has a reasonable chance of being reached in the next 20 years, while noting a variety of reasons that it has very limited relevance for forecasting overall AI capabilities.

Another approach to forecasting transformative AI

This section will discuss a separate case I see for expecting a nontrivial probability of transformative AI in the next 20 years. This case is based on reasoning through - with a small set of technical advisors - the details of relatively recent progress in AI, and attempting to inventory the most crucial technical challenges that will need to be addressed in order to develop transformative AI.

The technical advisors I have spoken with the most on this topic are close friends I’ve met through GiveWell and effective altruism: Dario Amodei, Chris Olah and Jacob Steinhardt. They are all relatively junior (as opposed to late-career) researchers; they do not constitute a representative sample of researchers; there are therefore risks in leaning too heavily on their thinking. With that said, talking to them has brought the advantage of being able to conduct - and listen in on - a large number of very detailed discussions, and I consider all three to be clearly on the cutting edge of various aspects of AI and machine learning research.^[8] It’s possible that there are a large number of other researchers having similar discussions - even that similar discussions have informed the survey responses discussed above - but the only discussions along these lines I have access to are the ones these technical advisors have been having. I feel fortunate to have good enough relationships with relevant researchers to have access to these sorts of discussions, and as I’ve written previously, I don’t think it would be advisable to discard our observations simply because they are friends and/or not a representative set. As discussed below, we have made some attempts to supplement their thinking with outside perspectives, and hope to do more on this front. (We have also discussed our high-level conclusions with a significant number of AI and ML researchers. Conversations were in confidence and often time-constrained, but we saw few signs that our take is clearly unreasonable.)

The rest of this section will discuss:

The basic question of whether AI research is likely to proceed via a very large number of highly specialized insights, or whether there may turn out to be a few broadly applicable AI approaches that lead to rapid progress on an extremely wide variety of intellectual tasks. At this point, this seems to be very much an open question.
Recent progress in AI and ML (particularly deep learning), which has provided some suggestive evidence for the idea of broadly applicable breakthroughs.
Work that the technical advisors mentioned above have been doing to summarize “core open problems” in AI - types of intellectual reasoning that seem important, but that researchers have not yet had success in reproducing - and potential research paths that could imaginably lead to progress on these problems. Based on this work, it is easy to imagine (though far from certain) that headway on a relatively small number of core problems could prove broadly applicable, and could lead to AI systems equalling or surpassing human performance in a very large number of domains. Discussions of this possibility, and the subjective estimates of the technical advisors involved, are important to my view that there is a nontrivial probability of transformative AI in the next 20 years.

How diverse are necessary AI advances?

In what follows, I will distinguish between (a) “intellectual functions” - human intellectual activities as we would generally describe them, such as “studying physics” or “writing a memo”; and (b) “underlying algorithms” - the specific mechanistic manipulations that are performed on raw data^[9]_ to perform intellectual functions._^[10]

Many of the people I’ve spoken with seem to instinctively assume that AI research is likely to proceed via a very large number of highly specialized insights, and that there will be few or no breakthroughs that lead to rapid progress on many fronts at once. It seems to me that this view is usually correlated with (though not the same as) an intuition that the human brain has an extraordinarily complex and varied architecture, and that each of the many intellectual functions humans perform requires fundamentally different underlying algorithms. For example, in order to build a computer system that can conduct scientific research at the level of human experts, the algorithm and training procedure used for reading existing scientific literature might need to be fundamentally different from the algorithm and training procedure for identifying important scientific questions. Other relevant tasks like experimental design, manipulating objects, writing quality expositions, etc., might all require fundamentally different algorithmic approaches.

Others have a different intuition. There may turn out to be a few broadly applicable AI approaches that lead to rapid progress on an extremely wide variety of intellectual tasks. This intuition seems correlated with (though again, not the same as) an intuition that the human brain makes repeated use of a relatively small set of underlying algorithms, and that by applying the processes, with small modifications, in a variety of contexts, it generates a wide variety of different predictive models, which can end up looking like very different intellectual functions.

My impression is that the current state of both neuroscience^[11] and AI research is highly compatible with both possibilities (and a range of things in between). With respect to AI research, I believe that much historical progress has come from relatively specialized approaches and has thus looked more like what would be expected under the first hypothesis - but that recent progress has provided more evidence of broadly applicable breakthroughs. The next section discusses this recent progress.

Recent progress in AI and ML

Certain areas of AI and machine learning, particularly related to deep neural networks and other deep learning methods, have recently experienced rapid and impressive progress. In some cases this progress has been surprisingly fast relative to what practitioners 10 years ago likely would have expected, has been strongly applicable to real-world problems, or both.

Daniel Dewey provides the following overview list. These examples of progress were chosen based on our technical advisors’ impressions of how impressive, significant, and/or surprising they were to academic AI researchers. We excluded some examples that might seem significant to non-researchers, but that did not meet these criteria (for example, recent progress in self-driving cars and IBM Watson’s Jeopardy! win).

Computer vision has improved significantly over the last 5 years, and now matches or exceeds human performance in some tasks. For example, in the ImageNet Large Scale Visual Recognition Challenge’s image recognition task, the best team’s top-5 error^[12] dropped from 28.2% in 2010 (before the adoption of deep learning) to 3.6% in 2015, beating a trained human’s error of 5.1%.^[13] In 2012, a breakthrough in training deep convolutional networks achieved a 9.4% improvement over the previous year (typical year-over-year improvements were closer to 3%).^[14]
Speech recognition has shown similar progress, again largely due to adoption of and advances in deep learning. Benchmarks are more varied in speech recognition, but a few illustrative cases are: the drop from ~24% transcription error for Gaussian mixture models in 2011 on the “Switchboard” data set to ~16% error for deep neural network models that same year;^[15] the decrease to ~8% error on the same task by 2015^[16]; and most recently the development of a distributed deep learning system, Deep Speech 2, that outperformed trained humans by 1.5%-3% on 3 out of four transcription tasks from the WSJ and Librispeech data sets in late 2015.^[17]
Go: In October 2015, the Go-playing system AlphaGo defeated a professional in 8 out of 10 games; in March 2016, an improved version of the system defeated top-tier professional Lee Sedol in 4 out of 5 games. It is debatable whether this performance should have been considered “surprising,” or a major leap relative to previous capabilities, when accounting for the high level of investment and hardware involved. However, it was another case in which deep learning seems to have made a major contribution to surpassing performance of top humans in a particular domain.
Expanding applicability of deep learning: more generally, there has been a proliferation of work applying existing deep learning methods to an expanding set of tasks. For example, new ideas in deep Q-learning and increased R&D and hardware investment resulted in 2015 in a deep-learning-based system that achieved human-like performance across many Atari games without specialized game-by-game tuning;^[18] sequence-to-sequence learning, encoder-decoder networks, attention models, and multimodal embeddings have enabled progress in using deep learning to perform tasks like sentence-level image description,^[19] phrase translation,^[20] and image generation;^[21] and neural turing machines, memory networks, and other architectures augmenting deep neural networks with external memory have been proposed as ways of applying deep learning to question-answering^[22] and learning algorithms from examples.^[23] Unlike the previous three examples, this expansion of deep learning beyond traditional classification tasks has not often yielded human-comparable performance, but it does suggest that deep learning may be broadly applicable to many problems. Deep learning is a general approach to fitting predictive models to data that can lead to automated generation of extremely complex non-linear models. It seems to be, conceptually, a relatively simple and cross-domain approach to generating such models (though it requires complex computations and generates complex models, and hardware improvements of past decades have been a key factor in being able to employ it effectively). My impression is that the field is still very far away from exploring all the ways in which deep learning might be applied to challenges in AI. In light of the excitement over recent progress (and increased investment, as noted above), there will be increasing attempts to do such exploration.

As an aside, deep learning — like most modern approaches to machine learning, which rely heavily on statistics and approximations — produces systems with strengths and weaknesses that don’t fit some common popular stereotypes of AI systems. They are often strong on activities commonly associated with “intuition” (playing Go, recognizing images),^[24] but my understanding is that symbolic and logical reasoning have proven difficult to deeply and satisfyingly integrate into such systems.

In my view, there is a live possibility that with further exploration of the implications and applications of deep learning - and perhaps a small number (1-3) of future breakthroughs comparable in scope and generality to deep learning - researchers will end up being able to achieve better-than-human performance in a large number of intellectual domains, sufficient to produce transformative AI. As stated above, I don’t believe existing knowledge about either AI or neuroscience can rule out this possibility; a key question is how plausible it looks following thorough discussion and reflection by people well-positioned to understand the strengths and weaknesses of deep learning and other established and emerging approaches to AI.

Core open problems

Over the past several months, the technical advisors mentioned above have been working on a document summarizing “core open problems” in AI - types of intellectual reasoning that seem important, and that humans seem to be able to do, but that researchers have not yet had success in reproducing. They have also been discussing potential research paths that could imaginably lead to progress on these problems.

In order to get wider input, we organized a meeting at our offices that included Dario, Jacob and Chris as well as three other early-career academic and industry researchers at leading institutions, and since then the six of them have been collaborating on refining the document summarizing core open problems.

We haven’t yet determined whether and when there will be public output from these discussions; it will ultimately be the choice of the people involved. My hope is that the researchers will make a document public and start a conversation in the wider community. For now, however, their views are the best available (to us) approximation of what kind of picture we might get from maximally informed people.

I don’t intend to go into the details of individuals’ views. But broadly speaking, based on these conversations, it seems to me that:

It is easy to imagine (though far from certain) that headway on a relatively small number of core problems could lead to AI systems equalling or surpassing human performance in a large number of domains.
The total number of core open problems is not clearly particularly large (though it is highly possible that there are many core problems that the participants simply haven’t thought of).
Many of the identified core open problems may turn out to have overlapping solutions. Many may turn out to be solved by continued extension and improvement of deep learning methods.
None appear that they will clearly require large numbers of major breakthroughs, large (decade-scale) amounts of trial and error, or further progress on directly studying the human brain. There are examples of outstanding technical problems, such as unsupervised learning, that could turn out to be very difficult, leading to a dramatic slowdown in progress in the near future, but it isn’t clear that we should confidently expect such a slowdown. I note that this situation is in contrast to many challenges in life sciences - such as producing meat from stem cells - where it seems that we can identify multiple challenging steps, some of which would involve clear significant lags due to time-consuming experiments and regulatory processes.
An aggregated picture of the subjective views of the people who have been working on the “core open problems” document would point to a 10% or greater probability of transformative AI in the next 20 years.

In discussing this work with technical advisors, I’ve sometimes informally shared my own intuitions about which sorts of intellectual functions seem most challenging, mysterious, or impressive, and therefore likely hard to replicate. I’ve generally found - unsurprisingly - that my thoughts (and common assumptions I hear from others) on these matters are far behind that of the technical advisors. For example, creativity in problem solving (e.g. generating novel ideas) may seem mysterious and very “human” at first blush, but my understanding is that when an AI system has a strong model of what problem it is trying to solve and how to evaluate potential solutions, coming up with ideas that can be called “creative” does not remain a major challenge. (As an example, commentators have remarked on the creativity of AlphaGo’s play.) I say all of this because I believe a common reaction to speculation about AI is to point to particular human modes of thought that seem hard to replicate, and I believe it’s worth noting that I think AI researchers carry out unusually sophisticated versions of this exercise.

Ideally, we would continue investigation on this topic by involving leading researchers from a diverse set of AI- and machine-learning-related fields, facilitating truly extensive discussion, and perhaps eventually using a Delphi method (or similar approach) to arrive at forecasts. But getting such a group to participate in such a time-consuming process might not be feasible at all, and might take years if it were. (We have also discussed our high-level conclusions with a significant number of AI and ML researchers. Conversations were in confidence and often time-constrained, but we saw few signs that our take is clearly unreasonable.)

At the moment, I feel we have gotten as far as we will for some time, in terms of assembling people who can combine (a) very strong knowledge of the cutting edge of AI and machine learning research with (b) a willingness to engage in the process - time-consuming, intellectually demanding, and highly speculative such that it is unlikely to lead directly to career advancement - of laying out and analyzing core open problems and potential research paths; (c) making probability estimates that are informed both by this analysis and by a general familiarity with probability-based forecasts.

Some notes on past “false alarms” and the burden of argumentation

When discussing the topics in this post, I’ve sometimes encountered the claim that we should heavily discount the analysis of today’s researchers, in light of the history of “false alarms” in the past - cases where researchers made overconfident, overly aggressive forecasts about how AI would develop. Luke Muehlhauser has looked into the history of past AI forecasts, and written up takeaways at some length here.

Based on this work, I think there has indeed been at least one past period during which researchers overestimated how quickly AI would improve, and I think there’s a substantial chance that we would have bought into the over-aggressive forecasts at that time. However, I see little evidence of similar dynamics after that period. In fact, since the 1970s, it appears that researchers have been fairly circumspect in their forecasts. (And the forecasts prior to the mid-1970s may well have been rational, if ultimately inaccurate, forecasts.) Overall, I see good reason to expect researchers to be overenthusiastic about their field, and thus to discount their claims to some degree, but I don’t think that the history of past forecasts gives us much additional reason, and I certainly don’t think there have been enough “false alarms” to provide strong evidence against my view about timelines to transformative AI.

Another argument against my view would be to claim that it should face a very high burden of argumentation, since transformative AI would be such an extreme development. I think this is true to some degree; however,

Developments of this magnitude are not unprecedented. The events that I’ve used as reference points for transformative AI - the agricultural (neolithic) revolution and the industrial revolution - both occurred within the past 10,000 years, and the industrial revolution occurred within the past 300. As each increased the pace of innovation and growth, each arguably raised the background likelihood of further, comparable transitions.
As noted above, expert surveys seem to imply a 90%+ chance of transformative AI well within the next 200 years, and this seems consistent with other conversations I’ve had: I’ve encountered very few people who seem to think that transformative AI is highly unlikely within the next few centuries.

With all of the above points in mind, I do think one should discount the views of researchers to some degree, and I do think one should consider transformative AI forecasts to face a reasonably high burden of argumentation. However, I don’t think one should discount researchers’ views to an extreme degree, or have an overwhelmingly strong prior against transformative AI in the medium term.

Bottom line

There’s no solid basis for estimating the likelihood of transformative AI in the coming decades. Trying to do so may be an entirely futile exercise; certainly it has many properties (long timelines, relatively vague concepts, lack of detailed public analysis) I’d associate with developments that are nearly impossible to forecast.

That said:

At this point we’ve tried to examine the problem from every angle we can, and further improvement in our picture of the situation would be quite time-consuming to obtain. (Nonetheless, we do hope to pursue such improvement.)
Machine learning (particularly deep learning) is a dynamic field: it has seen impressive progress and major growth in researchers and resources over the last few years. It isn’t clear where, if anywhere, the limitations of deep learning lie, and the highest-quality discussion we’ve been able to participate in on this topic has not led to identifying any clear limitations or obstacles. It is consistent with a real possibility that only a small number of major breakthroughs will be needed for AI systems to achieve human-level or better performance on large numbers of intellectual functions.
All the information we’ve collected - from surveys of experts, available analysis based on trend extrapolation, and from more detailed, analytical discussions with technical advisors - is consistent with this possibility. All of these categories of information have major flaws and limitations.
Assigning less than 10% probability to “transformative AI within the next 20 years” would not seem supported by any of these classifications of evidence, and would have no solid justification I can identify at this time. I don’t expect to come across such a justification within the next 6 months, though with enough effort we might encounter one within the next year.

There have been vivid false projections of AI in the past, and we’re aware that today’s projections of future AI could look misguided in retrospect. This is a risk I think we should accept, given our philosophy of hits-based giving. Overall, I think it is appropriate to act as though there is at least a 10% chance of “transformative AI” within the next 20 years; as though this “10%” figure is somewhat stable/robust; and as though we can be quite confident that the probability is at least 1%.

In the past, I have argued against investing resources based on arguments that resemble “Pascal’s Mugging” (and I’ve done so specifically in the context of reducing potential risks from advanced artificial intelligence). If one has little information about a debate, and no idea how to assign a probability to a given proposition, I don’t think it’s appropriate to make arguments along the lines of “We should assign a probability above X simply because our brains aren’t capable of confidently identifying probabilities lower than that.” Put differently, “I can’t prove that the probability is very low” is not sufficient to argue “The probability is reasonably high.” But I think the arguments I’ve given above present a markedly different situation:

We have put a great deal of effort into becoming informed about the relevant issues, and feel that we’ve explored essentially all of the available angles.
There are many outputs of our investigations that could have led - but did not, in fact, lead - me to assign a <10% probability of transformative AI in the next 20 years. For example, if the number of core open problems identified by technical advisors appeared larger and more diverse, or if the technical advisors working on identifying these problems had given different subjective bottom lines about the odds of transformative AI in the next 20 years, I could assign a much lower probability. Secondarily, despite the reservations I’ve expressed about expert surveys and trend extrapolations, I would be thinking about this issue quite differently if the expert surveys pointed to a much longer “10% probability” timeline or if trend extrapolations made it seem clear that we are still several decades from having computers that can match available estimates of the raw processing power of a human brain.
My best-guess probability for “transformative AI within the next 20 years” (>=10%) is within the range where I feel mentally capable of distinguishing between different probabilities. I would assign a 5% or lower probability to many statements that might, in the absence of our investigations, look about as likely as “transformative AI within the next 20 years.” For example, I’d assign a <=5% probability to the proposition that meat derived from stem cells will be competitive (in the sense of real-world demand) with traditional meat within the next 20 years, or that a broad-spectrum, extremely effective (comparable with Gleevec) cancer drug (as in pill or protein) will come to market within the next 20 years.^[25]
I think it’s very unlikely that my best-guess probability for “transformative AI within the next 20 years” could fall below 10% without a great deal more investigation or new information. Specifically, I think it would take at least 6-12 months for this to occur.

I believe there are important differences between probabilities assigned out of sheer guesswork and avoidance of overconfidence, vs. probabillities assigned as the result of substantive investigation that it would be hard to improve on. At this point, I have a view about the likelihood of transformative AI that - while the thinking behind it has many limitations and sources of uncertainty - fits more in the second category.

How convincing should this be to a reader outside the Open Philanthropy Project?

As stated above, my view is based on a large number of undocumented conversations, such that I don’t think it is realistic to aim for being highly convincing in this post. Instead, I have attempted to lay out the general structure of the inputs into my thinking.

For further clarification, I will now briefly go over which parts of my argument I believe are well-supported and/or should be uncontroversial, vs. which parts rely crucially on information I haven’t been able to fully share.

Definition and in-principle feasibility of transformative AI. I believe that my view that transformative AI is possible in principle would not be straightforward to vet because it would require access to either deep expertise or conversations with experts on a variety of subjects. However, I believe such conversations would lead relatively straightforwardly to the idea that transformative AI seems possible in principle.
Expert surveys and trend extrapolations. I believe that our take on available data from expert surveys and trend extrapolation is on relatively solid ground. Our methodology hasn’t been as systematic as it could have been, but I would be quite surprised if it turned out that there were relevant, high-value-added data in these categories that we haven’t considered, or that our take on the strengths and weaknesses of such data were missing crucial considerations. I believe an interested reader could perform their own searches and analysis to verify this. However, I don’t believe this analysis alone would justify the view I’m arguing, for reasons given above.
Possibility that a small number of broadly applicable insights might be sufficient for transformative AI. I believe that the above discussion of whether AI research is likely to proceed by a large number of specialized insights or a smaller number of more general ones, including my statement that “the current state of both neuroscience and AI research is highly compatible with both possibilities (and a range of things in between),” is on similar ground to the ideas in the first bullet point. It would not be straightforward to verify because it would require access to either deep expertise or conversations with experts on a variety of subjects, but I believe my relatively agnostic take on this topic would be fairly uncontroversial among such experts.
Recent progress in deep learning. Daniel Dewey’s summary of recent progress should give a general sense that there have been major advances recently in a fairly broad array of domains, though technical background or technical advisors would be necessary to verify the claims. I think it is uncontroversial that there have been major advances, though just how major and impressive is a question with a great deal of room for debate.
Core open problems in AI and their implications for likely timelines. I haven’t shared the specifics of the core open problems we’ve discussed, and even if I did, it would be very hard to get a sense for whether these core problems have been reasonably chosen, how likely various research paths are to lead to progress on these problems, and how significant progress on these problems would be. My views on this question rely on a variety of beliefs about the particular technical advisors we’ve worked with to understand these issues: that these advisors are among the stronger researchers in their fields, that they reflect intelligently and reasonably on what intellectual functions seem challenging for AI (and for what reasons), that they have enough connections and engagement with the rest of the community to notice most directly relevant major insights from other researchers, that they accurately report which beliefs are commonly held in the field vs. held by some researchers vs. commonly rejected, etc. I have formed my read on these advisors through a large amount of interaction. I don’t expect outside readers to come to the same views I have on the nature of core open problems in AI; the only way I can think of to do so would be to form their own high-trust relationships with technical advisors (or to become technical experts themselves).

What other possible future developments might deserve similar attention?

It generally seems to me that since philanthropy is well-suited to long-term, low-probability-of-success work that aims to benefit the world as a whole rather than a particular organization or interest, it is a good idea for philanthropists to ask “What are the most dramatic worldwide changes that could be in the relatively near future?” We’ve put a fair amount of informal effort into doing so,^[26] and at this point we feel the possibility of transformative AI stands out.

To give some more context on this view, I will first list a few potential developments that seem to me to be particularly strong candidates for being transformative in principle, then briefly discuss which I find most relevant in the relatively near (~20 years) future.

Some possible developments:

Progress in biology to the point of having dramatically better ability to understand, and modify, the functions of a human body and/or brain.
Development of extremely cheap, clean, scalable energy sources.
Development of radically improved manufacturing methods (atomically precise manufacturing is one possibility that we have written about, though far from the only one in this category).
Progress in social science to the point of being able to design institutions and interventions that dramatically change the way people think and live.
A variety of global catastrophic risks, particularly pandemics and worst-case climate change.
Dramatic shifts in geopolitical relations.
Dramatic shifts in cultural values.

There are many of these possibilities we haven’t investigated to my satisfaction. However:

I don’t believe the bulk of these are - or would seem after further investigation - much more likely over the next 20 years than the development of transformative AI. (The major exception is pandemics, a focus area of ours; the last two points are also possible for certain definitions of “dramatic.”) The combination of (a) a dynamic field with striking recent progress; (b) a live hypothesis for how this progress might be extended, with a relatively small number of further breakthroughs, to transformative technology; (c) a lack of concrete, clearly relevant “obstacles” one can name for this hypothesis seems fairly unique to AI, relative to the other possibilities, at this time.
I believe that unexpectedly strong progress on AI could lead to unexpectedly fast progress on almost any of the above developments, and possibly to unexpectedly fast progress on several of these fronts at once.
I think there are additional reasons - largely related to neglectedness and tractability - that AI is especially in need of philanthropic attention. I will discuss these in a future post.

(1) A randomized controlled trial conducted by Barbara Mellers and colleagues as part of a recent large forecasting tournament, reported in Mellers et al. (2015), found that “training in probabilistic reasoning,” among other factors, improved forecasting success (see the paper for details). They describe the training this way: “Forecasters were taught to consider comparison classes and take the “outside” view. They were told to look for historical trends and update their beliefs by identifying and extrapolating persistent trends and accounting for the passage of time. They were told to average multiple estimates and use previously validated statistical models when available. When not available, forecasters were told to look for predictive variables from formal models that exploit past regularities. Finally, forecasters were warned against judgmental errors, such as wishful thinking, belief persistence, confirmation bias, and hindsight bias. This training module was informed by a large literature that investigates methods of debiasing…” It is impossible to know how much of the measured improvement in forecasting accuracy was due to training on how to use comparison classes and trend extrapolations in particular (apart from the effects of the rest of the training), but it seems plausible that the training on comparison classes and trend extrapolation had some effect. The experiment’s training materials are provided in the supplemental materials for Mellers et al. (2014). (2) The most exhaustive retrospective analysis of historical technology forecasts we have yet found, Mullins (2012), categorized thousands of published technology forecasts by methodology, using eight categories including “multiple methods” as one category. Of all these methodology categories, quantitative trend analyses had the highest success rate (see Table 3). The authors summarize this finding, and add some important caveats, on page 18: “For success rates, quantitative trend analysis… outperforms all other methodologies across all time frames except for models… in the medium term. Forecasts generated from quantitative trend analysis do have a significantly high percentage of predictions about computer technologies (the technology area with the highest statistically significant success rate), with 46% of quantitative trend analysis forecasts falling into this technology area tag, compared to 21% for expert analysis methods, the methodology with the second highest percentage of forecasts about computer technologies. However, when comparing success rates for methodologies solely within the computer technology area tag, quantitative trend analysis performs slight below average, indicating the predominance of forecasts about computer technologies is not influencing the success rates of associated forecast methodologies. When quantitative trend analysis was compared to all other methodologies while correcting for technology area tag and time frame, it did not demonstrate a statistically better success rate. This is due in part to the small sample size of quantitative forecasts that did not project over the short term or make predictions about computer technologies.” We have not yet evaluated either of these studies closely, but suspect the view that it is often useful to look for relevant trends in quantifiable data when making predictions would be uncontroversial among forecasting experts.

I believe this view is relatively uncontroversial among people who have studied the matter, and would be supported by most literature on the topic. I may put out some content in the next couple of months that gives a sense of the literature I’ve found most informative for this view. ↩︎
Start of chapter 2. ↩︎
The way in which a superintelligence, artificial general intelligence, or high-level machine intelligence might not meet my more detailed definition would be if it failed to be cost-competitive with employing humans. I think this is a fairly minor discrepancy in the scheme of things, since computing technology has generally tended to decrease fairly quickly in cost over time. ↩︎
For example, the following is the 2nd of the “Ten Commandments for Aspiring Superforecasters” in Philip Tetlock’s Superforecasting:

(2) Break seemingly intractable problems into tractable sub-problems.

Channel the playful but disciplined spirit of Enrico Fermi who—when he wasn’t designing the world’s first atomic reactor—loved ballparking answers to head-scratchers such as “How many extraterrestrial civilizations exist in the universe?” Decompose the problem into its knowable and unknowable parts. Flush ignorance into the open. Expose and examine your assumptions. Dare to be wrong by making your best guesses. Better to discover errors quickly than to hide them behind vague verbiage.

Superforecasters see Fermi-izing as part of the job. How else could they generate quantitative answers to seemingly impossible-to-quantify questions about Arafat’s autopsy, bird-flu epidemics, oil prices, Boko Haram, the Battle of Aleppo, and bond-yield spreads.

We find this Fermi-izing spirit at work even in the quest for love, the ultimate unquantifiable. Consider Peter Backus, a lonely guy in London, who guesstimated the number of potential female partners in his vicinity by starting with the population of London (approximately six million) and winnowing that number down by the proportion of women in the population (about 50%), by the proportion of singles (about 50%), by the proportion in the right age range (about 20%), by the proportion of university graduates (about 26%), by the proportion he finds attractive (only 5%), by the proportion likely to find him attractive (only 5%), and by the proportion likely to be compatible with him (about 10%). Conclusion: roughly twenty-six women in the pool, a daunting but not impossible search task.

There are no objectively correct answers to true-love questions, but we can score the accuracy of the Fermi estimates that superforecasters generate in the IARPA tournament. The surprise is how often remarkably good probability estimates arise from a remarkably crude series of assumptions and guesstimates. ↩︎
Note that Good Ventures is an investor. ↩︎
Additional recent investments in AI include Toyota’s $1 billion commitment to AI R&D over the next 5 years, South Korea’s announced $863 million fund to support AI R&D over the next 5 years, Rethink Robotics’ $100+ million raised since November 2010, Sentient Technologies’ $130+ million raised since April 2010, and Goldman Sachs’ planned investments in AI. One report (that we have not vetted) estimates 2015 venture capital investments in AI at $1.9 billion (see table 1). ↩︎
Two data points on the utility of trend extrapolation as one input to successful forecasting, provided by Luke Muehlhauser: ↩︎
Dario works at Google Brain, and previously co-authored a paper at Baidu Silicon Valley AI lab that MIT Technology Review listed as a top breakthrough of 2015. Chris works at Google Brain, and co-authored a paper last year that drew significant popular attention for introducing a way of visualizing the “thought process” of neural networks. Jacob is a PhD student at Stanford who has published 9 papers in prominent computer science conferences and is lead organizer of a workshop at ICML this year. ↩︎
By “raw data”, I mean sensory inputs, e.g. light and sound. Most humans use eyes and ears to receive light and sound; computers can receive light and sound using cameras and microphones or (more commonly and simply) via files. ↩︎
The terms “intellectual functions” and “underlying algorithms” map fairly well to David Marr’s “computational level” and “algorithmic/representational level,” respectively. I use different terms only in an attempt to be clear for readers not familiar with those terms. Note that underlying algorithms can be opaque to the person carrying them out. For example, I can’t describe mechanistic rules that my brain uses to perform the intellectual functions of looking at a person and determining, from the pattern of light I perceive plus background knowledge, whether I know them. ↩︎
See notes from an email exchange with Adam Marblestone for more on relevant neuroscience specifically. ↩︎
“Top-5 error” is the percentage of images where the correct label did not appear in the system’s top 5 predicted labels. It’s worth noting that these classifiers generally do not perform well when presented with images that are very different from ImageNet photos (such as random noise), and so it would be inaccurate to say that they perform at human-like levels for general image recognition. ↩︎
See Russakovsky et al. 2015 pages 19-21 for 2010-2014 performance, this page for 2015 results (task 2a, team MSRA). He et al. 2015 is the paper underlying the latter. This blog post has a comparison to human performance. ↩︎
See Russakovsky et al. 2015 pages 19-21 for 2010-2014 performance. More on the 2012 breakthrough: Hinton et al. 2012, Krizhevsky, Sutskever and Hinton 2012, Srivastava et al. 2014. ↩︎
Seide, Li and Yu 2011. ↩︎
Saon et al. 2015. ↩︎
Amodei et al. 2015 In the fourth task, trained humans outperformed the system by less than 1%. ↩︎
Mnih et al. 2015. Note that a separate instance of the system was trained and evaluated on each game, instead of a single instance being trained and evaluated on every game; the latter would be an extremely impressive result. Also note that different games have different numbers of input actions, and that some games have a “life counter” displayed; DQN instances were modified to make use of that many actions for each game, and were given the number of lives remaining in games where a life counter was available, but according to the paper no other modifications were made between games. ↩︎
E.g. Karpathy and Li 2015, Xu et al. 2015. ↩︎
E.g. Cho et al. 2014. ↩︎
E.g. Gregor et al. 2015 ↩︎
E.g. Bordes et al. 2015. ↩︎
E.g. Vinyals, Fortunato and Jaitly 2015, Kurach, Andrychowicz, and Sutskever 2016. ↩︎
E.g. Geoff Hinton has said, “The really skilled players just sort of see where a good place to put a stone would be. They do a lot of reasoning as well, which they call reading, but they also have very good intuition about where a good place to go would be, and that’s the kind of thing that people just thought computers couldn’t do. But with these neural networks, computers can do that too. They can think about all the possible moves and think that one particular move seems a bit better than the others, just intuitively. That’s what the feed point neural network is doing: it’s giving the system intuitions about what might be a good move. It then goes off and tries all sorts of alternatives. The neural networks provides you with good intuitions, and that’s what the other programs were lacking, and that’s what people didn’t really understand computers could do.” ↩︎
Excluding cases where advances in AI are a primary driver behind these developments’ coming about. ↩︎
We’ve had many informal conversations on these topics, including with scientific advisors. Below are some sources I’ve read or skimmed and that others might find useful in getting a sense for the broader space of potential future developments: Global Catastrophes and Trends: The Next 50 Years by Vaclav Smil Global Trends 2030: Alternative Worlds by the U.S. National Intelligence Council Anticipating 2025 by David Wood, Mark Stevenson and others Global Catastrophic Risks edited by Nick Bostrom and Milan Cirkovic Wikipedia’s list of emerging technologies Edge.org’s 2016 question on interesting/important recent scientific news (also see Scott Aaronson’s summary) A Quora contest we sponsored around the question, “What are the biggest ways in which the world 20 years from now will probably be different from today? What are the biggest “X factors” (changes that are not probable, but are possible and could be huge)?” A variety of other sources that I haven’t reviewed, but were skimmed by Luke Muehlhauser in his search for literature on this general topic: Abundance, The Next Fifty Years, The Future of the Brain, Deep Shift: Technology Tipping Points and Social Impact, 2052: A Global Forecast for the Next Forty Years The Next 100 Years, and the timeline at FutureTimeline.net, starting with the present decade. ↩︎

Effective Altruism Forum
EA Forum