Artificial intelligence alignment is believed by many to be one of the most important challenges we face right now. I understand the argument that once AGI is developed it's game over unless we have solved alignment, and I am completely convinced by this. However, I have never seen anyone explain the reasoning that leads experts in the field to believe that AGI could be here in the near future. Claims that there is an X% chance of AGI in the next Y years (where X is fairly large and Y fairly small) are rarely supported by an actual argument.

I realize that for the EA community to dedicate so many resources to this topic there must be good reasons to believe that AGI really is not too far away or that alignment is such a hard problem it will take a long time to solve. It seems like the former is the more widely held view.

Could someone either present or point me in the direction of a clear explanation for why many believe AGI is on the horizon. In addition, please correct me if this question demonstrates some misunderstanding on my part.




New Answer
New Comment

4 Answers sorted by

It can seem strange that people act decisively about speculative things. So the first piece to understand is expected value: if something would be extremely important if it happened, then you can place quite low probability on it and still have warrant to act on it. (This is sometimes accused of being a decision-theory "mugging", but it isn't: we're talking about subjective probabilities in the range of 1% - 10%, not infinitesimals like those involved in Pascal's mugging.)

I think the most-defensible outside-view argument is: it could happen soon; it could be dangerous; aligning it could be very hard; and the product of these probabilities is not low enough to ignore.

1. When you survey general AI experts (not just safety or AGI people), they give a very wide distribution of predicting when we will have human-level AI (HLAI), with a central tendency of "10% chance of human-level AI... in the 2020s or 2030s". (This is weak evidence, since technology forecasting is very hard; these surveys are not random samples; but it seems like some evidence.)

2. We don't know what the risk of HLAI being dangerous is, but we have a couple of analogous precedents:

* the human precedent for world domination through intelligence / combinatorial generalisation / cunning

* the human precedent for 'inner optimisers': evolution was heavily optimising for genetic fitness, but produced a system, us, which optimises for a very different objective ("fun", or "status", or "gratification" or some bundle of nonfitness things).

* goal space is much larger than the human-friendly part of goal space (suggesting that a random objective will not be human-friendly, which combined with assumptions about goal maximisation and instrumental drives implies that most goals could be dangerous) .

* there's a common phenomenon of very stupid ML systems still developing "clever" unintended / hacky / dangerous behaviours

3. We don't know how hard alignment is, so we don't know how long it will take to solve. It may involve certain profound philosophical and mathematical questions, which have been worked on by some of the greatest thinkers for a long time. Here's a nice nontechnical statement of the potential difficulty. Some AI safety researchers are actually quite optimistic about our prospects for solving alignment, even without EA intervention, and work on it to cover things like the "value lock-in" case instead of the x-risk case.

Just wanted to note that while I am quoted as being optimistic, I am still working on it specifically to cover the x-risk case and not the value lock-in case. (But certainly some people are working on the value lock-in case.)

(Also I think several people would disagree that I am optimistic, and would instead think I'm too pessimistic, e.g. I get the sense that I would be on the pessimistic side at FHI.)

Also, for posterity, there's some interesting discussion of that interview with Rohin here. And some other takes on "Why AI risk might be solved without additional intervention from longtermists" are summarised, and then discussed in the comments, here. But very much in line with technicalities' comment, it's of course totally possible to believe that AI risk will probably be solved without additional intervention from longtermists, and yet still think that serious effort should go into raising that probability further. Great quote from The Precipice on that general idea, in the context of nuclear weapons:
I realize that for the EA community to dedicate so many resources to this topic there must be good reasons to believe that AGI really is not too far away

First, a technicality: you don't have to strongly believe that the median probability is that AGI/Transformative AI is happening soonish, just that the probability is high enough to be worth working on[1].

But in general, several points of evidence of a relatively soon AGI:

1. The first is that we can look at estimates from AI experts. (Not necessarily AI Safety people). It seems like survey estimates for when Human Level AI/AGI/TAI will happen are all over the place, but roughly speaking, the median is <60 years, so expert surveys say that it seems more likely than not to happen in our lifetimes[2]. You can believe that AI researchers are overconfident about this, but bias could be in either direction (eg, plenty of examples in history where famous people in a field dramatically underestimate progress in that field).

2. People working specifically on building AGI (eg, people at OpenAI, DeepMind) seem especially bullish about transformative AI happening soon, even relative to AI/ML experts not working on AGI. Note that this is not uncontroversial, see eg, criticisms from Jessica Taylor, among others. Note also that there's a strong selection effect for the people who're the most bullish on AGI to work on it.

3. Within EA, people working on AI Safety and AI Forecasting have more specific inside view arguments. For example, see this recent talk by Buck and a bunch of stuff by AI Impacts. I find myself confused about how much to update on believable arguments vs. just using them as one number among many of "what experts believe".

4. A lot of people working in AI Safety seem to have private information that updates them towards shorter timelines. My knowledge of a small(?) subset of them does lead me to believe in somewhat shorter timelines than expert consensus, but I'm confused about whether this information (or the potential of this information) feeds into expert intuitions for forecasting, so it's hard to know if this is in a sense already "priced in." (see also information cascades, this comment on epistemic modesty). Another point of confusion is how much you should trust people who claim to have private information; a potentially correct decision-procedure is to ignore all claims of secrecy as BS.


[1] Eg, if you believe with probability 1 that AGI won't happen for 100 years, I think a few people might still be optimistic about working now to hammer out the details of AGI safety, but most people won't be that motivated. Likewise, if you believe (as I think Will MacAskill does) that the probability of AGI/TAI in the next century is 1%, I think many people may believe there are marginally more important long-termist causes to work on. How high does X have to be for "X% chance of AGI in the next Y years", in your words, is a harder question.

[2] "Within our lifetimes" is somewhat poetic but obviously the "our" is doing a lot of the work in that phrase. I'm explicitly saying that as an Asian-American male in my twenties, I expect that if the experts are right, transformative AI is more likely than not to happen before I die of natural causes.

Good answer.

People working specifically on AGI (eg, people at OpenAI, DeepMind) seem especially bullish about transformative AI, even relative to experts not working on AGI. Note that this is not uncontroversial, see eg, criticisms from Jessica Taylor, among others. Note also that there's a strong selection effect for the people who're the most bullish on AGI to work on it.

I have several uncertainties about what you meant by this:

  • Do you include in "People working specifically on AGI" people working on AI safety, or just capabilities?
  • &qu
... (read more)
Just capabilities (in other words, people working to create AGI), although I think the safety/capabilities distinction is less clear-cut outside of a few dedicated safety orgs like MIRI. Yes. AI people who aren't explicitly thinking of AGI when they do their research (I think this correctly describes well over 90% of ML researchers at Google Brain, for example). Because it might be surprising (to people asking or reading this question who are imagining long timelines) to see timelines as short as the ones AI experts believe, so the second point is qualifying that AGI experts believe it's even shorter. In general it looks like my language choice was more ambiguous than desirable so I'll edit my answer to be clearer!
Ah, ok. The edits clear everything up for me except that the "even" is meant to be highlighting that this is even shorter than the timelines given in the above paragraph. (Not sure that matters much, though.)
I edited that section, let me know if there are remaining points of confusion!

(I agree with other commenters that the most defensible position is that "we don't know when AGI is coming", and I have argued that AGI safety work is urgent even if we somehow knew that AGI is not soon, because of early decision points on R&D paths; see my take here. But I'll answer the question anyway.) (Also, I seem to be almost the only one coming from this following direction, so take that as a giant red flag...)

I've been looking into the possibility that people will understand the brain's algorithms well enough to make an AGI by copying them (at a high level). My assessment is: (1) I don't think the algorithms are that horrifically complicated, (2) Lots of people in both neuroscience and AI are trying to do this as we speak, and (3) I think they're making impressive progress, with the algorithms powering human intelligence (i.e. the neocortex) starting to crystallize into view on the horizon. I've written about a high-level technical specification for what neocortical algorithms are doing, and in the literature I've found impressive mid-level sketches of how these algorithms work, and low-level sketches of associated neural mechanisms (PM me for a reading list). The high-, mid-, and low-level pictures all feel like they kinda fit together into a coherent whole. There are plenty of missing details, but again, I feel like I can see it crystallizing into view. So that's why I have a gut feeling that real-deal superintelligent AGI is coming in my lifetime, either by that path or another path that happens even faster. That said, I'm still saving for retirement :-P

Other answers have made what I think of as the key points. I'll try to add by pointing in the direction of some resources I've found on this matter which weren't mentioned already by others. Note that:

  • Some of these source suggest AGI is on the horizon, some suggest it isn't, and some just discuss the matter
  • The question of AGI timelines (things like "time until AGI") is related to, but distinct from, the question of "discontinuity"/"takeoff speed"/"foom" (I mention the last of those terms only for historical reasons; I think it's unnecessarily unprofessional). Both questions are relevant when determining strategies for handling AI risk. It would probably be good if the distinction was more often made explicit. The sources I'll mention may sometimes be more about discontinuity-type-questions than about AGI timelines.

With those caveats in mind, here are some sources:

I've also made a collection of so far around 30 "works that highlight disagreements, cruxes, debates, assumptions, etc. about the importance of AI safety/alignment, about which risks are most likely, about which strategies to prioritise, etc." Most aren't primarily focused on timelines, but many relate to that matter.

Oh, also, on the more general question of what to actually do, given a particular belief about AGI timelines (or other existential risk timelines), this technical report by Owen Cotton-Barratt is interesting. One quote:

There are two major factors which seem to push towards preferring more work which focuses on scenarios where AI comes soon. The first is nearsightedness: we simply have a better idea of what will be useful in these scenarios. The second is diminishing marginal returns: the expected effect of an extra year of work on a problem tends to declin
... (read more)
Sorted by Click to highlight new comments since: Today at 4:47 PM

Note also that your question has a selection filter where you'd also want to figure out where the best arguments for longer timelines are. In an ideal world these two sets of things tend to live in the same place, in our world this isn't always the case.