Hide table of contents

Artificial intelligence alignment is believed by many to be one of the most important challenges we face right now. I understand the argument that once AGI is developed it's game over unless we have solved alignment, and I am completely convinced by this. However, I have never seen anyone explain the reasoning that leads experts in the field to believe that AGI could be here in the near future. Claims that there is an X% chance of AGI in the next Y years (where X is fairly large and Y fairly small) are rarely supported by an actual argument.

I realize that for the EA community to dedicate so many resources to this topic there must be good reasons to believe that AGI really is not too far away or that alignment is such a hard problem it will take a long time to solve. It seems like the former is the more widely held view.

Could someone either present or point me in the direction of a clear explanation for why many believe AGI is on the horizon. In addition, please correct me if this question demonstrates some misunderstanding on my part.

14

0
0

Reactions

0
0
New Answer
New Comment


4 Answers sorted by

It can seem strange that people act decisively about speculative things. So the first piece to understand is expected value: if something would be extremely important if it happened, then you can place quite low probability on it and still have warrant to act on it. (This is sometimes accused of being a decision-theory "mugging", but it isn't: we're talking about subjective probabilities in the range of 1% - 10%, not infinitesimals like those involved in Pascal's mugging.)

I think the most-defensible outside-view argument is: it could happen soon; it could be dangerous; aligning it could be very hard; and the product of these probabilities is not low enough to ignore.

1. When you survey general AI experts (not just safety or AGI people), they give a very wide distribution of predicting when we will have human-level AI (HLAI), with a central tendency of "10% chance of human-level AI... in the 2020s or 2030s". (This is weak evidence, since technology forecasting is very hard; these surveys are not random samples; but it seems like some evidence.)


2. We don't know what the risk of HLAI being dangerous is, but we have a couple of analogous precedents:

* the human precedent for world domination through intelligence / combinatorial generalisation / cunning

* the human precedent for 'inner optimisers': evolution was heavily optimising for genetic fitness, but produced a system, us, which optimises for a very different objective ("fun", or "status", or "gratification" or some bundle of nonfitness things).

* goal space is much larger than the human-friendly part of goal space (suggesting that a random objective will not be human-friendly, which combined with assumptions about goal maximisation and instrumental drives implies that most goals could be dangerous) .

* there's a common phenomenon of very stupid ML systems still developing "clever" unintended / hacky / dangerous behaviours


3. We don't know how hard alignment is, so we don't know how long it will take to solve. It may involve certain profound philosophical and mathematical questions, which have been worked on by some of the greatest thinkers for a long time. Here's a nice nontechnical statement of the potential difficulty. Some AI safety researchers are actually quite optimistic about our prospects for solving alignment, even without EA intervention, and work on it to cover things like the "value lock-in" case instead of the x-risk case.

Just wanted to note that while I am quoted as being optimistic, I am still working on it specifically to cover the x-risk case and not the value lock-in case. (But certainly some people are working on the value lock-in case.)

(Also I think several people would disagree that I am optimistic, and would instead think I'm too pessimistic, e.g. I get the sense that I would be on the pessimistic side at FHI.)

1
MichaelA🔸
Also, for posterity, there's some interesting discussion of that interview with Rohin here. And some other takes on "Why AI risk might be solved without additional intervention from longtermists" are summarised, and then discussed in the comments, here. But very much in line with technicalities' comment, it's of course totally possible to believe that AI risk will probably be solved without additional intervention from longtermists, and yet still think that serious effort should go into raising that probability further. Great quote from The Precipice on that general idea, in the context of nuclear weapons:
I realize that for the EA community to dedicate so many resources to this topic there must be good reasons to believe that AGI really is not too far away

First, a technicality: you don't have to strongly believe that the median probability is that AGI/Transformative AI is happening soonish, just that the probability is high enough to be worth working on[1].

But in general, several points of evidence of a relatively soon AGI:

1. The first is that we can look at estimates from AI experts. (Not necessarily AI Safety people). It seems like survey estimates for when Human Level AI/AGI/TAI will happen are all over the place, but roughly speaking, the median is <60 years, so expert surveys say that it seems more likely than not to happen in our lifetimes[2]. You can believe that AI researchers are overconfident about this, but bias could be in either direction (eg, plenty of examples in history where famous people in a field dramatically underestimate progress in that field).

2. People working specifically on building AGI (eg, people at OpenAI, DeepMind) seem especially bullish about transformative AI happening soon, even relative to AI/ML experts not working on AGI. Note that this is not uncontroversial, see eg, criticisms from Jessica Taylor, among others. Note also that there's a strong selection effect for the people who're the most bullish on AGI to work on it.

3. Within EA, people working on AI Safety and AI Forecasting have more specific inside view arguments. For example, see this recent talk by Buck and a bunch of stuff by AI Impacts. I find myself confused about how much to update on believable arguments vs. just using them as one number among many of "what experts believe".

4. A lot of people working in AI Safety seem to have private information that updates them towards shorter timelines. My knowledge of a small(?) subset of them does lead me to believe in somewhat shorter timelines than expert consensus, but I'm confused about whether this information (or the potential of this information) feeds into expert intuitions for forecasting, so it's hard to know if this is in a sense already "priced in." (see also information cascades, this comment on epistemic modesty). Another point of confusion is how much you should trust people who claim to have private information; a potentially correct decision-procedure is to ignore all claims of secrecy as BS.

_

[1] Eg, if you believe with probability 1 that AGI won't happen for 100 years, I think a few people might still be optimistic about working now to hammer out the details of AGI safety, but most people won't be that motivated. Likewise, if you believe (as I think Will MacAskill does) that the probability of AGI/TAI in the next century is 1%, I think many people may believe there are marginally more important long-termist causes to work on. How high does X have to be for "X% chance of AGI in the next Y years", in your words, is a harder question.

[2] "Within our lifetimes" is somewhat poetic but obviously the "our" is doing a lot of the work in that phrase. I'm explicitly saying that as an Asian-American male in my twenties, I expect that if the experts are right, transformative AI is more likely than not to happen before I die of natural causes.

Good answer.

People working specifically on AGI (eg, people at OpenAI, DeepMind) seem especially bullish about transformative AI, even relative to experts not working on AGI. Note that this is not uncontroversial, see eg, criticisms from Jessica Taylor, among others. Note also that there's a strong selection effect for the people who're the most bullish on AGI to work on it.

I have several uncertainties about what you meant by this:

  • Do you include in "People working specifically on AGI" people working on AI safety, or just capabilities?
  • &qu
... (read more)
3
Linch
Just capabilities (in other words, people working to create AGI), although I think the safety/capabilities distinction is less clear-cut outside of a few dedicated safety orgs like MIRI. Yes. AI people who aren't explicitly thinking of AGI when they do their research (I think this correctly describes well over 90% of ML researchers at Google Brain, for example). Because it might be surprising (to people asking or reading this question who are imagining long timelines) to see timelines as short as the ones AI experts believe, so the second point is qualifying that AGI experts believe it's even shorter. In general it looks like my language choice was more ambiguous than desirable so I'll edit my answer to be clearer!

Ah, ok. The edits clear everything up for me except that the "even" is meant to be highlighting that this is even shorter than the timelines given in the above paragraph. (Not sure that matters much, though.)

2
Linch
I edited that section, let me know if there are remaining points of confusion!

(I agree with other commenters that the most defensible position is that "we don't know when AGI is coming", and I have argued that AGI safety work is urgent even if we somehow knew that AGI is not soon, because of early decision points on R&D paths; see my take here. But I'll answer the question anyway.) (Also, I seem to be almost the only one coming from this following direction, so take that as a giant red flag...)

I've been looking into the possibility that people will understand the brain's algorithms well enough to make an AGI by copying them (at a high level). My assessment is: (1) I don't think the algorithms are that horrifically complicated, (2) Lots of people in both neuroscience and AI are trying to do this as we speak, and (3) I think they're making impressive progress, with the algorithms powering human intelligence (i.e. the neocortex) starting to crystallize into view on the horizon. I've written about a high-level technical specification for what neocortical algorithms are doing, and in the literature I've found impressive mid-level sketches of how these algorithms work, and low-level sketches of associated neural mechanisms (PM me for a reading list). The high-, mid-, and low-level pictures all feel like they kinda fit together into a coherent whole. There are plenty of missing details, but again, I feel like I can see it crystallizing into view. So that's why I have a gut feeling that real-deal superintelligent AGI is coming in my lifetime, either by that path or another path that happens even faster. That said, I'm still saving for retirement :-P

Other answers have made what I think of as the key points. I'll try to add by pointing in the direction of some resources I've found on this matter which weren't mentioned already by others. Note that:

  • Some of these source suggest AGI is on the horizon, some suggest it isn't, and some just discuss the matter
  • The question of AGI timelines (things like "time until AGI") is related to, but distinct from, the question of "discontinuity"/"takeoff speed"/"foom" (I mention the last of those terms only for historical reasons; I think it's unnecessarily unprofessional). Both questions are relevant when determining strategies for handling AI risk. It would probably be good if the distinction was more often made explicit. The sources I'll mention may sometimes be more about discontinuity-type-questions than about AGI timelines.

With those caveats in mind, here are some sources:

I've also made a collection of so far around 30 "works that highlight disagreements, cruxes, debates, assumptions, etc. about the importance of AI safety/alignment, about which risks are most likely, about which strategies to prioritise, etc." Most aren't primarily focused on timelines, but many relate to that matter.

Oh, also, on the more general question of what to actually do, given a particular belief about AGI timelines (or other existential risk timelines), this technical report by Owen Cotton-Barratt is interesting. One quote:

There are two major factors which seem to push towards preferring more work which focuses on scenarios where AI comes soon. The first is nearsightedness: we simply have a better idea of what will be useful in these scenarios. The second is diminishing marginal returns: the expected effect of an extra year of work on a problem tends to declin
... (read more)
Comments1
Sorted by Click to highlight new comments since:

Note also that your question has a selection filter where you'd also want to figure out where the best arguments for longer timelines are. In an ideal world these two sets of things tend to live in the same place, in our world this isn't always the case.

Curated and popular this week
Paul Present
 ·  · 28m read
 · 
Note: I am not a malaria expert. This is my best-faith attempt at answering a question that was bothering me, but this field is a large and complex field, and I’ve almost certainly misunderstood something somewhere along the way. Summary While the world made incredible progress in reducing malaria cases from 2000 to 2015, the past 10 years have seen malaria cases stop declining and start rising. I investigated potential reasons behind this increase through reading the existing literature and looking at publicly available data, and I identified three key factors explaining the rise: 1. Population Growth: Africa's population has increased by approximately 75% since 2000. This alone explains most of the increase in absolute case numbers, while cases per capita have remained relatively flat since 2015. 2. Stagnant Funding: After rapid growth starting in 2000, funding for malaria prevention plateaued around 2010. 3. Insecticide Resistance: Mosquitoes have become increasingly resistant to the insecticides used in bednets over the past 20 years. This has made older models of bednets less effective, although they still have some effect. Newer models of bednets developed in response to insecticide resistance are more effective but still not widely deployed.  I very crudely estimate that without any of these factors, there would be 55% fewer malaria cases in the world than what we see today. I think all three of these factors are roughly equally important in explaining the difference.  Alternative explanations like removal of PFAS, climate change, or invasive mosquito species don't appear to be major contributors.  Overall this investigation made me more convinced that bednets are an effective global health intervention.  Introduction In 2015, malaria rates were down, and EAs were celebrating. Giving What We Can posted this incredible gif showing the decrease in malaria cases across Africa since 2000: Giving What We Can said that > The reduction in malaria has be
Rory Fenton
 ·  · 6m read
 · 
Cross-posted from my blog. Contrary to my carefully crafted brand as a weak nerd, I go to a local CrossFit gym a few times a week. Every year, the gym raises funds for a scholarship for teens from lower-income families to attend their summer camp program. I don’t know how many Crossfit-interested low-income teens there are in my small town, but I’ll guess there are perhaps 2 of them who would benefit from the scholarship. After all, CrossFit is pretty niche, and the town is small. Helping youngsters get swole in the Pacific Northwest is not exactly as cost-effective as preventing malaria in Malawi. But I notice I feel drawn to supporting the scholarship anyway. Every time it pops in my head I think, “My money could fully solve this problem”. The camp only costs a few hundred dollars per kid and if there are just 2 kids who need support, I could give $500 and there would no longer be teenagers in my town who want to go to a CrossFit summer camp but can’t. Thanks to me, the hero, this problem would be entirely solved. 100%. That is not how most nonprofit work feels to me. You are only ever making small dents in important problems I want to work on big problems. Global poverty. Malaria. Everyone not suddenly dying. But if I’m honest, what I really want is to solve those problems. Me, personally, solve them. This is a continued source of frustration and sadness because I absolutely cannot solve those problems. Consider what else my $500 CrossFit scholarship might do: * I want to save lives, and USAID suddenly stops giving $7 billion a year to PEPFAR. So I give $500 to the Rapid Response Fund. My donation solves 0.000001% of the problem and I feel like I have failed. * I want to solve climate change, and getting to net zero will require stopping or removing emissions of 1,500 billion tons of carbon dioxide. I give $500 to a policy nonprofit that reduces emissions, in expectation, by 50 tons. My donation solves 0.000000003% of the problem and I feel like I have f
LewisBollard
 ·  · 8m read
 · 
> How the dismal science can help us end the dismal treatment of farm animals By Martin Gould ---------------------------------------- Note: This post was crossposted from the Open Philanthropy Farm Animal Welfare Research Newsletter by the Forum team, with the author's permission. The author may not see or respond to comments on this post. ---------------------------------------- This year we’ll be sharing a few notes from my colleagues on their areas of expertise. The first is from Martin. I’ll be back next month. - Lewis In 2024, Denmark announced plans to introduce the world’s first carbon tax on cow, sheep, and pig farming. Climate advocates celebrated, but animal advocates should be much more cautious. When Denmark’s Aarhus municipality tested a similar tax in 2022, beef purchases dropped by 40% while demand for chicken and pork increased. Beef is the most emissions-intensive meat, so carbon taxes hit it hardest — and Denmark’s policies don’t even cover chicken or fish. When the price of beef rises, consumers mostly shift to other meats like chicken. And replacing beef with chicken means more animals suffer in worse conditions — about 190 chickens are needed to match the meat from one cow, and chickens are raised in much worse conditions. It may be possible to design carbon taxes which avoid this outcome; a recent paper argues that a broad carbon tax would reduce all meat production (although it omits impacts on egg or dairy production). But with cows ten times more emissions-intensive than chicken per kilogram of meat, other governments may follow Denmark’s lead — focusing taxes on the highest emitters while ignoring the welfare implications. Beef is easily the most emissions-intensive meat, but also requires the fewest animals for a given amount. The graph shows climate emissions per tonne of meat on the right-hand side, and the number of animals needed to produce a kilogram of meat on the left. The fish “lives lost” number varies significantly by