I previously summarized Ajeya Cotra’s “biological anchors” method for forecasting for transformative AI, aka “Bio Anchors.” Here I want to try to clarify why I find this method so useful, even though I agree with the majority of the specific things I’ve heard people say about its weaknesses (sometimes people who can’t see why I’d put any stock in it at all).
A couple of preliminaries:
- This post is probably mostly of interest for skeptics of Bio Anchors, and/or people who feel pretty confused/agnostic about its value and would like to see a reply to skeptics.
- I don’t want to give the impression that I’m leveling new criticisms of “Bio Anchors” and pushing for a novel reinterpretation. I think the author of “Bio Anchors” mostly agrees with what I say both about the report’s weaknesses and about how to best use it (and I think the text of the report itself is consistent with this).
Summary of what the framework is about
Just to re-establish context, here are some key quotes from my main post about biological anchors:
The basic idea is:
- Modern AI models can "learn" to do tasks via a (financially costly) process known as "training." You can think of training as a massive amount of trial-and-error. For example, voice recognition AI models are given an audio file of someone talking, take a guess at what the person is saying, then are given the right answer. By doing this millions of times, they "learn" to reliably translate speech to text. More: Training
- The bigger an AI model and the more complex the task, the more the training process [or “training run”] costs. Some AI models are bigger than others; to date, none are anywhere near "as big as the human brain" (what this means will be elaborated below). More: Model size and task type
- The biological anchors method asks: "Based on the usual patterns in how much training costs, how much would it cost to train an AI model as big as a human brain to perform the hardest tasks humans do? And when will this be cheap enough that we can expect someone to do it?" More: Estimating the expense
...The framework provides a way of thinking about how it could be simultaneously true that (a) the AI systems of a decade ago didn't seem very impressive at all; (b) the AI systems of today can do many impressive things but still feel far short of what humans are able to do; (c) the next few decades - or even the next 15 years - could easily see the development of transformative AI.
Additionally, I think it's worth noting a couple of high-level points from Bio Anchors that don't depend on quite so many estimates and assumptions:
- In the coming decade or so, we're likely to see - for the first time - AI models with comparable "size" to the human brain.
- If AI models continue to become larger and more efficient at the rates that Bio Anchors estimates, it will probably become affordable this century to hit some pretty extreme milestones - the "high end" of what Bio Anchors thinks might be necessary. These are hard to summarize, but see the "long horizon neural net" and "evolution anchor" frameworks in the report.
- One way of thinking about this is that the next century will likely see us go from "not enough compute to run a human-sized model at all" to "extremely plentiful compute, as much as even quite conservative estimates of what we might need." Compute isn't the only factor in AI progress, but to the extent other factors (algorithms, training processes) became the new bottlenecks, there will likely be powerful incentives (and multiple decades) to resolve them.
Things I agree with about the framework’s weaknesses/limitations
Bio Anchors “acts as if” AI will be developed in a particular way, and it almost certainly won’t be
Bio Anchors, in some sense, “acts as if” transformative AI will be built in a particular way: simple brute-force trial-and-error of computationally intensive tasks (as outlined here). Its main forecasts are based on that picture: it estimates when there will be enough compute to run a certain amount of trial and error, and calls that the “estimate for when transformative AI will be developed.”
I think it’s unlikely that if and when transformative AI is developed, the way it’s developed will resemble this kind of blind trial-and-error of long-horizon tasks.
If I had to guess how transformative AI will be developed, it would be more like:
- First, narrow AI systems prove valuable at a limited set of tasks. (This is already happening, to a limited degree, with e.g. voice recognition, translation and search.)
- This leads to (a) more attention and funding in AI; (b) more integration of AI into the economy, such that it becomes easier to collect data on how humans interact with AIs that can be then used for further training; (c) increased general awareness of what it takes for AI to usefully automate key tasks, and hence increased awareness of (and attention to) the biggest blockers to AI being broader and more capable.
- Different sorts of narrow AIs become integrated into different parts of the economy. Over time, the increased training data, funding and attention leads to AIs that are less and less narrow, taking on broader and broader parts of the tasks they’re doing. These changes don’t just happen via AI models (and training runs) getting bigger and bigger; they are also driven by innovations in how AIs are designed and trained.
- At some point, some combination of AIs is able to automate enough of scientific and technological advancement to be transformative. There isn’t a single “master run” where a single AI is trained to do the very hardest, broadest tasks via blind trial-and-error.
Bio Anchors “acts as if” compute availability is the only major blocker to transformative AI development, and it probably isn’t
As noted in my earlier post:
Bio Anchors could be too aggressive due to its assumption that "computing power is the bottleneck":
- It assumes that if one could pay for all the computing power to do the brute-force "training" described above for the key tasks (e.g., automating scientific work), transformative AI would (likely) follow.
- Training an AI model doesn't just require purchasing computing power. It requires hiring researchers, running experiments, and perhaps most importantly, finding a way to set up the "trial and error" process so that the AI can get a huge number of "tries" at the key task. It may turn out that doing so is prohibitively difficult.
It is very easy to picture worlds where transformative AI takes much more or less time than Bio Anchors implies, for reasons that are essentially not modeled in Bio Anchors at all
As implied above, transformative AI could take a very long time for reasons like “it’s extremely hard to get training data and environments for some crucial tasks” or “some tasks simply aren’t learnable even by large amounts of trial-and-error.”
Transformative AI could also be developed much more quickly than Bio Anchors implies. For example, some breakthrough in how we design AI algorithms - perhaps inspired by neuroscience - could lead to AIs that are able to do ~everything human brains can, without needing the massive amount of trial-and-error that Bio Anchors estimates (based on extrapolation from today’s machine learning systems).
I’ve listed more considerations like these here.
Bio Anchors is not “pinpointing” the most likely year transformative AI will be developed
My understanding of climate change models is that they try to examine each major factor that could cause the temperature to be higher or lower in the future; produce a best-guess estimate for each; and put them all together into a prediction of where the temperature will be.
In some sense, you can think of them as “best-guess pinpointing” (or even “simulating”) the future temperature: while they aren’t certain or precise, they are identifying a particular, specific temperature based on all of the major factors that might push it up or down.
Many other cases where someone estimates something uncertain (e.g., the future population) have similar properties.
Bio Anchors isn’t like that. There are factors it ignores that are identifiable today and almost certain to be significant. So in some important sense, it isn’t “pinpointing” the most likely year for transformative AI to be developed.
(Not the focus of this piece) The estimates in Bio Anchors are very uncertain
Bio Anchors estimates some difficult-to-estimate things, such as:
- How big an AI model would have to be to be “as big as the human brain” in some relevant sense. (For this it adapts Joe Carlsmith’s detailed report.)
- How fast we should expect algorithmic efficiency, hardware efficiency, and “willingness to spend on AI” to increase in the future - all of which affect the question of “how big an AI training run will be affordable.” Its estimates here are very simple and I think there is lots of room for improvement, though I don’t expect the qualitative picture to change radically.
I acknowledge significant uncertainty in these estimates, and I acknowledge that (all else equal) uncertainty means we should be skeptical.
That said:
- I think these estimates are probably reasonably close to the best we can do today with the information we have.
- I think these estimates are good enough for the purposes of what I’ll be saying below about transformative AI timelines.
I don’t plan to defend this position more here, but may in the future if I get a lot of pushback on it.
Bio Anchors as a way of bounding AI timelines
With all of the above weaknesses acknowledged, here are some things I believe about AI timelines, that are largely based on the Bio Anchors analysis:
- I would be at least mildly surprised if transformative AI weren’t developed by 2060. I put the probability of transformative AI by then at 50% (I explain below how the connection works between "mild surprise" and "50%"); I could be sympathetic to someone who said it was 25% or 75%, but would have a hard time seeing where someone was coming from if they went outside that range. More
- I would be significantly surprised if transformative AI weren’t developed by 2100. I put the probability of transformative AI by then at 2 in 3; I could be sympathetic to someone who said it was 1 in 3 or 80-90%, but would have a hard time seeing where someone was coming from if they went outside that range. More
- Transformative AI by 2036 seems plausible and concretely imaginable, but doesn’t seem like a good default expectation. I think the probability of transformative AI by then is at least 10%; I could be sympathetic to someone who said it was 40-50%, but would have a hard time seeing where someone was coming from if they said it was <10% or >50%. More
I’d be at least mildly surprised if transformative AI weren’t developed by 2060
This is mostly because, according to Bio Anchors, it will then be affordable to do some absurdly big training runs - arguably the biggest ones one could imagine needing to do, based on using AI models 10x the size of human brains and tasks that require massive numbers of computations to do even once. In some important sense, we’ll be “swimming in compute.” (More on this intuition at Fun with +12 OOMs of compute.)
But it also matters that 2060 is 40 years from now, which is 40 years to:
- Develop ever more efficient AI algorithms, some of which could be big breakthroughs.
- Increase the number of AI-centric companies and businesses, collecting data on human interaction and focusing increasing amounts of attention on the things that currently block broad applications.
Given the already-rising amount of investment, talent, and potential applications for today’s AI systems, 40 years seems like a pretty long time to make big progress on these fronts. For context, 40 years is around the amount of time that has elapsed between the Apple IIe release and now.
When it comes to translating my “sense of mild surprise” into a probability (see here for a sense of what I’m trying to do when talking about probabilities; I expect to write more on this topic in the future):
- On most topics, I equate “I’d be mildly surprised if X didn’t happen” with something like a 60-65% chance of X. But on this topic, I do think there's a burden of proof (which I consider significant though not overwhelming), and I'm inclined to shade my estimates downward somewhat. So I am saying there's about a 50% chance of transformative AI by 2060.
- I’d be sympathetic if someone said “40 years doesn’t seem like enough to me; I think it’s more like a 25% chance that we’ll see transformative AI by 2060.” But if someone put it at less than 25%, I’d start to think: “Really? Where are you getting that? Why think there’s a <25% chance that we’ll develop transformative AI by a year in which it looks like we’ll be swimming in compute, with enough for the largest needed runs according to our best estimates, with 40 years elapsed between today’s AI boom and 2060 to figure out a lot of the other blockers?”
- On the flip side, I’d be sympathetic if someone said “This estimate seems way too conservative; 40 years should be easily enough; I think it’s more like a 75% chance we’ll have transformative AI by 2060.” But if someone put it at more than 75%, I’d start to think: “Really? Where are you getting that? Transformative AI doesn’t feel around the corner, so this seems like kind of a lot of confidence to have about a 40-year-out event.”
I would be significantly surprised if transformative AI weren’t developed by 2100
By 2100, Bio Anchors projects that it will be affordable not only to do almost comically large-seeming training runs (again based on the hypothesized size of the models and cost-per-try of the tasks), but to do as many computations as all animals in history combined, in order to re-create the progress that was made by natural selection.
In addition, 2100 is 80 years from now - longer than the time that has elapsed since programmable digital computers were developed in the first place. That’s a lot of time to find new approaches to AI algorithms, integrate AI into the economy, collect training data, tackle cases where the current AI systems don’t seem able to learn particular tasks, etc.
To me, it feels like 2100 is something like “About as far out as I could tell a reasonable-seeming story for, and then some.” Accordingly, I’d be significantly surprised if transformative AI weren’t developed by then, and I assign about a 2/3 chance that it will be. And:
- I’d be sympathetic if someone said “Well, there’s a lot we don’t know, and a lot that needs to happen - I only think there’s a 50% chance we’ll see transformative AI by 2100.” I’d even be somewhat sympathetic if they gave it a 1 in 3 chance. But if someone put it at less than 1/3, I’d really have trouble seeing where they were coming from.
- I’d be sympathetic if someone put the probability for “transformative AI by 2100” at more like 80-90%, but given the difficulty of forecasting this sort of thing, I’d really have trouble seeing where they were coming from if they went above 90%.
Transformative AI by 2036 seems plausible and concretely imaginable, but doesn’t seem like a good default expectation
Bio Anchors lays out concrete, plausible scenarios in which there is enough affordable compute to train transformative AI by 2036 (link). I know some AI researchers who feel these scenarios are more than plausible - their intuitions tell them that the giant training runs envisioned by Bio Anchors are unnecessary and that the more aggressive anchors in the report are being underrated.
I also think Bio Anchors understates the case for “transformative AI by 2036” a bit, because it’s hard to tell what consequences the current boom of AI investment and interest will have. If AI is about to become a noticeably bigger part of the economy (definitely an “if”, but compatible with recent market trends), this could result in rapid improvements along many possible dimensions. In particular, there could be a feedback loop in which new profitable AI applications spur more investment in AI, which in turn spurs faster-than-expected improvements in the efficiency of AI algorithms and compute, which in turn leads to more profitable applications … etc.
With all of this in mind, I think the probability of transformative AI by 2036 is at least 10%, and I don't have a lot of sympathy for someone saying it is less.
And that said, all of the above is a set of “coulds” and “mights” - every case I’ve heard for “transformative AI by 2036” seems to require a number of uncertain pieces to click into place.
- If “long-horizon” tasks turn out to be important, Bio Anchors shows that it’s hard to imagine there will be enough compute for the needed training runs.
- Even if there is plenty of compute, 15 years might not be enough time to resolve challenges like assembling the right training data and environments.
- It’s certainly possible that some completely different paradigm will emerge - perhaps inspired by neuroscience - and transformative AI will be developed in ways that don’t require Bio-Anchors-like “training runs” at all. But I don’t see any particular reason to expect that to happen in the next 15 years.
So I also don’t have a lot of sympathy for people who think that there’s a >50% chance of transformative AI by 2036.
Bottom line
Bio Anchors is a bit different from the “usual” approach to estimating things. It doesn’t “pinpoint” likely dates for transformative AI; it doesn’t model all the key factors.
But I think it is very useful - in conjunction with informal reasoning about the factors it doesn’t model - for “bounding” transformative AI timelines: making a variety of statements along the lines of “It would be surprising if transformative AI weren’t developed by ___” or “You could defend a ___% probability by such a date, but I think a ___% probability would be hard to sympathize with.”
And that sort of “bounding” seems quite useful for the purpose I care most about: deciding how seriously to take the possibility of the most important century. My take is that this possibility is very serious, though far from a certainty, and Bio Anchors is an important part of that picture for me.
Thanks for the thoughtful reply, that's a good list! I'll make a list of my own below. Warning: Wall of text incoming, I won't be offended if you don't read it!
This is the crux I guess, haha. Here's a stab:
Let's suppose it's 2030 and algorithmic and hardware progress have continued at the rates Ajeya projects and so has willingness-to-spend. Also let's suppose the scaling laws have continued to hold.
Here is a disjunctive list of paths-to-AI-PONR:
a. Some PONR-inducing task turns out to be short-horizon
b. Some PONR-inducing task turns out to work with smallish brains and medium horizons
c. Some PONR-inducing task can be reached via generalization (in short-horizon-pre-trained human-size brains)
d. Some PONR-inducing task can be reached via task decomposition (e.g. bureaucracies of AIs of the aforementioned types)
e. New algorithmic advancements appear that make it possible to do long-horizon training a few OOMs more data-efficiently (I guess I mean this to also be the catch-all category for paradigm shifts and the like)
I should now say what the main PONR-inducing tasks are in my opinion. They are:
--APS-AI [EDIT: Advanced, Planning, Strategically aware. See this report.]
--Persuasion tools good enough to cause major ideological strife and/or major degradation of public epistemology
--R&D acceleration
--Unknown/catchall
Technically R&D acceleration isn't PONR-inducing but it would lead to something PONR-inducing pretty quickly so I include it.
Ok, credences:
a. I think APS-AI is probably not short-horizon, but persuasion and R&D acceleration and unknown might be. (Maybe if we did AlphaFold but bigger and for AI R&D it would make a kickass tool for designing new AI architectures. Input hyperparameters, it predicts what training curve and performance on benchmarks will be!) Let's say 50% chance for persuasion, 25% for R&D acceleration, and 15% for unknown, and 65% for combined.
b. I worry that maybe a small neural net trained long-horizon-style to be APS-AI might actually succeed at some PONR-inducing task even though it is smaller than the human brain. I don’t worry too much about this, but… think about how GPT-2 is able to write sensible English even though it’s 5 OOMs smaller than the human brain. Or how AlphaStar an go toe-to-toe with human experts despite being 7 OOMs smaller! Let’s say 20%.
c. I’m more worried about big pre-trained brains generalizing (perhaps with a bit of fine-tuning.) I know there has been some research done into scaling laws for transfer, and Rohin extrapolated to calculate that this would only knock off 1.5 OOMs of cost from a hypothetical long-horizon training run… but I’m still nervous. Put it this way: Humans are FAR from optimal at long-horizon tasks anyway. There is no reason to think that we are as good as a human-brain-sized neural net trained for 10^14 data points each one the length of a subjective human lifetime. There’s every reason to think that neural net would instead be dramatically better than us. What sorts of things does an AI need to do to be APS-AI? Planning, strategically aware… arguably GPT-3 can already do those things, it just can’t do them well. But once it’s bigger, and fine-tuned… maybe it’ll be able to go toe-to-toe with humans, while still being far from optimal. Or even if it can’t be APS-AI, maybe it can be smart enough to accelerate AI R&D. (One could also imagine making a brain bigger than the human brain, and then pre-training it, and then using it as an oracle… ask it to predict which AI architecture will yield the best results, etc.) I say 60%.
d. I think bureaucracies of neural nets are pretty brittle and finicky now, but (a) that might change in the future as we get more practice with them, and (b) I get the impression that they do reasonably well when you can fine-tune them / retrain them into their new roles. See e.g. the recent OpenAI crawl-the-internet-and-do-research-with-which-to-answer-questions bot. I say 25%.
e: Let's suppose there have been 2 paradigm shifts in the last 60 years of AI research. Seems like the recent shift to deep learning was one. Seems very plausible that if we have a new shift that is to deep learning what deep learning was to the previous shitty stuff in the early 2000s, then we are going to get AI-PONR very shortly thereafter. So anyhow maybe this suggests something like a 33% chance of another such shift by 2030, going on base rates? Could go down if you think there have been fewer paradigm shifts in the past, could go up if you think there have been more. I'd love to see someone measure the recent increase in investment and calculate whether we are more likely to get paradigm shifts now than any time in the past, taking into account ideas-getting-harder-to-find effects. (Huh, you know, I don't think I realized how high the chance of paradigm shift is until now... I guess this means my timelines should be shorter...)
f. I’m not sure which category this fits in, but what about just scaling up EfficientZero? As far as I know its architecture is pretty damn general, not game-specific at all. You should be able to hook it up to a robot or a chatbot (perhaps with a pre-trained model like GPT-3 as a seed) and let rip. Napkin math time: Instead of spending 1 day training on hardware that costs $10,000, let's make a custom supercomputer that is 6 OOMs bigger. Cost: $10B. Run it for 100 days instead of 1. That gives us 8 OOMs more compute to work with than EfficientZero had. Use 5 OOMs to increase the subjective training time from 2 hours to 22 years. Use 3 OOMs to increase parameter count. Maybe this setup would work for something much more complex than Atari… I’m gonna say 20%.
Anyhow, all of this is off the cuff, out of my ass, etc. but it really does feel like it adds up to significantly more than 50% to me, more like 80% or so. So then why aren’t my timelines 80% by 2030? Well, remember all of this was conditioning on “algorithmic and hardware progress have continued at the rates Ajeya projects and so has willingness-to-spend. Also let's suppose the scaling laws have continued to hold.” Also I wish to be humble etc. and defer to people like yourself and Ajeya and Paul at least a little bit.
My promised list: Here are some example observations that would go a long way towards lengthening my timelines a lot longer, e.g. to 20-30 years instead of 10:
1. AI winter. Progress slows, investment dries up. People generally agree that the amount of compute used for the largest training runs will stop growing for the next decade or so, rather than grow by a couple OOMs as is currently expected.
2. Roadblock that doesn't quickly fall: My brief (5year) experience watching AI progress is a story of many repeated instances of purported roadblocks being smashed through almost as soon as I hear about them. E.g. transfer learning, imperfect-information games, common sense understanding, reasoning, real-time games, sim-2-real, ... the list goes on. Most recently people I respect a lot (Ajeya, Paul, etc.) taught me about horizon lengths and data inefficiency and I came to believe that modern AI methods were fundamentally less data-efficient than the human brain... but then along came EfficientZero! So, I'd lengthen my timelines if someone clearly articulates a major roadblock to all important milestones (AGI/TAI/APS-AI/etc.), DeepMind and OpenAI etc. throw themselves at overcoming it for a few years, and fail. (Maybe this has already happened and I haven't heard about it because of publication bias?) (Also it's important that the roadblock plausibly block us from AGI/TAI/APS-AI/etc. Data-efficiency is on thin ice by this metric because plausibly even if AI is dramatically less data-efficient than humans there might still be a way to make AGI/TAI/APS-AI/etc. out of it. Causal reasoning and common sense and imperfect-information games do much better by this metric; too bad we smashed through them so easily.)
3. Solid evidence that human intelligence comes from "special sauce" that needs to either be painstakingly imitated via much greater knowledge of neuroscience, or brute-force rediscovered via at least genome-anchor-like levels of artificial evolution. As far as I know there isn't really any solid evidence for the special sauce hypothesis; if actually AGI is really easy and there is no special sauce whatsoever, my brain would still look exactly the way it does. (To date there has been no experiment along the lines of “make a 100T parameter dense model and train it for a billion time steps,” not even close.) The best piece of evidence I know of is along the lines of "If there's no special sauce, then we should be able to make AIs as smart as animal brains of similar size, and we can't." Except that so far it seems like we can actually? We can make image recognizers better than bee brains, for example, as OpenPhil's investigation showed. I haven't yet heard of an intellectual task tiny-brained animals can do that we know current AI methods can't also do.
4. People trying to build AGI with a track record of success change their minds and start disagreeing with me about timelines: My impression is that the people actually trying to build AGI, especially the ones at the cutting edge with the best track records, tend to have even shorter timelines than me!