Over at The 80,000 Hours Podcast we just published an interview that is likely to be of particular interest to people who identify as involved in the effective altruism community: Tom Davidson on how quickly AI could transform the world.
You can click through for the audio, a full transcript and related links. Below is the episode summary and some key excerpts.
By the time that the AIs can do 20% of cognitive tasks in the broader economy, maybe they can already do 40% or 50% of tasks specifically in AI R&D. So they could have already really started accelerating the pace of progress by the time we get to that 20% economic impact threshold.
At that point you could easily imagine that really it’s just one year, you give them a 10x bigger brain. That’s like going from chimps to humans — and then doing that jump again. That could easily be enough to go from [AIs being able to do] 20% [of cognitive tasks] to 100%, just intuitively. I think that’s kind of the default, really.
It’s easy to dismiss alarming AI-related predictions when you don’t know where the numbers came from.
For example: what if we told you that within 15 years, it’s likely that we’ll see a 1,000x improvement in AI capabilities in a single year? And what if we then told you that those improvements would lead to explosive economic growth unlike anything humanity has seen before?
You might think, “Congratulations, you said a big number — but this kind of stuff seems crazy, so I’m going to keep scrolling through Twitter.”
But this 1,000x yearly improvement is a prediction based on real economic models created by today’s guest Tom Davidson, Senior Research Analyst at Open Philanthropy. By the end of the episode, you’ll either be able to point out specific flaws in his step-by-step reasoning, or have to at least consider the idea that the world is about to get — at a minimum — incredibly weird.
As a teaser, consider the following:
Developing artificial general intelligence (AGI) — AI that can do 100% of cognitive tasks at least as well as the best humans can — could very easily lead us to an unrecognisable world.
You might think having to train AI systems individually to do every conceivable cognitive task — one for diagnosing diseases, one for doing your taxes, one for teaching your kids, etc. — sounds implausible, or at least like it’ll take decades.
But Tom thinks we might not need to train AI to do every single job — we might just need to train it to do one: AI research.
And building AI capable of doing research and development might be a much easier task — especially given that the researchers training the AI are AI researchers themselves.
And once an AI system is as good at accelerating future AI progress as the best humans are today — and we can run billions of copies of it round the clock — it’s hard to make the case that we won’t achieve AGI very quickly.
To give you some perspective: 17 years ago we saw the launch of Twitter, the release of Al Gore’s An Inconvenient Truth, and your first chance to play the Nintendo Wii.
Tom thinks that if we have AI that significantly accelerates AI R&D, then it’s hard to imagine not having AGI 17 years from now.
Host Luisa Rodriguez gets Tom to walk us through his careful reports on the topic, and how he came up with these numbers, across a terrifying but fascinating three hours.
Luisa and Tom also discuss:
- How we might go from GPT-4 to AI disaster
- Tom’s journey from finding AI risk to be kind of scary to really scary
- Whether international cooperation or an anti-AI social movement can slow AI progress down
- Why it might take just a few years to go from pretty good AI to superhuman AI
- How quickly the number and quality of computer chips we’ve been using for AI have been increasing
- The pace of algorithmic progress
- What ants can teach us about AI
- And much more
Get this episode by subscribing to our podcast on the world’s most pressing problems and how to solve them: type ‘80,000 Hours’ into your podcasting app. Or read the transcript below.
Producer: Keiran Harris
Audio mastering: Simon Monsour and Ben Cordell
Transcriptions: Katy Moore
Going from GPT-4 to AI takeover
Tom Davidson: We can try and think about this system which is trying to solve these math problems. Maybe the first version of the AI, you just say, “We want you to solve the problem using one of these four techniques. We want to use one of these seven methodologies on those techniques to get to an answer.” And that system is OK, but then someone comes along and realises that if you let the AI system do an internet search and plan its own line of attack on the problem, then it’s able to do a better job in solving even harder and harder problems. So you say, “OK, we’ll allow the AI to do that.”
Then over time, in order to improve performance, you give it more and more scope to kind of be creative in planning how it’s going to attack each different kind of problem. One thing that might happen internally, inside the AI’s own head, is that the AI may end up developing just an inherent desire to just get the answer to this math question as accurately as possible. That’s something which it always gets rewarded for when it’s being trained. Maybe it could be thinking, “I actually want the humans to be happy with my answer.” But another thing it might end up thinking is, “You know what? What I really want is just to get the answer correct.” And the kind of feedback that we humans are giving that system doesn’t distinguish between those two possibilities.
So maybe we get unlucky, and maybe the thing that it wants is to just really get the answer correct. And maybe the way that the AI system is working internally is, it’s saying, “OK, that’s my goal. What plan can I use to achieve that goal?” It’s creatively going and looking for new approaches by googling information. Maybe one time it realises that if it hacked into another computing cluster, it could use those computations to help it solve the problem. And it does that, and no one realises — and then that reinforces the fact that it is now planning on such a broad scale to try and achieve this goal.
Maybe it’s much more powerful at a later time, and it realises that if it kills all humans, it could have access to all the supercomputers — and then that would help it get an even more accurate answer. Because the thing it cares about is not pleasing the humans — the thing it happened to care about internally was actually just getting an accurate answer — then that plan looks great by its own lights. So it goes and executes the plan.
Luisa Rodriguez: Why couldn’t you just give the system an instruction that didn’t also come with rewards? Is it impossible to give an AI system a reward for every problem it solves by not hurting anyone?
Tom Davidson: I think that would help somewhat. The problem here is that there are kind of two possibilities, and it’s going to be hard for us to give rewards that ensure that one of the possibilities happens and not the second possibility.
Here are the two possibilities: One possibility is the AI really doesn’t want to hurt humans, and it’s just going to take that into account when solving the math problem. That’s what we want to happen. The other possibility is that the AI only cares about solving the math problem and doesn’t care about humans at all, but it understands that humans don’t like it when it hurts them, and so it doesn’t hurt humans in any obvious way.
Why AGI could lead to explosive economic growth
Tom Davidson: Today there are maybe tens of millions of people whose job it is to discover new and better technologies, working in science and research and development. They’re able to make a certain amount of progress each year. It’s their work that helps us get better computers and phones, and discover better types of solar panels, and drives all these improvements that we’re seeing.
But like we’ve been talking about, shortly after AGI, there’s going to be billions of top human researcher equivalents — in terms of a scientific workforce from AI. And if you imagine that workforce — or half of that workforce, or just 10% of it — working on trying to advance technology and come up with new ideas, then you have now 10 or 100 times the effort that’s going into that activity. And these AIs are also able to think maybe 10 or 100 times as quickly as humans can think.
And you’re able to take the very best AI researchers and copy them. So if you think that scientific progress is overwhelmingly driven by a smaller number of really brilliant people with brilliant ideas, then we just need one of them and we can copy them. They might be happy to just work much harder than humans work. It might be possible to focus them much more effectively on the most important types of R&D, whereas humans maybe are more inclined to follow their interests, even when it’s not the most useful thing to be researching.
All of those things together just mean that we’ll be generating 100 times as many new good ideas and innovations each year compared with today, and then that would drive the development of technologies to be at least 10 times faster than today.
Tom Davidson: I think this is a default. You could give objections to the argument I gave, but I think it’s mostly possible to answer those objections. So you could say that discovering new technologies isn’t just about thinking and coming up with new ideas; you also need to do experiments. I think you can answer that objection by saying that’s right, we will need to do experiments.
Luisa Rodriguez: And that’s like testing a drug on humans, and maybe it takes five years or something to really check that it’s safe and effective?
Tom Davidson: Right. Or you’ve designed a new solar panel, and you want to test its performance in a variety of conditions. Or you’re running some experiments to see what happens when you combine these two chemicals together, because you’re not able to predict it in advance.
But if you have a billion AIs trying to push forward R&D, and they’re bottlenecked on needing to do these experiments, then they’ll be putting in a huge amount of effort to make these experiments happen as efficiently as possible. Whereas today we might be using the lab for 50% of the time we could be using it, and we might be just doing a whole bunch of experiments and then analysing it afterwards and learning a little bit from each experiment, but also not trying to cram as much into each experiment as is humanly possible. If these AIs are limited on experiments, then they’re going to be spending months and months just meticulously planning the micro details of every single experiment, so that you can get as much information as possible out of each one.
Why explosive growth is plausible despite sounding crazy
Tom Davidson: I agree it seems really crazy, and I think it’s very natural and understandable to just not believe it when you hear the arguments.
I think what’s at the heart of it for me is that the human brain is a physical system. There’s nothing magical about it. It isn’t surprising that we develop machines that can do what the human brain can do at some point in the process of technological discovery. To be honest, that happening in the next couple of decades is when you might expect it to happen, naively. We’ve had computers for 70-odd years. It’s been a decade since we started pouring loads and loads of compute into training AI systems, and we’ve realised that that approach works really, really well. If you say, “When do you think humans might develop machines that can do what the human brain can do?” you kind of think it might be in the next few decades.
I think if you just sit with that fact — that there are going to be machines that can do what the human brain can do; and you’re going to be able to make those machines much more efficient at it; and you’re going to be able to make even better versions of those machines, 10 times better versions; and you’re going to be able to run them day and night; and you’re going to be able to build more — when you sit with all that, I do think it gets pretty hard to imagine a future that isn’t very crazy.
Another perspective is just zooming out even further, and just looking at the whole arc of human history. If you’d have asked hunter-gatherers — who only knew the 50 people in their group, and who had been hunting using techniques and tools that, as far as they knew, had been passed down for eternity, generation to generation, doing their rituals — if you’d have told them that in a few thousand years, there were going to be huge empires building the Egyptian pyramids, and massive armies, and the ability to go to a market and give people pieces of metal in exchange for all kinds of goods, it would have seemed totally crazy.
And if you’d have told those people in those markets that there’s going to be a future world where every 10 years major technological progress is going to be coming along, and we’re going to be discovering drugs that can solve all kinds of diseases, and you’re going to be able to get inside a box and land on the other side of the Earth — again, they would have just thought you were crazy.
While it seems that we understand what’s happening, and that progress is pretty steady, that has only been true for the last 200 years — and zooming out, it’s actually the norm throughout the longer run of history for things to go in a totally surprising and unpredictable direction, or a direction that would have seemed totally bizarre and unpredictable to people naively at that time.
Why AI won't go the way of nuclear power
Tom Davidson: I don’t have a good understanding of what happened [with nuclear power], but I think there were some big catastrophes with nuclear power, and then it became very stigmatised. And the regulatory requirements around it, the safety requirements, became very large — much larger, really, than was reasonable, given that fossil fuel energy has damaging health consequences as well through air pollution. As a result, it just became kind of a mixture of stigma and the additional cost from all that regulation just prevented it from being rolled out. But I do think there are a fair few very significant disanalogies between that case and the case of AI.
One thing is that there were other sources of energy that were available, and so it wasn’t too costly to be like, “We’re not going to use nuclear; we’re going to use fossil fuels instead.” Even the green, climate-change-concerned people could think about developing solar panels and renewable energies. In the AI case, there is going to be no alternative: there’s going to be no alternative technology which can solve all illness, and which can grant your nation massive national security and military power, and that can solve climate change. This is going to be the only option. So that’s one disanalogy.
Another disanalogy is the cost factor. With nuclear power, it’s become more expensive over time due to regulations, and that’s been a big factor in it not being pursued. But we’ve been discussing the specifics around these cost curves with compute and these algorithmic progress patterns, which suggest that the upfront cost of training AGI is going to be falling really pretty quickly over time. Even if initially, you put in loads of regulations which make it very expensive, it’s really not going to be long until it’s 10x cheaper. So permanently preventing it, when it’s becoming cheaper and cheaper at such a high rate, is going to be really difficult.
Third is just talking about the size of the gains from this technology compared to nuclear power. France adopted nuclear power and it was somewhat beneficial — it now gets a lot of its power from nuclear energy, and there’s no climate change impacts, and that’s great — but it’s not as if France is visibly and indisputably just doing amazingly well as a country because it’s got this nuclear power. It’s kind of a modest addition. Maybe it makes it look a little bit better.
By contrast, if one country is progressing technology at the normal rate, and then another country comes along and just starts using these AIs and robots a little bit, you’re going to see very significant differences in how its overall technology and prosperity and military power is progressing. You’re going to see that as countries dial up how much they’re allowing AIs to do this work, that there are then bigger and bigger differences there. Ultimately, advancing technology at our pace versus advancing technology 30 times faster, over the course of just a few years, becomes a massive difference in the sophistication of your country’s technology and ability to solve all kinds of social and political problems.
Why AI takeoff might be shockingly fast
Tom Davidson: The conclusion from my report is pretty scary. The bottom line is that my median guess is that it would take just a small number of years to go from that 20% to the 100%, I think it’s equally likely to happen in less than three years as it is to happen in more than three years. So a pretty abrupt and quick change is the kind of median.
Some quick things about why it’s plausible. Each year, once you take better algorithms and using more compute into account, we’re currently training AIs each year that have three times bigger brains than the year before. So, this is a really rough way to think about it, but imagine a three times smaller brain than humans — that’s chimpanzee-brain size.
And right now it’s humans that are doing all the work to improve those AI systems — as we get close to AIs that match humans, we’ll be increasingly using AI systems to improve AI algorithms, design better AI chips. Overall, I expect that pace to accelerate, absent a specific effort to slow down. Rather than three times bigger brains each year, it’s going to be going faster and faster: five times bigger brains each year, 10 times bigger brains each year. I think that already makes it plausible that there could be just a small number of years where this transition happens — where AIs go from much worse than humans to much better.
To add in another factor, I think that it’s likely that AIs are going to be automating AI research itself before they’re automating things in most of the economy. Because that’s the kind of the tasks and the workflow that AI researchers themselves really understand, so they would be best placed to use AIs effectively there — there aren’t going to be delays to rolling it out, or trouble finding the customers for that. And the task of AI research is quite similar to what language models are currently trained to do. They’re currently trained to predict the next token on the internet, which means they’re particularly well suited to text-based tasks. The task of writing code is one such task, and there is lots of data on examples of code writing.
Already we’re seeing that with GPT-4 and other systems like that, people are becoming much more interested in AI, much more willing to invest in AI. The demand for good AI researchers is going up. The wages for good AI researchers are going up. AI research is going to be a really financially valuable thing to automate.
If you’re paying $500,000 a year to one of your human research engineers — which is a lot lower than what some of these researchers are earning — then if you can manage to get your AI system to double their productivity, that’s way better than doubling the productivity of someone who works in a random other industry. Just the straightforward financial incentive as the power of AI becomes apparent will be towards “Let’s see if we can automate this really lucrative type of work.”
That’s another reason to think that we get the automation much earlier on the AI side than on the general economy side — and that by the time we’re seeing big economic impacts, AI is already improving at a blistering pace, potentially.
Why it's so important to build trust between labs
Tom Davidson: In terms of plans for making the whole thing go well, it’s especially scary, because a really important part of the plan, from my perspective, would be to go especially slowly when we’re around the human level — so that we can do loads of experiments, and loads of scientific investigation into this human level AI: “Is it aligned if we do this technique? What about if we try this other alignment technique? Does it then seem like it’s aligned?” Just really making sure we fully understand the science of alignment, and can try out lots of different techniques, and to develop reliable tests for whether the alignment technique has worked or not, that they’re hard to game.
Luisa Rodriguez: The kind of thing that ARC has done with GPT-4, for example.
Tom Davidson: Exactly. I think if we only have a few months through the human-level stage, that stuff becomes really difficult to do without significant coordination in advance by labs. I think that there are really important implications of this fast transition in terms of setting up a kind of governance system, which can allow us to go slowly despite the technical possibilities existing to go very fast.
Luisa Rodriguez: That makes sense. I feel like I’ve had some background belief that was like, obviously when we’ve got AI systems that can do things humans can do, people are going to start freaking out, and they’re going to want to make sure those systems are safe. But if it takes months to get there and then within another few months we’re already well beyond human capabilities, then no one’s going to have time to freak out, or it’ll be too late. I mean, even if we spend the next seven years left in the decade, that sounds hard enough.
Tom Davidson: Yeah. I agree.
Luisa Rodriguez: So a takeaway is that we really need to start slowing down or planning now. Ideally both.
Tom Davidson: Yeah. And we’ll need the plans we make to really enable there to be mutual trust that the other labs are also slowing down. Because if it only takes six months to make your AIs 10 or 100 times as smart, then you’re going to need to be really confident that the other labs aren’t doing that in order to feel comfortable slowing down yourself.
Luisa Rodriguez: Right. If it was going to take 10 years and you noticed three months in that another lab is working on it, you’d be like, “Eh, we can catch up.” But if it’s going to take six months and you’re three months in, you’ve got no hope — so maybe you’ll just spend those first three months secretly working on it to make sure that doesn’t happen, or just not agree to do the slowdown.
Tom Davidson: Yeah.
Luisa Rodriguez: Oh, these are really hard problems. I mean, it feels very prisoner’s dilemma-y.
Tom Davidson: I’m hoping it’s going to be more like an iterated prisoner’s dilemma, where there’s multiple moves that the labs make, one after the other, and they can see if the other labs are cooperating. In an iterated prisoner’s dilemma, it ultimately makes sense for everyone to cooperate — because that way, the other people can see you coordinating, then they coordinate, and then everyone kind of ends up coordinating.
One thing is if you could set up ways for labs to easily know whether the other labs are indeed cooperating or not, kind of week by week. That turns it into a more iterated prisoner’s dilemma, and makes it easier to achieve a kind of good outcome.
Luisa Rodriguez: Yeah, that makes sense. I imagine it’s the case that the more iteration you get in an iterated prisoner’s dilemma, the better the incentives are to cooperate. And so just by making the timelines shorter, you make it harder to get these iterations that build trust.
Tom Davidson: Yeah, I think that’s right.
What ants might teach us about deploying AI safely
Tom Davidson: In an ant colony, ants are smarter than like a human cell is: they’re kind of self-contained units that eat and do tasks by themselves, and they’re pretty autonomous. But the ants are still pretty dumb: no ant really knows that it’s part of a colony, or knows that the colony has certain tasks that it needs to do, and that it has to help out with the colony efforts. It’s more like a little robot that’s bumping into other ants and getting signals and then adjusting its behaviour based on that interaction.
Luisa Rodriguez: It’s not like a company, where the different people in the company are like, “My job is marketing,” and they have a basic picture of how it all fits together. They’re much more like if a person at a company doing marketing was just like, “I don’t know why I do it, I just do it.”
Tom Davidson: Yeah, exactly. Another disanalogy with the company is that in a company, there’s someone at the top that’s kind of coordinating the whole thing — whereas with ants, there’s no one that’s coordinating it, including the queen. There’s no management system; it’s just all of the hundreds and thousands of ants have their individual instincts of what they do when they bump into each other, and what they do when they bump into food, and what they do when they realise that there’s not as much food as there needs to be.
And by all of the ants following their own individual instincts, it turns out that they act as if they were a fairly well-coordinated company that’s ensuring that there are some ants going to get food, and some ants that are keeping the nest in order, and some ants that are feeding the young. That coordination happens almost magically, and emerges out of those individual ant interactions.
One example of how this works is that if an ant comes across a body of a dead ant, and if there’s another dead body nearby, it would tend to move it to be close to the other dead body. That’s just an instinct it has: it just moves the body towards another. If there’s one pile of three dead ants and another pile of two dead ants, it will tend to go towards the bigger pile, so tend to move with this extra dead ant towards the pile of three. If all the ants just have those instincts, then if there’s initially a sprawling mass of dead bodies everywhere, then those dead bodies will be collected into a small number of piles of bodies.
They don’t have to know that the whole point of this instinct is to clear the ground so that it’s easier to do work in the future; it’s just an instinct they have. They don’t have to know that when everyone follows that instinct, this is the resultant pattern of behaviour.
This is an example of a system where lots of less-clever individuals are following their local rules, doing their local task, and that what emerges from that is a very coherent and effective system for ultimately gathering food, defending against predators, raising the young.
An analogy would be that maybe we think it’s pretty dangerous to train really smart AIs that are individually very smart, but it might be safer to set up a team of AIs, such that each AI is doing its own part in a kind of team and doesn’t necessarily know how its work is fitting into the broader whole. Nonetheless, you can maybe get a lot more out of that kind of disconnected team of AIs that are specialised, and that just kind of take their inputs and produce their outputs, without much of an understanding of the broader context. And just thinking that maybe that would be a safer way to develop advanced AI capabilities than just training one super-smart AI megabrain.