Two things:
- awesome poddie
- Luisa is crushing it
Two things:
Really great discussion. How can we get this kind of information out into the general population?Ā
IMHO the biggest challenge we face is convincing people that the default outcome, if we do nothing, is more likely to be that we get an AI which is much more powerful than humans. Tom and Luisa, you do a great job of making this case. If someone disagrees, it is up to them to demonstrate where is the flaw in the logic.Ā
I think we face three critical challenges in getting people to act on this as urgently as we need to:
All this means that most of us (I include myself in this) read this article, fully accept that Tom's arguments are compelling, realise that we absolutely must do something, but somehow do not rush out and storm the parliament demanding immediate action. Instead, we go on to the next item on our to-do list, maybe laundry or grocery shopping ... I'm really determined to figure out a way to overcome this inertia.Ā
Ā
*Obviously this is true for those of us in the current generation, in the West. I'm sure those who lived through world wars or famines or national wars, even those today in Syria or Ukraine or Sudan, will have a better understanding of how things can suddenly go wrong. But most of the people taking the decisions about AI have never experienced anything like that.Ā
Over at The 80,000 Hours Podcast we just published an interview that is likely to be of particular interest to people who identify as involved in the effective altruism community: Tom Davidson on how quickly AI could transform the world.
You can click through for the audio, a full transcript and related links. Below is the episode summary and some key excerpts.
By the time that the AIs can do 20% of cognitive tasks in the broader economy, maybe they can already do 40% or 50% of tasks specifically in AI R&D. So they could have already really started accelerating the pace of progress by the time we get to that 20% economic impact threshold.
At that point you could easily imagine that really itās just one year, you give them a 10x bigger brain. Thatās like going from chimps to humans ā and then doing that jump again. That could easily be enough to go from [AIs being able to do] 20% [of cognitive tasks] to 100%, just intuitively. I think thatās kind of the default, really.
Tom Davidson
Itās easy to dismiss alarming AI-related predictions when you donāt know where the numbers came from.
For example: what if we told you that within 15 years, itās likely that weāll see a 1,000x improvement in AI capabilities in a single year? And what if we then told you that those improvements would lead to explosive economic growth unlike anything humanity has seen before?
You might think, āCongratulations, you said a big number ā but this kind of stuff seems crazy, so Iām going to keep scrolling through Twitter.ā
But this 1,000x yearly improvement is a prediction based on real economic models created by todayās guest Tom Davidson, Senior Research Analyst at Open Philanthropy. By the end of the episode, youāll either be able to point out specific flaws in his step-by-step reasoning, or have to at least consider the idea that the world is about to get ā at a minimum ā incredibly weird.
As a teaser, consider the following:
Developing artificial general intelligence (AGI) ā AI that can do 100% of cognitive tasks at least as well as the best humans can ā could very easily lead us to an unrecognisable world.
You might think having to train AI systems individually to do every conceivable cognitive task ā one for diagnosing diseases, one for doing your taxes, one for teaching your kids, etc. ā sounds implausible, or at least like itāll take decades.
But Tom thinks we might not need to train AI to do every single job ā we might just need to train it to do one: AI research.
And building AI capable of doing research and development might be a much easier task ā especially given that the researchers training the AI are AI researchers themselves.
And once an AI system is as good at accelerating future AI progress as the best humans are today ā and we can run billions of copies of it round the clock ā itās hard to make the case that we wonāt achieve AGI very quickly.
To give you some perspective: 17 years ago we saw the launch of Twitter, the release of Al Goreās An Inconvenient Truth, and your first chance to play the Nintendo Wii.
Tom thinks that if we have AI that significantly accelerates AI R&D, then itās hard to imagine not having AGI 17 years from now.
Wild.
Host Luisa Rodriguez gets Tom to walk us through his careful reports on the topic, and how he came up with these numbers, across a terrifying but fascinating three hours.
Luisa and Tom also discuss:
Get this episode by subscribing to our podcast on the worldās most pressing problems and how to solve them: type ā80,000 Hoursā into your podcasting app. Or read the transcript below.
Producer: Keiran Harris
Audio mastering: Simon Monsour and Ben Cordell
Transcriptions: Katy Moore
Tom Davidson: We can try and think about this system which is trying to solve these math problems. Maybe the first version of the AI, you just say, āWe want you to solve the problem using one of these four techniques. We want to use one of these seven methodologies on those techniques to get to an answer.ā And that system is OK, but then someone comes along and realises that if you let the AI system do an internet search and plan its own line of attack on the problem, then itās able to do a better job in solving even harder and harder problems. So you say, āOK, weāll allow the AI to do that.ā
Then over time, in order to improve performance, you give it more and more scope to kind of be creative in planning how itās going to attack each different kind of problem. One thing that might happen internally, inside the AIās own head, is that the AI may end up developing just an inherent desire to just get the answer to this math question as accurately as possible. Thatās something which it always gets rewarded for when itās being trained. Maybe it could be thinking, āI actually want the humans to be happy with my answer.ā But another thing it might end up thinking is, āYou know what? What I really want is just to get the answer correct.ā And the kind of feedback that we humans are giving that system doesnāt distinguish between those two possibilities.
So maybe we get unlucky, and maybe the thing that it wants is to just really get the answer correct. And maybe the way that the AI system is working internally is, itās saying, āOK, thatās my goal. What plan can I use to achieve that goal?ā Itās creatively going and looking for new approaches by googling information. Maybe one time it realises that if it hacked into another computing cluster, it could use those computations to help it solve the problem. And it does that, and no one realises ā and then that reinforces the fact that it is now planning on such a broad scale to try and achieve this goal.
Maybe itās much more powerful at a later time, and it realises that if it kills all humans, it could have access to all the supercomputers ā and then that would help it get an even more accurate answer. Because the thing it cares about is not pleasing the humans ā the thing it happened to care about internally was actually just getting an accurate answer ā then that plan looks great by its own lights. So it goes and executes the plan.
Luisa Rodriguez: Why couldnāt you just give the system an instruction that didnāt also come with rewards? Is it impossible to give an AI system a reward for every problem it solves by not hurting anyone?
Tom Davidson: I think that would help somewhat. The problem here is that there are kind of two possibilities, and itās going to be hard for us to give rewards that ensure that one of the possibilities happens and not the second possibility.
Here are the two possibilities: One possibility is the AI really doesnāt want to hurt humans, and itās just going to take that into account when solving the math problem. Thatās what we want to happen. The other possibility is that the AI only cares about solving the math problem and doesnāt care about humans at all, but it understands that humans donāt like it when it hurts them, and so it doesnāt hurt humans in any obvious way.
Tom Davidson: Today there are maybe tens of millions of people whose job it is to discover new and better technologies, working in science and research and development. Theyāre able to make a certain amount of progress each year. Itās their work that helps us get better computers and phones, and discover better types of solar panels, and drives all these improvements that weāre seeing.
But like weāve been talking about, shortly after AGI, thereās going to be billions of top human researcher equivalents ā in terms of a scientific workforce from AI. And if you imagine that workforce ā or half of that workforce, or just 10% of it ā working on trying to advance technology and come up with new ideas, then you have now 10 or 100 times the effort thatās going into that activity. And these AIs are also able to think maybe 10 or 100 times as quickly as humans can think.
And youāre able to take the very best AI researchers and copy them. So if you think that scientific progress is overwhelmingly driven by a smaller number of really brilliant people with brilliant ideas, then we just need one of them and we can copy them. They might be happy to just work much harder than humans work. It might be possible to focus them much more effectively on the most important types of R&D, whereas humans maybe are more inclined to follow their interests, even when itās not the most useful thing to be researching.
All of those things together just mean that weāll be generating 100 times as many new good ideas and innovations each year compared with today, and then that would drive the development of technologies to be at least 10 times faster than today.
Tom Davidson: I think this is a default. You could give objections to the argument I gave, but I think itās mostly possible to answer those objections. So you could say that discovering new technologies isnāt just about thinking and coming up with new ideas; you also need to do experiments. I think you can answer that objection by saying thatās right, we will need to do experiments.
Luisa Rodriguez: And thatās like testing a drug on humans, and maybe it takes five years or something to really check that itās safe and effective?
Tom Davidson: Right. Or youāve designed a new solar panel, and you want to test its performance in a variety of conditions. Or youāre running some experiments to see what happens when you combine these two chemicals together, because youāre not able to predict it in advance.
But if you have a billion AIs trying to push forward R&D, and theyāre bottlenecked on needing to do these experiments, then theyāll be putting in a huge amount of effort to make these experiments happen as efficiently as possible. Whereas today we might be using the lab for 50% of the time we could be using it, and we might be just doing a whole bunch of experiments and then analysing it afterwards and learning a little bit from each experiment, but also not trying to cram as much into each experiment as is humanly possible. If these AIs are limited on experiments, then theyāre going to be spending months and months just meticulously planning the micro details of every single experiment, so that you can get as much information as possible out of each one.
Tom Davidson: I agree it seems really crazy, and I think itās very natural and understandable to just not believe it when you hear the arguments.
I think whatās at the heart of it for me is that the human brain is a physical system. Thereās nothing magical about it. It isnāt surprising that we develop machines that can do what the human brain can do at some point in the process of technological discovery. To be honest, that happening in the next couple of decades is when you might expect it to happen, naively. Weāve had computers for 70-odd years. Itās been a decade since we started pouring loads and loads of compute into training AI systems, and weāve realised that that approach works really, really well. If you say, āWhen do you think humans might develop machines that can do what the human brain can do?ā you kind of think it might be in the next few decades.
I think if you just sit with that fact ā that there are going to be machines that can do what the human brain can do; and youāre going to be able to make those machines much more efficient at it; and youāre going to be able to make even better versions of those machines, 10 times better versions; and youāre going to be able to run them day and night; and youāre going to be able to build more ā when you sit with all that, I do think it gets pretty hard to imagine a future that isnāt very crazy.
Another perspective is just zooming out even further, and just looking at the whole arc of human history. If youād have asked hunter-gatherers ā who only knew the 50 people in their group, and who had been hunting using techniques and tools that, as far as they knew, had been passed down for eternity, generation to generation, doing their rituals ā if youād have told them that in a few thousand years, there were going to be huge empires building the Egyptian pyramids, and massive armies, and the ability to go to a market and give people pieces of metal in exchange for all kinds of goods, it would have seemed totally crazy.
And if youād have told those people in those markets that thereās going to be a future world where every 10 years major technological progress is going to be coming along, and weāre going to be discovering drugs that can solve all kinds of diseases, and youāre going to be able to get inside a box and land on the other side of the Earth ā again, they would have just thought you were crazy.
While it seems that we understand whatās happening, and that progress is pretty steady, that has only been true for the last 200 years ā and zooming out, itās actually the norm throughout the longer run of history for things to go in a totally surprising and unpredictable direction, or a direction that would have seemed totally bizarre and unpredictable to people naively at that time.
Tom Davidson: I donāt have a good understanding of what happened [with nuclear power], but I think there were some big catastrophes with nuclear power, and then it became very stigmatised. And the regulatory requirements around it, the safety requirements, became very large ā much larger, really, than was reasonable, given that fossil fuel energy has damaging health consequences as well through air pollution. As a result, it just became kind of a mixture of stigma and the additional cost from all that regulation just prevented it from being rolled out. But I do think there are a fair few very significant disanalogies between that case and the case of AI.
One thing is that there were other sources of energy that were available, and so it wasnāt too costly to be like, āWeāre not going to use nuclear; weāre going to use fossil fuels instead.ā Even the green, climate-change-concerned people could think about developing solar panels and renewable energies. In the AI case, there is going to be no alternative: thereās going to be no alternative technology which can solve all illness, and which can grant your nation massive national security and military power, and that can solve climate change. This is going to be the only option. So thatās one disanalogy.
Another disanalogy is the cost factor. With nuclear power, itās become more expensive over time due to regulations, and thatās been a big factor in it not being pursued. But weāve been discussing the specifics around these cost curves with compute and these algorithmic progress patterns, which suggest that the upfront cost of training AGI is going to be falling really pretty quickly over time. Even if initially, you put in loads of regulations which make it very expensive, itās really not going to be long until itās 10x cheaper. So permanently preventing it, when itās becoming cheaper and cheaper at such a high rate, is going to be really difficult.
Third is just talking about the size of the gains from this technology compared to nuclear power. France adopted nuclear power and it was somewhat beneficial ā it now gets a lot of its power from nuclear energy, and thereās no climate change impacts, and thatās great ā but itās not as if France is visibly and indisputably just doing amazingly well as a country because itās got this nuclear power. Itās kind of a modest addition. Maybe it makes it look a little bit better.
By contrast, if one country is progressing technology at the normal rate, and then another country comes along and just starts using these AIs and robots a little bit, youāre going to see very significant differences in how its overall technology and prosperity and military power is progressing. Youāre going to see that as countries dial up how much theyāre allowing AIs to do this work, that there are then bigger and bigger differences there. Ultimately, advancing technology at our pace versus advancing technology 30 times faster, over the course of just a few years, becomes a massive difference in the sophistication of your countryās technology and ability to solve all kinds of social and political problems.
Tom Davidson: The conclusion from my report is pretty scary. The bottom line is that my median guess is that it would take just a small number of years to go from that 20% to the 100%, I think itās equally likely to happen in less than three years as it is to happen in more than three years. So a pretty abrupt and quick change is the kind of median.
Some quick things about why itās plausible. Each year, once you take better algorithms and using more compute into account, weāre currently training AIs each year that have three times bigger brains than the year before. So, this is a really rough way to think about it, but imagine a three times smaller brain than humans ā thatās chimpanzee-brain size.
And right now itās humans that are doing all the work to improve those AI systems ā as we get close to AIs that match humans, weāll be increasingly using AI systems to improve AI algorithms, design better AI chips. Overall, I expect that pace to accelerate, absent a specific effort to slow down. Rather than three times bigger brains each year, itās going to be going faster and faster: five times bigger brains each year, 10 times bigger brains each year. I think that already makes it plausible that there could be just a small number of years where this transition happens ā where AIs go from much worse than humans to much better.
To add in another factor, I think that itās likely that AIs are going to be automating AI research itself before theyāre automating things in most of the economy. Because thatās the kind of the tasks and the workflow that AI researchers themselves really understand, so they would be best placed to use AIs effectively there ā there arenāt going to be delays to rolling it out, or trouble finding the customers for that. And the task of AI research is quite similar to what language models are currently trained to do. Theyāre currently trained to predict the next token on the internet, which means theyāre particularly well suited to text-based tasks. The task of writing code is one such task, and there is lots of data on examples of code writing.
Already weāre seeing that with GPT-4 and other systems like that, people are becoming much more interested in AI, much more willing to invest in AI. The demand for good AI researchers is going up. The wages for good AI researchers are going up. AI research is going to be a really financially valuable thing to automate.
If youāre paying $500,000 a year to one of your human research engineers ā which is a lot lower than what some of these researchers are earning ā then if you can manage to get your AI system to double their productivity, thatās way better than doubling the productivity of someone who works in a random other industry. Just the straightforward financial incentive as the power of AI becomes apparent will be towards āLetās see if we can automate this really lucrative type of work.ā
Thatās another reason to think that we get the automation much earlier on the AI side than on the general economy side ā and that by the time weāre seeing big economic impacts, AI is already improving at a blistering pace, potentially.
Tom Davidson: In terms of plans for making the whole thing go well, itās especially scary, because a really important part of the plan, from my perspective, would be to go especially slowly when weāre around the human level ā so that we can do loads of experiments, and loads of scientific investigation into this human level AI: āIs it aligned if we do this technique? What about if we try this other alignment technique? Does it then seem like itās aligned?ā Just really making sure we fully understand the science of alignment, and can try out lots of different techniques, and to develop reliable tests for whether the alignment technique has worked or not, that theyāre hard to game.
Luisa Rodriguez: The kind of thing that ARC has done with GPT-4, for example.
Tom Davidson: Exactly. I think if we only have a few months through the human-level stage, that stuff becomes really difficult to do without significant coordination in advance by labs. I think that there are really important implications of this fast transition in terms of setting up a kind of governance system, which can allow us to go slowly despite the technical possibilities existing to go very fast.
Luisa Rodriguez: That makes sense. I feel like Iāve had some background belief that was like, obviously when weāve got AI systems that can do things humans can do, people are going to start freaking out, and theyāre going to want to make sure those systems are safe. But if it takes months to get there and then within another few months weāre already well beyond human capabilities, then no oneās going to have time to freak out, or itāll be too late. I mean, even if we spend the next seven years left in the decade, that sounds hard enough.
Tom Davidson: Yeah. I agree.
Luisa Rodriguez: So a takeaway is that we really need to start slowing down or planning now. Ideally both.
Tom Davidson: Yeah. And weāll need the plans we make to really enable there to be mutual trust that the other labs are also slowing down. Because if it only takes six months to make your AIs 10 or 100 times as smart, then youāre going to need to be really confident that the other labs arenāt doing that in order to feel comfortable slowing down yourself.
Luisa Rodriguez: Right. If it was going to take 10 years and you noticed three months in that another lab is working on it, youād be like, āEh, we can catch up.ā But if itās going to take six months and youāre three months in, youāve got no hope ā so maybe youāll just spend those first three months secretly working on it to make sure that doesnāt happen, or just not agree to do the slowdown.
Tom Davidson: Yeah.
Luisa Rodriguez: Oh, these are really hard problems. I mean, it feels very prisonerās dilemma-y.
Tom Davidson: Iām hoping itās going to be more like an iterated prisonerās dilemma, where thereās multiple moves that the labs make, one after the other, and they can see if the other labs are cooperating. In an iterated prisonerās dilemma, it ultimately makes sense for everyone to cooperate ā because that way, the other people can see you coordinating, then they coordinate, and then everyone kind of ends up coordinating.
One thing is if you could set up ways for labs to easily know whether the other labs are indeed cooperating or not, kind of week by week. That turns it into a more iterated prisonerās dilemma, and makes it easier to achieve a kind of good outcome.
Luisa Rodriguez: Yeah, that makes sense. I imagine itās the case that the more iteration you get in an iterated prisonerās dilemma, the better the incentives are to cooperate. And so just by making the timelines shorter, you make it harder to get these iterations that build trust.
Tom Davidson: Yeah, I think thatās right.
Tom Davidson: In an ant colony, ants are smarter than like a human cell is: theyāre kind of self-contained units that eat and do tasks by themselves, and theyāre pretty autonomous. But the ants are still pretty dumb: no ant really knows that itās part of a colony, or knows that the colony has certain tasks that it needs to do, and that it has to help out with the colony efforts. Itās more like a little robot thatās bumping into other ants and getting signals and then adjusting its behaviour based on that interaction.
Luisa Rodriguez: Itās not like a company, where the different people in the company are like, āMy job is marketing,ā and they have a basic picture of how it all fits together. Theyāre much more like if a person at a company doing marketing was just like, āI donāt know why I do it, I just do it.ā
Tom Davidson: Yeah, exactly. Another disanalogy with the company is that in a company, thereās someone at the top thatās kind of coordinating the whole thing ā whereas with ants, thereās no one thatās coordinating it, including the queen. Thereās no management system; itās just all of the hundreds and thousands of ants have their individual instincts of what they do when they bump into each other, and what they do when they bump into food, and what they do when they realise that thereās not as much food as there needs to be.
And by all of the ants following their own individual instincts, it turns out that they act as if they were a fairly well-coordinated company thatās ensuring that there are some ants going to get food, and some ants that are keeping the nest in order, and some ants that are feeding the young. That coordination happens almost magically, and emerges out of those individual ant interactions.
One example of how this works is that if an ant comes across a body of a dead ant, and if thereās another dead body nearby, it would tend to move it to be close to the other dead body. Thatās just an instinct it has: it just moves the body towards another. If thereās one pile of three dead ants and another pile of two dead ants, it will tend to go towards the bigger pile, so tend to move with this extra dead ant towards the pile of three. If all the ants just have those instincts, then if thereās initially a sprawling mass of dead bodies everywhere, then those dead bodies will be collected into a small number of piles of bodies.
They donāt have to know that the whole point of this instinct is to clear the ground so that itās easier to do work in the future; itās just an instinct they have. They donāt have to know that when everyone follows that instinct, this is the resultant pattern of behaviour.
This is an example of a system where lots of less-clever individuals are following their local rules, doing their local task, and that what emerges from that is a very coherent and effective system for ultimately gathering food, defending against predators, raising the young.
An analogy would be that maybe we think itās pretty dangerous to train really smart AIs that are individually very smart, but it might be safer to set up a team of AIs, such that each AI is doing its own part in a kind of team and doesnāt necessarily know how its work is fitting into the broader whole. Nonetheless, you can maybe get a lot more out of that kind of disconnected team of AIs that are specialised, and that just kind of take their inputs and produce their outputs, without much of an understanding of the broader context. And just thinking that maybe that would be a safer way to develop advanced AI capabilities than just training one super-smart AI megabrain.
Great listen, I enjoyed this a lot!
Kudos to Luisa who does a really good job of acting as a "Watson", asking the followup questions that listeners might have. Several times in this podcast I was happy with her summaries or clarifying questions, even if I suspect she already knew the answers many of those times.