By Robert Wiblin | Watch on Youtube | Listen on Spotify | Read transcript
Episode summary
I must disagree strongly when people say ‘nature is the world’s worst bioterrorist.’ That is not true. We can do worse than nature. This is true in all aspects of science. There are so many examples where we engineer things better than nature has ever provided. We can make materials that are much stronger than anything in nature. That is not the ceiling. So we should be deeply concerned about the ability for AI to… build things worse than we have ever seen on Earth. — Richard Moulange |
Last September, scientists used an AI model to design genomes for entirely new bacteriophages (viruses that infect bacteria). They then built them in a lab. Many were viable. And despite being entirely novel some even outperformed existing viruses from that family.
That alone is remarkable. But as today’s guest — Dr Richard Moulange, one of the world’s top experts on ‘AI–Biosecurity’ — explains, it’s just one of many data points showing how AI is dissolving the barriers that have historically kept biological weapons out of reach.
For years, experts have reassured us that ‘tacit knowledge’ — the hands-on, hard-to-Google lab skills needed to work with dangerous pathogens — would prevent bad actors from weaponising biology. So far, they’ve been right.
But as of 2025 that reassurance is crumbling. The Virology Capabilities Test measures exactly this kind of troubleshooting expertise, and finds that modern AI models crushed top human virologists even in their self-declared area of greatest specialisation and expertise — 45% to 22%.
Meanwhile, Anthropic’s research shows PhD-level biologists getting meaningfully better at weapons-relevant tasks with AI assistance — with the effect growing with each new model generation.
In today’s conversation, Richard and host Rob Wiblin discuss:
- What AI biology tools already exist
- Why mid-tier actors (not amateurs) are the ones getting the most dangerous boost
- The three main categories of defence we can pursue
- Whether there’s a plausible path to a world where engineered pandemics become a thing of the past.
This episode was recorded on January 16, 2026. Since recording this episode, Richard has seconded to the UK Government — please note that his views expressed here are entirely his own.
Video and audio editing: Dominic Armstrong, Milo McGuire, Luke Monsour, and Simon Monsour
Music: CORBIT
Camera operator: Jeremy Chevillotte
Transcripts and web: Elizabeth Cox and Katy Moore
A couple of announcements from 80,000 Hours
|
The interview in a nutshellRichard Moulange, Senior AI Policy Manager at the Centre for Long-Term Resilience and scientific contributor to the International AI Safety Report, argues that AI is already crossing critical thresholds in biological capability — and that the field’s response has so far been focused on the wrong threat actors, with too little investment in defensive technologies. AI biological capabilities have crossed alarming milestonesSeveral recent empirical results demonstrate that AI-enabled biological risk is no longer theoretical:
The field has been focused on the wrong threat actorsRichard argues that AI–bio risk management has been too fixated on whether AI can uplift novices, when the real danger lies with mid-tier actors:
Nonetheless, Richard emphasises that state bioweapons programmes remain deeply concerning. The US Department of State assesses that Russia, Iran, and North Korea all actively pursue biological weapons. North Korea’s assessed capabilities have been expanding in recent annual reports. Richard strongly pushes back on the idea that “nature is the world’s worst bioterrorist” — we can engineer things far worse than nature has ever produced, and AI could help even state programmes reach new ceilings of harm. The most likely catastrophic AI-bio scenariosRichard identifies three primary threat categories:
AI autonomy is a key accelerant: as AI agents can complete more steps of a biological workflow without human intervention, the range of actors who can attempt dangerous activities expands dramatically. The three defensive interventions, and their limitsRichard breaks our response options into three approaches, with increasing robustness: Access controls: useful but not sufficient alone
Guardrails: better than ever, but probably not robust long-term
Defensive acceleration: the most robust and underexplored category Richard is most excited about proactively and differentially deploying AI and other technologies to strengthen biological defences:
Career advice if you want to help
People with deep national security or intelligence community experience are especially needed — if that describes you, consider getting in touch with the CLTR team. |
Highlights
We can now design novel genomes
Richard Moulange: What the team were able to do was they had the base Evo 2 model, and then they fine-tuned it on what are called bacteriophages — so these are viruses that eat, that kill bacteria — fine-tuned it on something like 15,000 of those, and then started prompting it with the beginnings of known bacteriophage genomes to see if they can make new ones.
So this is again akin to, with LLMs, you say, “Write me a story about this kind of topic” — I don’t know, a murder mystery — and then you start with a classic opening sentence and see where the LLM takes you. It’s the same kind of thing.
And they discovered these sequences that the model produced are going to be new: they’re going to be different than existing genomes. And this is huge, because this is the first time that an AI design of a genome has turned out to actually be novel. It really is very different than existing bacteriophages, existing viruses. I think the most different one was 7% different than anything that we’ve seen in nature before. And they work in the lab.
And more than that, it didn’t just make viable genomes — they worked better, they functioned better than the best bacteriophages that we’d ever found before.
This is huge! We can now design organisms, small ones, to do things better than we have ever seen in nature. We can go beyond nature in this very narrow subdomain of biology. And this heralds the promise of genome-scale engineering, which is going to be, I think, a revolutionary capability within biology.
The end of the 'tacit knowledge' barrier
Rob Wiblin: I guess for a long time people have been worried that terrorists or bad actors or rogue countries might be able to develop new biological weapons or new pandemics that we really wouldn’t like, and isn’t science advancing All of these tools that are increasing knowledge and disseminating it, and that should make us very worried?
And probably the most intelligent, most reasonable response has been that it’s not enough to have a bunch of textbook knowledge, like explicit knowledge that you could Google or look up in a virology textbook, because most of the actual barrier to making these things is the know-how to actually do it in the lab. There’s a tonne of understanding how to do the experiments, how to debug things that go wrong, and literally the motions that you’re doing with your hands that you can’t Google. So even if people tried, they still wouldn’t be able to get there.
And I think this Virology Capabilities Test was kind of set up to answer, is AI now assisting with this other part of the problem?
Richard Moulange: Yes. … It’s a really great eval. It’s an eval that’s a set of questions, and the questions are often accompanied by an image. … They’ll show an image or they’ll provide a paragraph that describes some sort of modern virology experiment, maybe literally a picture of a dish with some virus in it. And then there’ll be a question like, “This thing looks like the wrong colour, or something has gone wrong with this experiment. Here’s some information about what the person did in the lab: a series of very complicated, PhD-level steps they took. What do you think happened? Why did this go wrong?” This is really getting at, “We are trying to debug modern virology workflows.”
And there’ll be a bunch of answers, often maybe 10 different answers, only which maybe one to five are right, and it’ll be different for different questions. And the marking scheme is really quite harsh, because it says, unless you really identify all these things, we’re not going to give you the mark.
So it’s a pretty hard eval already. What’s harder about it is it was designed by virology experts. And they had these multiple rounds of review as described in the paper to get down to questions that are really well scoped for modern virology, and really, really difficult.
So difficult, in fact, that something else they did is they went and spoke to these experts who were writing the questions. They said, “What sort of biological activities do you do in your day-to-day work, and how good are you at them?” And they really distinguish between merely having a working knowledge versus maybe being specialised versus having expertise in that particular thing.
And then they said, “For those who are expert in this particular subdomain, we’re just going to show you the questions from our benchmark that are officially about that. We are trying to make it as easy as possible for you, as the human, to do well. We’re not going to show you things outside the thing you say yourself you’re really, really good at.” Humans got 22% on the test: four out of five things in their own area of expertise they couldn’t do. So this is really, really hard.
AI did much better. Back in early 2025 when the paper was released, OpenAI’s best model at the time, their o-series models, o1, o3, I think it was o3 got something like 45%. The best AI systems were getting double the top virology experts answering in their own area of expertise about these tacit knowledge problems: “Why has this petri dish gone wrong?” or “What is going on in this experiment that doesn’t make sense?”
This is huge, because this put pay to the claim that tacit knowledge barriers would always and inevitably be something that could never be overcome. The eval doesn’t answer everything about tacit knowledge. You’re quite right. You talked about holding a pipette or how to sort of pour a particular kind of gel: these are very physical things that are not easy to test in an eval. But the test really does get an awful lot of difficult knowledge that humans themselves say are huge blockers on modern state-of-the-art work — and we know that they are blockers because they didn’t do very well, and models could do much better. …
It moved certain people in the community a lot, and people really woke up to, “We thought it would be a few years until this tacit knowledge thing really started kicking in. It looks like we’re here already.”
I’ll note it’s not just that AI has been much better than individual experts: they even went back and got teams of experts together, and the teams still weren’t as good as the best AI. The best human teams get something like 40% on the eval, which is still lower than the state of the art from AI systems.
It didn’t persuade everyone, however. What really worries me here is that I think it’s partly that people just didn’t know it happened. I still read in newspapers, in op-eds, and also meet people at conferences who are often experts in maybe biosecurity in general or in security studies, but don’t deeply follow the AI angle, who say, “But this tacit knowledge thing, it’s a huge barrier. We’ll never overcome it.” And I say, “What about the Virology Capabilities Test? Don’t you think SecureBio really provided evidence that sort of questions that?” And they’re like, “What’s that? I’ve never heard of it.”
Which bad actors does AI help the most?
Rob Wiblin: Can you lay out what is the range of actors that we need to have in our mind, from maybe the least sophisticated to the most sophisticated? …
Richard Moulange: We looked at five different types of actors:
- Novices. These are individuals who really don’t know very much. Maybe they don’t have very much biological training, they don’t have much AI training, they don’t have that many resources.
- Highly capable individuals. So these are people who are often expert in one particular thing. They’re not experts in everything under the sun, but they really might be PhD or above in maybe a particular biological subdomain or an AI subdomain. I think a good example for the listeners to think about would be Dr Bruce Ivins, who allegedly — it’s never been shown with total confidence — was behind the anthrax attacks against the US Congress [and journalists] in late 2001. He was one of the US’s top anthrax experts, who worked at their leading national biodefence lab.
- We also talked about group actors, and we distinguished them in three different ways: somewhat capable groups, moderately capable groups, and highly capable groups. As you go up in capability, you see that the group is able to have more people working to horrifically cause harm to others. That is what we’re talking about: more money, more ability to actually evade law enforcement or intelligence agencies trying to spot what they’re doing; but also just more expertise, more know-how, both in AI and biology, more ability to conduct offensive cyber operations against AI companies. As you go up, it gets worse and worse. …
So where might be the most [AI] uplift? That was part of what our paper was trying to answer. The bottom line is we think that the uplift really comes in the middle, roughly … We haven’t really found evidence for novices, though we’ve been looking for it. … We are in fact finding evidence about uplift for mid-tier actors: highly skilled individuals, PhD students. … They can’t just already do everything under the sun, but also are not so inept that they would fail even with the help of AI. …
Rob Wiblin: The mental model that I’ve had, as a result of thinking about this for one minute, is you’ve got kind of an S-curve with all of these things, and on the x-axis you’ve got just how much expertise do you have in the area? And then what’s your probability of success?
So there’s a point at which you can already do it, in which case you don’t need the AI to help you; there’s a point at which you’re doomed to failure no matter how much someone coaches you, because you’re just no good; and the people in the middle who you would think would get the biggest boost from having some advice. …
I think something that’s slightly useful about thinking about the S-curve is that the S-curve is going to differ depending on the challengingness of the thing that this person or group is trying to do.
So if we’re thinking about making mirror bacteria — something that no one has ever done and that is actually some of the most challenging frontier science possible — then probably the only group that would be meaningfully helped would be a state actor, like the Russian bioweapons programme. They’re the only folks who would be close enough to having a shot that AI assistance would help them out. …
And I guess conversely, for the absolutely most basic — I suppose perhaps like chemical weapons attacks that are more straightforward than biological weapons attacks — it might be the novices who now are getting the biggest uplift, because they were the ones who would struggle.
Richard Moulange: Everyone else can maybe just do it out of the box.
Rob Wiblin: Yes. If you had a dedicated group of semi-experts.
Richard Moulange: Yeah. So I’ve been concerned that I think the threat modelling has been wrong for a while. I think that it’s understandable why it’s gone this way. I’m glad that we are able to measure novice uplift, but we must not do this at the detriment of measuring expert uplift.
AI biorisks are sometimes dismissed (and that's a huge mistake)
Rob Wiblin: I think one of the reasons that AI-misalignment-focused people sometimes are not so bought into bio as being such a great focus, trying to improve resilience to bio as being such a great focus, is they think that any AI in the situation, it’s going to be overdetermined that it could do this. It’s going to have such an easy time making just many different pandemic viruses, it might have such an easy time even advancing to mirror bacteria, that there’s nothing really that we can do to improve our resilience meaningfully. We’re just toast, no matter what.
I guess you don’t share that view. Why is that?
Richard Moulange: Why don’t I share this view? I think the first thing is that the sorts of AI takeover stories that include this, that often come from committed members of the classic AI safety community, don’t seem very nuanced to me. Maybe this is a thing that I should be careful to steelman the other side, as they say. But it is not enough to write down, “…and then the superintelligence makes a weapon that kills everyone” and go, “Well, of course, it was just much smarter. So you could just do that.”
This is a concern, but there are more steps involved. Even as I’m saying to you, Rob, “This is really concerning; I think this is a major national security risk that is only going to grow markedly in the next decade and requires serious resources,” I’m also not saying that in two years we’ll have a world where it is certain that we will die. I’m not saying that, because there will be barriers that we can put in place even with a so-called superintelligence.
The superintelligence will require physical resources. Anyone trying to build a biological weapon will require a laboratory, it will require sophisticated equipment, it will require people who can use that equipment. Now, this raises its own concerns. This is why I think it’s great that UK AISI, for example, has this AI persuasiveness programme to think about how could AI be manipulating people? Sometimes people go, “Is that really relevant to the most extreme risks?” I’m like, yes, because the concern is that AI might manipulate top biological scientists. We saw this with the Soviet programme: many people who worked on the Soviet programme didn’t know they were part of a biological weapons programme. They did genuinely think they were working on vaccines, but the work they were doing was actually feeding directly into the militarisation of weapons of mass destruction.
So yes, that’s another step that AI will probably need to take, especially if we can constrain it not to be able to have access easily to laboratory equipment. It’s not a given that we’ll immediately have totally automated cloud laboratories, though I quite agree that that technology is also advancing and is something that will need to be carefully secured. …
Also I would just say that weak, unnuanced, oversimplified arguments are not in fact going to convince precisely those colleagues, especially in governments, that it is essential to work alongside to deal with these threats. There are people who have studied biological weapons programmes, active ones, for decades. They have a lot to contribute. I am concerned when we have conversations that lack in nuance that that turns off deep expertise that we desperately need.
Rob Wiblin: I think part of what’s going on with this mentality — that there’s no biological countermeasures that you can have that would really constrain the kind of misaligned AI that we’re worried about — is because for a long time people have been worried about this massive intelligence explosion, the kind of foom scenario where you go from human level to vastly superhuman superintelligence. I guess originally, literally overnight. I guess now even the most extreme people probably talk about weeks.
Richard Moulange: Oh, it’s just weeks now, guys. We’re fine.
Rob Wiblin: If that’s how things go, then it might be the case that any kind of measure that you put in place, an AI that is just many, many times smarter than the whole of humanity put together would be able to find some way around it and would be able to kill you one way or another. Which maybe you don’t agree about, we’ll come back to that, but it would be able to make so many scientific advances that you’re just not going to be able to stop it.
But we don’t know whether that will happen at all. We don’t know whether there’ll be an intelligence explosion at all. We could be in a world where the feedback loop is too weak, and we basically just have a gradual increase in capabilities all the way through. In that case, at any point in time, the ability to do damage for a rogue AI, for a misaligned AI is only going to be somewhat above the level of knowledge that humans have, and I guess it’s going to be potentially competing with quite a lot of people and quite a lot of compute that is arrayed against it. We might also have all kinds of different control measures that are constraining the amount of compute that it can access that mean that it can’t work for very long on something before it’s being detected, so it doesn’t have as long a leash potentially. It doesn’t have access to unlimited resources to try to do these things.
So each countermeasure that you put in place — each extra bit of resilience to make it harder to make new diseases, to try to catch things before they get synthesised, to try to give us more options for tackling a disease once it’s released — do make it just a less promising project for the AI to engage in at all, and maybe makes it not as interested in going rogue in the first place, because it doesn’t rate its chances of success.
So I would say it is possible that it’s overdetermined, and that all of this stuff will turn out to have been futile in this project — but it also could turn out that actually it’s extremely relevant, and we don’t get a superintelligence explosion that moves things out of our hands, and this will actually make the difference.
Richard Moulange: Yeah, I completely agree. I think you’ve gestured at a couple of different things that we should talk to more as we continue chatting.
You’ve gestured at deterrence: the ability of, if it were better at defending, it makes it less palatable, less of an incentive for a threat actor, whether human or AI, to explore that.
You’ve mentioned defence in depth: maybe it’s hard to come up with one silver bullet for this problem, but can we stack lots and lots of different defences that together make us much more resilient to this sort of threat?
I want to push back further on this, even if you had a foom scenario of superintelligence in weeks, would that magically turn into a deployable biological weapon? Where would it get the DNA? I think there’s a number of different ways we can think about this.
One is it would be more like a terrorist group. It’d have to order the DNA from somewhere — and immediately there you can go, well, we should definitely have gene synthesis screening so that whenever you order DNA, it is screened for what it might be able to do, so that you do not in fact send out dangerous pathogens to anyone. And again, you can be using AI for defence here. We have copies of the AI. If there’s a superintelligence that understands what sort of pathogen would be the worst ever, there might be precursor models that are still good enough at spotting, “Wow, this one seems really dangerous. I don’t think you should send it out.”
However, maybe a superintelligence, or whatever AI system, would have access to resources a bit more like a state. I especially think about interactions between the frontier AI companies and pharmaceutical companies. Absolutely they will be wanting to sell their products to pharmaceutical companies. It’s a huge market and it’s really important. We want better drugs, we want to cure cancer: this is often the cry with AI.
So we’re going to have to think carefully about guardrails, on making sure that models that we deploy in that domain are not misaligned and they are controlled. But we can do that. That’s the same problem as the other one; it’s the same problem as sort of classic misalignment. Either we’re going to be able to sufficiently align and control AI systems in the bowels of a frontier company or a government, such that we are willing to then put them in other industries, or we can’t. If we do that, and we put them in the industries, then we meet the next question of: are we giving them too many affordances with respect to unsupervised physical laboratory access? But that’s totally a solvable problem.
So this is where I’m always a bit confused. I suppose unless someone thinks that misalignment is the situation by default, that really nothing we can do will ever control or constrain — these are the people with 90%+ p(doom)s — then sure, ignore biology. But for everybody else who thinks this is a real concern but that other aspects of AI safety are solvable, I would extend your relative lack of pessimism to AI bio too.
The promise of surveillance and attribution
Rob Wiblin: OK, so access controls have their place: they’re quite useful, potentially they can buy us some time, they can buy us some risk reduction. Guardrails you’re a bit of a relative pessimist on: I’m sure that we’ll get some use out of them, but they’re not going to ultimately save us in the long term.
That pushes us onto the third broader category, which is defensive acceleration: other technologies that can advantage defenders, that can advance our ability to safeguard ourselves relative to the ability of bad actors to harm people. What is your top “def/acc,” as it’s called, technology recommendation that you think it would be really important for us to pursue and get on top of?
Richard Moulange: There are a lot of different technologies. I wrote a blog post [and later a full report] fairly recently where I listed more than a dozen. So when I pick my top, I want to say it’s only narrowly my top. Unfortunately, we’re in a world where we’re going to need approximately all of them. It is really about defence in depth.
But ideas that I am particularly excited about, I’m going to talk about two, but they’re very interrelated: AI-enabled metagenomic biosurveillance and AI-enabled attribution technologies.
Why am I excited by these technologies? Well, there’s a group in Boston, the Nucleic Acid Observatory [now SecureBio Detection], which you’ve discussed previously on the podcast, and they do wastewater and sewage screening for pathogens. They collect samples from sewage systems, from aeroplanes, and they see what sort of pathogens might be there. This is called “metagenomic surveillance.”
The reason it’s metagenomic is they’re looking over many different kinds of genomes. They’re not just looking at just viral genomes; they’re looking at viruses, bacteria, fungi, and lots of other things all at once. And explicitly, they’re trying to spot things that we may have never seen before.
The reason this is important is this is going to be one of our top defences against engineered pandemics. Because we have ways to spot smallpox — the known smallpox genome is on the internet, we know what that looks like. But being able to spot fragments of something that is in fact engineered, that’s different from anything the world has seen before, especially when it’s sort of broken up — it’s not just going to be a complete genome, it’s not going to be a whole bacterium floating around, it’s just going to be a little bit — that’s going to be really important to defend against people who might try and deploy engineered pandemics, which is what we’ve been talking about.
And this is in fact linked to attribution. Attribution is the ability to say that this thing was engineered versus wasn’t — and in fact, it was engineered by them. And this is, I believe, very important for deterrence, because if you can say to a state or a non-state, if you know who has done it, then you can punish them, then you can offer retribution. And if they know that you know you can do that, this is where the game theory comes in, that creates a disincentive for them to do it in the first place.
And I think this is really important. We saw a failure of the ability to attribute during COVID because we had multiple parts of the US intelligence community publicly disagreeing: some of them were saying, yep, this is a natural pandemic; some people were saying, no, this seems to be engineered, so it was going to be a lab leak. There wasn’t consensus. And without consensus, it’s much harder to take necessary or decisive political or policy action.
AI companies talk about defensive acceleration more than they fund it
Rob Wiblin: So I think the def/acc idea — let’s not slow down technology; let’s speed up the stuff that is good, that advantages defenders — is a very attractive framing, a very attractive mentality, because it allows you to, on the one hand, address your safety concerns and your anxieties without seeming like you’re anti-progress and anti-technology and you’re a doomer or something like that. …
How much of this stuff is actually happening though? I worry that it’s such a nice idea that people talk about it a tonne — but then are actually many people going into def/acc projects? Is it attracting the talent, is it attracting the funding that it needs?
Richard Moulange: We’re barely starting. I think we’re in the foothills of the def/acc mountain. A good example I think is BlueDot Impact, which provides courses on where people can go and learn about AI security and biological security topics and skill up. And they’ve now started a big new programme specifically around defensive acceleration. There have been lots of hackathons where they invite people to go and create new ideas, and now they’re going to be funding some of those best ideas. This is really exciting. It’s also like one not-very-well-heard-of company. I think we can do better. …
Rob Wiblin: Yeah. What do you think is the barrier to getting more of these projects happening? I could see three main ones.
On the government side, governments aren’t willing to spend money. I guess the UK in particular is very fiscally stretched. It’s always difficult to bid for large budgets for science.
There’s also just the bandwidth to even think of it: governments are dealing with all kinds of different things. This is not a threat that has actually happened yet. So it may be hard to get as many staff as you might like to even be considering what the response ought to be.
Then there’s also, I imagine the experts in this area, especially ones who are both good at the science and are highly entrepreneurial and could try to own a project end to end, are surely in enormous demand. Persuading them to work on one of these def/acc AI-bio projects, I’m sure it’s competitive, but it’s a hard sell because there are many things that they could go and do.
Did you have a sense of what the bottleneck is?
Richard Moulange: Yeah, I think there are a lot of bottlenecks, and I think you’ve said them. I think a lot of it is going to be people — which is exciting, because I think there’s lots of great latent talent out there. A very good friend of mine, he finished his PhD at Cambridge as well last year, and now he’s going to found broadly an AI-bio startup. I think it’s exciting, but he’s one of very few people I know.
I know of a few other startups that have just been announced in the last few months. … This is not the response I would expect from a society who is going, “This is one of the defining national security challenges of our time.” We will get there. I hope it’s not only a biological attack that makes us get there, but there are people. So I’m excited by work to have hackathons, ways of sort of doing pull mechanisms to give promising people a little bit of funding. …
Rob Wiblin: You have a really nice blog post on your Substack where you go through 15 different def/acc projects that you’d really like to see the UK, and I guess the US as well, get on top of and advance faster than is currently happening. So if people are looking for more ideas for science that they could go do, or companies even that they could start, then they could go and start by looking at that blog post.
Richard Moulange: Please do.
