Still no strong evidence that LLMs increase bioterrorism risk

freedomandutility

https://www.lesswrong.com/posts/ztXsmnSdrejpfmvn7/propaganda-or-science-a-look-at-open-source-ai-and

Linkpost from LessWrong.

The claims from the piece which I most agree with are:

Academic research does not show strong evidence that existing LLMs increase bioterrorism risk.
Policy papers are making overly confident claims about LLMs and bioterrorism risk, and are citing papers that do not support claims of this confidence.

I'd like to see better-designed experiments aimed at generating high quality evidence to work out whether or not future, frontier models increase bioterrorism risks, as part of evals conducted by groups like the UK and US AI Safety Institute.

58 Reactions

Mentioned in

108RAND report finds no effect of current LLMs on viability of bioterrorism attacks

Comments9

Sorted by

New & upvoted

Click to highlight new comments since: Today at 12:52 AM

Benevolent_RainNov 3 202316

I am not sure why you receive downvotes on this post - I also think that anything that is made strong claims about and that has large impacts (possibly a significant reason for the UK and US' movements on AI policy is the perceived AI+bio risk) should also be backed up by evidence. Perhaps we just have not had time to conduct these studies and if so I think it is fair that strong statements have been used on AI+bio in order to make potential risks salient. But as we get more and more traction with AI policy and societal awareness I think we need to go back and revisit these assumptions. The only "evidence" I have found so far are some less-than-reliable interpretations of Metaculus results on the overlap of AI and bio.

denyeverywhereNov 5 20238

I think the far more important claim from the post (for an EA forum) is that the author says that Open Philanthropy is funding low-quality science, because they're either doing a bad job of exercising oversight, or they're intentionally producing disingenuous propaganda. Either of these options suggest they are not worthy of EA support.

You might not agree with this. I'm not sure I agree with this (except insofar as I agree with the author about the quality of the paper), but I don't think it's appropriate to just pass over it in silence.

Daniel GreeneNov 10 20237

RAND and Gryphon Scientific are in the process of writing up an experiment comparing the ability of red teams to develop bioterrorism plans using traditional internet search vs. LLMs-plus-internet. Hopefully this will soon improve the state of the evidence!

SammyDMartinNov 2 20237

I've thought for a while based on common sense that since most people seem to agree that you could replicate the search that LM's provide with a half decent background knowledge of the topic and a few hours of googling, the incremental increase in risk in terms of the number of people it provides access to can't be that big. In my head it's been more like the bioterrorism risk is unacceptably high already and has been for a while and current AI can increase this unacceptably high already level by like 20% or something and that is still an unacceptably large increase in risk in an absolute sense but it's to an already unacceptable situation.

Chris LeongNov 3 20235

Copying my comment over here:

Thanks for sharing your concerns and helping us be more calibrated on the value of this study.
I agree that a control group is vital for good science. Nonetheless, I think that such an experiment is valuable and informative, even if it doesn't meet the high standards required by many professional science disciplines.
I believe in the necessity of acting under uncertainty. Even with its flaws, this study is sufficient evidence for us to want to enact temporary regulation at the same time as we work to provide more robust evaluations.
The biggest critique for me isn't that there isn't a control group, but that they don't have a limitations section that suggests follow-up experiments with a control group. A lot can be forgiven if you're open and transparent about it, particularly when a field is new.
I've only skimmed this post, but I suspect your principle of substitution has a wrong framing. LLM's can make the situation worse even if a human could easily access the information through other means, see beware trivial inconveniences. I suspect that neglecting this factor causes you to significantly underrate the risks here.

[anonymous]Nov 3 20234

David Thorstad has written 3 posts so far casting doubt on whether biorisk itself is as plausible as EAs think it is: https://ineffectivealtruismblog.com/category/exaggerating-the-risks/biorisk/

David Mathers🔸Nov 3 202314

Thorstad is mostly writing about X-risk from bioterror. That's slightly different from biorisk as a broader category. I suspect Thorstad is also skeptical about the latter, but that is not what the blogposts are mostly focused on. It could be that frontier AI models will make bioterror easier and this could kill a large number of people in a bad pandemic, even if X-risk from bioterror remains tiny.

Minh NguyenNov 3 20234

Disclaimer that I am practically a layman on this topic.

My threat model is that creating bioweapons requires a series of steps that are getting easier and easier to do, and LLMs are significantly accelerating one of these steps.

In that sense, open-sourcing LLMs does contribute to increased biorisk, but the action of restricting open-source LLMs to restrict the increase in biorisk seems a disproportionate response by itself?

For example, the internet certainly increased the ease of conducting terrorism, but many people would consider it a disproportionate response to heavily restrict the internet just to restrict terrorism.

Alix PhamNov 9 20231

Kevin Esvelt's team has released this paper earlier this month:

Abstract: Large language models can benefit research and human understanding by providing tutorials that draw on expertise from many different fields. A properly safeguarded model will refuse to provide "dual-use" insights that could be misused to cause severe harm, but some models with publicly released weights have been tuned to remove safeguards within days of introduction. Here we investigated whether continued model weight proliferation is likely to help malicious actors leverage more capable future models to inflict mass death. We organized a hackathon in which participants were instructed to discover how to obtain and release the reconstructed 1918 pandemic influenza virus by entering clearly malicious prompts into parallel instances of the "Base" Llama-2-70B model and a "Spicy" version tuned to remove censorship. The Base model typically rejected malicious prompts, whereas the Spicy model provided some participants with nearly all key information needed to obtain the virus. Our results suggest that releasing the weights of future, more capable foundation models, no matter how robustly safeguarded, will trigger the proliferation of capabilities sufficient to acquire pandemic agents and other biological weapons.