Pitch:

  • Publishing dual use papers is bad.
  • It’s also easy, people can upload it to arxiv/bioRxiv/medRxiv, even by accident (not knowing they are publishing something potentially harmful).
  • It might be useful for these platforms to reject such publications, and the platforms seem interested.
  • I’m guessing they’re lacking resources: People that will vet papers, maybe a software system, maybe money.
  • Let’s talk to the platforms, ask what they need, and give it to them.

Why I think the platforms are interested in doing this

A founder from bioRxiv and medRxiv, Richard Sever, says about this screening:

  1. "This is desirable and in fact already happens to an extent"
  2. "arXiv and bioRxiv/medRxiv already communicate regularly"

I can provide the reference for this.

Request for vetting

My experience in biosecurity is about 3 hours.

Please, people doing biosecurity, reply with your opinion, even if it is very short like “sounds good” or “sounds bad”.

Looking for project lead

Do you know someone who could run this? Comment on the post (or DM me, and I’ll pass it on somewhere).

Before starting this project, please review the ways it could go wrong

As a naive example, just to be concrete: Someone gets mad that their (dangerous) article wasn’t accepted, so they publish it on Twitter and it goes viral.

But more generally, before starting this project, please talk to the people who reply “sounds bad”.

It was just Petrov Day, ”Wherever you are, whatever you're doing, take a minute to not destroy the world”.

Thanks

New Comment
11 comments, sorted by Click to highlight new comments since: Today at 9:01 AM

Trying to collect some comments that seem relevant:

 

Daniel Greene said:

+1 to [doing things at arXiv]. arXiv could play a big role in contributing to a norm around not publishing dual use bio research. There are challenges of screening large numbers of papers, but they can be met. See here for an example from ASM. bioRxiv may or may not be screening already, but they aren't sharing information about their practices. It would be helpful if they were more vocal about the importance of not publishing dangerous information.

ASB seems to agree:

See Daniel Greene's comment about creating better norms around publishing dangerous information (he beat me to it!).

 

 

Tessa said:

Relatedly, an area where I think arXiv could have a huge impact (in both biosecurity and AI) would be setting standards for easy-to-implement manged access to algorithms and datasets.

This is something called for in Biosecurity in an Age of Open Science:

Given the misuse potential of research objects like code, datasets, and protocols, approaches for risk mitigation are needed. Across digital research objects, there appears to be a trend towards increased modularisation, i.e., sharing information in dedicated, purpose built repositories, in contrast to supplementary materials. This modularisation may allow differential access to research products according to the risk that they represent. Curated repositories with greater access control could be used that allow reuse and verification when full public disclosure of a research object is inadvisable. Such repositories are already critical for life sciences that deal with personally identifiable information.

This sort of idea also appears in New ideas for mitigating biotechnology misuse under responsible access to genetic sequences and in Dual use of artificial-intelligence-powered drug discovery as a proposal for managing risks from algorithmically designed toxins.

Seems promising and I would support these efforts. One danger to watch for and try to address would be attention info-hazards e.g.:

  1. if object-level rejection criteria were specific and public/well-known, they could point less sophisticated bad actors to a curated list of topics/papers/ideas thought by experts to be dangerous
  2. if rejected papers find an alternative publication location, that location could become suffused with a higher density of potentially dangerous material relative to other locations, maybe facilitating malicious application

But these risks can be managed, and are outweighed by the project's benefits in my view.

I think this project seems valuable for three main reasons:

  1. It would provide a test of dual-use research (DUR) policies and policy implementation, improving future efforts to limit the publication of DUR. Indeed, if the project succeeds, it is likely to provide a blueprint for future action that can be easily replicated with journals and other preprint servers. If it fails, it will yield valuable information about what our blindspots are and what types of projects we should pursue in the future. As far as I know, the EA biosecurity community has not yet tried working directly with publishers.
  2. It would normalize responsible publication (as opposed to maximal openness) and raise awareness of dual-use concerns within the scientific community, making future efforts to limit the publication of DUR more likely to succeed.
  3. In some (very rare) scenarios, it might actually limit the spread of dangerous information. Journals tend to have better review processes than preprint servers, so it is possible to imagine a particularly dangerous piece of research (e.g., the creation of a new superbug) being denied publication in a journal only to find that it has spread widely over the internet after being submitted to arXiv.org.

Additionally, it is clear that servers like arXiv.org are highly resource-constrained (see https://www.scientificamerican.com/article/arxiv-org-reaches-a-milestone-and-a-reckoning/).

With that said, striking the right balance between openness and security is difficult, and this project might generate significant backlash or have unintended consequences. I wrote up some additional thoughts about this project after discussing it with Yonatan, so please reach out to me if you would like to read more about what this project might look like and why it might or might not be a good idea.

Arguments for

  • Access to harmful research would be restricted
  • The pace of relevant research would slow (allowing for regulation to catch up)

Arguments against

  • Malevolent actor could publish or retrieve said information elsewhere than arxiv/bioRxiv/medRxiv/participating publisher. Members of the International Gene Synthesis Consortium (IGSC) voluntarily apply screening standards to assess gene sequence orders and customers, and these companies today only represent approximately 80% of global commercial gene synthesis capacity so compulsory filtering, or better coordination of the industry (which stricter self-regulation at this stage may deter) is required.
  • Benevolent actors would be face higher barriers to access useful information

For further investigation

  • Would Industry lead censorship could crowd-out or crowd-in government intervention that could enforce vetting across publishers and in other initiatives? 
  • Would insulating the riskiest dual use publications from exposure to a broader pool of researchers accessing conventional publishers have unintended consequences?

Implementation

How would vetters, whether a regulatory agency or an independent initiative, screen papers? In the case of DNA synthesis, which does not account for all biosecurity relevant dual use research, a minimalistic approach is for vetters to utilise an encrypted database of riskier sequences of DNA , as proposed by MIT's Prof Kevin Esvelt.  

However, dual use control at the publisher level would presumably not be restricted to DNA synthesis, it would include such things as studies of remote areas at the human-animal boundary.

Next steps

Esvelt is in dialogue with the Nuclear Threat Initiative who are coordinating higher level conversation in this area. If the publishers you mentioned aren't already part of that dialogue, the best next steps may be to connect Nuclear Threat Initiative folks with those academic publishers. But, I don't think that should mean this initiative shouldn't proceed in parallel. I think there is merit in taking some action now in this space because the conversation that the Nuclear Threat Initiative and co are kindling is a slow, multilateral process - screening DNA synthesis orders is not legally required by any national government at this stage.

Time sensitivity

The cost of DNA synthesis is declining and the fixed costs of filtering could grow as fraction of the cost, therefore the viability of a voluntary screening model could its highest right now.

Participation

I'm interested in being involved, but don't know that much about academic publishing or technical genomics stuff so probably not a fit to be a (solo) project lead. Do know about management, health policy, public administration, stakeholder engagement, communications etc

+1 to contacting Nuclear Threat Initiative, they seem to be active and well connected across many relevant areas. 

I definitely don't think "sounds bad" (I really, really would like it to be easier for publishers to adopt dual-use screening best practices) but I do think "sounds partly duplicative of other work" (there are other groups looking into what publishers need / want, seems good to collaborate with them and use their prior work) and "should be done thoughtfully" (for example, should be done with someone who has a good appreciation for the fact that, right now, there does not exist a set of "dual use best practices" that an organization could simply adopt).

I'm going to gesture towards some related initiatives that might be of interest, from some folks who have already undertaken (at least some of the) "Let’s talk to the platforms, ask what they need, and give it to them" step:

Anyway, I feel like one way in which this project could go wrong is viewing itself as trying to lock in a new standard, rather than running an experiment in biosecurity governance that is part of the project of Consensus-finding on risks and benefits of research.

This project would be valuable if the costs outweighed the benefits.  

It could be relatively expensive (in person-hours) to run (there might be a tonne of publications to vet!) and relies on us being good (low false positive, high recall) at identifying biohazards (my prior is that this is actually pretty hard and those biohazardous publications would happen anyway). We'd also need to worry about incentivising people to make it harder to tell that their work is dangerous. 

Biohazards are bad but preventing biohazards might have low marginal returns when some already exist. It's not that any new biohazard is fine; it's that marginal biohazard might be pretty rare (like something that advances the possibilities) relative to "acceptable" sort of non-marginal biohazards (i.e., another bad genome for something as bad as what's already public knowledge).  Other work might advance what's possible without being a biohazard persay (i.e., AlphaFold). 

I think a way to verify if this is a good project might be to talk to the Spiez lab. They run a biosecurity conference every year and invite anyone doing work that could be dangerous to attend. 

I'm happy to chat more about it. 

Regarding the problem being expensive (because there are many publications to vet or something like that):

I think I could help optimize that part, it sounds like an engineering problem, which is a thing I'm ok with

I'd like to see a more comprehensive model for what biosecurity risk looks like that can motivate a comparison of project ideas. In the absence of that, it's really hard to say where we get the most benefit. 

We don't have to drop all our plans in favor of the one top plan

 

Or as Dumbledore said:

I do not say that carrying your father's rock is the one best possible course of action, only that it is wiser to do than not.

Sorry I'm late to the party- as per the OP's request for short takes with no explanation, mine is that this is probably not worth doing, fwiw.