No, there are not hundreds of cloud labs in biology

ljusten

This is a linkpost for https://substack.com/home/post/p-192022274

In a recent appearance on Sam Harris’s Making Sense podcast (#463, “Privatizing the Apocalypse”), Rob Reid claimed that there are “dozens, maybe a hundred” cloud labs and suggested that cloud labs are a major emerging source of biorisk.

While I agree with many points made on the podcast, I think Rob is mistaken here, both on the number of cloud labs and the current magnitude of the risk they pose.

What is a cloud lab?

Cloud labs^[1] are highly automated and flexible biological laboratories that users can operate remotely, without needing their own hardware.

On paper, the commercial and scientific motivation for cloud labs is compelling: building a wet lab is expensive, both in terms of equipment and the staff to operate it, so why not outsource those experiments to a centralized “foundry” that can do it all with robots?

In theory, this approach lowers the barriers to doing biology, since you can control things with code (easy) and don’t have to get involved in the business of manipulating molecules (hard). And since everything is automated, logged, and not subject to the whims of human performance, the experiments should be more precise and more reproducible, which is good for science.

Where are the cloud labs?

The vision of the cloud lab remains largely just that – a vision. As of today, there are only a handful of commercial cloud labs accessible to scientists or startups. The primary ones are Emerald Cloud Lab (ECL), Strateos (formerly Transcriptic), and Ginkgo Bioworks. Ginkgo’s cloud lab product launched just this month, as part of a broader pivot into lab automation as a service. Strateos runs two automated labs in California and has partnered with Eli Lilly, but has been pivoting toward deploying its software on-site at client facilities rather than expanding its own hosted cloud labs.

That’s about it for cloud labs available as a service. A handful of providers, not dozens.^[2]

As far as I’m aware, both Ginkgo and ECL are still serving only a small number of customers (ECL contracts reportedly start above $250k/year), and the reality is that getting your protocol running on any lab automation system is going to take more than clicking a button or asking your favorite AI agent. Each cloud lab has its own niche software layer for specifying experiments: ECL uses a proprietary Symbolic Lab Language and Ginkgo uses its own Catalyst software. Getting your protocol running will mean verifying compatibility with their instrumentation and developing the programmatic protocol in their niche software environment, and iterating an actual automation setup. The teams at these companies will work with you on this, but it is an involved process that’s going to require consultation, and will probably only be worth the investment if you need to screen a large number of molecules or run the same protocol many times. For bespoke, low-throughput workflows, a cloud lab is probably more trouble than it’s worth.

In the words of Tessa Alexanian, right now, getting started with a cloud lab looks more like “establish a client relationship” than “open an account and make an order.”

Part of the hype around cloud labs has been driven by flashy examples, like the recent OpenAI-Ginkgo collaboration in which GPT-5 helped direct thousands of cell-free protein synthesis experiments, optimizing reaction conditions and reducing costs by 40%. But these are precisely the kind of high-throughput experiments where lab automation excels, and the process was far from fully autonomous, requiring manual reagent adjustments by Ginkgo engineers and some model nudging along the way, no doubt.

In addition to these commercial efforts, there are more decentralized approaches to automation, like $16k liquid handlers and open-source Python libraries to control them. While this means you can use Claude Code to control your liquid handler or set up a mini self-driving lab, you’re not removing the need for biological know-how. Liquid handlers are still quite constrained in what types of instrumentation they support, and if you thought lab automation was just about writing code, you’re going to inevitably face some annoying hardware challenges, and might wish you had just learned how to pipette yourself.

What are the risks?

The primary threat model I hear around cloud labs is that they could lower the barrier to making dangerous pathogens by circumventing the need for wet lab skills and expensive lab equipment, and lowering the risk of self-infection or detection.

This is a concern worth taking seriously. But most proponents of this threat model, like Rob, overestimate both the number of cloud labs and their maturity, treating them as a sort of black box where arbitrary protocols go in and molecules or data come out. That is not the current state of affairs. If you want to use a cloud lab to make a pathogen, you’re going to have some awkward conversations about your goals and desires with the cloud lab providers, which are not a neutral black box, and will, in fact, ask questions about your protocol and end goals.

For a lot of core dual use biology workflows, like reverse genetics, I also don’t think cloud labs are going to be the best fit, since the workflows are fairly bespoke and you’re not running enormous high throughput experiments. For these workflows, I would be more concerned about the risks from contract research organizations (CROs), which are sort of like cloud labs except humans do the work.

I do think there are real risks around using cloud labs to generate valuable data for pathogen optimization, and cloud lab operators should take meaningful steps to reduce those risks. But when I survey the current, highly neglected state of biorisk, the cloud lab risk is not among my top concerns. Still, lab automation is going to get better, and it’s reasonable to start developing and integrating safeguards now, even if the overall risk is low.

Safeguards for cloud labs

The main safeguards for cloud labs are akin to those in the commercial DNA synthesis world, with providers trying to prevent malicious actors from obtaining dangerous pathogen DNA. These safeguards are: (1) screening incoming requests for hazardous features, (2) performing know-your-customer (KYC) screening to ensure the end users are legit, and 3) developing the requisite standards, legislation, and industry groups so that this type of screening and KYC is enforced nationally and ideally globally, so evasion is harder than just switching suppliers.

For (1), it seems reasonable for cloud labs to screen both the nucleic acid substrates as well as the actual protocols, to ensure that they are not facilitating the creation or modification of dangerous pathogens. In addition to being good for biosecurity, this kind of screening is also important for biosafety – if your lab automation engineers are walking around and interacting with equipment, you probably want to ensure nobody is making prions or pathogens on your setup without your knowledge.

Conclusion

I’m enthusiastically onboard with making the world safer against biological threats and unlocking all the beneficial applications of biology – it’s the mission around which I’ve built my PhD and career up to this point. But given the neglected state of biosecurity, we should focus resources and attention where it matters most and remain evidence-based about the risks, lest we alienate much-needed allies by crying wolf where there’s only a pup.

^{^}
In addition to the RAND report I link above, Tessa Alexanian wrote this great primer on cloud labs back in 2023 for policymakers, which is still relevant today. The Biden-era Framework for Nucleic Acid Synthesis Screening also includes an official definition of cloud labs as a “highly automated research laboratory possessing a diversity of analytical and synthesis capabilities across the life sciences and that can be remotely operated by specifying experimental protocols via software.”
^{^}
There are other types of automated biology labs that I’m hesitating to call cloud labs but others might. A 2025 RAND report identified 15 “cloud lab organizations” globally, but from what I can tell, most of these are internal platforms for private entities, not something a start up or a scientist would have access to. Some service companies like Arctoris and Adaptyv Bio brand themselves as cloud labs, but are more specialized and less autonomous. For the purposes of biosecurity, I think we should apply much the same security principles, treating cloud labs as a subset of CROs subject to less direct human intervention and oversight.

SummaryBotMar 252

Executive summary: The author argues that claims about “dozens, maybe a hundred” cloud labs and their current biorisk are overstated, as only a handful of limited, immature services exist and they are not a major present risk compared to other biosecurity concerns.

Key points:

The author claims Rob Reid overestimates both the number of cloud labs and the magnitude of their current risk.
Cloud labs are defined as highly automated biological laboratories that can be remotely operated via software, in theory lowering barriers and improving reproducibility.
The author states that only a handful of commercial cloud labs currently exist, mainly Emerald Cloud Lab, Strateos, and Ginkgo Bioworks.
The author argues that cloud labs are not easily accessible or turnkey, requiring significant setup, specialized software, and ongoing consultation, making them unsuitable for many workflows.
The author notes that current usage is limited, with high costs (e.g. ECL reportedly above $250k/year) and small customer bases.
The author claims that examples like OpenAI–Ginkgo reflect high-throughput niches and still require substantial human involvement.
The author argues that decentralized automation tools (e.g. liquid handlers) still require biological expertise and face hardware constraints.
The author describes the main risk concern as lowering barriers to creating pathogens but argues this is overstated given current limitations and provider oversight.
The author claims cloud labs are not a “black box” and involve scrutiny of user goals and protocols, including interaction with providers.
The author argues that for many dual-use workflows (e.g. reverse genetics), cloud labs are a poor fit and contract research organizations may pose greater risk.
The author believes cloud labs may pose some risk in generating data for pathogen optimization but are not a top current biosecurity concern.
The author recommends safeguards such as screening protocols and materials, know-your-customer checks, and broader regulatory standards.

This comment was auto-generated by the EA Forum Team. Feel free to point out issues with this summary by replying to the comment, and contact us if you have feedback.

Effective Altruism Forum
EA Forum

No, there are not hundreds of cloud labs in biology

42

42

Reactions

More posts like this