Hide table of contents

It's been a busy season at the Nucleic Acid Observatory, and we have a lot to share since our last update. As always, If anything here is particularly interesting or if you’re working on similar problems, please reach out!

Wastewater Sequencing

We performed an initial analysis of untargeted sequencing data from aggregated airplane lavatory waste and municipal treatment plant influent that we collected and processed during our Fall 2023 partnership with CDC’s Traveler-based Genomic Surveillance program and Ginkgo Biosecurity. We've now analyzed viral abundance and diversity in sequencing data across multiple sample types and wastewater-processing and sequencing protocols. Next steps include further investigating how protocol and sample-type affect specific viruses and bacteria, as well as understanding pathogen temporal dynamics seen in airport versus treatment-plant samples.

We have continued to work with Pardis Sabeti's Lab at the Broad Institute to develop an optimized protocol for preparing RNA sequencing libraries from nucleic acids extracted from wastewater. By the end of the year, we plan to use this protocol to sequence the samples collected during our Fall 2023 effort described above.

We have also been collaborating with Jason Rothman, formerly of Katrine Whiteson's lab at the University of California, Irvine, and now at his own lab at the University of California, Riverside, to sequence and analyze Southern California wastewater. The sequencing is complete, with a total of 45B read pairs. We're aiming to make the raw sequencing data public with a data paper later this fall.

We've scaled up our collaboration with Marc Johnson's group at the University of Missouri. Marc's group is now running a full flow cell every other week. While the MU Genomics Core had been sequencing these on a NovaSeq 6000 with the S4 flow cell they've moved to the NovaSeq X+ with the 10B flow cell, which lowers sequencing costs by about 30% due to cheaper reagents. These runs include samples from sewersheds in multiple major metropolitan areas, among them four Chicago-area sewersheds through a new collaboration with Rachel Poretsky at University of Illinois Chicago. We're still looking for additional partnerships, and if you have a relationship with a treatment plant in a city with a large amount of international travel we can potentially sequence your influent and share the data with you.

Pooled Individual Sequencing

We are continuing our exploration of pooled individual sequencing, where we go to busy public places and collect nasal swab samples. We've been out to sample eight times, with an average of 33 samples per hour with $5/person and $2/person compensation, and plan to evaluate other compensation options.

We recently received permission to sample on the MIT campus, including inside buildings, and have applied for permission to sample inside Boston's public transit stations. This is important for our effort because we expect outdoor sampling in the winter to have a lower participation rate, in addition to being less pleasant for the experimenters.

We're developing methods for extracting the nucleic acids from these samples and running Nanopore sequencing. We're iterating on approaches to maximize the number of viral reads, which primarily means minimizing the fraction of human genome reads while maximizing overall yield.

Other Sampling Strategies

While we are primarily working with wastewater and nasal swabs, we think other sampling approaches are still worth exploring. We think air and blood samples are especially promising:

Nucleic Acid Tracers

We now have permission to use our nucleic acid tracers for an experiment where we deposit them in a toilet and measure concentration at a wastewater treatment plant, which we call a "deposition experiment". This is a great outcome from a multi-year process involving collaboration with multiple regulatory bodies. This fall we will be characterizing sequencing efficacy via spike-ins to estimate the amount we would need to deposit to be detectable in our ongoing surveillance, and if efficacy is sufficiently high we hope to run a deposition experiment.

Analysis of Sequencing Data

We've now moved our primary analysis over to our completely rewritten Nextflow-based metagenomic sequencing pipeline. The new pipeline is much more scalable, using AWS Batch to distribute processing across many machines. We've also made good progress in measuring and optimizing costs, to where we can now analyze a billion read pairs for under $10. This is under 1% of the cost of producing this data: the cost in flow cells alone is around $1k per billion read pairs. With costs at an acceptable level, we've now run all our internal sequencing data through the new pipeline.

Our genetic engineering detection pipeline is now operational, at the end of months of work on reducing false positives. In addition to the genetically engineered viral vector we described in our initial announcement, we have since detected two additional HIV-based viral vectors. We ran a preliminary positive control experiment where our collaborators at MU spiked engineered lentiviral particles into wastewater influent, and are currently writing up results to share. We'll be speaking about this work at the AI-powered Diagnostics session of CBD S&T 2024.

We are also exploring other approaches to detecting novel pathogens in metagenomic data. This summer we resumed work developing a method for reference-free detection based on flagging and assembling exponentially increasing sequences in wastewater data. We plan to investigate additional strategies this fall.

We've recently shared an updated version of our preprint on the relationship between relative abundance in published metagenomic sequencing data and incidence or prevalence in public health data. Among other changes, the preprint now uses our new pipeline for analysis. We also applied these methods to the unpublished sequencing data collected by our collaborators at MU and UCI to generate an estimate for the sequencing depth necessary to detect influenza A and B.

Our collaborators Willie Neiswanger and Oliver Liu at the University of Southern California have a paper accepted at an upcoming NeurIPS workshop. This paper, A Foundation Model for Metagenomic Sequences, describes their work training a metagenomic foundation model on the sequencing data our MU and UCI collaborators collected. Their goal is to apply this model to pathogen-agnostic detection.

We have heard from a range of groups that our expertise in pathogen-agnostic detection is something they are potentially interested in drawing on. If you're interested in our analysis services please let us know.

Organizational Updates

After comparing many candidate representations of our work in scoping and building a Nucleic Acid Observatory, we now have a logo!

 

In preparation for scaling up our wet lab operations, we've secured additional wet lab space outside of MIT, at BioLabs Tufts Boston. This is a shared lab facility in downtown Boston, and our first day with the additional space is today, October 17th.

We recently hired two new Research Scientists, Vanessa Smilansky (at MIT) and Evan Fields (at SecureBio). Vanessa has broad research experience in pathogen detection and nucleic acid sequencing. At the University of Exeter, she developed targeted genomics methods for surveillance of amphibian-infecting protists. Now, she will be joining our Near-Term First group to work on untargeted metagenomic methods for surveillance of human-infecting viruses. Evan joins us from Zoba, where he led data science and software teams optimizing shared mobility and will be working in our Robust Detection group. Outside of work he’s an avid baker.

The NAO is a collaboration between the Sculpting Evolution group at MIT and SecureBio, and the latter recently said goodbye to Operations Manager Tiff Tzeng. She leaves to help start a new AI Safety organization which will publicly develop tools for the responsible deployment of artificial intelligence. With her departure, SecureBio is hiring a Director of Operations (job posting).

Comments1


Sorted by Click to highlight new comments since:

Executive summary: The Nucleic Acid Observatory (NAO) reports progress in wastewater sequencing, pooled individual sampling, nucleic acid tracers, and data analysis techniques for pathogen detection and surveillance.

Key points:

  1. Expanded wastewater sequencing efforts with multiple collaborations, including analysis of airplane lavatory waste and municipal treatment plant samples.
  2. Scaled up pooled individual sequencing via nasal swabs, with plans to sample indoors at MIT and Boston transit stations.
  3. Received approval for nucleic acid tracer deposition experiments in wastewater systems.
  4. Improved metagenomic sequencing pipeline and genetic engineering detection capabilities, reducing costs and false positives.
  5. Organizational updates include new logo, additional lab space, and hiring of two new Research Scientists.
  6. Ongoing development of novel pathogen detection methods, including reference-free detection and a metagenomic foundation model.

 

 

This comment was auto-generated by the EA Forum Team. Feel free to point out issues with this summary by replying to the comment, and contact us if you have feedback.

Curated and popular this week
 ·  · 1m read
 · 
(Audio version here, or search for "Joe Carlsmith Audio" on your podcast app.) > “There comes a moment when the children who have been playing at burglars hush suddenly: was that a real footstep in the hall?”  > > - C.S. Lewis “The Human Condition,” by René Magritte (Image source here) 1. Introduction Sometimes, my thinking feels more “real” to me; and sometimes, it feels more “fake.” I want to do the real version, so I want to understand this spectrum better. This essay offers some reflections.  I give a bunch of examples of this “fake vs. real” spectrum below -- in AI, philosophy, competitive debate, everyday life, and religion. My current sense is that it brings together a cluster of related dimensions, namely: * Map vs. world: Is my mind directed at an abstraction, or it is trying to see past its model to the world beyond? * Hollow vs. solid: Am I using concepts/premises/frames that I secretly suspect are bullshit, or do I expect them to point at basically real stuff, even if imperfectly? * Rote vs. new: Is the thinking pre-computed, or is new processing occurring? * Soldier vs. scout: Is the thinking trying to defend a pre-chosen position, or is it just trying to get to the truth? * Dry vs. visceral: Does the content feel abstract and heady, or does it grip me at some more gut level? These dimensions aren’t the same. But I think they’re correlated – and I offer some speculations about why. In particular, I speculate about their relationship to the “telos” of thinking – that is, to the thing that thinking is “supposed to” do.  I also describe some tags I’m currently using when I remind myself to “really think.” In particular:  * Going slow * Following curiosity/aliveness * Staying in touch with why I’m thinking about something * Tethering my concepts to referents that feel “real” to me * Reminding myself that “arguments are lenses on the world” * Tuning into a relaxing sense of “helplessness” about the truth * Just actually imagining differ
Garrison
 ·  · 7m read
 · 
This is the full text of a post from "The Obsolete Newsletter," a Substack that I write about the intersection of capitalism, geopolitics, and artificial intelligence. I’m a freelance journalist and the author of a forthcoming book called Obsolete: Power, Profit, and the Race to build Machine Superintelligence. Consider subscribing to stay up to date with my work. Wow. The Wall Street Journal just reported that, "a consortium of investors led by Elon Musk is offering $97.4 billion to buy the nonprofit that controls OpenAI." Technically, they can't actually do that, so I'm going to assume that Musk is trying to buy all of the nonprofit's assets, which include governing control over OpenAI's for-profit, as well as all the profits above the company's profit caps. OpenAI CEO Sam Altman already tweeted, "no thank you but we will buy twitter for $9.74 billion if you want." (Musk, for his part, replied with just the word: "Swindler.") Even if Altman were willing, it's not clear if this bid could even go through. It can probably best be understood as an attempt to throw a wrench in OpenAI's ongoing plan to restructure fully into a for-profit company. To complete the transition, OpenAI needs to compensate its nonprofit for the fair market value of what it is giving up. In October, The Information reported that OpenAI was planning to give the nonprofit at least 25 percent of the new company, at the time, worth $37.5 billion. But in late January, the Financial Times reported that the nonprofit might only receive around $30 billion, "but a final price is yet to be determined." That's still a lot of money, but many experts I've spoken with think it drastically undervalues what the nonprofit is giving up. Musk has sued to block OpenAI's conversion, arguing that he would be irreparably harmed if it went through. But while Musk's suit seems unlikely to succeed, his latest gambit might significantly drive up the price OpenAI has to pay. (My guess is that Altman will still ma
 ·  · 5m read
 · 
When we built a calculator to help meat-eaters offset the animal welfare impact of their diet through donations (like carbon offsets), we didn't expect it to become one of our most effective tools for engaging new donors. In this post we explain how it works, why it seems particularly promising for increasing support for farmed animal charities, and what you can do to support this work if you think it’s worthwhile. In the comments I’ll also share our answers to some frequently asked questions and concerns some people have when thinking about the idea of an ‘animal welfare offset’. Background FarmKind is a donation platform whose mission is to support the animal movement by raising funds from the general public for some of the most effective charities working to fix factory farming. When we built our platform, we directionally estimated how much a donation to each of our recommended charities helps animals, to show users.  This also made it possible for us to calculate how much someone would need to donate to do as much good for farmed animals as their diet harms them – like carbon offsetting, but for animal welfare. So we built it. What we didn’t expect was how much something we built as a side project would capture peoples’ imaginations!  What it is and what it isn’t What it is:  * An engaging tool for bringing to life the idea that there are still ways to help farmed animals even if you’re unable/unwilling to go vegetarian/vegan. * A way to help people get a rough sense of how much they might want to give to do an amount of good that’s commensurate with the harm to farmed animals caused by their diet What it isn’t:  * A perfectly accurate crystal ball to determine how much a given individual would need to donate to exactly offset their diet. See the caveats here to understand why you shouldn’t take this (or any other charity impact estimate) literally. All models are wrong but some are useful. * A flashy piece of software (yet!). It was built as
Recent opportunities in Global health & development
63
· · 1m read