Hide table of contents

Cross-posted from my NAO Notebook. Thanks to Evan Fields and Mike McLaren for editorial feedback on this post.

In Detecting Genetically Engineered Viruses With Metagenomic Sequencing we have:

our best guess is that if this system were deployed at the scale of approximately $1.5M/y it could detect something genetically engineered that shed like SARS-CoV-2 before 0.2% of people in the monitored sewersheds had been infected.

I want to focus on the last bit: "in the monitored sewersheds". The idea is, if a system like this is tracking wastewater from New York City, its ability to raise an alert for a new pandemic will depend on how far along that pandemic is in that particular city. This is closely related to another question: what fraction of the global population would have to be infected before it could raise an alert?

There are two main considerations pushing in opposite directions, both based on the observation that the pandemic will be farther along in some places than others:

  • With so many places in the world where a pandemic might start, the chance that it starts in NYC is quite low. To take the example of COVID-19, when the first handful of people were sick they were all in one city in China. Initially, prevalence in monitored sewersheds in other parts of the world will be zero, while global prevalence will be greater than zero. This effect should diminish as the pandemic progresses, but at least in the <1% cumulative incidence situations I'm most interested in it should remain a significant factor. This pushes prevalence in your sample population to lag prevalence in the global population.

  • NYC is a highly connected city: lots of people travel between there and other parts of the world. Since pandemics spread as people move around, places with many long-distance travelers will generally be infected before places with few. While if you were monitoring an isolated sewershed you'd expect this factor to cause an additional lag in your sample prevalence, if you specifically choose places like NYC we expect instead the high connectivity to reduce lag relative to global prevalence, and potentially even to lead global prevalence.

My guess is that with a single monitored city, even the optimal one (which one is that even?) your sample prevalence will significantly lag global prevalence in most pandemics, but by carefully choosing a few cities to monitor around the world you can probably get to where it leads global prevalence. But I would love to see some research and modeling on this: qualitative intutitions don't take us very far. Specifically:

  • How does prevalence at a highly-connected site compare to global prevalence during the beginning of a pandemic?

  • What if you instead are monitoring a collection of highly-connected sites?

  • What does the diminishing returns curve look like for bringing additional sites up? Does it go negative at some point, where you are sampling so many excellent sites that the marginal site is mostly dilutative?

  • If you look at the initial spread of SARS-CoV-2, how much of the variance in when places were infected is explained by how connected they are?

  • What about with data from the spread of influenza and SARS-CoV-2 variants?

  • Are there other major factors aside from connectedness that lead to earlier infection? Can we model how valuable different sites are to sample, in a way that can be combined with how operationally difficult it is to sample in various places?

If you know of good work on these sorts of modeling questions or are interested in collaborating on them, please get in touch! My work email is jeff at securebio.org.

34

2
0
2

Reactions

2
0
2

More posts like this

Comments14


Sorted by Click to highlight new comments since:

Jeff, your notes on NAO are fascinating to read! I have nothing to add other than that I hope you keep posting them

Great post - I really enjoyed reading this.

I would have thought the standard way to resolve some of the questions above would be to use a large agent-based model, simulating disease transmission among millions of agents and then observing how successful some testing scheme is within the model (you might be able to backtest the model against well-documented outbreaks).

I'm not sure how much you'd trust these models over your intuitions, but I'd guess they'd have quite a lot of mileage.

I've only skimmed these papers, but these seem promising and illustrative of the direction to me: 

The best stuff looking at global-scale analysis of epidemics is probably by GLEAM. I doubt full agent-based modelling at small-scales is giving you much but massively complicating the model.

This effect should diminish as the pandemic progresses, but at least in the <1% cumulative incidence situations I'm most interested in it should remain a significant factor.

1% cumulative incidence is quite high, so I think this is probably far along you're fine. E.g. we've estimated London hit this point for COVID around 22 Mar 2020 when it was pretty much everywhere.

[This comment is no longer endorsed by its author]Reply

I think this is probably far along you're fine

I'm not sure what you mean by this?

(Yes, 1% cumulative incidence is high -- I wish the NAO were funded to the point that we could be talking about whether 0.01% or 0.001% was achievable.)

Sorry, I answered the wrong question, and am slightly confused what this post is trying to get out. I think your question is: will NYC hit 1% cumulative incidence after global 1% cumulative incidence?

I think this is almost never going to be the case for fairly indiscriminately-spreading respiratory pathogens, such as flu or COVID.

The answer is yes only if NYC's cumulative incidence is lower than the global mean region (weighted by population). Due to connectedness, I expect NYC to always be hit pretty early, as you point out, definitely before most rural communities. I think the key point here is that NYC doesn't need to be ahead of the epicentre of the disease, only the global mean.

One way of looking at this is how early on does NYC get hit compared to other cities/regions. This analysis (pdf) orders cities by connectedness to Wuhan to answer this question for COVID. It looks like they've released an online tool that lets you specify different origin locations and epidemiological parameters. So you could rank how early NYC gets hit for a range of different scenarios.

by carefully choosing a few cities to monitor around the world you can probably get to where it leads global prevalence

This would surprise me. It's hard to imagine a scenario where the arrival time at different major travel hubs is very desynchronized as these locations are highly connected to each other. So you'd probably then end up looking at a long tail of locations which are poorly connected to the main travel hubs.

[I] am slightly confused what this post is trying to get out. I think your question is: will NYC hit 1% cumulative incidence after global 1% cumulative incidence?

That's one of the main questions, yes.

The core idea is that our efficacy simulations are in terms of cumulative incidence in a monitored population, but what people generally care about is cumulative incidence in the global (or a specific country's) population.

online tool

Thanks! The tool is neat, and it's close to the approach I'd want to see.

I think this is almost never ... would surprise me

I don't see how you can say both that it will "almost never" be the case that NYC will "hit 1% cumulative incidence after global 1% cumulative incidence" but also that it would surprise you if you can get to where your monitored cities lead global prevalence?

I don't see how you can say both that it will "almost never" be the case that NYC will "hit 1% cumulative incidence after global 1% cumulative incidence" but also that it would surprise you if you can get to where your monitored cities lead global prevalence?

Sorry, this is poorly phrased by me. I meant that it would surprise me if there's much benefit from adding a few additional cities.

Possibly! That would certainly be a convenient finding (from my perspective) if it did end up working out that way.

Thank you, this is fascinating. Is there an option to monitor wastewater just from airports (as well as generally for a whole city)? Then anything brought in on international flights might be less diluted and you might be able to detect it sooner, idk?

I realise that the world is a little bit different than in 1918, but given that the Spanish Flu was spread by troop movements, I wonder what the various militaries are doing and if they see themselves as having a role in pandemic prevention?

The NAO ran a pilot where we worked with the CDC and Ginkgo to collect and sequence pooled airplane toilet waste. We haven't sequenced these samples as deeply as we would like to yet, but initial results look very promising.

Militaries are generally interested in this kind of thing, but primarily as biodefense: protecting the population and service members.

Thanks for the post! This may not be helpful, but one thing I would be curious to see would be how the dispersion coefficient k (Discussed here; I'm sure there's a better reference source) affected the importance of having many sites. With COVID, a lot of transmission came from superspreader events, which intuitively would increase the variance of how quickly it spread in different sites. On the other hand, the flu has a low proportion of superspreader events, so testing in a well connected site might explain more of the variance?

I haven't done or seen any modeling on this, but intuitively I would expect the variance due to superspreading to have most of its impact in the very early days, when single superspreading events can meaningfully accelerate the progress of the pandemic in a specific location, and to be minimal by the time you get to ~1% cumulative incidence?

Curated and popular this week
 ·  · 20m read
 · 
Advanced AI could unlock an era of enlightened and competent government action. But without smart, active investment, we’ll squander that opportunity and barrel blindly into danger. Executive summary See also a summary on Twitter / X. The US federal government is falling behind the private sector on AI adoption. As AI improves, a growing gap would leave the government unable to effectively respond to AI-driven existential challenges and threaten the legitimacy of its democratic institutions. A dual imperative → Government adoption of AI can’t wait. Making steady progress is critical to: * Boost the government’s capacity to effectively respond to AI-driven existential challenges * Help democratic oversight keep up with the technological power of other groups * Defuse the risk of rushed AI adoption in a crisis → But hasty AI adoption could backfire. Without care, integration of AI could: * Be exploited, subverting independent government action * Lead to unsafe deployment of AI systems * Accelerate arms races or compress safety research timelines Summary of the recommendations 1. Work with the US federal government to help it effectively adopt AI Simplistic “pro-security” or “pro-speed” attitudes miss the point. Both are important — and many interventions would help with both. We should: * Invest in win-win measures that both facilitate adoption and reduce the risks involved, e.g.: * Build technical expertise within government (invest in AI and technical talent, ensure NIST is well resourced) * Streamline procurement processes for AI products and related tech (like cloud services) * Modernize the government’s digital infrastructure and data management practices * Prioritize high-leverage interventions that have strong adoption-boosting benefits with minor security costs or vice versa, e.g.: * On the security side: investing in cyber security, pre-deployment testing of AI in high-stakes areas, and advancing research on mitigating the ris
 ·  · 15m read
 · 
In our recent strategy retreat, the GWWC Leadership Team recognised that by spreading our limited resources across too many projects, we are unable to deliver the level of excellence and impact that our mission demands. True to our value of being mission accountable, we've therefore made the difficult but necessary decision to discontinue a total of 10 initiatives. By focusing our energy on fewer, more strategically aligned initiatives, we think we’ll be more likely to ultimately achieve our Big Hairy Audacious Goal of 1 million pledgers donating $3B USD to high-impact charities annually. (See our 2025 strategy.) We’d like to be transparent about the choices we made, both to hold ourselves accountable and so other organisations can take the gaps we leave into account when planning their work. As such, this post aims to: * Inform the broader EA community about changes to projects & highlight opportunities to carry these projects forward * Provide timelines for project transitions * Explain our rationale for discontinuing certain initiatives What’s changing  We've identified 10 initiatives[1] to wind down or transition. These are: * GWWC Canada * Effective Altruism Australia funding partnership * GWWC Groups * Giving Games * Charity Elections * Effective Giving Meta evaluation and grantmaking * The Donor Lottery * Translations * Hosted Funds * New licensing of the GWWC brand  Each of these is detailed in the sections below, with timelines and transition plans where applicable. How this is relevant to you  We still believe in the impact potential of many of these projects. Our decision doesn’t necessarily reflect their lack of value, but rather our need to focus at this juncture of GWWC's development.  Thus, we are actively looking for organisations and individuals interested in taking on some of these projects. If that’s you, please do reach out: see each project's section for specific contact details. Thank you for your continued support as we
 ·  · 11m read
 · 
Our Mission: To build a multidisciplinary field around using technology—especially AI—to improve the lives of nonhumans now and in the future.  Overview Background This hybrid conference had nearly 550 participants and took place March 1-2, 2025 at UC Berkeley. It was organized by AI for Animals for $74k by volunteer core organizers Constance Li, Sankalpa Ghose, and Santeri Tani.  This conference has evolved since 2023: * The 1st conference mainly consisted of philosophers and was a single track lecture/panel. * The 2nd conference put all lectures on one day and followed it with 2 days of interactive unconference sessions happening in parallel and a week of in-person co-working. * This 3rd conference had a week of related satellite events, free shared accommodations for 50+ attendees, 2 days of parallel lectures/panels/unconferences, 80 unique sessions, of which 32 are available on Youtube, Swapcard to enable 1:1 connections, and a Slack community to continue conversations year round. We have been quickly expanding this conference in order to prepare those that are working toward the reduction of nonhuman suffering to adapt to the drastic and rapid changes that AI will bring.  Luckily, it seems like it has been working!  This year, many animal advocacy organizations attended (mostly smaller and younger ones) as well as newly formed groups focused on digital minds and funders who spanned both of these spaces. We also had more diversity of speakers and attendees which included economists, AI researchers, investors, tech companies, journalists, animal welfare researchers, and more. This was done through strategic targeted outreach and a bigger team of volunteers.  Outcomes On our feedback survey, which had 85 total responses (mainly from in-person attendees), people reported an average of 7 new connections (defined as someone they would feel comfortable reaching out to for a favor like reviewing a blog post) and of those new connections, an average of 3