I like to ask questions and talk about ideas. www.tinarwhite.com
I posted on EA Forum about a month ago with an idea for a privacy-focused contact tracing app and started a volunteer team to complete it. We now about have about a 20-person team working on Slack. We've made great progress and it's almost done. You can see our progress and arguments for the intervention here:
And my original EA forum post is here:
If you're interested in collaborating, please see our collaborate page here:
[Edit: Formerly at https://www.covid19risk.com]
We have a website now too. Please check out our collaborate page if you'd like to help: https://www.covid19risk.com/
CoEpi is a great team too! We're helping each other out now with a lot of cross-communication between our slack channels.
Wow, what a great dataset! If you have some colleagues that might be interested, please link them here to the forum. I've also made a couple public Facebook posts about it looking for collaborators in COVID-19 discussion groups and publicly on on my personal page here and here with more information about who can help.
This app idea is focused on privacy, but not for the reasons people may assume. It isn't just an abstract ideal. It's privacy-focused because it would need to be. The app would rely on people voluntarily sharing data. In many countries, people do not feel safe sharing certain kinds of information because of their government. And if it's not safe to share your information, people won't. Making this as safe as possible would be crucial, and that's what the privacy focus is about.
3) You mention that Google traffic data is still useful, even when few people use it. I am not familiar with that part of the app, but if it involves some form of prediction, it is important to note that Google has had years to get this right. With a pandemic, you have at best months(!), and on top of that the situation changes constantly.
Yes. This is another reason why working with someone like Google Maps or some other mapping app could be crucial, because they have accumulated domain / tribal knowledge that no one else might have.
[Edit: I've received some private feedback that neither of us might be right here. The calculation for both (density of traffic and density of "infectiousness") may be quite straightforward. The reason traffic updates got so much better might just have been more data.]
2) Regarding the computation of the risk score: If you only use confirmed cases with voluntary sign up, you might not get enough data; if you use suspected cases by symptoms, you will get a lot of false positives due to worried people with the flu. In the absence of data on how to properly account for that, this is a very difficult problem.
These are significant challenges and I talk a bit about how they can be addressed in The Incentives Align and at the end of the section Example App Questionnaire. I imagine there would always be more confidence put in a confirmed case with a code than someone who just answers yes to having cold or flu symptoms recently.
Also, for confirmed cases, as part of contact tracing, the CDC sometimes identifies a site of concern where a patient might recall that something particularly infectious happened before they were aware they were sick. For example: "Oh. I remember that a few days ago, I sneezed quite forcefully and unexpectedly at my favorite buffet, so I couldn't cover my nose. Oh, and then again on the way home on the BART! I'm so sorry." Tracing multiple paths backwards, you might get a lot of data from a single event.
1) How likely are you to catch the virus at all just by being in the same area/frequenting the same shops as somebody infected? My impression from the Western cases so far was that it infections occurred generally with close contacts; this risk changes obviously when more infected people are around, but still should be estimated to decide whether such an app would be worth it.
I imagine the CDC could already have some estimates for this, which they might use in contact tracing. And it might turn out that contact tracing is enough to solve the problem. It seems to be working well right now in the U.S.
But if not, a not-so-educated guess for a general outline of the calculation of risk for the app might be (1) Close contact. You were in the same location where a possibly infectious person spent some time within maybe 10 minutes of them. This is higher risk. (2) Semi-Close Contact. If the virus might live on surfaces for a few hours, the close contact risk distribution tapers off over a few hours and (3) Infectiousness. This isn't binary so the infectious person's distribution also peaks sometime around their first symptoms, and tapers off over a few days up until the tail reaches some maximum.
A heat map that changes with time could reflect this information. And any individual's risk could be calculated by integrating over it. And, if the local situation is suddenly found to have had multiple people in it in the last few weeks, and the level of precision was possible, you could do multiple iterations of this given each user's (small) risk of contracting the virus from their interactions too, given an estimate of what we think the time might be between when a person contracts the virus to when they become infectious. This is harder, but if everything else works, it's possible.
Writing this out in words is long, but the actual summation/integral is not too complicated. It's a combination of science and hack-y guesses. But this seems true to me of almost all engineering.
0) Even basic questions about the virus and how it spreads are still unanswered, like how infectious one is during the incubation period. This makes more advanced questions regarding a risk score difficult to answer.
I agree. And there are a lot of resources being put into research on this right now, so in time I hope we have better answers. But even imperfect information could be helpful. See the Q&A with Sukrit Silas. At first I imagined the app could only give a risk score that was very coarse. Just levels 1-7. I've commented separately, header Example Score Levels, with an example of what I mean, which I didn't put in the post because I have no confidence in what it should be. But you might be able to show a decimal point too. I think it's good not to start out being committed to any kind of risk scoring system until you have a sense of what's possible.
Thank you for raising the concerns that you did. I appreciate the opportunity to explain more about how I've been thinking about these concerns. This is the kind of feedback I was hoping for from posting on EA Forum. I'll try to address each one.