Cross-posted to the Oxford Prioritisation Project blog.
Read and comment on the Google Document version of this post here. Feedback from the EA community is how we improve.
I consider open science as a cause area, by reviewing Open Phil’s published work, as well as some popular articles and research, and assessing the field for scale, neglectedness, and tractability. I conclude that the best giving opportunities will likely be filled by foundations such as LJAF and Open Phil, and recommend that the Oxford Prioritisation Project focusses elsewhere.
Open Science (OS) is a movement to make the process of scientific research more transparent and accessible, such as by encouraging researchers to share their workflows, data, and code so that it’s available for independent checking, to pre-register studies and methods for analysis to combat post-hoc analysis (e.g. p-hacking) and publication bias, and changing incentivizes in favour of these, as well as publishing negative results, reproducing others’ experiments, and typically less prestigious but hugely valuable tasks like providing quality peer review or generating data for others to analyze.
In the current system, journals prefer to publish attention-grabbing and novel results as opposed less sexy, and more careful, studies, and in academia one is primary judged by their track record of publications, particularly in top journals, and their numbers of citations, which aren’t necessarily good proxies for accuracy. By straying off this narrow path of publish-or-perish, for example by taking the time to design more careful studies and disseminate the material, or by pursuing a more risky but potentially more rewarding research topic, one is likely to be outcompeted for prestige and advancement by other people in the field, even if these are things one really values.
A priori, it seems like Open Science could have very large returns, plausibly by:
● Improving productivity. If more material from studies is publicly available, scientists can build on others’ research, data, and tools, as well as collaborate and verify each other's data, more easily. Only a slightly higher percentage gain annually might grow a field enough to bring important advances such as new drugs and vaccines forwards by many years.
● Better reproducibility due to more replications published, as well as being more careful.
● Increasing public engagement and trust in science generally, leading to better policy choices.
● Better prioritization of resources and funding, as it will be easier to see which studies should be replicated, or whether and where to use the resources to innovate instead.
The strongest arguments I have found against seem to be:
● Danger of misuse. There may be some research which should not be made publicly available, for example that which could be used for bioterror. On the other hand, organisations like OpenAI hope to encourage open research into potentially risky artificial intelligence, under assumptions such as that developments in AI will occur anyway due to the economic incentives, and it is better for them to be out in the open so that we can better prepare for the outcomes, rather than in secret where a small group of people have access to the information. In general, this seems a poor argument against Open Science in most other cases, and steps could be taken to ensure research in especially risky areas isn’t publicly available.
● Too much information and data. There may be much more low quality data, making verifying all of it at once harder. In general, being more transparent careful takes more time and effort, which could hold back certain fields. The data could be deliberately cherry-picked further by non-scientists, particularly in the media, reducing public understanding of science. This consideration seems overwhelmed by the benefits.
In doing this research, I have mainly looked the Open Philanthropy Project’s writing on the topic, as well as other posts by the effective altruism community.
Note that related to Open Science, there are also issues of neglected goals and breakthrough fundamental science (high risk, high reward research), translational science (e.g. finding ways to apply theoretical knowledge), and science policy and infrastructure (such as government funding). These also seem worth considering separately.
Broadly, my guess is that this is an important area for which there is potential space for new organizations and funding in future, but that today there are only a few players (Centre for Open Science, METRICS, and recently SSMART) in the area, with little room for more funding, due to the interested in the area by and funding from the Laura and John Arnolds Foundation.
From Open Phil’s “Our Landscape of the Open Science Community”:
● There is a lot of for-profit funding, particularly in the journal system, but it is unclear how aligned their incentives are with making science more open.
● “In general, it seems to us that there is currently much more organizational activity on the “building tools and platforms” front than on the “changing incentives and advocating for better practices” front.”
From “Meta-Research Innovation Center at Stanford (METRICS)”:
“At this time, our take is that:
● There is more philanthropic interest in promoting reproducibility and open science (which had initially been the focus of our investigations) than we had initially guessed. (Note that LJAF became interested in these areas around the time that we did.)
● Given the players that are already present, and having explored some other areas, our sense is that there are higher-leverage causes for us to enter.
● We are trying to explore the broader concept behind “meta-research” - thinking about how to improve the incentives that academics face, both as they relate to reproducibility and as they relate to other matters - as part of our scientific research investigations.
● We are extremely happy to see the launch of METRICS, which we believed to be probably the single most exciting giving opportunity we saw while exploring meta-research (and one we might have considered for a recommendation if not for LJAF’s interest).”
From a recent profile in Wired of John Arnold of the Laura and John Arnold Foundation, describing the public and scientific awareness of the problem (article also details LJAF’s interest in funding Open Science orgs):
● “Denis Calabrese, the Arnold Foundation’s president, says they don’t expect immediate results. The Arnolds have a “multiple-decade timeline to work on problems.” Yet the most remarkable thing about the Arnold Foundation’s research integrity projects is that they already appear to be paying off. For one thing, the problems plaguing scientific research are now increasingly well known. Of 1,576 researchers who responded to a recent online survey from Nature, more than half agreed there is “a significant crisis” of reproducibility. The comedian John Oliver spent 20 prime-time minutes on HBO last May mocking the reign of terrible science on TV news shows and in public debate: “After a certain point, all that ridiculous information can make you wonder: Is science bullshit? To which the answer is clearly no, but there’s a lot of bullshit masquerading as science.” (Some of the background footage in the segment came from the Arnold Foundation.)
Ioannidis, whose name is almost synonymous with scientific skepticism, says he has seen immense progress in recent years. The journals Science and Nature have started bringing in statisticians to review their papers. The National Institutes of Health is moving forward with new requirements for data sharing; starting as early as this year, all NIH-funded training programs must include plans for teaching researchers the principles of reproducibility. “Now everybody says we need replication; we need reproducibility,” Ioannidis tells me. “Otherwise our field is built on thin air.””
gwern on the Wired article:
● “I had definitely noticed all the different nutrition, psychology, and biological initiatives like OSF or the Reproducibility Project, and how expensive they all are, but I didn't realize that they all owed their funding to a single source. (I had only ever briefly heard of Arnold in the context of pension reform.) I'm very glad Arnold is doing this, but I now feel more pessimistic about academia than when I assumed that the funding for all this was coming from a broad coalition of universities and nonprofits etc....”
● It may be difficult to convince people to publish on more open, but less prestigious journals, since the more prestigious journals better advance their careers. If these are mostly for-profit (I’m not sure of the breakdown on these), it seems likely they aren’t incentivized for openness in the accessibility sense, since they make money from subscriptions. It seems like prestige would need to go to the journals known for their openness, but this may be hard since it is a big industry.
● My sense is that there is a definite awareness of the issues surrounding Open Science in the scientific community (recently, from the replication crisis in psychology, the worm wars on deworming), and that scientists value openness in their work, but just do not have the right incentives.
● I agree with gwern that it is concerning that the small number of Open Science orgs are mostly all funded by LJAF, despite the general awareness of the problems. For our purposes, this probably means that OS has little RFMF right now, because the opportunities are already filled by LJAF.
The benefits of Open Science seem potentially large, and there is overwhelming evidence that the scale of the problem is large and the current system could be improved.
● Ioannidis’ famous paper, “Why Most Published Research Findings Are False”:
● “There is increasing concern that most current published research findings are false. The probability that a research claim is true may depend on study power and bias, the number of other studies on the same question, and, importantly, the ratio of true to no relationships among the relationships probed in each scientific field. In this framework, a research finding is less likely to be true when the studies conducted in a field are smaller; when effect sizes are smaller; when there is a greater number and lesser preselection of tested relationships; where there is greater flexibility in designs, definitions, outcomes, and analytical modes; when there is greater financial and other interest and prejudice; and when more teams are involved in a scientific field in chase of statistical significance. Simulations show that for most study designs and settings, it is more likely for a research claim to be false than true. Moreover, for many current scientific fields, claimed research findings may often be simply accurate measures of the prevailing bias. In this essay, I discuss the implications of these problems for the conduct and interpretation of research.”
● A paper from 1975, “Consequences of Prejudice Against the Null Hypothesis”:
● “Particularly, the model indicates that there may be relatively few publications on problems for which the null hypothesis is (at least to a reasonable approximation) true, and of these, a high proportion will erroneously reject the null hypothesis. The case studies provide additional support for this conclusion. Accordingly, it is concluded that research traditions and customs of discrimination against accepting the null hypothesis may be very detrimental to research progress.”
Open Phil8 lists some broad interventions which orgs are currently working on:
● “Altmetrics - metrics for evaluating the use/influence/importance of research that go beyond the traditional measures of “where a paper is published and how many citations it has.”
● Post-publication peer review - tools that allow online critique and discussion of research, beyond the traditional journal-based prospective peer review process.
● Innovative open access publishing, including preprints - models that facilitate sharing research publicly rather than simply publishing it in closed journals, sometimes prior to any peer review occurring.
● Sharing data and code - projects that encourage researchers to share more information about their research, by providing tools to make sharing easier or by creating incentives to share.
● Reproducibility - projects that focus on assessing and improving the reproducibility of research, something that the traditional journal system has only very limited mechanisms to address.
● Attribution - tools allowing researchers to cite each other's’ work in nontraditional ways, thus encouraging nontraditional practices (such as data-sharing).
● Advocacy - public- or government-focused campaigns aiming to encourage open access, data/code sharing, and other practices that might have social benefits but private costs for researchers or publishers.
● Alternative publication and peer review models - providing novel ways for researchers to disseminate their research processes and findings and have them reviewed (pre-publication).
● Social networks - platforms encouraging researchers to connect with each other, and in the process to share their research in nontraditional forums.”
More concretely, The Center for Open Science worked with the journal Psychological Science to introduce a badge programme to study whether it encouraged openness:
● “Beginning January 2014, Psychological Science gave authors the opportunity to signal open data and materials if they qualified for badges that accompanied published articles. Before badges, less than 3% of Psychological Science articles reported open data. After badges, 23% reported open data, with an accelerating trend; 39% reported open data in the first half of 2015, an increase of more than an order of magnitude from baseline. There was no change over time in the low rates of data sharing among comparison journals. Moreover, reporting openness does not guarantee openness. When badges were earned, reportedly available data were more likely to be actually available, correct, usable, and complete than when badges were not earned. Open materials also increased to a weaker degree, and there was more variability among comparison journals. Badges are simple, effective signals to promote open practices and improve preservation of data and materials by using independent repositories.”
That said, Nosek (COS) notes that the success might not translate to other fields:
● “Yeah, there are good reasons to think that the impact when badges are adopted across journals and disciplines won't be quite as strong. The particular reason to think that is that the concerns about reproducibility are at the forefront of researchers minds, particularly psychologists’ minds.”
COS are also running prediction markets where researchers bet on their confidence that a study would replicate, which could help prioritize different research areas:
● “What we found was that the market was quite well calibrated for anticipating
the results that were observed in the replications —indicating on a substance
level that researchers have some knowledge about what's likely to replicate or not.
That's useful to know, that when people have priors that say, "Oh, I'm not so
sure about that result," that those are worth at least taking seriously. Whether
or not they end up being true or not, we don't know, but at least paying
attention to that skepticism or non skepticism if people really believe it.
Then other opportunities emerge if prediction markets become quite
effective at anticipating replication success. For example, prioritizing which
things to replicate. We can't replicate everything. Resources are limited and
the more we put resources into replication, the less we put resources into
innovation. We need to be as efficient as possible between the two.
The opportunity with doing some markets is to identify those projects that are,
or those findings that are very important, that the community feels very
uncertain about. And prioritize funding for those where it would be devastating
for a field, or actually very useful to know that this isn't actually a viable
direction, so that the resources on innovation can be placed in other directions
to really advance them more quickly.
That's been the real success of that. Now we have a number of subsequent
prediction markets ongoing for other projects that are replication projects to
see how viable this is as an approach.”
● I find it difficult to assess how entrenched the forces and for-profit incentives are in the current system, such that they would be open to these sorts of changes. That said, it seems plausible that many possible interventions similar to the COS badge scheme could exist which do not seem to greatly affect their motives, and I could imagine other journals adopting them.
● Again, I think researchers seem to be open with and value these kinds of changes, if only the incentives were different.
Also, Holden on funding a new org focussed on researching and advocating for changing science policy and infrastructure6 (not necessarily Open Science specifically, but the arguments carry over to some extent):
● “In conversations about this idea so far, I’ve encountered a mix of enthusiasm and skepticism. (I’ve also generally heard from science funders that it would be outside of their model, regardless of merits, because of the focus on influencing policy rather than directly supporting research.) Most of the skepticism has been along the lines of, “The current system’s cultural norms and practices are too deeply entrenched; it’s futile to try to change them, and better to support the best research directly.”
● This may turn out to be true, but I’m not convinced:
○ There is already a great deal of private money attempting to support the best research directly (including ~$700 million per year from Howard Hughes Medical Institute). Directly supporting research is generally expensive, especially in biomedical sciences. The NIH’s research project grants cost an average of ~$500,000 per year. We have previously remarked that policy-oriented philanthropy seems to “cost less” in some broad sense than scientific research funding. As a very rough argument along these lines, if an organization the size of Center for Global Development (~$15 million per year) or Center on Budget and Policy Priorities (~$30 million per year) could make the NIH (~$30 billion per year) 1% more efficient, it seems it would be more than justifying its existence.
○ In general, I think it is best to avoid putting too much weight on arguments of the form “It’s futile to try to influence policy X.” We have argued previously against placing excessive weight on seeming political tractability. My general impression is that policy change has often quickly moved from “seemingly futile” to “inevitable”; for examples of this, see our conversation with Frank Baumgartner. I’m particularly inclined to think that change is possible when (a) there is a high degree of agreement on what needs to change; (b) the key institutions are largely technocratic, with strong scope to do what they judge best rather than what a particular constituency supports; (c) there is no existing dedicated effort to optimize and build momentum around specific proposals; (d) there is no clear opposing constituency for many of the possible changes outlined above.”
Overall, my sense is that Open Science is an important field, but with few orgs working directly within it, and a lack of room for more funding. I would guess that there are more promising giving opportunities for the Oxford Prioritisation Project to be pursue elsewhere, and that the best ones in Open Science are already and will in future be filled by large foundations (e.g. LJAF).
What would change my mind?
● The journal system seemed more amenable to change, current system less entrenched (the majority of scientists seem to dislike it, but relatively little seems to be changing)
● There were more promising organizations working in the area, or promising funding gaps which I am not aware of for current orgs
● Evidence or just ideas for plausible-seeming interventions
● Less interest from large foundations
 “The hotter [e.g. more competitive] a scientific field (with more scientific teams involved), the less likely the research findings are to be true.” https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1182327/
 These people don’t even necessarily need to have harmful goals for this to go badly, just the ability to create a strong AI with goals which aren’t precisely aligned with humanity’s, so the argument goes that it would be better for the research to be in the open where others can critique it. “The AI does not hate you, nor does it love you, but you are made out of atoms which it can use for something else.” https://wiki.lesswrong.com/wiki/Paperclip_maximizer