# 40

Note: This payout report was published on the EA Funds website on April 27, 2020, but it hadn’t been published on the Forum before now.

## Matt Wage

### 80,000 Hours($100,000) General support 80,000 Hours is an organization that “provides research and support to help people switch into careers that effectively tackle [the world’s most pressing problems]”. They primarily focus on longtermist cause areas. You can see their most recent self-review here. 80,000 Hours is one of the organizations in the EA community that I am most optimistic about. This is partly because they seem to be addressing an important bottleneck with a high-caliber team, and partly because of a promising track record (e.g. “From 2016, the numbers of EAs reporting first hearing about EA from 80,000 Hours explode from 9% to 13% to 25%, making them the largest single source of EA recruitment in 2018.”) 80,000 Hours had not managed to hit their fundraising target and seemed to be running out of plausible funders, so it seemed like a good time for us to make a grant and help fill their remaining gap. ### Rethink Priorities($68,200)

Supporting research on nuclear weapons and AI arms control.

This is a restricted grant to Rethink Priorities to support their research on nuclear weapons and AI arms control.  You can see Rethink Priorities’ past work and most recent self-review here.

The main case for funding them (as I see it) is:

I think their approach of working on tractable research questions and frequently, publicly writing up their results is a promising model which I’d like to see more of in EA.

External references I trust have been positive on the quality of their past work.

Due to the public nature of their output, I think it will be pretty clear whether they're doing good research.  (This allows them to learn faster what research is useful and also allows the EA community to learn faster whether we should stop investing in the project.)

Their longtermist work seems to be less well-funded than that of most other organizations in the space, and it’s unlikely they would be able to grow that work without this grant from us.

## Jonas Vollmer

Note: Jonas is an advisor to the Long-Term Future Fund. Jonas's reasoning aligned well with our own, and he seemed best-placed to provide the basic case for the below grant (a case which caused us to ultimately make the grant).

### Pablo Stafforini($17,000) Writing the preliminary content for an encyclopedia of effective altruism. Pablo Stafforini applied for a grant of$17,000 over six months to write the preliminary content for an encyclopedia of effective altruism.

I advised in favor of this grant because I believe it is important for the EA community to organize its knowledge and make it easily accessible to the community. If successful, this resource would make it significantly easier for community members to:

familiarize themselves with EA knowledge,

introduce newcomers to EA ideas (see also some of the arguments outlined in The Value of Wikipedia Contributions in Social Sciences), and

easily retrieve publications on a particular topic (currently spread out across many journals, blogs, research agendas, and websites).

In Stafforini’s own words: “It is widely accepted that having existing research in a particular field organized systematically and succinctly is of high value (…).”

Several previous EA encyclopedia projects have not taken off or stopped being maintained for various reasons. In the application, Stafforini did not address the implementation and long-term maintenance of the project, which led to initial skepticism among fund managers. I tried to address these potential issues by sending follow-up questions and suggestions to Stafforini. I also suggested he reach out to the Centre for Effective Altruism so his project can serve to update or replace EA Concepts. The responses reduced my skepticism substantially: I believe Pablo will deliver high-quality written content and integrate it appropriately with existing resources. I assign a credence of 35% that this project will be maintained well in the longer term, which seems high enough for this to be worth trying.

Stafforini’s track record suggests he is well-suited for this project: He was one of the main authors and the project manager of EA Concepts (in my view the most successful EA encyclopedia so far), one of the main editors of EA-related Wikipedia articles, and a research assistant for Will MacAskill’s book Doing Good Better. Based on in-person interactions, I believe Stafforini to be particularly familiar with the academic and EA-adjacent literature on many EA topics and unusually truth-seeking and unlikely to present one-sided perspectives on topics he feels strongly about. Also, this appears to be a relatively inexpensive grant.

## Oliver Habryka

[Meta]: These grant rationales are somewhat less detailed than in previous LTFF payout reports that I’ve contributed to. I was also hoping to have published a set of longer grant writeups for last round by now, but sadly a global pandemic happened and threw off a lot of my plans, and I’ve also decided to reduce my time investment in the Long-Term Future Fund since I’ve become less excited about the value that the fund can provide at the margin (for a variety of reasons, which I also hope to have time to expand on at some point). I do still hope that I can find time to produce longer grant writeups, but would now assign only around 40% credence that I create longer writeups for either this round or next round.

As a result, the following report consists of a relatively straightforward list of the grants we made, with short explanations of the reasoning behind them.

### Will Bradshaw($25,000) Exploring crucial considerations for decision-making around information hazards. Will Bradshaw has been working with Anders Sandberg from the Future of Humanity Institute (FHI) on analysis of good decision-making protocols around information hazards. I trust the judgement of many of the people who have reviewed his work at FHI, and they expressed significant excitement about his work. I also personally think that work on better information hazard protocols is quite valuable, and that there has been relatively little public work on analyzing various sources of infohazards and how to navigate them, making marginal work quite valuable. After we made this recommendation, Will reached out to us and asked whether it’s possible for him to broaden the grant to also include some immediate crisis response relating to the coronavirus pandemic, in particular trying to make sure that this crisis translates into more long-term work on biorisk. We decided that it would be fine for him to work on either this or his work on infohazards, depending on his judgement and the judgement of his collaborators. ### Sofia Jativa Vega($7,700)

Developing a research project on how to infer human's internal mental models from their behaviour.

Sofia Jativa Vega wants to work together with Stuart Armstrong (from the Future of Humanity Institute) on developing methods for AI agents to infer human mental models and use those to predict human preferences. I have many disagreements with Stuart’s agenda, but overall trust his methodology and judgement, and have been following the research he has been posting online for a long time. Stuart was very excited about this opportunity, and Sofia seems to have the relevant background to make progress on these problems (with a PhD in neuroscience).

### Anthony Aguirre($65,000) Making Metaculus useful and available to EA and other organizations I’ve written up rationales for grants to Metaculus in the past, which you can see here. In my last writeup on Metaculus, I said the following: My current model is that Metaculus will struggle as a platform without a fully dedicated team or at least individual champion, though I have not done a thorough investigation of the Metaculus team and project, so I am not very confident of this. One of the major motivations for this grant is to ensure that Metaculus has enough resources to hire a potential new champion for the project (who ideally also has programming skills or UI design skills to allow them to directly work on the platform). That said, Metaculus should use the money as best they see fit. I am also concerned about the overlap of Metaculus with the Good Judgment Project, and currently have a sense that it suffers from being in competition with it, while also having access to substantially fewer resources and people. The requested grant amount was for$150k, but I am currently not confident enough in this grant to recommend filling the whole amount. If Metaculus finds an individual new champion for the project, I can imagine strongly recommending that it gets fully funded, if the new champion seems competent.

I’ve since thought a lot more about Metaculus, have used the platform more myself, and have broadly been very happy with the progress that the platform has made since I wrote the summary above. As far as I know, Metaculus now has a full-time champion for the project (Tamay Besiroglu), and has demonstrated to me significant advantages over platforms like the Good Judgement Project, in particular with its ability to enter probability distributions over events and a question curation team that is much better at producing forecasts that I care about and seem important to me (and, I would guess, others working on global catastrophic risk).

This overall makes me excited about Metaculus’s future and the effects of this grant.

### Tushant Jha(TJ) ($40,000) Working on long-term macrostrategy and AI Alignment, and up-skilling and career transition towards that goal. Tushant Jha wants to visit multiple top x-risk organizations while working on a broad range of research questions. They were also accepted to the FHI Research Scholars Program (though were unable to participate due to immigration process related delays), and have also received a large number of highly positive references for their work. They sadly haven’t produced much public work, though I expect that to change over the coming months. I recommended this grant mostly on the basis of those strong references, and a small number of conversations I had with TJ in which they said reasonable things and generally displayed (as far as I can tell) good judgement on some open research questions. ## Helen Toner ### Shin-Shin HuaandHaydn Belfield($32,000)

Identifying and resolving tensions between competition law and long-term AI strategy.

Shin-Shin Hua and Haydn Belfield proposed a research project on the implications of competition law for long-term AI governance. One important set of questions in AI governance revolves around the number and type of actors (e.g. companies, governments, international organizations) involved in developing advanced AI systems. Competition law likely affects the types of intervention that are possible on this axis, e.g. whether multiple research groups could combine into a consortium-like structure. This broad topic has been discussed in the long-term AI governance space for a while, but so far very little serious work has been done on it.

Shin-Shin has 7 years of experience as a competition lawyer, and also has experience in academic legal research involving AI. Haydn is an Academic Project Manager at the Centre for the Study of Existential Risk.

I think good quality work on this topic would be valuable for the field of long-term AI governance, and I believe this is a promising team to undertake such work.

### MIRI(100,000) General Support The Machine Intelligence Research Institute (MIRI) does computer science and math research on AI alignment. MIRI is a challenging organization to evaluate, both because their public work is hard to evaluate (because much of it does not fit neatly within an existing discipline, and because much of the case for why the research matters is based on hypotheses about how AI research will develop in the future) and because they have decided to make much of their research non-public, for strategic reasons. Nonetheless, I believe there is sufficient evidence that MIRI is doing good work that it makes sense for the Fund to support them. This evidence includes the track record of their team, the quality of some recent hires, and some promising signs about their ability to produce work that well-established external reviewers consider to be very high-quality—most notably, the acceptance of one of their decision theory papers to a top philosophy journal, The Journal of Philosophy. ## Adam Gleave Adam Gleave trialled as a member of the Fund management team during this grant round. Adam has subsequently been confirmed as a permanent member of the Long-Term Future Fund management team. ### Dan Hendrycks(55,000)

A human value modeling benchmark

Dan Hendrycks is a second-year AI PhD student at UC Berkeley, advised by Dawn Song and Jacob Steinhardt. This is a restricted grant to support creation of a benchmark for NLP model's predictive power for human models. In particular, it supports paying contractors via platforms such as Mechanical Turk to generate and validate question-answer pairs.

I am generally excited about benchmarks as a route for progress on AI safety, especially for focusing the attention of the AI research community. Their impact is heavy-tailed, with many benchmarks seeing little adoption while others being extremely influential, so this is definitely a "hits-based" grant. However, Dan does have a strong track record of creating several benchmarks in the field of robust machine learning which makes me optimistic.

The topic, testing whether NLP models implicitly capture notions of human morality, is very novel. Language is a natural channel by which to express preferences, especially over more abstract concepts, so it seems important that NLP models can represent preferences. It is common in the field to train unsupervised language models on large corpora and then fine-tune for specific applications. So testing whether preexisting models have already learned some information about preferences is a natural starting point.

A key concern regarding this project is that being able to predict human moral judgements in text does not directly translate to better aligned AI systems. We expect most of the project's impact to come from gaining a better qualitative understanding of language models deficiencies, and from increased attention on human values in general in the NLP community. However, there is some risk that the limitations of the benchmark are not sufficiently recognized, and the NLP community wrongly believes that value learning is "solved".

### Vincent Luczkow($10,000) Counterfactual impact minimization Vincent Luczkow is an MSc student at Mila, soon to start a PhD in AI, interested in conducting research on AI safety. This grant is for a research project on counterfactual impact minimization. We anticipate Vincent spending it in part on attending conferences, and in part on freeing up time for this research project (for example, by not needing to TA supplement his income). My experience interacting with US PhD students suggests many students are less productive due to financial constraints, especially those without savings or other income sources. Mila seems to have below-average stipends, of between 25,000 and 30,000 CAD (~18,000 to 21,500 USD). By contrast, US universities typically offer a package of at least 40,000 USD/year. While Montreal has lower costs of living, overall we find it likely that a student at MILA would benefit from supplemental funding. Since Vincent is an early-stage researcher, he has a limited track record to evaluate, making this a slightly risky grant. However, getting into MILA (one of the leading AI labs) is a strong signal, and referees spoke positively about his motivation. Since we view there being little downside risk (beyond the opportunity cost of the donation) and a significant chance of a larger upside at little cost, we decided to make the grant. ### Michael Dickens($33,000)

Conducting independent research on cause prioritization

This is a grant to Michael Dickens to conduct independent research on cause prioritization, focusing on investment strategies for effective altruists and long-termists. Specifically, he intends to refine his work on how altruistic investors may want to invest differently than self-interested market participants. Additionally, he intends to focus more on giving now vs later: that is, whether to donate in the short term, or save and donate later.

We were impressed by his prior essays analysing investment strategies. I previously worked as a quantitative trader, and I saw Michael’s essays as describing straightforward but useful applications of asset pricing models and providing a good literature review of investment advice. While I disagreed with some of the claims he made, he had explicitly acknowledged in the essay that those claims were controversial.

Whether to give now or later is an important question for long-termists that has received limited attention. Based on his track record, we expect Michael will both be able to make research progress and communicate these results clearly.

# 40

New Comment
some promising signs about their ability to produce work that well-established external reviewers consider to be very high-quality—most notably, the acceptance of one of their decision theory papers to a top philosophy journal, The Journal of Philosophy.

I get that this is not the main case for the grant, and that MIRI generally avoids dealing with academia so this is not a great criterion to evaluate them on, but getting a paper accepted does not imply "very high-quality", and having a single paper accepted in (I assume) a couple of years is an extremely low bar (e.g. many PhD students exceed this in terms of their solo output).

I'd heard that the particular journal had quite a high quality bar. Do you have a sense of whether that's true or how hard it is to get into that journal? I guess we could just check the number of PhD students who get published in an edition of the journal to check the comparison.

I'd say most PhD students don't publish in the Journal of Philosophy or other journals of a similar or better quality (it's the fourth best general philosophy journal according to a poll by Brian Leiter).

This blog post seems to suggest it has an acceptance rate of about 5%.

I don't know for sure, but at least in most areas of Computer Science it is pretty typical for at least Berkeley PhD students to publish in the top conferences in their area. (And they could publish in top journals; that just happens not to be as incentivized in CS.)

I generally dislike using acceptance rates -- I don't see strong reasons that they should correlate strongly with quality or difficulty -- but top CS conferences have maybe ~25% acceptance rates, suggesting this journal would be 5x "harder". This is more than I thought, but I don't think it brings me to the point of thinking it should be a significant point in favor in an outside evaluation, given the size of the organization and the time period over which we're talking.

Not an expert but, fwiw, my impression is that this is more common in CS than philosophy and the social science areas I know best.

Do you mean CS or ML? Because (I believe) ML is an especially new and 'flat' field where it doesn't take as long to get to the cutting edge, so it probably isn't representative.

I do mean CS and not just ML. (E.g. PLDI and OSDI are top conferences with acceptance rates of 27% and 18% respectively according to the first Google result, and Berkeley students do publish there.)