Please note that the following grants are only recommendations, as all grants are still pending an internal due diligence process by CEA.
This post contains our allocation and some explanatory reasoning for our Q1 2019 grant round. We opened up an application for grant requests earlier this year which was open for about one month, after which we received an unanticipated large donation of about $715k. This caused us to reopen the application for another two weeks. We then used a mixture of independent voting and consensus discussion to arrive at our current grant allocation.
What is listed below is only a set of grant recommendations to CEA, who will run these by a set of due-diligence tests to ensure that they are compatible with their charitable objectives and that making these grants will be logistically feasible.
Each grant recipient is followed by the size of the grant and their one-sentence description of their project.
- Anthony Aguirre ($70,000): A major expansion of the Metaculus prediction platform and its community
- Tessa Alexanian ($26,250): A biorisk summit for the Bay Area biotech industry, DIY biologists, and biosecurity researchers
- Shahar Avin ($40,000): Scaling up scenario role-play for AI strategy research and training; improving the pipeline for new researchers
- Lucius Caviola ($50,000): Conducting postdoctoral research at Harvard on the psychology of EA/long-termism
- Connor Flexman ($20,000): Performing independent research in collaboration with John Salvatier
- Ozzie Gooen ($70,000): Building infrastructure for the future of effective forecasting efforts
- Johannes Heidecke ($25,000): Supporting aspiring researchers of AI alignment to boost themselves into productivity
- David Girardo ($30,000): A research agenda rigorously connecting the internal and external views of value synthesis
- Nikhil Kunapuli ($30,000): A study of safe exploration and robustness to distributional shift in biological complex systems
- Jacob Lagerros ($27,000): Building infrastructure to give X-risk researchers superforecasting ability with minimal overhead
- Lauren Lee ($20,000): Working to prevent burnout and boost productivity within the EA and X-risk communities
- Alex Lintz ($17,900): A two-day, career-focused workshop to inform and connect European EAs interested in AI governance
- Orpheus Lummis ($10,000): Upskilling in contemporary AI techniques, deep RL, and AI safety, before pursuing a ML PhD
- Vyacheslav Matyuhin ($50,000): An offline community hub for rationalists and EAs
- Tegan McCaslin ($30,000): Conducting independent research into AI forecasting and strategy questions
- Robert Miles ($39,000): Producing video content on AI alignment
- Anand Srinivasan ($30,000): Formalizing perceptual complexity with application to safe intelligence amplification
- Alex Turner ($30,000): Building towards a “Limited Agent Foundations” thesis on mild optimization and corrigibility
- Eli Tyre ($30,000): Broad project support for rationality and community building interventions
- Mikhail Yagudin ($28,000): Giving copies of Harry Potter and the Methods of Rationality to the winners of EGMO 2019 and IMO 2020
- CFAR ($150,000): Unrestricted donation
- MIRI ($50,000): Unrestricted donation
- Ought ($50,000): Unrestricted donation
Total distributed: $923,150
Here we explain the purpose for each grant and summarize our reasoning behind their recommendation. Each summary is written by the fund member who was most excited about recommending the relevant grant (plus some constraints on who had time available to write up their reasoning). These differ a lot in length, based on how much available time the different fund members had to explain their reasoning.
Writeups by Helen Toner
Alex Lintz ($17,900)
A two-day, career-focused workshop to inform and connect European EAs interested in AI governance
Alex Lintz and some collaborators from EA Zürich proposed organizing a two-day workshop for EAs interested in AI governance careers, with the goals of giving participants background on the space, offering career advice, and building community. We agree with their assessment that this space is immature and hard to enter, and believe their suggested plan for the workshop looks like a promising way to help participants orient to careers in AI governance.
Writeups by Matt Wage
Tessa Alexanian ($26,250)
A biorisk summit for the Bay Area biotech industry, DIY biologists, and biosecurity researchers
We are funding Tessa Alexanian to run a one day biosecurity summit, immediately following the SynBioBeta industry conference. We have also put Tessa in touch with some experienced people in the biosecurity space who we think can help make sure the event goes well.
Shahar Avin ($40,000)
Scaling up scenario role-play for AI strategy research and training; improving the pipeline for new researchers
We are funding Shahar Avin to help him hire an academic research assistant and for other miscellaneous research expenses. We think positively of Shahar’s past work (for example this report), and multiple people we trust recommended that we fund him.
Lucius Caviola ($50,000)
Conducting postdoctoral research at Harvard on the psychology of EA/long-termism
We are funding Lucius Caviola for a 2-year postdoc at Harvard working with Professor Joshua Greene. Lucius plans to study the psychology of effective altruism and long-termism, and an EA academic we trust had a positive impression of him. We are splitting the cost of this project with the EA Meta Fund because some of Caviola’s research (on effective altruism) is a better fit for the Meta Fund while some of his research (on long-termism) is a better fit for our fund.
We funded Ought in our last round of grants, and our reasoning for funding them in this round is largely the same. Additionally, we wanted to help Ought diversify its funding base because it currently receives almost all its funding from only two sources and is trying to change that.
Our comments from last round:
Ought is a nonprofit aiming to implement AI alignment concepts in real-world applications. We believe that Ought’s approach is interesting and worth trying, and that they have a strong team. Our understanding is that hiring is currently more of a bottleneck for them than funding, so we are only making a small grant. Part of the aim of the grant is to show Ought as an example of the type of organization we are likely to fund in the future.
Writeups by Alex Zhu
Nikhil Kunapuli ($30,000)
A study of safe exploration and robustness to distributional shift in biological complex systems
Nikhil Kunapuli is doing independent deconfusion research for AI safety. His approach is to develop better foundational understandings of various concepts in AI safety, like safe exploration and robustness to distributional shift, by exploring these concepts in complex systems science and theoretical biology, domains outside of machine learning for which these concepts are also applicable. To quote an illustrative passage from his application:
When an organism within an ecosystem develops a unique mutation, one of several things can happen. At the level of the organism, the mutation can either be neutral in terms of fitness, maladaptive and leading to reduced reproductive success and/or death, or adaptive. For an adaptive mutation, the upgraded fitness of the organism will change the fitness landscape for all other organisms within the ecosystem, and in response, the structure of the ecosystem will either be perturbed into a new attractor state or destabilized entirely, leading to ecosystem collapse. Remarkably, most mutations do not kill their hosts, and most mutations also do not lead to ecosystem collapse. This is actually surprising when one considers the staggering complexity present within a single genome (tens of thousands of genes deeply intertwined through genomic regulatory networks) as well as an ecosystem (billions of organisms occupying unique niches and constantly co-evolving). One would naïvely think that a system so complex must be highly sensitive to change, and yet these systems are actually surprisingly robust. Nature somehow figured out a way to create robust organisms that could respond to and function in a shifting environment, as well as how to build ecosystems in which organisms could be free to safely explore their adjacent possible new forms without killing all other species.
Nikhil spent a summer doing research for the New England Complex Systems Institute. He also spent 6 months as the cofounder and COO of an AI hardware startup, which he left because he decided that direct work on AI safety is more urgent and important.
I recommended that we fund Nikhil because I think Nikhil’s research directions are promising, and because I personally learn a lot about AI safety every time I talk with him. The quality of his work will be assessed by researchers at MIRI.
Anand Srinivasan ($30,000)
Formalizing perceptual complexity with application to safe intelligence amplification
Anand Srinivasan is doing independent deconfusion research for AI safety. His angle of attack is to develop a framework that will allow researchers to make provable claims about what specific AI systems can and cannot do, based off of factors like their architectures and their training processes. For example, AlphaGo can “only have thoughts” about patterns on Go boards and lookaheads, which aren’t expressive enough to encode thoughts about malicious takeover.
AI researchers can build safe and extremely powerful AI systems by relying on intuitive judgments of their capabilities. However, these intuitions are non-rigorous and prone to error, especially since powerful optimization processes can generate solutions that are totally novel and unexpected to humans. Furthermore, competitive dynamics will incentivize rationalization about which AI systems are safe to deploy. Under fast takeoff assumptions, a single rogue AI system could lead to human extinction, making it particularly unreliable for us to rely exclusively on intuitive judgments about which AI systems are safe. Anand’s goal is to develop a framework that formalizes these intuitions well enough to permit future AI researchers to make provable claims about what future AI systems can and can’t internally represent.
Anand was the CTO of an enterprise software company that he cofounded with me, where he managed a six-person engineering team for two years. Upon leaving the company, he decided to refocus his efforts toward building safe AGI. Before dropping out of MIT, Anand worked on Ising models for fast image classification and fuzzy manifold learning (which was later independently published as a top paper at NIPS).
I recommended that we fund Anand because I think Anand’s research directions are promising, and I personally learn a lot about AI safety every time I talk with him. The quality of Anand’s work will be assessed by researchers at MIRI.
David Girardo ($30,000)
A research agenda rigorously connecting the internal and external views of value synthesis
David Girardo is doing independent deconfusion research for AI safety. His angle of attack is to elucidate the ontological primitives for representing hierarchical abstractions, drawing from his experience with type theory, category theory, differential geometry, and theoretical neuroscience.
I decided to fund David because I think David’s research directions are very promising, and because I personally learn a lot about AI safety every time I talk with him. Tsvi Benson-Tilsen, a MIRI researcher, has also recommended that David get funding. The quality of David’s work will be assessed by researchers at MIRI.
Writeups by Oliver Habryka
I have a broad sense that funders in EA tend to give little feedback to organizations they are funding, as well as organizations that they explicitly decided not to fund (usually due to time constraints). So in my writeups below I tried to be as transparent as possible in explaining the real reasons for what caused me to believe a grant was a good idea, what my biggest hesitations are, and took a lot of opportunities to explain background models of mine that might help others get better at understanding my future decisions in this space.
For some of the grants below, I think there exist more publicly defensible (or easier to understand) arguments for the grants that I recommended. However I tried to explain the actual models that drove my decisions for these grants, which are often hard to put into a few paragraphs of text, and so I apologize in advance for some of the explanations below almost certainly being a bit hard to understand.
Note that when I’ve written about how I hope a grant will be spent, this was in aid of clarifying my reasoning and is in no way meant as a restriction on what the grant should be spent on. The only restriction is that it should be spent on the project they applied for in some fashion, plus any further legal restrictions that CEA requires.
Mikhail Yagudin ($28,000)
Giving copies of Harry Potter and the Methods of Rationality to the winners of EGMO 2019 and IMO 2020
From the application:
EA Russia has the oral agreements with IMO [International Math Olympiad] 2020 (Saint Petersburg, Russia) & EGMO [European Girls’ Mathematical Olympiad] 2019 (Kyiv, Ukraine) organizers to give HPMORs [copies of Harry Potter and the Methods of Rationality] to the medalists of the competitions. We would also be able to add an EA / rationality leaflet made by CFAR (I contacted Timothy Telleen-Lawton on that matter).
My thoughts and reasoning
[Edit & clarification: The books will be given out by the organisers of the IMO and EGMO as prizes for the 650 people who got far enough to participate, all of which are "medalists".]
My model for the impact of this grant roughly breaks down into three questions:
- What effects does reading HPMOR have on people?
- How good of a target group are Math Olympiad winners for these effects?
- Is the team competent enough to execute on their plan?
What effects does reading HPMOR have on people?
My models of the effects of HPMOR stem from my empirical observations and my inside view on rationality training.
- Empirically, a substantial number of top people in our community have (a) entered due to reading and feeling a deep connection to HPMOR and (b) attributed their approach to working on the long term future in substantial part to the insights they learned from reading HPMOR. This includes some individuals receiving grants on this list, and some individuals on the grant-making team.
- I also weight here my inside view of the skills that HPMOR helps to teach. I’ll try to point at the things I think HPMOR does exceptionally and uniquely well at, though I find it a bit hard to make my models fully explicit here in an appropriate amount of space.
- The most powerful tools that humanity has discovered so far are methods for thinking quantitatively and scientifically about how our universe works, and using this understanding to manipulate the universe. HPMOR attempts to teach the fundamental skills behind this thinking in three main ways:
- The first way HPMOR teaches science is that the reader is given many examples of the inside of someone’s mind when they are thinking with the goal of actually understanding the world and are reasoning with the scientific and quantitative understanding humanity has developed. HPMOR is a fictional work, containing a highly detailed world with characters whose experience a reader empathises with and storylines that evoke responses from a reader. The characters in HPMOR demonstrate the core skills of quantitative, scientific reasoning: forming a hypothesis, making a prediction, throwing out the hypothesis when the prediction does not match reality, and otherwise updating probabilistically when they don’t yet have decisive evidence.
- The second way HPMOR teaches science is that key scientific results and mechanisms are woven into the narrative of the book. Studies in the heuristics and biases literature, genetic selection, programming loops, Bayesian reasoning, and more are all explained in an unusually natural manner. They aren’t just added on top of the narrative in order for there to be science in the book; instead, the story’s universe is in fact constrained by these theories in such a way that they are naturally brought up by characters attempting to figure out what they should do.
- This contributes to the third way HPMOR helps teach scientific thinking: HPMOR is specifically designed to be understandable in advance of the end of the book, and many readers have used the thinking tools taught in the book to do just that. One of the key bottlenecks in individuals’ ability to affect the long-term future is the ability to deal with the universe as though it is understandable in principle, and HPMOR creates a universe where this is so and includes characters doing their best to understand it. This sort of understanding is necessary for being able to take actions that will have large, intended effects on important and difficult problems 10^n years down the line.
- The book also contains characters who viscerally care about humanity, other conscious beings, and our collective long-term future, and take significant actions in their own lives to ensure that this future goes well.
- It is finally worth noting that HPMOR does all of the above things while also being a highly engaging book that has been read by hundreds of thousands of readers (if not more) primarily for pleasure. It is the most reviewed Harry Potter fan fiction on fanfiction.net, which is a remarkable state of affairs.
How good of a target group are Math Olympiad winners for these effects?
I think that Math Olympiad winners are a very promising demographic within which to find individuals who can contribute to improving the long-term future. I believe Math Olympiads select strongly on IQ as well as (weakly) on conscientiousness and creativity, which are all strong positives. Participants are young and highly flexible; they have not yet made too many major life commitments (such as which university they will attend), and are in a position to use new information to systematically change their lives’ trajectories. I view handing them copies of an engaging book that helps teach scientific, practical and quantitative thinking as a highly asymmetric tool for helping them make good decisions about their lives and the long-term future of humanity.
I’ve also visited and participated in a variety of SPARC events, and found the culture there (which is likely to be at least somewhat representative of Math Olympiad culture) very healthy in a broad sense. Participants displayed high levels of altruism, a lot of willingness to help one another, and an impressive amount of ambition to improve their own thinking and affect the world in a positive way. These observations make me optimistic about efforts that build on that culture.
I think it’s important when interacting with minors, and attempting to improve (and thus change) their life trajectories, to make sure to engage with them in a safe way that is respectful of their autonomy and does not put social pressures on them in ways they may not yet have learned to cope with. In this situation, Mikhail is working with/through the institutions that run the IMO and EGMO, and I expect those institutions to (a) have lots of experience with safeguarding minors and (b) have norms in place to make sure that interactions with the students are positive.
Is the team competent enough to execute on their plan?
I don’t have a lot of information on the team, don’t know Mikhail, and have not received any major strong endorsement for him and his team, which makes this the weakest link in the argument. However, I know that they are coordinating both with SPARC (which also works to give books like HPMOR to similar populations) and the team behind the highly successful Russian printing of HPMOR, two teams who have executed this kind of project successfully in the past. So I felt comfortable recommending this grant, especially given its relatively limited downside.
Alex Turner ($30,000)
Building towards a “Limited Agent Foundations” thesis on mild optimization and corrigibility
From the application:
I am a third-year computer science PhD student funded by a graduate teaching assistantship; to dedicate more attention to alignment research, I am applying for one or more trimesters of funding (spring term starts April 1).
Last summer, I designed an approach to the “impact measurement” subproblem of AI safety: “what equation cleanly captures what it means for an agent to change its environment, and how do we implement it so that an impact-limited paperclip maximizer would only make a few thousand paperclips?”. I believe that my approach, Attainable Utility Preservation (AUP), goes a long way towards answering both questions robustly, concluding:
> By changing our perspective from “what effects on the world are ‘impactful’?” to “how can we stop agents from overfitting their environments?”, a natural, satisfying definition of impact falls out. From this, we construct an impact measure with a host of desirable properties […] AUP agents seem to exhibit qualitatively different behavior […]
Primarily, I aim both to output publishable material for my thesis and to think deeply about the corrigibility and mild optimization portions of MIRI’s machine learning research agenda. Although I’m excited by what AUP makes possible, I want to lay the groundwork of deep understanding for multiple alignment subproblems. I believe that this kind of clear understanding will make positive AI outcomes more likely.
My thoughts and reasoning
I’m excited about this because:
- Alex’s approach to finding personal traction in the domain of AI Alignment is one that I would want many other people to follow. On LessWrong, he read and reviewed a large number of math textbooks that are useful for thinking about the alignment problem, and sought public input and feedback on what things to study and read early on in the process.
- He wasn’t intimidated by the complexity of the problem, but started thinking independently about potential solutions to important sub-problems long before he had “comprehensively” studied the mathematical background that is commonly cited as being the foundation of AI Alignment.
- He wrote up his thoughts and hypotheses in a clear way, sought feedback on them early, and ended up making a set of novel contributions to an interesting sub-field of AI Alignment quite quickly (in the form of his work on impact measures, on which he recently collaborated with the DeepMind AI Safety team)
These intuitions, however, are a bit in conflict with some of the concrete research that Alex has actually produced. My inside views on AI Alignment make me think that work on impact measures is very unlikely to result in much concrete progress on what I perceive to be core AI Alignment problems, and I have talked to a variety of other researchers in the field who share that assessment. I think it’s important that this grant not be viewed as an endorsement of the concrete research direction that Alex is pursuing, but only as an endorsement of the higher-level process that he has been using while doing that research.
As such, I think it was a necessary component of this grant that I have talked to other people in AI Alignment whose judgment I trust, who do seem excited about Alex’s work on impact measures. I think I would not have recommended this grant, or at least this large of a grant amount, without their endorsement. I think in that case I would have been worried about a risk of diverting attention from what I think are more promising approaches to AI Alignment, and a potential dilution of the field by introducing a set of (to me) somewhat dubious philosophical assumptions.
Overall, while I try my best to form concrete and detailed models of the AI Alignment research space, I don’t currently devote enough time to it to build detailed models that I trust enough to put very large weight on my own perspective in this particular case. Instead, I am mostly deferring to other researchers in this space that I do trust, a number of whom have given positive reviews of Alex’s work.
In aggregate, I have a sense that the way Alex went about working on AI Alignment is a great example for others to follow, I’d like to see him continue, and I am excited about the LTF Fund giving out more grants to others who try to follow a similar path.
Orpheus Lummis ($10,000)
Upskilling in contemporary AI techniques, deep RL and AI safety, before pursuing a ML PhD
From the application :
Notable planned subprojects:
- Engaging with David Krueger’s AI safety reading group at Mila
- Starting & maintaining a public index of AI safety papers, to help future literature reviews and to complement https://vkrakovna.wordpress.com/ai-safety-resources/, as a standalone wiki-page (eg at http://aisafetyindex.net)
- From-scratch implementation of seminal deep RL algorithms
- Going through textbooks: Goodfellow Bengio Courville 2016, Sutton Barto 2018, …
- Possibly doing the next AI Safety camp
- Building a prioritization tool for English Wikipedia using NLP, building on the literature of quality assessment (https://paperpile.com/shared/BZ2jzQ)
- Studying the AI Alignment literature
My thoughts and reasoning
We funded Orpheus in our last grant round to run an AI Safety Unconference just after NeurIPS. We’ve gotten positive testimonials from the event, and I am overall happy about that grant.
I do think that of the grants I recommended this round, this is probably the one I feel least confident about. I don’t know Orpheus very well, and while I have received generally positive reviews of their work, I haven’t yet had the time to look into any of those reviews in detail, and haven’t seen clear evidence about the quality of their judgment. However, what I have seen seems pretty good, and if I had even a tiny bit more time to spend on evaluating this round’s grants, I would probably have spent it reaching out to Orpheus and talking with them more in person.
In general, I think time for self-study and reflection can be exceptionally important for people starting to work in AI Alignment. This is particularly true if they are following a more conventional academic path which could easily cause them to try to immediately work on contemporary AI capabilities research, because I generally think this has negative value even for people concerned about safety (though I do have some uncertainty here). I think giving people working on more classical ML research the time and resources to explore the broader implications of their work on safety, if they are already interested in that, is a good use of resources.
I am also excited about building out the Montreal AI Alignment community, and having someone who both has the time and skills to organize events and can understand the technical safety work seems likely to have good effects.
This grant is also the smallest grant we are funding this round, making me more comfortable with a bit less due diligence than the other grants, especially since this grant seems unlikely to have any large negative consequences.
Tegan McCaslin ($30,000)
Conducting independent research into AI forecasting and strategy questions
From the application:
1) I’d like to independently pursue research projects relevant to AI forecasting and strategy, including (but not necessarily limited to) some of the following:
- Does the trajectory of AI capability development match that of biological evolution?
- How tractable is long-term forecasting?
- How much compute did evolution use to produce intelligence?
- Benchmarking AI capabilities against insects
- Collect and summarize the views of people other than Paul, Dario and Jacob on timelines
- Short doc with more detail on the first two projects here: https://docs.google.com/document/d/1hTLrLXewF-_iJiefyZPF6L677bLrUTo2ziy6BQbxqjs/edit?usp=sharing
I am actively pursuing opportunities to work with or under more senior AI strategy researchers [..], so my research focus within AI strategy is likely to be influenced by who exactly I end up working with. Otherwise I expect to spend some short period of time at the start generating more research ideas and conducting pilot tests on the order of several hours into their tractability, then choosing which to pursue based on an importance/tractability/neglectedness framework.
2) There are relatively few researchers dedicated full-time to investigating AI strategy questions that are not immediately policy-relevant. However, there nonetheless exists room to contribute to the research on existential risks from AI with approaches that fit into neither technical AI safety nor AI policy/governance buckets.
My thoughts and reasoning
Tegan has been a member of the X-risk network for several years now, and recently left AI Impacts. She is now looking for work as a researcher. Two considerations made me want to recommend that the LTF Fund make a grant to her.
- It’s easier to relocate someone who has already demonstrated trust and skills than to find someone completely new.
- This is (roughly) advice given by YCombinator to startups, and I think it’s relevant to the X-risk community. It’s cheaper for Tegan to move around and find the place for her to do her best work relative to an outsider who has not already worked within the X-risk network. A similarly skilled individual who is not already part of the network will need to spend a few years understanding the community and demonstrating that they can be trusted. So I think it is a good idea to help Tegan explore other parts of the community to work in.
- It’s important to give good researchers runway while they find the right place.
- For many years, the X-risk community has been funding-bottlenecked, keeping salaries low. A lot of progress has been made on this front and I hope that we’re able to fix this. Unfortunately, the current situation means that when a hire does not work out, the individual often doesn’t have much runway while reorienting, updating on what didn’t work out, and subsequently trialing at other organizations.
- This moves them much more quickly into an emergency mode, where everything must be optimized for short-term income, rather than long-term updating, skill building, and research. As such, I think it is important for Tegan to have a comfortable amount of runway while doing solo research and trialling at various organizations in the community.
While I haven’t spent the time to look into Tegan’s research in any depth, the small amount I did read looked promising. The methodology of this post is quite exciting, and her work there and on other pieces seems very thorough and detailed.
That said, my brief assessment of Tegan’s work was not the reason why I recommended this grant, and if Tegan asks for a new grant in 6 months to focus on solo research, I will want to spend significantly more time reading her output and talking with her, to understand how these questions were chosen and what precise relation they have to forecasting technological progress in AI.
Overall, I think Tegan is in a good place to find a valuable role in our collective X-risk reduction project, and I’d like her to have the runway to find that role.
Anthony Aguirre ($70,000)
A major expansion of the Metaculus prediction platform and its community
From the application:
The funds would be used to expand the Metaculus prediction platform along with its community. Metaculus.com is a fully-functional prediction platform with ~10,000 registered users and >120,000 predictions made to date on more than >1000 questions. The goals of Metaculus are:
- Short-term: Provide a resource to science, tech, and (especially) EA-related communities already interested in generating, aggregating, and employing accurate predictions, and training to be better predictors.
- Medium-term: Improve decision-making by individuals and groups by providing well-calibrated numerical predictions.
- Long-term: encourage and backstop a widespread culture of accountable and accurate predictions and scenario planning.
There are two major high-priority expansions possible with funding in place. The first would be an integrated set of extensions to improve user interaction and information-sharing. This would include private messaging and notifications, private groups, a prediction “following” system to create micro-teams within individual questions, and various incentives and systems for information-sharing.
The second expansion would link questions into a network. Users would express links between questions, from very simple (“notify me regarding question Y when P(X) changes substantially) to more complex (“Y happens only if X happens, but not conversely”, etc.) Information can also be gleaned from what users actually do. The strength and character of these relations can then generate different graphical models that can be explored interactively, with the ultimate goal of a crowd-sourced quantitative graphical model that could structure event relations and propagate new information through the network.
My thoughts and reasoning
For this grant, and also the grants to Ozzie Gooen and Jacob Lagerros, I did not have enough time to write up my general thoughts on forecasting platforms and communities. I hope to later write a post with my thoughts here. But for a short summary, see my thoughts on Ozzie Gooen’s grant.
I am generally excited about people building platforms for coordinating intellectual labor, particularly on topics that are highly relevant to the long-term future. I think Metaculus has been providing a valuable service for the past few years, both in improving our collective ability to forecast a large variety of important world events and in allowing people to train and demonstrate their forecasting skills, which I expect to become more relevant in the future.
I am broadly impressed with how cooperative and responsive the Metaculus team has been in helping organizations in the X-risk space get answers to important questions, or provide software services to them (e.g. I know that they are helping Jacob Lagerros and Ben Goldhaber set up a private Metaculus instance focused on AI)
I don’t know Anthony well, and overall I am quite concerned that there is no full-time person on this project. My model is that projects like this tend to go a lot better if they have one core champion who has the resources to fully dedicate themselves to the project, and it currently doesn’t seem that Anthony is able to do that.
My current model is that Metaculus will struggle as a platform without a fully dedicated team or at least individual champion, though I have not done a thorough investigation of the Metaculus team and project, so I am not very confident of this. One of the major motivations for this grant is to ensure that Metaculus has enough resources to hire a potential new champion for the project (who ideally also has programming skills or UI design skills to allow them to directly work on the platform). That said, Metaculus should use the money as best they see fit.
I am also concerned about the overlap of Metaculus with the Good Judgment Project, and currently have a sense that it suffers from being in competition with it, while also having access to substantially fewer resources and people.
The requested grant amount was for $150k, but I am currently not confident enough in this grant to recommend filling the whole amount. If Metaculus finds an individual new champion for the project, I can imagine strongly recommending that it gets fully funded, if the new champion seems competent.
Lauren Lee ($20,000)
Working to prevent burnout and boost productivity within the EA and X-risk communities
From the application:
(1) After 2 years as a CFAR instructor/researcher, I’m currently in a 6-12 month phase of reorienting around my goals and plans. I’m requesting a grant to spend the coming year thinking about rationality and testing new projects.
(2) I want to help individuals and orgs in the x-risk community orient towards and achieve their goals.
(A) I want to train the skill of dependability, in myself and others.
This is the skill of a) following through on commitments and b) making prosocial / difficult choices in the face of fear and aversion. The skill of doing the correct thing, despite going against incentive gradients, seems to be the key to virtue.
One strategy I’ve used is to surround myself with people with shared values (CFAR, Bay Area) and trust the resulting incentive gradients. I now believe it is also critical to be the kind of person who can take correct action despite prevailing incentive structures.
Dependability is also related to thinking clearly. Your ability to make the right decision depends on your ability to hold and be with all possible realities, especially painful and aversive ones. Most people have blindspots that actively prevent this.
I have some leads on how to train this skill, and I’d like both time and money to test them.
(B) Thinking clearly about AI risk
Most people’s decisions in the Bay Area AI risk community seem model-free. They themselves don’t have models of why they’re doing what they’re doing; they’re relying on other people “with models” to tell them what to do and why. I’ve personally carried around such premises. I want to help people explore where their ‘placeholder premises’ are and create safety for looking at their true motivations, and then help them become more internally and externally aligned.
Speaking of “not getting very far.” My personal opinion is that most ex-CFAR employees left because of burnout; I’ve written what I’ve learned here, see top 2 comments: [https://forum.effectivealtruism.org/posts/NDszJWMsdLCB4MNoy/burnout-what-is-it-and-how-to-treat-it#87ue5WzwaFDbGpcA7]. I’m interested in working with orgs and individuals to prevent burnout proactively.
(3) Some possible measurable outputs / artifacts:
- A program where I do 1-on-1 sessions with individuals or orgs; I’d create reports based on whether they self-report improvements
- X-risk orgs (e.g. FHI, MIRI, OpenPhil, BERI, etc.) deciding to spend time/money on my services may be a positive indicator, as they tend to be thoughtful with how they spend their resources
- Writings or talks
- Workshops with feedback forms
- A more effective version of myself (notable changes = gaining the ability to ride a bike / drive a car / exercise—a PTSD-related disability, ability to finish projects to completion, others noticing stark changes in me)
My thoughts and reasoning
Lauren worked as an instructor at CFAR for about 2 years, until Fall 2018. I review CFAR’s impact as an institution below; in general, I believe it has helped set a strong epistemic foundation for the community and been successful in recruitment and training. I have a great appreciation for everyone who helps them with their work.
Lauren is currently in a period of reflection and reorientation around her life and the problem of AGI, in part due to experiencing burnout in the months before she left CFAR. To my knowledge, CFAR has never been well-funded enough to offer high salaries to its employees, and I think it is valuable to ensure that people who work at EA orgs and burn out have the support to take the time for self-care after quitting due to long-term stress. Ideally, I think this should be improved by higher salaries that allow employees to build significant runway to deal with shocks like this, but I think that the current equilibrium of salary levels in EA does not make that easy. Overall, I think it’s likely that staff at highly valuable EA orgs will continue burning out, and I don’t currently see it as an achievable target to not have this happen (though I am in favor of people people working on solving the problem).
I do not know Lauren well enough to evaluate the quality of her work on the art of human rationality, but multiple people I trust have given positive reviews (e.g. see Alex Zhu above), so I am also interested to read her output on the subjects she is thinking about.
I think it’s very important that people who work on developing an understanding of human rationality take the time to add their knowledge into our collective understanding, so that others can benefit from and build on top of it. Lauren has begun to write up her thoughts on topics like burnout, intentions, dependability, circling, and curiosity, and her having the space to continue to write up her ideas seemed like a significant additional positive outcome of this grant.
I think that she should probably aim to make whatever she does valuable enough that individuals and organizations in the community wish to pay her directly for her work. It’s unlikely that I would recommend renewing this grant for another 6 month period in the absence of a relatively exciting new research project/direction, and if Lauren were to reapply, I would want to have a much stronger sense that the projects she was working on were producing lots of value before I decided to recommend funding her again.
In sum, this grant hopefully helps Lauren to recover from burning out, get the new rationality projects she is working on off the ground, potentially identify a good new niche for her to work in (alone or at an existing organization), and write up her ideas for the community.
Ozzie Gooen ($70,000)
Build infrastructure for the future of effective forecasting efforts
From the application:
What I will do
I applied a few months ago and was granted $20,000 (thanks!). My purpose for this money is similar but greater in scope to the previous round. The previous funding has given me the security to be more ambitious, but I’ve realized that additional guarantees of funding should help significantly more. In particular, engineers can be costly and it would be useful to secure additional funding in order to give possible hires security.
My main overall goal is to advance the use of predictive reasoning systems for purposes most useful for Effective Altruism. I think this is an area that could eventually make use of a good deal of talent, so I have come to see my work at this point as foundational.
This work is in a few different areas that I think could be valuable. I expect that after a while a few parts will emerge as the most important, but think it is good to experiment early when the most effective route is not yet clear.
I plan to use additional funds to scale my general research and development efforts. I expect that most of the money will be used on programming efforts.
Foretold is a forecasting application that handles full probability distributions. I have begun testing it with users and have been asked for quite a bit more functionality. I’ve also mapped out the features that I expect people will eventually desire, and think there is a significant amount of work that would be significantly useful.
One particular challenge is figuring out the best way to handle large numbers of questions (1000 active questions plus, at a time.) I believe this requires significant innovations in the user interface and backend architecture. I’ve made some wireframes and have experimented with different methods, and believe I have a pragmatic path forward, but will need to continue to iterate.
I’ve talked with members of multiple organizations at this point who would like to use Foretold once it has a specific set of features, and cannot currently use any existing system for their purposes. […]
Ken is a project to help organizations set up and work with structured data, in essence allowing them to have private versions of Wikidata. Part of the project is Ken.js, a library which I’m beginning to integrate with Foretold.
The main aim of EA forecasting would be to better prioritize EA actions. I think that if we could have a powerful system set up, it could make us better at predicting the future, better at understanding what things are important and better at coming to a consensus on challenging topics.
In the short term, I’m using heuristics like metrics regarding user activity and upvotes on LessWrong. I’m also getting feedback by many people in the EA research community. In the medium to long term, I hope to set up evaluation/estimation procedures for many projects and would include this one in that process.
My thoughts and reasoning
This grant is to support Ozzie Gooen in his efforts to build infrastructure for effective forecasting. Ozzie requested $70,000 to hire a software engineer who would support him on his work on the prediction platform www.foretold.iothat he is working on.
- When thinking about how to improve the long-term future, I think we are confused about what counts as progress and what specific problems need solving. We can already see that there are a lot of technical and conceptual problems that have to be solved to make progress on a lot of the big problems we think are important.
- I think that in order to make effective intellectual progress, you need some way for many people to collaborate on solving problems and to document the progress they have made so far.
- I think there is potentially a lot of low-hanging fruit in designing better online platforms for making intellectual progress (which is why I chose to work on LessWrong + AI Alignment Forum + EA Forum). Ozzie works in this space too, and previously built Guesstimate (a spreadsheet where every cell is a probability distribution), which I think displayed some real innovation in the way we can use technology to communicate and clarify ideas. It was also produced to a very high standard of quality.
- Forecasting platforms in particular have already displayed significant promise and tractability, with recent work by Philip Tetlock showing that a simple prediction platform can outperform major governmental institutions like the CIA, and older work by Robin Hanson, showing ways that prediction markets could help us make progress on a number of interesting problems.
- The biggest concerns I have with Ozzie’s work, as well as the work on other prediction and aggregation platforms, is that the problem of getting people to actually use the product turns out to be very hard. Matt Fallshaw’s team at Trike Apps built https://predictionbook.com/, but then found it hard to get people to actually use it. Ozzie’s last project, Guesstimate, seemed quite well-executed, but similarly faltered due to low user numbers and a lack of interest from potential customers in industry. As such, I think it’s important not to underestimate the difficulty of making the product good enough that people actually use it.
- I do think that the road to building knowledge aggregation platforms will include many failed projects and many experiments that never get traction; as such, I do think that one should not over-update on the lack of users for some of the existing platforms. As a positive counterexample, the Good Judgment Project seems to have a consistently high number of people making predictions.
- I’ve also frequently interacted with Ozzie in person, and generally found his reasoning and judgment in this domain to be good. I also think it is quite good that he has been writing up his thinking for the community to read and engage with, which will allow other people to build off of his thinking and efforts, even if he doesn’t find traction with this particular project.
Johannes Heidecke ($25,000)
Supporting aspiring researchers of AI alignment to boost themselves into productivity
From the application:
(1) We would like to apply for a grant to fund an upcoming camp in Madrid that we are organizing. The camp consists of several weeks of online collaboration on concrete research questions, culminating in a 9-day intensive in-person research camp. Participants will work in groups on tightly-defined research projects in strategy and technical AI safety. Expert advisors from AI Safety/Strategy organizations will help refine proposals to be tractable and relevant. This allows for time-efficient use of advisors’ knowledge and research experience, and ensures that research is well-aligned with current priorities. More information: https://aisafetycamp.com/
(2) The field of AI alignment is talent-constrained, and while there is a significant number of young aspiring researchers who consider focussing their career on research on this topic, it is often very difficult for them to take the first steps and become productive with concrete and relevant projects. This is partially due to established researchers being time-constrained and not having time to supervise a large number of students. The goals of AISC are to help a relatively large number of high-talent people to take their first concrete steps in research on AI safety, connect them to collaborate, and efficiently use the capacities of experienced researchers to guide them on their path.
(3) We send out evaluation questionnaires directly after the camp and in regular intervals after the camp has passed. We measure impact on career decisions and collaborations and keep track of concrete output produced by the teams, such as blog posts or published articles.
We have successfully organized two camps before and are in the preparation phase for the third camp taking place in April 2019 near Madrid. I was the main organizer for the second camp and am advising the core team of the current camp, as well as organizing funding.
An overview of previous research projects from the first 2 camps can be found here:
We have evaluated the feedback from participants of the first two camps in the following two documents:
My thoughts and reasoning
I’ve talked with various participants of past AI Safety camps and heard broadly good things across the board. I also generally have a positive impression of the people involved, though I don’t know any of the organizers very well.
The material and testimonials that I’ve seen so far suggest that the camp successfully points participants towards a technical approach to AI Alignment, focusing on rigorous reasoning and clear explanations, which seems good to me.
I am not really sure whether I’ve observed significant positive outcomes of camps in past years, though this might just be because I am less connected to the European community these days.
I also have a sense that there is a lack of opportunities for people in Europe to productively work on AI Alignment related problems, and so I am particularly interested in investing in infrastructure and events there. This does however make this a higher-risk grant, since I think this means this event and the people surrounding it might become the main location for AI Alignment in Europe, and if the quality of the event and the people surrounding it isn’t high enough, this might cause long-term problems for the AI Alignment community in Europe.
- I think organizing long in-person events is hard, and conflict can easily have outsized negative effects. The reviews that I read from past years suggest that interpersonal conflict negatively affected many participants. Learning how to deal with conflict like this is difficult. The organizers seem to have considered this and thought a lot about it, but the most likely way I expect this grant to have large negative consequences is still if there is some kind of conflict at the camp that results in more serious problems.
- I think it’s inevitable that some people won’t get along with organizers or other participants at the camp for cultural reasons. If that happens, I think it’s important for these people to have some other way of getting connected to people working on AI Alignment. I don’t know the best way to arrange this, but I would want the organizers to think about ways to achieve it.
I also coordinated with Nicole Ross from CEA’s EA Grants project, who had considered also making a grant to the camp. We decided it would be better for the LTF Fund team to make this grant, though we wanted to make sure that some of the concerns Nicole had with this grant were summarized in our announcement:
- AISC could potentially turn away people who would be very good for AI Safety or EA, if those people have negative interactions at the camp or if they are much more talented than other participants (and therefore develop a low opinion of AI Safety and/or EA).
- Some negative interactions with people at the camp could, as with all residential programs, lead to harm and/or PR issues, (for example, if someone at the camp were sexually harassed). Being able to handle such issues thoughtfully and carefully is a hard task, and additional support or advice may be beneficial.
This seems to roughly mirror my concerns above.
I would want to engage with the organizers a fair bit more before recommending a renewal of this grant, but I am happy about the project as a space for Europeans to get engaged with alignment ideas and work on them for a week together with other technical and engaged people.
Broadly, the effects of the camp seem very likely to be positive, while the (financial) cost of the camp seems small compared to the expected size of the impact. This makes me relatively confident that this grant is a good bet.
Vyacheslav Matyuhin ($50,000)
An offline community hub for rationalists and EAs
From the application:
Our team is working on the offline community hub for rationalists and EAs in Moscow called Kocherga (details on Kocherga are here).
We want to make sure it keeps existing and grows into the working model for building new flourishing local EA communities around the globe.
Our key assumptions are:
- There’s a gap between the “monthly meetup” EA communities and the larger (and significantly more productive/important) communities. That gap is hard to close for many reasons.
- Solving this issue systematically would add a lot of value to the global EA movement and, as a consequence, the long-term future of humanity.
- Closing the gap requires a lot of infrastructure, both organizational and technological.
So we work on building such an infrastructure. We also keep in mind the alignment and goodharting issues (building a big community of people who call themselves EAs but who don’t actually share EA virtues would be bad, obviously).
Concretely, we want to:
- Add 2 more people to our team.
- Implement our new community building strategy (which includes both organizational tasks such as new events and processes for seeding new working groups, and technological tasks such as implementing a website which allows people from the community to announce new private meetups or team up for coaching or mastermind groups)
- Improve our rationality workshops (in terms of scale and content quality). Workshops are important for attracting new community members, for keeping the high epistemic standards of the community and for making sure that community members can be as productive as possible.
To be able to do this, we need to cover our current expenses somehow until we become profitable on our own.
My thoughts and reasoning
The Russian rationality community is surprisingly big, which suggests both a certain level of competence from some of its core organizers and potential opportunities for more community building. The community has:
- Successfully translated The Sequences and HPMOR into Russian, as can be seen at the helpful LessWrong.ru site.
- Executed a successful kickstarter campaign to distribute physical copies of HPMOR (over 7,000 copies).
- Built a community hub in Moscow called Kocherga, which is a financially self-sustaining anti-cafe (a cafe where you pay for time spent there rather than drinks/snacks) that hosts a variety of rationality events for roughly 100 attendees per week.
This grant is to the team that runs the Kocherga anti-cafe.
Their LessWrong write-up suggests:
- They have good skills at building spaces, running events, and generally preserving their culture while still being financially sustainable
- They’ve seen steady increases over time in available funding and attendees
- They’ve succeeded at being largely self-sufficient for 4 years
- They’ve successfully engaged with other local intellectual communities
- Their culture seems to value careful thinking and good discourse a lot, and they seem to have put serious effort into developing the art of rationality, including caring about the technical aspects and incorporating CFAR’s work into their thinking
I find myself having slightly conflicted feelings about the Russian rationality community trying to identify and integrate more with the EA community. I think a major predictor of how excited I have historically been about community building efforts has been a group’s emphasis on improving members’ judgement and thinking skills, as well as the degree to which it emphasizes high epistemic standards and careful thinking. I am quite excited about how Kocherga seems to have focused on those issues so far, and I am worried that this integration and change of identity will reduce that focus (as I think it has for some local and student groups that made a similar transition). That said, I think the Kocherga group has shown quite good judgement on this dimension (see here), which addresses many of my concerns, though I am still interested in thinking and talking about these issues further.
I’m somewhat concerned that I’m not aware of any major insights or unusually talented people from this community, but I expect the language barrier to be a big part of what is preventing me from hearing about those things. And I am somewhat confused about how to account for interesting ideas that don’t spread to the projects I care most about.
I think there are benefits to having an active Russian community that can take opportunities that are only available for people in Russia, or at least people who speak Russian. This particularly applies to policy-oriented work on AI alignment and other global catastrophic risks, which is also a domain that I feel confused about and have a hard time evaluating.
For a lot of the work that I do feel comfortable evaluating, I expect the vast majority of intellectual progress to be made in the English-speaking world, and as such, the question of how talent can flow from Russia to the existing communities working on the long-term future seems quite important. I hope this grant can facilitate a stronger connection between the rest of the world and the Russian community, to improve that talent and idea flow.
This grant seemed like a slightly better fit for the EA Meta fund. They decided not to fund it, so we made it instead, since it still seemed like a strong proposal to us.
What I have seen so far makes me confident that this grant is a good idea. However, before we make more grants like this, I would want to talk more to the organizers involved and generally get more information on the structure and culture of the Russian EA and rationality communities.
Jacob Lagerros ($27,000)
Building infrastructure to give x-risk researchers superforecasting ability with minimal overhead
From the application:
Build a private platform where AI safety and policy researchers have direct access to a base of superforecaster-equivalents, and where aspiring EAs with smaller opportunity costs but excellent calibration perform useful work.
I previously received two grants to work on this project: a half-time salary from EA Grants, and a grant for direct project expenses from BERI. Since then, I dropped out of a Master’s programme to work full-time on this, seeing that was the only way I could really succeed at building something great. However, during that transition there were some logistical issues with other grantmakers (explained in more detail in the application), hence I applied to the LTF for funding for food, board, travel and the runway to make more risk-neutral decisions and capture unexpected opportunities in the coming ~12 months of working on this.”
My thoughts and reasoning
There were three main factors behind my recommending this grant:
- My object-level reasons for recommending this grant are quite similar to my reasons for recommending Ozzie Gooen’s and Anthony Aguirre’s.
- Jacob has been around the community for about 3 years. The output of his that I’ve seen has included (amongst other things) competently co-directing EAGxOxford 2016, and some thoughtful essays on LessWrong (e.g. 1, 2, 3, 4).
- Jacob’s work seems useful to me, and is being funded on the recommendation of the FHI Research Scholars Programme and the Berkeley Existential Risk Initiative. He is also collaborating with others I’m excited about (Metaculus and Ozzie Gooen).
However, I did not assess the grant in detail, as the only reason Jacob asked for a grant was due to logistical complications with other grantmakers. Since FHI and BERI have already investigated the project in more detail, I was happy to suggest we pick up the slack to ensure Jacob has the runway to pursue his work.
Connor Flexman ($20,000)
Perform independent research in collaboration with John Salvatier
I am recommending this grant with more hesitation than most of the other grants in this round. The reasons for hesitation are as follows:
- I was the primary person on the grant committee on whose recommendation this grant was made.
- Connor lives in the same group house that I live in, which I think adds a complicating conflict of interest to my recommendation.
- I have generally positive impressions of Connor, but I have not personally seen concrete, externally verifiable evidence that clearly demonstrates his good judgment and competence, which in combination with the other two factors makes me more hesitant than usual.
However, despite these reservations, I think this grant is a good choice. The two primary reasons are:
- Connor himself has worked on a variety of research and community building projects, and both by my own assessment and other people I talked to, has significant potential in becoming a strong generalist researcher, which I think is an axis on which a lot of important projects are bottlenecked.
- This grant was strongly recommended to me by John Salvatier, who is funded by an EA Grant and whose work I am generally excited about.
John did some very valuable community organizing while he lived in Seattle and is now working on developing techniques to facilitate skill transfer between experts in different domains. I think it is exceptionally hard to develop effective techniques for skill transfer, and more broadly techniques to improve people’s rationality and reasoning skills, but am sufficiently impressed with John’s thinking that I think he might be able to do it anyway (though I still have some reservations).
John is currently collaborating with Connor and requested funding to hire him to collaborate on his projects. After talking to Connor I decided it would be better to recommend a grant to Connor directly, encouraging him to continue working with John but also allowing him to switch towards other research projects if he finds he can’t contribute as productively to John’s research as he expects.
Overall, while I feel some hesitation about this grant, I think it’s very unlikely to have any significant negative consequences, and I assign some significant probability that this grant can help Connor develop into an excellent generalist researcher of a type that I feel like EA is currently quite bottlenecked on.
Eli Tyre ($30,000)
Broad project support for rationality and community building interventions
Eli has worked on a large variety of interesting and valuable projects over the last few years, many of them too small to have much payment infrastructure, resulting in him doing a lot of work without appropriate compensation. I think his work has been a prime example of picking low-hanging fruit by using local information and solving problems that aren’t worth solving at scale, and I want him to have resources to continue working in this space.
Concrete examples of projects he has worked on that I am excited about:
- Facilitating conversations between top people in AI alignment (I’ve in particular heard very good things about the 3-day conversation between Eric Drexler and Scott Garrabrant that Eli helped facilitate)
- Organizing advanced workshops on Double Crux and other key rationality techniques
- Doing a variety of small independent research projects, like this evaluation of birth order effects in mathematicians
- Providing many new EAs and rationalists with advice and guidance on how to get traction on working on important problems
- Helping John Salvatier develop techniques around skill transfer
I think Eli has exceptional judgment, and the goal of this grant is to allow him to take actions with greater leverage by hiring contractors, paying other community members for services, and paying for other varied expenses associated with his projects.
Robert Miles ($39,000)
Producing video content on AI alignment
From the application:
My goals are:
- To communicate to intelligent and technically-minded young people that AI Safety:
- is full of hard, open, technical problems which are fascinating to think about
- is a real existing field of research, not scifi speculation
- is a growing field, which is hiring
- To help others in the field communicate and advocate better, by providing high quality, approachable explanations of AIS concepts that people can share, instead of explaining the ideas themselves, or sharing technical documents that people won’t read
- To motivate myself to read and internalise the papers and textbooks, and become a technical AIS researcher in future
My thoughts and reasoning
I think video is a valuable medium for explaining a variety of different concepts (for the best examples of this, see 3Blue1Brown, CGP Grey, and Khan Academy). While there are a lot of people working directly on improving the long term future by writing explanatory content, Rob is the only person I know who has invested significantly in getting better at producing video content. I think this opens a unique set of opportunities for him.
The videos on his Youtube channel pick up an average of ~20k views. His videos on the official Computerphile channel often pick up more than 100k views, including for topics like logical uncertainty and corrigibility (incidentally, a term Rob came up with).
More things that make me optimistic about Rob’s broad approach:
- He explains that AI alignment is a technical problem. AI safety is not primarily a moral or political position; the biggest chunk of the problem is a matter of computer science. Reaching out to a technical audience to explain that AI safety is a technical problem, and thus directly related to their profession, is a type of ‘outreach’ that I’m very happy to endorse.
- He does not make AI safety a politicized matter. I am very happy that Rob is not needlessly tribalising his content, e.g. by talking about something like “good vs bad ML researchers”. He seems to simply portray it as a set of interesting and important technical problems in the development of AGI.
- His goal is to create interest in these problems from future researchers, and not to simply get as large of an audience as possible. As such, Rob’s explanations don’t optimize for views at the expense of quality explanation. His videos are clearly designed to be engaging, but his explanations are simple and accurate. Rob often interacts with researchers in the community (at places like DeepMind and MIRI) to discuss which concepts are in need of better explanations. I don’t expect Rob to take unilateral action in this domain.
Rob is the first skilled person in the X-risk community working full-time on producing video content. Being the very best we have in this skill area, he is able to help the community in a number of novel ways (for example, he’s already helping existing organizations produce videos about their ideas).
Rob made a grant request during the last round, in which he explicitly requested funding for a collaboration with RAISE to produce videos for them. I currently don’t think that working with RAISE is the best use of Rob’s talent, and I’m skeptical of the product RAISE is currently trying to develop. I think it’s a better idea for Rob to focus his efforts on producing his own videos and supporting other organizations with his skills, though this grant doesn’t restrict him to working with any particular organization and I want him to feel free to continue working on RAISE if that is the project he thinks is currently most valuable.
Overall, Rob is developing a new and valuable skill within the X-risk community, and executing on it in a very competent and thoughtful way, making me pretty confident that this grant is a good idea.
My thoughts and reasoning
- MIRI is a 20-year-old research organization that seeks to resolve the core difficulties in the way of AGI having a positive impact.
- My model of MIRI’s approach looks something like an attempt to join the ranks of Turing, Shannon, von Neumann and others, in creating a fundamental piece of theory that helps humanity to understand a wide range of powerful phenomena. Gaining an understanding of the basic theory of intelligent agents well enough to think clearly about them is plausibly necessary for building an AGI that ensures the long term future goes well.
- It seems to me that they are making real progress (although I’m not confident of the rate of that progress) - for example, MIRI has discovered a Solomonoff-induction-style algorithm that can reason well under logical uncertainty, learning reasonable probabilities for mathematical propositions before they can be proved, which I found surprising. While I am uncertain about the usefulness of this particular insight on the path to further basic theory, I take it as some evidence that they’re using methods that can in principle make progress, which is something that I have historically been pessimistic about.
- Only in recent years have there been routes to working on alignment that have also given you funding, status, and a stable social life. Nowadays many others are helping out the work of solving alignment, but MIRI core staff worked on the problem while all the incentives pulled in other directions. For me this is a strong sign of their integrity, and makes me expect they will make good decisions in many contexts where the best action isn't the locally incentivized action. It is also evidence that if I can’t understand why their weird action is good, that they will often still be correct to do it, and this is an outside view in favor of funding them in cases where I don't have my own inside-view model of why the project they're working on is good.
- On that note, MIRI has also worked on a number of other projects that have attempted to teach the skills behind their general methodology for reasoning quantitatively and scientifically about the world and taking right action. I regret not having the time to detail all the impacts of these projects, but they include (and are not limited to): LessWrong, The Sequences, HPMOR, Inadequate Equilibria, Embedded Agency, and CFAR (an organization I discuss below). I view these as some of the main reasons the x-risk community exists.
- Another outside view to consider is the support of MIRI by so many others whom I trust. Their funders have included Open Phil, BERI, FLI, and Jaan Tallinn, plus a variety of smaller donors I trust, and they are advised by Stuart Russell and Nick Bostrom. They’ve also been supported by other people who I don’t necessarily trust directly, but who I do think have interesting and valuable perspectives on the world, like Peter Thiel and Vitalik Buterin .
- I also judge the staff to be exceptionally competent. Some examples:
- The programming team has taken very early hires from multiple good startups such as Triplebyte, Recursion Pharmaceuticals, and Quixey, and also includes the Haskell core-developer Edward Kmett.
- The ops staff are currently, in my evaluation, the most competent operations team of any of the organizations that I have personally interacted with.
In sum, I think MIRI is one of the most competent and skilled teams attempting to improve the long-term future, I have a lot of trust in their decision-making, and I’m strongly in favor of ensuring that they’re able to continue their work.
Thoughts on funding gaps
Despite all of this, I have not actually recommended a large grant to MIRI.
- This is due to MIRI’s funding situation being solid at its current level (I would be thinking very differently if I annually had tens of millions of dollars to give away). But MIRI’s marginal use of dollars at this point of funding seems lower-impact, so I only recommended $50k.
- I feel conflicted about whether it might be better to give MIRI more money. Historically, it has been common in the EA funding landscape to only give funding to organizations when they have demonstrated concrete room for more funding, or when funding is the main bottleneck for the organization. I think this has allowed us to start many small organizations that are working on a variety of different problems.
- A common way in which at least some funding decisions are made is to compare the effect of a marginal donation now with the effect of a marginal donation at an earlier point in the project’s lifecycle (i.e. not wanting to invest in a project after it has hit strongly diminishing marginal returns, aka “maxed out its room for more funding” or “filled the funding gap”).
- However, when I think about this from first principles, I think we should expect a heavy-tailed (probably log-normal) distribution in the impact of different cause areas, individuals, and projects. And while I can imagine that many good opportunities might hit strong diminishing marginal returns early on, it doesn’t seem likely for most projects. Instead, I expect factors that stay constant over the life of a project, like its broader organizational philosophy, core staff, and choice of problem to solve, to determine a large part its marginal value. Thus, we should expect our best guesses to be worth investing significant further resources into.
However, this is all complicated by a variety of countervailing considerations, such as the following three:
- Power law distributions of impact only really matter in this way if we can identify which interventions we expect to be in the right tail of impact, and I have a lot of trouble properly bounding my uncertainty here.
- If we are faced with significant uncertainty about cause areas, and we need organizations to have worked in an area for a long time before we can come to accurate estimates about its impact, then it’s a good idea to invest in a broad range of organizations in an attempt to get more information. This is related to common arguments around “explore/exploit tradeoffs”.
- Sometimes, making large amounts of funding available to one organization can have negative consequences for the broader ecosystem of a cause area. Also, giving an organization access to more funding than it can use productively may cause it to make too many hires or lose focus by trying to scale too quickly. Having more funding often also attracts adversarial actors and increases competitive stakes within an organization, making it a more likely target for attackers.
I can see arguments that we should expect additional funding for the best teams to be spent well, even accounting for diminishing margins, but on the other hand I can see many meta-level concerns that weigh against extra funding in such cases. Overall, I find myself confused about the marginal value of giving MIRI more money, and will think more about that between now and the next grant round.
[Edit: It seems relevant to mention that LessWrong is currently receiving operational support from CFAR, in a way that makes me technically an employee of CFAR (similar to how ACE and 80K were/are part of CEA for a long time). However, LessWrong operates as a completely separate entity with its own fundraising and hiring procedures, and I don't feel any hesitation or pressure to critique CFAR openly because of that relation. Though I find myself a tiny bit more hesitant to speak harshly of specific individuals, simply because I am only working a floor away from the CFAR offices and that does have some psychological effect on me. Though the same was true for CEA while LessWrong was located in the CEA office for a few months, and was true for residents of my group house while LessWrong was located in the living room of my group house for most of the past two years, so I don't think this effect is particularly large.]
I think that CFAR’s intro workshops have historically had a lot of positive impact. I think they have done so via three pathways.
- Establishing epistemic norms: I think CFAR workshops are quite good at helping the EA and rationality community establish norms about what good discourse and good reasoning look like. As a concrete example of this, the concept of Double Crux has gotten traction in the EA and rationality communities, which has improved the way ideas and information spread throughout the community, how ideas get evaluated, and what kinds of projects get resources. More broadly, I think CFAR workshops have helped in establishing a set of common norms about what good reasoning and understanding look like, similar to the effect of the sequences on LessWrong.
- I think that it’s possible that the majority of the value of the EA and rationality communities comes from having that set of shared epistemic norms that allows them to reason collaboratively in a way that most other communities cannot (in the same way that what makes science work is a set of shared norms around what constitutes valid evidence and how new knowledge gets created).
- As an example of the importance of this: I think a lot of the initial arguments for why AI risk is a real concern were “weird” in a way that was not easily compatible with a naive empiricist worldview that I think is pretty common in the broader intellectual world.
- In particular, the arguments for AI risk are hard to test with experiments or empirical studies, but hold up from the perspective of logical and philosophical reasoning and are generated by a variety of good models of broader technological progress, game theory, and related areas of study. But for those arguments to find traction, they required a group of people with the relevant skills and habits of thought for generating, evaluating, and having extended intellectual discourse about these kinds of arguments.
- Training: A percentage of intro workshop participants (many of whom were already working on important problems within X-risk) have seen significant improvements in competence; as a result, they became substantially more effective in their work.
- Recruitment: CFAR has helped many people move from passive membership in the EA and rationality community to having strong social bonds in the X-risk network.
While I do think that CFAR has historically caused a significant amount of impact, I feel hesitant about this grant because I am unsure whether CFAR can continue to create the same amount of impact in the future. I have a few reasons for this:
- First, all of its founding staff and many other early staff have left. I broadly expect organizations to get a lot worse once their early staff leaves.
- Some examples of people who left after working there:
- Julia Galef (left a few years ago to start the Update Project)
- Andrew Critch (left to join first Jane Street, then MIRI, then founded CHAI and BERI)
- Kenzi Askhie
- Duncan Sabien
- Anna Salamon has reduced her involvement in the last few years and seems significantly less involved with the broader strategic direction of CFAR (though she is still involved in some of the day-to-day operations, curriculum development, and more recent CFAR programmer workshops). [Note: After talking to Anna about this, I am now less certain of whether this actually applies and am currently confused on this point]
- Duncan Sabien is no longer involved in day-to-day work, but still does some amount of teaching at intro workshops and programmer workshops (though I think he is planning to phase that out) and will help with the upcoming instructor training.
- I think that Julia, Anna and Critch have all worked on projects of enormous importance, and their work over the last few years has clearly demonstrated a level of competence that makes me expect that CFAR will struggle to maintain its level of quality with their involvement significantly reduced.
- From recent conversations with CFAR, I’ve gotten a sense that the staff isn’t interested in increasing the number of intro workshops, that the intro workshops don’t feel particularly exciting for the staff, and that most staff are less interested in improving the intro workshops than other parts of CFAR. This makes it less likely that those workshops will maintain their quality and impact, and I currently think that those workshops are likely one of the best ways for CFAR to have a large impact.
- I have a general sense that CFAR is struggling to attract top talent, partially because some of the best staff left, and partially due to a general sense of a lack of forward momentum for the organization. This is a bad sign, because I think CFAR in particular benefits from having highly talented individuals teach at their workshops and serve as a concrete example of the skills they’re trying to teach.
- My impression is that while the intro workshops were historically focused on instrumental rationality and personal productivity, the original CFAR staff was oriented quite strongly around truth-seeking. Core rationality concepts were conveyed indirectly by the staff in smaller conversations and in the broader culture of the organization. The current staff seems less oriented around that kind of epistemic rationality, and so I expect that if they continue their current focus on personal productivity and instrumental rationality, the epistemic benefits of CFAR workshops will be reduced significantly, and those are the benefits I care about most.
However, there are some additional considerations that led me to recommending this grant.
- First, CFAR and MIRI are collaborating on a set of programmer-focused workshops that I am also quite positive on. I think those workshops are less directly influenced by counterfactual donations than the mainline workshops, since I expect MIRI to fund them in any case, but they do still rely on CFAR existing as an institution that can provide instructors. I am excited about the opportunities the workshops will enable in terms of curriculum development, since they can focus almost solely on epistemic rationality
- Second, I think that if CFAR does not receive a grant now, there’s a good chance they’d be forced to let significant portions of their staff go, or take some other irreversible action. CFAR decided not to run a fundraiser last fall because they felt like they’d made significant mistakes surrounding a decision made by a community dispute panel that they set up and were responsible for, and they felt like it would be in poor taste to ask the community for money before they thoroughly investigated what went wrong and released a public statement.
- I think this was the correct course of action, and I think overall CFAR’s response to the mistakes they made last year has been quite good.
- The lack of a fundraiser led CFAR to have a much greater need for funding than usual, and a grant this round will likely make a significant difference in CFAR’s future.
In the last year, I had some concerns about the way CFAR communicated a lot of its insights, and I sensed an insufficient emphasis on a kind of robust and transparent reasoning that I don’t have a great name for. I don’t think the communication style I was advocating for is always the best way to make new discoveries, but is very important for establishing broader community-wide epistemic norms and enables a kind of long-term intellectual progress that I think is necessary for solving the intellectual challenges we’ll need to overcome to avoid global catastrophic risks. I think CFAR is likely to respond to last year’s events by improving their communication and reasoning style in this respect (from my perspective).
My overall read is that CFAR is performing a variety of valuable community functions and has a strong enough track record that I want to make sure that it can continue existing as an institution. I didn’t have enough time this grant round to understand how the future of CFAR will play out; the current grant amount seems sufficient to ensure that CFAR does not have to take any drastic action until our next grant round. By the next grant round, I plan to have spent more time learning and thinking about CFAR’s trajectory and future, and to have a more confident opinion about what the correct funding level for CFAR is.