Hide table of contents

[Epistemic status: strongly stated, weakly held]

When faced with problems that involve ongoing learning, most strategies involve a balance between "exploration" and "exploitation". Exploration means taking opportunities that increase your knowledge about how good your opportunities are, whereas exploitation means putting resources into what you currently believe is the best opportunity. A good strategy will involve both: if you only explore, then you will never actually reap any rewards, whereas if you only exploit then you will likely spend all your resources on a poor opportunity.

When we in the EA community are thinking about how to spend our resources, we face an exploration/exploitation problem. In this post I'm going to argue that:

  1. The value of exploration is higher than ever
  2. The EA community is not doing enough exploration
  3. Exploration through experimentation is particularly neglected

Preface: What does exploration and exploitation look like for EA?

I'm deliberately going to leave these terms fairly vague - I'm not entirely sure that they can be made precise in this context1. However, here are some canonical exemplars of what exploration and exploitation look like in the setting of EA:

  • Doing or donating to the best direct work that you know of is exploitation.
  • Trying to work out what to work on or donate to is exploration.

We actually have two additional options that aren't available in traditional exploration/exploitation problems:

  • Increase our exploitation capacity. This represents how effectively we can exploit good opportunities, so includes things like increasing our available funds or people.
  • Increase our exploration capacity. This represents how effectively we can explore, so includes things like increasing our pool of available researchers or forming better exploration institutions.

The value of exploration is higher than ever

Exploration improves the effectiveness of exploitation, because we may discover that we can switch our resources to a better opportunity. If GiveWell discovered a new top charity that was significantly more effective than its current ones, then many EAs who are donating (exploiting) might switch their donations to that charity, increasing the amount of good they achieved with their exploitation.

"Smart money" (and, although it's less fungible, "smart capacity" like talented employees) corresponds to our exploitation capacity. The greater our exploitation capacity, the more benefit we gain from moving our resources to a better opportunity, and so the more valuable it is to explore.

We actually have a couple of quantitative models of how useful exploration is:

  1. Some old work of mine estimating the value of funding the Disease Control Priorities (DCP) project (part 1, part 2).2
  2. Peter Hurford's model of how valuable it is to create a new GiveWell-recommended charity.

Both of these models suggest that the cost-effectiveness of spending on exploration could be many times that of spending on exploitation (funding charities), and that the value of exploration is greatly increased by the amount of "smart money" that can be directed by improved information.

The improvement from more exploration may not be a simple multiplier on the effectiveness of our exploitation (as is assumed in e.g. the DCP model), because opportunities do not have an unlimited capacity to absorb resources. If anything, this makes exploration more important: if we do not explore enough then we are at risk of running out of effective ways to spend our resources, in which case we would have to fall back on worse opportunities or just save our money.3

The EA community has recently seen a vast increase in its exploitation capacity. Not only has the community expanded a great deal, but the Open Philanthropy Project (OPP) now has the potential to provide a vast amount of funding. That means that exploration (and hence improving exploration capacity) is increasingly valuable, since it has the potential to enhance a huge budget.

The EA community is not doing enough exploration

The EA community has benefited from an excellent initial endowment of exploration done by the academic community (Glennerster and Kremer; DCP), and later GiveWell and Giving What We Can. That means that there has been plenty of low-hanging fruit available for collection, if we could get the resources to harvest it. So we have largely been able to focus on improving exploitation (outreach).

That doesn't mean that we haven't increased our exploration capacity: today there is more exploration happening than ever, with all of the existing research organizations having increased their capacity, and new organizations like OPP also doing exploration work.

However, the total amount of effort expended in this direction by the EA community is still small relative to how much our exploitation capacity has improved. I very roughly estimate the number or researchers doing exploratory work at <100.4 These jobs also tend not to be incredibly highly-paid, so at a generous assumption of ~$50k per person, and doubling to allow for support staff, that's still less than $10m being spent on research by the movement as a whole.5

In comparison, OPP alone expects to have disbursed about $100m in grants in 2016, and eventually to grant as much as $400m a year.

I don't have a good idea for what kind of spending ratio would be ideal, but this seems low to me (more precise models welcome!). As a heuristic, I think most EAs would consider RCTs to be a cost-effective way of getting high-quality information about a possible solution. But the EA community has spent almost no money on funding RCTs!6 Hopefully this means that we are spending our money doing exploration that is even more valuable than RCTs, but that means that we have not even progressed far enough up the curve of diminishing marginal returns to exploration to reach "doing RCTs".

Exploration through experimentation is particularly neglected

From an exploitation point of view, all of these are equivalent:

  • Discovering a good opportunity
  • Discovering that a known opportunity is good
  • Creating a new good opportunity

In all of these cases we gain the ability to devote resources to a good opportunity that we could not previously have exploited.

The EA community has so far focussed on discovering and assessing existing funding opportunities, which is not surprising given its historical roots in charity evaluation. This is not an unreasonable strategy, even if the opportunities which you assess are simply generated randomly (as is the case in the DCP model).

But we might be able to do even better if we can generate high-quality opportunities deliberately, rather than having to wait for them to happen by accident. Matt Clifford, who runs EF, a London startup accelerator, describes traditional VC investing as waiting for high-value "lightning strikes", whereas an effective startup incubator should be like an electricity generator, producing value deliberately and predictably. I think we can aspire to something similar for high-impact organizations.

One way to try and create a high-impact organization is to sit down and think very hard about what organizations might be high-impact, and then once you think you have a good idea, go ahead and implement it. But we know that this is often a bad strategy! The modern startup industry functions more like an experimental laboratory. Sometimes you gain a great deal of information from actually trying things.7 Experimentation has the added virtue that a successful experiment can gradually transition into a exploitation opportunity as it grows.

There are a number of organizations in or around EA who are working on exploration through experimentation:

However, I think there is room to do much more, both directly and in capacity-building. For example:

  • Charity Entrepreneurship found multiple promising ideas, but as far as I know nobody has picked up any of the remaining ones.
  • Charity Entrepreneurship has primarily targeted health so far, but we should be doing similar explorations in other promising areas.
  • The Good Technology Project is a "technique specialist" focussing on technology startups, analogously we could have e.g. an organization that focussed on starting policy campaigns targeting important problems.
  • We could create other kinds of institution for systematically generating high-impact organizations, such as impact-focussed startup incubators.8 This is essentially capacity-building for exploration through experimentation - I am particularly excited by this idea.
  • We could experiment with VC-style gated-funding models which encourage demonstrating results rapidly and aborting failed experiments.9


As a community, we should invest more in exploration.10 Concretely,

  • Organizations doing research, like GiveWell and OPP, should invest even more in capacity-building.
  • We should start and fund experimental new organizations.
  • We should start and fund organizations that focus on systematically generating further high-impact organizations.

In particular, starting or funding new high-impact organizations does not seem to be a focus area for the "giga-funders" like OPP, so this may be a place where small donors can bet better.

Finally, we should start doing this as soon as possible. Exploration tends to take time, and there is already some possibility that OPP could saturate most of our best available opportunities. Time to expand the frontier!

[Cross-posted from my blog]

  1. If we consider value of information (VoI), then saying "we should do more exploration" just amounts to saying "the VoI of this (exploration) action is high enough to make it the best option", in which case we are only ever doing exploitation.

    I could probably have written this post and only talked about VoI, and I don't think it would have changed the conclusions much. However, I find the exploration/exploitation dichotomy to be intuitive, so I've stuck with it.

  2. I still think this model is plausible for many meta-charities, especially since we think that many of them are drawing from fat-tailed distributions.

  3. Of course, this is not a problem if we are already planning to give later.

  4. Making some very rough estimates of the numbers of exploratory research staff employed by various organizations: I believe OPP has ~10, CEA has ~5, GiveWell has ~7, ACE has ~3, Sentience Politics ~3. I don't have a good idea how many researchers are employed by other relevant organizations across all cause areas, but I'm fairly sure the number of full-time researchers must be less than 100.

  5. There is also exploration work relevant to EA that is being done outside the movement, e.g. by J-PAL, but I'm disregarding that as we have less influence over how that money is spent.

  6. The few exceptions that I know of are the studies relating to the effectiveness of vegan outreach (e.g. this one), although those have had some problems.

  7. Obviously the best strategy is a mixed one - pure experimentation is very slow and wasteful. But if we're anything like other domains we should expect to have quite a lot of experimentation in the mix.

  8. This is similar to what Spark Wave is doing, but there's room for more work in this space!

  9. The Global Innovation Fund does something like this, with different funding levels for organizations at the "Pilot", "Test" and "Scale" phases.

  10. Note that this is a bit more specific than just saying "fund meta", since I'm not including things like EA movement-building which have historically been considered part of "meta". I think these are helpful, but not the best ways to improve exploration.

Sorted by Click to highlight new comments since:

Good post!

Animal Charity Evaluators also has the Animal Advocacy Research Fund which has $1,000,000 to give out over 3 years to fund research, which you should probably count as money spent on exploration.

Depending on what you mean by 'direct work', x-risk orgs could also be counted as currently doing mostly exploration, or at least don't fit very neatly into the dichotomy. Still even with these additions I doubt this would raise it above ~$20 million a year, which would probably not be enough to change your conclusion.

I'm unsure where the balance should lie quantitatively. I think that $100 million would probably be too much, and $10 million is probably too low.

I agree that x-risk work doesn't fit nicely into this: it's not even clear whether you'd want to count the output of research as "actually" what you want or as VoI.

Thanks for the post. I broadly agree.

There are some more remarks on "gaps" in EA here: https://80000hours.org/2015/11/why-you-should-focus-more-on-talent-gaps-not-funding-gaps/

Two quick additions:

1) I'm not sure spending on RCTs is especially promising. Well-run RCTs that actually have power to update you can easily cost tens of millions of dollars, so you'd need to be considering spending hundreds of millions for it to be worth it. We're only just getting to this scale. GiveWell has considered funding RCTs in the past and rejected it, I think for this reason (though I'm not sure).

2) It might be interesting for someone to think more about multi-arm bandit problems, since it seems like it could be a good analogy for cause selection. An approximate solution is to exploit your best opportunity 90% of the time, then randomly select another opportunity to explore 10% of the time. https://en.wikipedia.org/wiki/Multi-armed_bandit


An approximate solution is to exploit your best opportunity 90% of the time, then randomly select another opportunity to explore 10% of the time.

This is the epsilon-greedy strategy with epsilon = 0.1, which is probably a good rule of thumb for when one's prior for each of the causes has a slim-tailed distribution (e.g. Gaussian). The optimal value of epsilon increases with the variance in our prior for each of the causes. So if we have a cause and our confidence interval for its cost effectiveness goes over more than an order of magnitude (high variance), a higher value of epsilon could be better. Point is - the rule of thumb doesn't really apply when you think some causes are much better than others and you have plenty of uncertainty.

That said, if you had realistic priors for the effectiveness of each cause, you can calculate an optimal solution using Gittins indeces.

It might be interesting for someone to think more about multi-arm bandit problems, since it seems like it could be a good analogy for cause selection. An approximate solution is to exploit your best opportunity 90% of the time, then randomly select another opportunity to explore 10% of the time. https://en.wikipedia.org/wiki/Multi-armed_bandit

I'm doing some research along these lines with Bayesian Bandits.

1) I nearly added a section about whether exploration is funiding- or talent-constrained! In short, I'm not sure, and I suspect it's different in different places. It sounds like OPP is probably talent-constrained, but other orgs may differ. In particular, if we wanted to try some of my other suggestions for improving exploration, like building institutions to start new orgs, then that's potentially quite funding-intensive.

2) I'm not sure whether multi-armed bandits actually model our situation, since I'm not sure if you can incorporate situations where you can change the efficiencies of your actions. What does "improving exploration capacity" look like in a multi-armed bandit? There may also be complications because we don't even know the size of the option set.

What does "improving exploration capacity" look like in a multi-armed bandit?

You could potentially model this as an (a) increase in the amount of bandit pulls you can do in parallel (simple models only assume one pull at a time), (b) a decrease in the amount of time it takes between a bandit pull and the information being received (simple bandit models assume this to be instantaneous), (c) an increase in the accuracy of information received by each bandit pull (simple models assume the information received is perfectly accurate).

It sounds like OPP is probably talent-constrained

This seems likely to me given that they certainly have more funding than they currently know how to spend, but given that they are not openly hiring right now, I imagine they are probably just not constrained by talent or money.

Yes! Probably when we think of Importance, Neglectedness, and Tractability, we should also consider informativeness!

We've considered wrapping it into the problem framework in the past, but it can easily get confusing. Informativeness is also more of a feature of how you go about working on the cause, rather than which cause you're focused on.

The current way we show that we think VOI is important is by listing Global Priorities Research as a top area (though I agree that doesn't quite capture it). I also talk about it often when discussing how to coordinate with the EA community (VOI is a bigger factor when considering the community perspective than individual perspective).

The 'Neglectedness' criteria gets you a pretty big tilt in favour of working on underexplored problems already. But value of information is an important factor in choosing what project to work on within a problem area.

I think I agree with this - it's usually the case that one particular sub-problem in an area is particularly informative to work on.

However, I think it's at least possible that some areas are systematically very informative to work on. For example, if the primary work is research, then you should expect to mainly be outputting information. AI research might be like this.

Great post. Completely agree with the general concept and have a few positive updates on the Charity Entrepreneurship front.

We are working with another team to get one of the other promising ideas from our initial CE research founded. A public post on this will come out sometime in the next month or so.

Additionally we are in fact working on expanding the model we used on Charity Entrepreneurship for health to a much wider subset of causes and crucial considerations to end up some with charities we/others can start in broader areas. Our first post on this, which is going up publicly very soon, is on explore/exploit and optimal stopping, but in the context of starting charities. We also talk about multi-armed bandit problems in it.

This is really exciting, looking forward to these posts.

The Charity Entrepreneurship model is interesting to me because you're trying to do something analogous to what we're doing at the Good Technology Project - cause new high impact organisations to exist. Whereas we started meta (trying to get other entrepreneurs to work on important problems) you started at the object level (setting up a charity and only later trying to get other people to start other charities). Why did you go for this depth-first approach?

So this response could also be a whole post in of itself, but briefly, there were 3 big reasons:

1) We thought that it's generally quite hard to start an extremely effective charity and also quite hard to influence pre-existing ones. Additionally it's quite easy to start something ineffective. GiveWell only gives even us and New Incentives a 10-20% of successfully starting a charity, and I think these are relatively high rates compared to what I would expect to happen if we only attempted to inspire. (e.g. Our team already has experience founding an EA meta-charity for example). 2) We were in a pretty good position to start something. We had a strong team that worked well together and the timing seemed quite good for starting a direct charity in the poverty space and we thought this space was very high impact.
3) We figured once we had starting something we would be much stronger mentors and know the process a lot better. We have already found this to be very true as we are coaching other projects through this process.

In general, I could imagine switching to a strategy that is more hands off and tries to inspire folks in a very meta way (e.g. incubator or heavy mentoring). If we see a few people pick up our CE ideas and take a good shot at them our probabilities of doing something like this would go up a lot.

Agreed. You can also add the Effective Altruism Foundation to your list. One of its strategies is to try out many high–risk high reward interventions, especially in the animal advocacy space, to reap the value of information of these experiments and to profit from the potentially greater neglectedness due to the risk aversity of most other actors.

The Foundational Research Institute is also run by EAF.

(I used to work for EAF.)

GiveWell has made several big research-related grants, e.g. a $2M grant to IDinsight, part of which funds impact evaluations (which probably includes RCTs): http://www.givewell.org/charities/IDinsight/june-2016-grant

Exploration through experimentation might also be neglected because it's uncomfortable and unintuitive. EAs traditionally make a distinction between 'work out how to do the most good' and 'do it'. We like to work out whether something is good through careful analysis first, and once they're confident enough of a path they then optimise for exploitation. This is comforting because we then get to do only do work when we're fairly confident of it being the right path. But perhaps we need to get more psychologically comfortable with mixing the two together in an experimental approach.

Agreed that we should be doing more exploration. I think one reason there hasn't been as much is it's a harder sell. "Give me money that I can use to save lives - I've already found a method that works" is a more convincing plea than "give me money so I can sit around and think of an altruistic way to spend other people's money - I swear I'll work effectively at this." Of course, big established organizations like OPP can do this, but I think the hard sell creates a barrier to entry.

Exploration also carries significant risk of failure, which can be offputting. I don't think there's any way around that but to be somewhat tolerant of failure. But not so tolerant that people don't try hard!

Yeah. Probably the fact that it caries an uncertain risk of failure, and that the level of success is also unknowable, makes it more off-putting. Especially since EA has a quantitative bent.

One possible area for exploration is around Schistosomiasis prevention, as reinfection rates appear to be high after deworming campaigns. PMA2020 has launched an annual survey to measure the impact of Schistosomiasis control programs in Uganda.

Johns Hopkins University/Center for Communication Programs in Uganda will be conducting a mass media campaign to promote Schistosomiasis prevention in fall 2017 before deworming day. The 2017 PMA2020 survey should be able to measure changes in knowledge, attitudes and practices after the mass media campaign. If there is funding in place, the 2018 PMA2020 survey may be able to measure the impact of the mass media campaign on actual infection rates.

Does anyone have ideas for exploration around Schistosomiasis prevention? With the PMA2020 survey, there is a unique opportunity for data collection to help evaluate potential Schistosomiasis prevention programs.

Disclosure: I am helping fund both the data collection and mass media program in Uganda

I think this is a case where we're unlikely to be able to offer anything beyond what the academic community is going to do. I think the best way to improve exploration around schistosomiasis prevention would probably be to just fund some more PhD students!

In practice, no one can stomach what exploration looks like: giving money away to things that look like they won't work (and aren't tax deductible). I started a company because I expect the number of dollars available for weird stuff to continue to be near zero for the foreseeable future.

This problem is exacerbated by the fact that the most valuable people to have doing exploration with each other are also very very valuable exploiters, so the perceived opportunity cost is high.

I really like this. In fact, I would take it a step further. I believe we should expand the multi-armed bandit model to cover exploring areas like:

  • Philosophy, particularly ethics. It would be nice to know whether Hedonic or Preference utilitarianism is correct without having to compute all of our Coherent Extrapolated Volition. Perhaps a few people doing such narrow, targeted research could make headway in our lifetimes with university funding rather than EA money. This seems likely to make a large impact in what EAs fund for generations to come.

  • Neuroscience, especially regarding qualia and types of thought with moral concern. Principia Qualia is an EA attempt to solve this. This could resolve the questions which divide EA between animal and human charities.

  • Finding new big ideas. There are already people working on big projects like x-risks, s-risk, space colonization/industrialization, curing aging, cryogenic freezing, simulating neurons digitally, brain-computer interfaces, AGI, nanotechnology, self-replicating machines, etc. Most of these will likely fail, but perhaps a few will succeed if we're not just deluding ourselves on all counts. Are there entirely new fields which no one has thought of yet? I suspect the answer is yes. Having a better understanding of our own utility function would narrow the search space of valuable ideas significantly, but I think we likely can make headway based on existing philosophy. Robin Hanson made some suggestions just a few days ago.

  • Improving EA thought, mental-tools, physical tools, methodologies, and other ways of exploring/exploiting more efficiently. (This was mentioned in the OP, but I wanted to highlight it.)

Poor matching of employees/employers in impoverished countries seems particularly neglected, tractable, and scalable.

Curated and popular this week
Relevant opportunities