Feb 05, 2017
[Epistemic status: strongly stated, weakly held]
When faced with problems that involve ongoing learning, most strategies involve a balance between "exploration" and "exploitation". Exploration means taking opportunities that increase your knowledge about how good your opportunities are, whereas exploitation means putting resources into what you currently believe is the best opportunity. A good strategy will involve both: if you only explore, then you will never actually reap any rewards, whereas if you only exploit then you will likely spend all your resources on a poor opportunity.
When we in the EA community are thinking about how to spend our resources, we face an exploration/exploitation problem. In this post I'm going to argue that:
I'm deliberately going to leave these terms fairly vague - I'm not entirely sure that they can be made precise in this context1. However, here are some canonical exemplars of what exploration and exploitation look like in the setting of EA:
We actually have two additional options that aren't available in traditional exploration/exploitation problems:
Exploration improves the effectiveness of exploitation, because we may discover that we can switch our resources to a better opportunity. If GiveWell discovered a new top charity that was significantly more effective than its current ones, then many EAs who are donating (exploiting) might switch their donations to that charity, increasing the amount of good they achieved with their exploitation.
"Smart money" (and, although it's less fungible, "smart capacity" like talented employees) corresponds to our exploitation capacity. The greater our exploitation capacity, the more benefit we gain from moving our resources to a better opportunity, and so the more valuable it is to explore.
We actually have a couple of quantitative models of how useful exploration is:
Both of these models suggest that the cost-effectiveness of spending on exploration could be many times that of spending on exploitation (funding charities), and that the value of exploration is greatly increased by the amount of "smart money" that can be directed by improved information.
The improvement from more exploration may not be a simple multiplier on the effectiveness of our exploitation (as is assumed in e.g. the DCP model), because opportunities do not have an unlimited capacity to absorb resources. If anything, this makes exploration more important: if we do not explore enough then we are at risk of running out of effective ways to spend our resources, in which case we would have to fall back on worse opportunities or just save our money.3
The EA community has recently seen a vast increase in its exploitation capacity. Not only has the community expanded a great deal, but the Open Philanthropy Project (OPP) now has the potential to provide a vast amount of funding. That means that exploration (and hence improving exploration capacity) is increasingly valuable, since it has the potential to enhance a huge budget.
The EA community has benefited from an excellent initial endowment of exploration done by the academic community (Glennerster and Kremer; DCP), and later GiveWell and Giving What We Can. That means that there has been plenty of low-hanging fruit available for collection, if we could get the resources to harvest it. So we have largely been able to focus on improving exploitation (outreach).
That doesn't mean that we haven't increased our exploration capacity: today there is more exploration happening than ever, with all of the existing research organizations having increased their capacity, and new organizations like OPP also doing exploration work.
However, the total amount of effort expended in this direction by the EA community is still small relative to how much our exploitation capacity has improved. I very roughly estimate the number or researchers doing exploratory work at <100.4 These jobs also tend not to be incredibly highly-paid, so at a generous assumption of ~$50k per person, and doubling to allow for support staff, that's still less than $10m being spent on research by the movement as a whole.5
I don't have a good idea for what kind of spending ratio would be ideal, but this seems low to me (more precise models welcome!). As a heuristic, I think most EAs would consider RCTs to be a cost-effective way of getting high-quality information about a possible solution. But the EA community has spent almost no money on funding RCTs!6 Hopefully this means that we are spending our money doing exploration that is even more valuable than RCTs, but that means that we have not even progressed far enough up the curve of diminishing marginal returns to exploration to reach "doing RCTs".
From an exploitation point of view, all of these are equivalent:
In all of these cases we gain the ability to devote resources to a good opportunity that we could not previously have exploited.
The EA community has so far focussed on discovering and assessing existing funding opportunities, which is not surprising given its historical roots in charity evaluation. This is not an unreasonable strategy, even if the opportunities which you assess are simply generated randomly (as is the case in the DCP model).
But we might be able to do even better if we can generate high-quality opportunities deliberately, rather than having to wait for them to happen by accident. Matt Clifford, who runs EF, a London startup accelerator, describes traditional VC investing as waiting for high-value "lightning strikes", whereas an effective startup incubator should be like an electricity generator, producing value deliberately and predictably. I think we can aspire to something similar for high-impact organizations.
One way to try and create a high-impact organization is to sit down and think very hard about what organizations might be high-impact, and then once you think you have a good idea, go ahead and implement it. But we know that this is often a bad strategy! The modern startup industry functions more like an experimental laboratory. Sometimes you gain a great deal of information from actually trying things.7 Experimentation has the added virtue that a successful experiment can gradually transition into a exploitation opportunity as it grows.
There are a number of organizations in or around EA who are working on exploration through experimentation:
However, I think there is room to do much more, both directly and in capacity-building. For example:
As a community, we should invest more in exploration.10 Concretely,
In particular, starting or funding new high-impact organizations does not seem to be a focus area for the "giga-funders" like OPP, so this may be a place where small donors can bet better.
Finally, we should start doing this as soon as possible. Exploration tends to take time, and there is already some possibility that OPP could saturate most of our best available opportunities. Time to expand the frontier!
[Cross-posted from my blog]
If we consider value of information (VoI), then saying "we should do more exploration" just amounts to saying "the VoI of this (exploration) action is high enough to make it the best option", in which case we are only ever doing exploitation.
I could probably have written this post and only talked about VoI, and I don't think it would have changed the conclusions much. However, I find the exploration/exploitation dichotomy to be intuitive, so I've stuck with it.↩
I still think this model is plausible for many meta-charities, especially since we think that many of them are drawing from fat-tailed distributions.↩
Of course, this is not a problem if we are already planning to give later.↩
Making some very rough estimates of the numbers of exploratory research staff employed by various organizations: I believe OPP has ~10, CEA has ~5, GiveWell has ~7, ACE has ~3, Sentience Politics ~3. I don't have a good idea how many researchers are employed by other relevant organizations across all cause areas, but I'm fairly sure the number of full-time researchers must be less than 100.↩
There is also exploration work relevant to EA that is being done outside the movement, e.g. by J-PAL, but I'm disregarding that as we have less influence over how that money is spent.↩
Obviously the best strategy is a mixed one - pure experimentation is very slow and wasteful. But if we're anything like other domains we should expect to have quite a lot of experimentation in the mix.↩
This is similar to what Spark Wave is doing, but there's room for more work in this space!↩
Note that this is a bit more specific than just saying "fund meta", since I'm not including things like EA movement-building which have historically been considered part of "meta". I think these are helpful, but not the best ways to improve exploration.↩