This is a linkpost for https://confusopoly.com/2019/04/03/the-optimizers-curse-wrong-way-reductions/.
Summary
I spent about two and a half years as a research analyst at GiveWell. For most of my time there, I was the point person on GiveWell’s main cost-effectiveness analyses. I’ve come to believe there are serious, underappreciated issues with the methods the effective altruism (EA) community at large uses to prioritize causes and programs. While effective altruists approach prioritization in a number of different ways, most approaches involve (a) roughly estimating the possible impacts funding opportunities could have and (b) assessing the probability that possible impacts will be realized if an opportunity is funded.
I discuss the phenomenon of the optimizer’s curse: when assessments of activities’ impacts are uncertain, engaging in the activities that look most promising will tend to have a smaller impact than anticipated. I argue that the optimizer’s curse should be extremely concerning when prioritizing among funding opportunities that involve substantial, poorly understood uncertainty. I further argue that proposed Bayesian approaches to avoiding the optimizer’s curse are often unrealistic. I maintain that it is a mistake to try and understand all uncertainty in terms of precise probability estimates.
I go into a lot more detail in the full post.
You seem to be using "people all agree" as a stand-in for "the optimizer's curse has been addressed". I don't get this. Addressing the optimizer's curse has been mathematically demonstrated. Different people can disagree about the specific inputs, so people will disagree, but that doesn't mean they haven't addressed the optimizer's curse.
I think combining into a single model is generally appropriate. And the sub-models need not be fully, explicitly laid out.
Suppose I'm demonstrating that poverty charity > animal charity. I don't have to build one model assuming "1 human = 50 chickens", another model assuming "1 human = 100 chickens", and so on.
Instead I just set a general standard for how robust my claims are going to be, and I feel sufficiently confident saying "1 human = at least 60 chickens", so I use that rather than my mean expectation (e.g. 90).