I'm currently researching forecasting and epistemics as part of the Quantified Uncertainty Research Institute.
Thanks so much for sharing that, it adds a lot of context to the conversation.I really, really hope this post doesn't act anything like "the last word" on this topic. This post was Nuno doing his best with only a few weeks of research based on publicly-accessible information (which is fairly sparse, and I could understand why). The main thing he was focused on was simple cost-effectiveness estimation of the key parameters, compared to GiveWell top charities, which I agree is a very high bar. I agree work on this scale really could use dramatically more comprehensive analysis, especially if other funders are likely to continue funding effectiveness-maximizing work here. One small point: I read this analysis much more as suggesting that "CJR is really tough to make effective compared to top GiveWell charities, upon naive analyses" than anything like "the specific team involved did a poor job".
Quick point that I'm fairly suspicious of uniform distributions for such uncertainties. I'd agree that our format of a 90% CI can be deceptive, especially when people aren't used to it. I imagine it would eventually be really neat to have probability distribution support right in the EA Forum. Until then, I'm curious if there are better ways to write the statistical summaries of many variables.
To me, "0.01 to 0.1" doesn't suggest that 5.5% is the "mean", but I could definitely appreciate that others would think that.
I'll write more of a post about it in a few weeks. Right now it's not really meant for external use; the API isn't quite stable.That said, you can see some older posts which give the main idea: https://www.lesswrong.com/s/rDe8QE5NvXcZYzgZ3
Good to hear it shifted your opinion!> I can see why releasing information about personal incompetence for instance might be unusual in some cultures; I'm not sure why you can't build a culture where releasing such information is accepted.I agree it's possible, but think it's a ton of work! Intense cultural change is really tough. Imagine an environment, for instance, where we had a public ledger of, for every single person:
There would be many positives of having such a list. However, it would also create a lot of problems, especially in the short-term. It would be pretty radical.I think it's good for people/orgs to experiment with radical honesty, though OP should probably be late to that. (You'd want to do experiments with smaller groups without so many commitments). The company Bridgewater is noted as being particularly honest. It seems to produce some good results, but also, it's an environment that seems terrible for most people. I recommend looking into reviews of it, it's pretty interesting. (Also, note that Elie and Holden both worked at Bridgewater).
Thanks for the comment! I think it's really useful to hear concerns and have public discussions about them. As stated earlier, this post went through a few rounds of revisions. We're looking to strike balances between publishing useful evaluative takes while not being disrespectful or personally upsetting.I think it's very easy to go too far on either side of this. It's very easy to not upset anyone, but also not say anything, for instance. We're still trying to find the best balances, as well as finding the best ways to achieve both candor and little offense. I'm sorry that this came of as having personal attacks. > Chloe and Jesse are competent and committed people working in a cause area that does not meet the 1000x threshold currently set by GiveWell top charitiesMaybe the disagreement is partially in the framing? I think this post was agreeing with you that it doesn't seem to match the (incredibly high) bar of GiveWell top charities. I think many people came at this thinking that maybe criminal justice did meet that bar, so this post was mostly about flagging that in retrospect, it didn't seem to. For what it's worth, I'd definitely agree that it is incredibly difficult to meet that bar. There are lots of incredibly competent people who couldn't do that. If you would have recommendations for ways this post and future evaluations could improve, I'd of course be very curious.
How do you balance your high opinion of OpenPhil with the assumption that there's information that cannot be made public, and which tips the scale in important decisions?
This is almost always the case for large organizations. All CEOs or government officials have a lot of private information that influences their decision making.
This private information does make it much more difficult for external evaluators to evaluate them. However, there's often still a lot that can be inferred. It's really important that these evaluators stay humble about their analysis in light of the fact that there's a lot of private information, but it's also important that evaluators still try, given the information available.
In fairness, the situation is a bit confusing. Open Phil came from GiveWell, which is meant for external donors. In comparison, as Linch mentioned, Open Phil mainly recommends donations just to Good Ventures (Cari Tuna and Dustin Moskovitz). My impression is that OP's main concern is directly making good grants, not recommending good grants to other funders. Therefore, a large amount of public research is not particularly crucial. I think the name is probably not quite ideal for this purpose. I think of it more like "Highly Effective Philanthropy"; it seems their comparative advantage / unique attribute is much more their choices of focus and their talent pool, than it is their openness, at this point. If there is frustration here, it seems like the frustration is a bit more "it would be nice if they could change their name to be more reflective of their current focus", than "they should change their work to reflect the previous title they chose".
In this case, the "mistakes" is often a list of things like, "This specific organization was much worse than we thought. The founders happened to have issues A, B, and C, which really hurt the organization's performance". Releasing such information publicly is a big pain and something our culture is not very well attuned to. If OP basically announced information like, "This small group we funded is terrible, in large part because their CEO, George, is very incompetent", that would be very unusual and there would likely be a large amount of resistance. I imagine OP would get a ton of heat for doing that. This issue is amplified by the fact that OP is a grantmaking organization much more than a grant recommendation organization. If it posted information publicly about an organization's competence, it's not clear which other orgs would actually use that information. My guess is that internally, people at OP admit a lot of mistakes. But making these mistakes public would create a lot of hazards, just because they contain a lot of information that specific organizations they work with would much prefer to be private.
I previously gave a fair bit of feedback to this document. I wanted to quickly give my take on a few things.
Overall, I found the analysis interesting and useful. However, I overall have a somewhat different take than Nuno did.
On OP: - Aaron Gertler / OP were given a previous version of this that was less carefully worded. To my surprise, he recommended going forward with publishing it, for the sake of community discourse. This surprised me and I’m really thankful. - This analysis didn’t get me to change my mind much about Open Philanthropy. I thought fairly highly of them before and after, and expect that many others who have been around would think similarly. I think they’re a fair bit away from being an “idealized utilitarian agent” (in part because they explicitly claim not to be), but still much better than most charitable foundations and the like.
On this particular issue: - My guess is that in the case of criminal justice reform, there were some key facts of the decision-making process that aren’t public and are unlikely to ever be public. It’s very common in large organizations for compromises to be made for various political or social reasons, for example. I’ve previously written a bit about similar things [here](https://twitter.com/ozziegooen/status/1456992079326978052).- I think Nuno’s quantitative estimates were pretty interesting, but I wouldn’t be too surprised if other smart people would come up with numbers that are fairly different. For those reading this, I’d take the quantitative estimates with a lot of uncertainty.- My guess is that a “highly intelligent idealized utilitarian agent” probably would have invested a fair bit less in criminal justice reform than OP did, if at all.
On evaluation, more broadly: - I’ve found OP to be a very intimidating target of critique or evaluation, mainly just because of their position. Many of us are likely to want funding from them in the future (or from people that listen to them), so the risk of getting people at OP upset is very high. From a cost-benefit position, publicly critiquing OP (or other high-status EA organizations) seems pretty risky. This is obviously unfortunate; these groups are often appreciative of feedback, and of course, they are some of the most useful groups to get feedback. (Sometimes prestigious EAs complain about getting too little feedback, I think this is one reason why). - I really would hate for this post to be taken as “ammunition” by people with agendas against OP. I’m fairly paranoid about this. That wasn’t the point of this piece at all. If future evaluations are mainly used as “ammunition” by “groups with grudges”, then that makes it far more hazardous and costly to publish them. If we want lots of great evaluations, we’ll need an environment that doesn’t weaponize them.- Similarly to the above point, I prefer these sorts of analysis and the resulting discussions to be fairly dispassionate and rational. When dealing with significant charity decisions I think it’s easy for some people to get emotional. “$200M could have saved X lives!”. But in the scheme of things, there are many decisions like this to make, and there will definitely be large mistakes made. Our main goals should be to learn quickly and continue to improve in our decisions going forward. - One huge set of missing information is OP’s internal judgements of specific grants. I’m sure they’re very critical now of some groups they’ve previously funded (in all causes, not just criminal justice). However, it would likely be very awkward and unprofessional to actually release this information publicly.- For many of the reasons mentioned above, I think we can rarely fully trust the public reasons for large actions by large institutions. When a CEO leaves to “spend more time with family”, there’s almost always another good explanation. I think OP is much better than most organizations at being honest, but I’d expect that they still face this issue to an extent. As such, I think we shouldn’t be too surprised when some decisions they make seem strange when evaluating them based on their given public explanations.
I appreciate the response here, but would flag that this came off, to me, as a bit mean-spirited. One specific part: > If you think that not trusting you is good, because you are liable to certain suboptimal mechanisms established early on, then are you acknowledging that your recommendations are suboptimal? Where would you suggest that impact-focused donors in EA look?1. He said "less trust", not "not trust at all". I took that to mean something like, "don't place absolute reverence in our public messaging."2. I'm sure anyone reasonable would acknowledge that their recommendations are less than optimal. 3. "Where would you suggest that impact-focused donors in EA look" -> There's not one true source that you should only pay attention to. You should probably look at a diversity of sources, including OP's work.