November 2022 update: I wrote this post during a difficult period in my life. I still agree with the basic point I was gesturing towards, but regret some of the presentation decisions I made. I may make another attempt in the future.
"A system that ignores feedback has already begun the process of terminal instability."
– John Gall, Systemantics
(My request from last time still stands.)
jimrandomh wrote a great comment in response to my last post:
The core thesis here seems to be:
"I claim that [cluster of organizations] have collectively decided that they do not need to participate in tight feedback loops with reality in order to have a huge, positive impact."
There are different ways of unpacking this, so before I respond I want to disambiguate them. Here are four different unpackings:
- Tight feedback loops are important, [cluster of organizations] could be doing a better job creating them, and this is a priority. (I agree with this. Reality doesn't grade on a curve.)
- Tight feedback loops are important, and [cluster of organizations] is doing a bad job of creating them, relative to organizations in the same reference class. (I disagree with this. If graded on a curve, we're doing pretty well. )
- Tight feedback loops are important, but [cluster of organizations] has concluded in their explicit verbal reasoning that they aren't important. (I am very confident that this is false for at least some of the organizations named, where I have visibility into the thinking of decision makers involved.)
- Tight feedback loops are important, but [cluster of organizations] is implicitly deprioritizing and avoiding them, by ignoring/forgetting discouraging information, and by incentivizing positive narratives over truthful narratives.
(4) is the interesting version of this claim, and I think there's some truth to it. I also think that this problem is much more widespread than just our own community, and fixing it is likely one of the core bottlenecks for civilization as a whole.
I think part of the problem is that people get triggered into defensiveness; when they mentally simulate (or emotionally half-simulate) setting up a feedback mechanism, if that feedback mechanism tells them they're doing the wrong thing, their anticipations put a lot of weight on the possibility that they'll be shamed and punished, and not much weight on the possibility that they'll be able to switch to something else that works better. I think these anticipations are mostly wrong; in my anecdotal observation, the actual reaction organizations get to poor results followed by a pivot is usually at least positive about the pivot, at least from the people who matter. But getting people who've internalized a prediction of doom and shame to surface those models, and do things that would make the outcome legible, is very hard.
...
I replied:
Thank you for this thoughtful reply! I appreciate it, and the disambiguation is helpful. (I would personally like to do as much thinking-in-public about this stuff as seems feasible.)
I mean a combination of (1) and (4).I used to not believe that (4) was a thing, but then I started to notice (usually unconscious) patterns of (4) behavior arising in me, and as I investigated further I kept noticing more & more (4) behavior in me, so now I think it's really a thing (because I don't believe that I'm an outlier in this regard).
...
I agree with jimrandomh that (4) is the most interesting version of this claim. What would it look like if the cluster of EA & Rationality organizations I pointed to last time were implicitly deprioritizing getting feedback from reality?
I don't have a crisp articulation of this yet, so here are some examples that seem to me to gesture in that direction:
- Giving What We Can focusing on the number of pledges signed rather than on the amount of money donated by pledge-signers (or better yet, on the impact those donations have had on projects out in the world).
- Founders Pledge focusing on the amount of money pledged and the amount of money donated, rather than on the impact those donations have had out in the world.
- The Against Malaria Foundation and GiveWell focusing on the number of mosquito nets distributed, rather than on the change in malaria incidence in the regions where they have distributed nets.
- 80,000 Hours tracking the number of advising calls they make and the number of career plan changes they catalyze, rather than the long-run impacts their advisees are having in the world.
- It's interesting to compare how 80,000 Hours and Emergent Ventures assess their impact; also 80,000 Hours-style career coaching would plausibly be much more effective if it were coupled with small grants to support advisee exploration. (This would be more of an incubator model.)
- CFAR not tracking workshop participant outcomes in a standardized way over time.
- The Open Philanthropy Project re-granting funds to community members on the basis of reputation (1, 2), rather than on the basis of a track record of effectively deploying capital or on the basis of having concrete, specific plans.
- We could also consider the difficulties that EA Funds had re: deploying capital a few years ago, though as far as I know that situation has improved somewhat in the last couple of years (thanks in large part to the heroic efforts of a few individuals).
Please don't misunderstand – I'm not suggesting that the people involved in these examples are doing anything wrong. I don't think that they are behaving malevolently. The situation seems to me to be more systemic: capable, well-intentioned people begin participating in an equilibrium wherein the incentives of the system encourage drift away from reality.
There are a lot of feedback loops in the examples I list above... but those loops don't seem to connect back to reality, to the actual situation on the ground. Instead, they seem to spiral upwards – metrics tracking opinions, metrics tracking the decisions & beliefs of other people in the community. Goodhart's Law neatly sums up the problem.
Why does this happen? Why do capable, well-intentioned people get sucked into equilibria that are deeply, obviously strange?
Let's revisit this part of jimrandomh's great comment:
I think part of the problem is that people get triggered into defensiveness; when they mentally simulate (or emotionally half-simulate) setting up a feedback mechanism, if that feedback mechanism tells them they're doing the wrong thing, their anticipations put a lot of weight on the possibility that they'll be shamed and punished, and not much weight on the possibility that they'll be able to switch to something else that works better. I think these anticipations are mostly wrong; in my anecdotal observation, the actual reaction organizations get to poor results followed by a pivot is usually at least positive about the pivot, at least from the people who matter. But getting people who've internalized a prediction of doom and shame to surface those models, and do things that would make the outcome legible, is very hard.
I don't have a full articulation yet, but I think this starts to get at it. The strange equilibria fulfill a real emotional need for the people who are attracted to them (see Core Transformation for discussion of one approach towards developing an alternative basis for meeting this need).
And from within an equilibrium like this, pointing out the dynamics by which it maintains homeostasis is often perceived as an attack...
In the sentence you quoted, you literally state that 80k tracks the # of calls and # of career plan changes, but doesn't track the long-run impacts of their advisees.