This is an extract from a post called "Doing EA Better", which argued that EA's new-found power and influence obligates us to solve our movement's significant problems with respect to epistemics, rigour, expertise, governance, and power.
We are splitting DEAB up into a sequence to facilitate object-level discussion.
Each post will include the relevant parts of the list of suggested reforms. There isn't a perfect correspondence between the subheadings of the post and the reforms list, so not all reforms listed will be 100% relevant to the section in question.
Finally, we have tried (imperfectly) to be reasonably precise in our wording, and we ask that before criticising an argument of ours, commenters ensure that it is an argument that we are in fact making.
Summary: EA is highly culturally quantitative, which is optimal for some problem categories but not others. Trying to put numbers on everything causes information loss and triggers anchoring and certainty biases. Individual Bayesian Thinking, prized in EA, has significant methodological issues. Thinking in numbers, especially when those numbers are subjective “rough estimates”, allow one to justify anything comparatively easily, and can lead to wasteful and immoral decisions.
EA places an extremely high value on quantitative thinking, mostly focusing on two key concepts: expected value (EV) calculations and Bayesian probability estimates.
From the EA Forum wiki: “The expected value of an act is the sum of the value of each of its possible outcomes multiplied by their probability of occurring.” Bayes’s theorem is a simple mathematical tool for updating our estimate of the likelihood of an event in response to new information.
Individual Bayesian Thinking (IBT) is a technique inherited by EA from the Rationalist subculture, where one attempts to use Bayes’ theorem on an everyday basis. You assign each of your beliefs a numerical probability of being true and attempt to mentally apply Bayes’ theorem, increasing or decreasing the probability in question in response to new evidence. This is sometimes called “Bayesian epistemology” in EA, but to avoid confusing it with the broader approach to formal epistemology with the same name we will stick with IBT.
There is nothing wrong with quantitative thinking, and much of the power of EA grows from its dedication to the numerical. However, this is often taken to the extreme, where people try to think almost exclusively along numerical lines, causing them to neglect important qualitative factors or else attempt to replace them with doubtful or even meaningless numbers because “something is better than nothing”. These numbers are often subjective “best guesses” with little empirical basis.[27]
For instance, Bayesian estimates are heavily influenced by one’s initial figure (one’s “prior”), which, especially when dealing with complex, poorly-defined, and highly uncertain and speculative phenomena, can become subjective (based on unspecified values, worldviews, and assumptions) to the point of arbitrary.[28] This is particularly true in existential risk studies where one may not have good evidence to update on.
We assume that, with enough updating in response to evidence, our estimates will eventually converge on an accurate figure. However, this is dependent on several conditions, notably well-formulated questions, representative sampling of (accurate) evidence, and a rigorous and consistent method of translating real-world observations into conditional likelihoods.[29] This process is very difficult even when performed as part of careful and rigorous scientific study; attempting to do it all in your head, using rough-guess or even purely intuitional priors and likelihoods, is likely to lead to more confidence than accuracy.
This is further complicated by the fact that probabilities are typically distributions rather than point values – often very messy distributions that we don’t have nice neat formulae for. Thus, “updating” properly would involve manipulating big and/or ugly matrices in your head. Perhaps this is possible for some people.
A common response to these arguments is that Bayesianism is “how the mind really works”, and that the brain already assigns probabilities to hypotheses and updates them similarly or identically to Bayes’ rule. There are good reasons to believe that this may be true. However, the fact that we may intuitively and subconsciously work along Bayesian lines does not mean that our attempts to consciously “do the maths” will work.
In addition, there seems to have been little empirical study of whether Individual Bayesian Updating actually outperforms other modes of thought, never mind how this varies by domain. It seems risky to put so much confidence in a relatively unproven technique.
The process of Individual Bayesian Updating can thus be critiqued on scientific grounds, but there is also another issue with it and hyper-quantitative thinking more generally: motivated reasoning. With no hard qualitative boundaries and little constraining empirical data, the combination of expected value calculations and Individual Bayesian Thinking in EA allows one to justify and/or rationalise essentially anything by generating suitable numbers.
Inflated EV estimates can be used to justify immoral or wasteful actions, and somewhat methodologically questionable subjective probability estimates translate psychological, cultural, and historical biases into truthy “rough estimates” to plug into scientific-looking graphs and base important decisions upon.
We then try to optimise our activities using the numbers we have. Attempting to fine-tune estimates of the maximally impactful strategy is a great approach when operating within fairly predictable, well-described domains, but is a fragile and risky strategy when operating in complex and uncertain domains (like existential risk) even when you have solid reasons for believing that your numbers are good – what if you’re wrong? Robustness to a wide variety of possibilities is typically the objective of professionals in such areas, not optimality; we should ask ourselves why.
Such estimates can also trigger the anchoring bias, and imply to lay readers that, for example, while unaligned artificial intelligence may not be responsible for almost twice as much existential risk as all other factors combined, the ratio is presumably somewhere in that ballpark. In fact, it is debatable whether such estimates have any validity at all, especially when not applied to simple, short-term (i.e. within a year),[30] theoretically well-defined questions. Indeed, they do not seem to be taken seriously by existential risk scholars outside of EA.[31] The apparent scientific-ness of numbers can fool us into thinking we know much more about certain problems than we actually do.
This isn’t to say that quantification is inherently bad, just that it needs to be combined with other modes of thought. When a narrow range of thought is prized above all others, blind spots are bound to emerge, especially when untested and controversial techniques like Individual Bayesian Thinking are conflated (as they sometimes are by EAs) with “transparent reasoning” and even applied “rationality” itself.
Numbers are great, but they’re not the whole story.
Summary: Overly-numerical thinking lends itself to homogeneity and hierarchy. This encourages undue deference and opaque/unaccountable power structures. EAs assume they are smarter/more rational than non-EAs, which allows us to dismiss opposing views from outsiders even when they know far more than we do. This generates more homogeneity, hierarchy, and insularity.
Under number-centric thinking, everything is operationalised as (or is assigned) some value unless there is an overwhelming need or deliberate effort to think otherwise. A given value X is either bigger or smaller than another value Y, but not qualitatively different to it; ranking X with respect to Y is the only possible type of comparison. Thus, the default conceptualisation of a given entity is a point on a (homogenous) number line. In a culture strongly focused on maximising value (that “line goes up”), one comes to assume that this model fits everything: put a number on something, then make the number bigger.
For instance, (intellectual) ability is implicitly assumed within much of EA to be a single variable[32], which is simply higher or lower for different people. Therefore, there is no need for diversity, and it feels natural to implicitly trust and defer to the assessments of prominent figures (“thought leaders”) perceived as highly intelligent. This in turn encourages one to accept opaque and unaccountable hierarchies.[33]
This assumption of cognitive hierarchy contributes to EA’s unusually low opinion of diversity and democracy, which reduces the input of diverse perspectives, which naturalises orthodox positions, which strengthens norms against diversity and democracy, and so on.
Moreover, just as prominent EAs are assumed to be authoritative, the EA community’s focus on individual epistemics leads us to think that we, with our powers of rationality and Bayesian reasoning, must be epistemically superior to non-EAs. Therefore, we can place overwhelming weight on the views of EAs and more easily dismiss the views of the outgroup, or even disregard democracy in favour of an “epistocracy” in which we are the obvious rulers.[34]
This is a generator function for hierarchy, homogeny, and insularity. It is antithetical to the aim of a healthy epistemic community.
In fact, work on the philosophy of science in existential risk has convincingly argued that in a field with so few independent evidential feedback loops, homogeneity and “conservatism” are particularly problematic. This is because unlike other fields where we have a good idea of the epistemic landscape, the inherently uncertain and speculative nature of Existential Risk Studies (ERS) means that not only are we uncertain of whether we have discovered an epistemic peak, but what the topography of the epistemic landscape even looks like. Thus, we should focus on creating the conditions for creative science, rather than the conservative science that we (i.e. the EAs within ERS) are moving towards through our extreme focus on a narrow range of disciplines and methodologies .
Below, we have a preliminary non-exhaustive list of relevant suggestions for structural and cultural reform that we think may be a good idea and should certainly be discussed further.
It is of course plausible that some of them would not work; if you think so for a particular reform, please explain why! We would like input from a range of people, and we certainly do not claim to have all the answers!
In fact, we believe it important to open up a conversation about plausible reforms not because we have all the answers, but precisely because we don’t.
Italics indicates reforms strongly inspired by or outright stolen from Zoe Cremer’s list of structural reform ideas. Some are edited or merely related to her ideas; they should not be taken to represent Zoe’s views.
Asterisks (*) indicate that we are less sure about a suggestion, but sure enough that we think they are worth considering seriously, e.g. through deliberation or research. Otherwise, we have been developing or advocating for most of these reforms for a long time and have a reasonable degree of confidence that they should be implemented in some form or another.
Timelines are suggested to ensure that reforms can become concrete. If stated, they are rough estimates, and if there are structural barriers to a particular reform being implemented within the timespan we suggest, let us know!
Categorisations are somewhat arbitrary, we just needed to break up the text for ease of reading.
[27] The vast majority of researchers, professionals, etc. do not try to quantitatively reason from first principles in this way. There seems relatively little consideration within EA of why this might be. ↩︎
[28] This is known in the philosophy of probability as the Problem of Priors. ↩︎
[29] That is, into the probability of making a given observation assuming that the hypothesis in question is true: P(E|H). ↩︎
[30] “There is no evidence that geopolitical or economic forecasters can predict anything ten years out." – Phillip Tetlock ↩︎
[31] The conclusions of what is by far the most comprehensive and rigorous study of quantification in existential risk (Beard, Rowe, and Fox, 2020) is that all the methods we have at the moment are rather limited or flawed in one way or another, that the most popular methods are also the least rigorous, and that the best route forward is to learn from other fields by transparently laying out our reasoning processes for others to evaluate. ↩︎
[32] There’s probably a link to the Rationalist community’s emphasis on IQ here. [Editor’s note: see Bostrom]. ↩︎
[33] As Noah Scale puts it, “EAs [can] defer when they claim to argue.” ↩︎
[34] To clarify, we’re not saying that this sort of hierarchical sensibility is purely due to number-centric thinking: other cultural and especially class-political factors are likely to play a very significant role. ↩︎