Thank you to Sam Atis, Robert Harling, Guive Assadi, Pranay Mittal, and Francis Priestland for comments on earlier drafts. The research for this post was supported by a grant from the FTX Future Fund Regranting Program.
The train to crazy town
Sam Atis, following on from some arguments by Scott Alexander, writes of ‘the train to Crazy Town’. As Sam presents it, there are a series of escalating challenges to utilitarian-style (and more broadly consequentialist-style) reasoning, leading further and further into absurdity. Sam himself bites the bullet on some classic cases, like the transplant problem and the repugnant conclusion, but is put off by some more difficult examples:
Thomas Hurka’s St Petersburg Paradox: Suppose you are offered a deal—you can press a button that has a 51% chance of creating a new world and doubling the total amount of utility, but a 49% chance of destroying the world and all utility in existence. If you want to maximise total expected utility, you ought to press the button—pressing the button has positive expected value. But the problem comes when you are asked whether you want to press the button again and again and again—at each point, the person trying to maximise expected utility ought to agree to press the button, but of course, eventually they will destroy everything.
The Very Repugnant Conclusion: Once [utilitarians] assign some positive value, however small, to the creation of each person who has a weak preference for leading her life rather than no life, then how can they stop short of saying that some large number of such lives can compensate for the creation of lots of dreadful lives, lives in pain and torture that nobody would want to live?
Most people, like Sam, try to get off the train before they reach the end of the line, trying to preserve some but not all of utilitarianism. And so the question is: how far are you willing to go?
As a list of challenges to utilitarianism, I think Sam’s post is lacking: he is very focussed on specific thought experiments, ignoring more theoretical objections that in my view are much more insightful. But as a provocation—how far will you go in your utilitarianism?—I think it’s an extraordinarily useful post. Sam takes it upon himself to pose the following question: ‘what are the principles by which we should decide when to get off the train?’
But something worries me about Sam’s presentation. Who said you could actually get off the train to crazy town? Each additional challenge to utilitarian logic—each stop on the route—does not seem to assume any new premises: every problem is generated by the same basic starting point of impartially weighing up all people’s experiences and preferences against each other. As such, there just might not be any principles you can use to justify biting this bullet but not that bullet—doing so might even be logically incoherent. Sam says that he does want to get off the train eventually, but it’s not clear how he would do that.
Alexander’s original post suggests something closer to this view. He explains why he dislikes the Repugnant Conclusion and various other situations in which ‘longtermist’-style consequentialism goes awry, and then lays out the position he would take to avoid these conclusions ‘[i]f I had to play the philosophy game’. But, he writes, ‘I’m not sure I want to play the philosophy game.’ He’s not confident that his own partial theoretical compromise actually will avoid the absurdity (and rightly so), and he cares more about avoiding absurdity than he does about getting on the train.
Mainstream commentators from outside Effective Altruism have made this point too. Stephen Bush, a Financial Times journalist whose political analysis I respect a lot, reviewed William MacAskill’s book What We Owe the Future a few weeks ago; his primary criticism was that the whole book seemed to be trying to ‘sell’ readers ‘on a thought experiment that even its author doesn’t wholly endorse’. As an example, Bush notes that arguments in the book straightforwardly entail that ‘my choice to remain childless so that my partner and I can fritter away our disposable income’ is immoral. Officially, MacAskill swears off this implication; but this just looks like he ‘explicitly sets out the case for children and then goes “now, of course I’m not saying that”’. MacAskill tries to avoid these inferences, but it is entirely reasonable for someone like Bush to look at where MacAskill’s logic is heading, decide they don’t like it, and reject the whole approach.
In other words, it’s not clear how anyone actually gets off the train to crazy town. Once you allow even a little bit of utilitarianism in, the unpalatable consequences follow immediately. The train might be an express service: once the doors close behind you, you can’t get off until the end of the line.
I want to get off Mr Bones’ Wild Ride
This seems like a pretty significant problem for Effective Altruists. Effective Altruists seemingly want to use a certain kind of utilitarian (or more generally consequentialist) logic for decision-making in a lot of situations; but at the same time, the Effective Altruism movement aims to be broader than pure consequentialism, encompassing a wider range of people and allowing for uncertainty about ethics. As Richard Y. Chappell has put it, they want ‘utilitarianism minus the controversial bits’. Yet it’s not immediately clear how the models and decision-procedures used by Effective Altruists can consistently avoid any of the problems for utilitarianism: as examples above illustrate, it’s entirely possible that even the simplest utilitarian premises can lead to seriously difficult conclusions.
Maybe a few committed ‘EAs’ will bite the relevant bullets. But not everyone will, and this could potentially create bad epistemic foundations for Effective Altruism (if people end up being told to accept premises without worrying about their conclusions) as well as poor social dynamics (as with Alexander believing his position is ‘anti-intellectual’). And beyond the community itself, the general public isn’t stupid: if they can see that this is where Effective Altruist logic leads, they might simply avoid it. This could significantly hamper the effectiveness of Effective Altruist advocacy. In this post, I want to ask if this impression—that Effective Altruism can’t get consistently off the train to crazy town—is correct, and what it might mean if it is.
An impossibility result
Above, I introduced my main idea in an intuitive way, showing how a bunch of different writers have come to similar conclusions. I now want to try to be a bit more formal about it. This section draws heavily on a 1996 article by Tyler Cowen, which is called ‘What Do We Learn from the Repugnant Conclusion?’ but which is much broader in scope than the title would suggest. Cowen is a well-known thinker within Effective Altruism, and Effective Altruists are often interested in population ethics and the repugnant conclusion, but this article and Cowen’s other writings on population ethics (with the exception of the piece he co-authored with Derek Parfit on discounting) seem relatively unknown in these spaces.
Cowen’s argument is similar on the surface to ‘impossibility theorems’ in the population ethics literature, which prove that we cannot coherently combine all the things we intuitively want from a theory of population ethics. But on a deeper level, it’s quite different: it’s about problems for moral theories in general, not just population ethics. In particular, Cowen is discussing moral theories with ‘universal domain’, which just means systematic theories that are able to compare any two possible options. This is as opposed to moral particularism, which opposes the use of general principles in moral thought and favours individual case-by-case judgments, and especially to value pluralism, which is committed to there being multiple incommensurable values and thus insists that sometimes different situations are morally incomparable. Theories with universal domain include almost all consequentialist and deontological theories, as well as some forms of virtue ethics: these are all committed to comparing and weighing up different values (whether by reducing them all to a single overarching value, or by treating them as separate but commensurable ends-in-themselves), and can systematically evaluate all possible options.
Cowen restricts his attention to theories where one of the values that matters for ranking outcomes is utility (whether in its preference-satisfaction, hedonist, or welfarist guises). It’s not clear that this is strictly necessary—at least some theories that ignore utility, though not necessarily all of them, face similar problems anyway—but Cowen focusses on utility for simplicity and clarity. Importantly, this doesn’t overly limit our focus: Cowen’s condition includes all moral views that might support Effective Altruism, including non-consequentialist theories that include an account of impartial, aggregative do-gooding as well as pure consequentialism.
So, the problem is this. Effective Altruism wants to be able to say that things other than utility matter—not just in the sense that they have some moral weight, but in the sense that they can actually be relevant to deciding what to do, not just swamped by utility calculations. Cowen makes the condition more precise, identifying it as the denial of the following claim: given two options, no matter how other morally-relevant factors are distributed between the options, you can always find a distribution of utility such that the option with the larger amount of utility is better. The hope that you can have ‘utilitarianism minus the controversial bits’ relies on denying precisely this claim.
This condition doesn’t aim to make utility irrelevant, such that utilitarian considerations should never change your mind or shift your perspective: it just requires that they can be restrained, with utility co-existing with other valuable ends. It guarantees that utility won’t automatically swamp other factors, like partiality towards family and friends, or personal values, or self-interest, or respect for rights, or even suffering (as in the Very Repugnant Conclusion). This would allow us to respect our intuitions when they conflict with utility, which is just what it means to be able to get off the train to crazy town.
Now, at the same time, Effective Altruists also want to emphasise the relevance of scale to moral decision-making. The central insight of early Effective Altruists was to resist scope insensitivity and to begin systematically examining the numbers involved in various issues. ‘Longtermist’ Effective Altruists are deeply motivated by the idea that ‘the future is vast’: the huge numbers of future people that could potentially exist gives us a lot of reason to try to make the future better. The fact that some interventions produce so much more utility—do so much more good—than others is one of the main grounds for prioritising them. So while it would technically be a solution to our problem to declare (e.g.) that considerations of utility become effectively irrelevant once the numbers get too big, that would be unacceptable to Effective Altruists. Scale matters in Effective Altruism (rightly so, I would say!), and it doesn’t just stop mattering after some point.
So, what other options are there? Well, this is where Cowen’s paper comes in: it turns out, there are none. For any moral theory with universal domain where utility matters at all, either the marginal value of utility diminishes rapidly (asymptotically) towards zero, or considerations of utility come to swamp all other values. The formal reasoning behind this impossibility result can be found in the appendix to Cowen’s paper; it relies on certain order properties of the real numbers. But the same general argument can be made without any mathematical technicality.
Consider two options, A and B, where A is arbitrarily preferable to B along all dimensions—except, possibly, utility. Now imagine we can continually increase the amount of utility in option B, while keeping everything else fixed. At some point in this process, one of two things must occur:
- These increases in utility eventually become so large in aggregate that their value swamps the value of everything else, and B becomes preferable to A on utility grounds alone.
- Each additional unit of utility begins to ‘matter’ less and less, with the marginal value of utility diminishing rapidly to zero, such that A remains preferable to B no matter how much utility is added.
This is where the paradoxes that Sam discusses come in: they are concrete examples of exactly this sort of case. Take the Very Repugnant Conclusion as an example. You start with world A, containing a small population who live in bliss with all the good things in life, and world B, containing solely a huge quantity of suffering with nothing else of value. You then begin adding additional utility into world B, in the form of additional lives with imperceptibly positive value—one brief minute of ‘muzak and potatoes’ each. And then the inevitable problem: either the value of all these additional lives, added together, eventually swamps the negative value of suffering; or, eventually, the marginal value of an additional life becomes infinitesimally small, such that there is no number of lives you could add to make B better than A.
I hope the reasoning is clear enough from this sketch. If you are committed to the scope of utility mattering, such that you cannot just declare additional utility de facto irrelevant past a certain point, then there is no way for you to formulate a moral theory that can avoid being swamped by utility comparisons. Once the utility stakes get large enough—and, when considering the scale of human or animal suffering or the size of the future, the utility stakes really are quite large—all other factors become essentially irrelevant, supplying no relevant information for our evaluation of actions or outcomes.
Further discussion and some more cases
This structure does not just apply to population ethics problems, like the repugnant conclusion and the Very Repugnant Conclusion. As Cowen showed in a later paper, the same applies to both Pascal’s mugging and Hurka’s version of the St Petersburg paradox, both of which seem quite different due to their emphasis on probability and risk but which have the same fundamental structure: start with an obvious choice between two options (should I or shouldn’t I give my wallet to this random person who is promising me literally nothing?), then keep adding tiny amounts of utility to the bad option (until the person is promising you huge amounts of utility). Many of the problems of infinite ethics have this structure as well. While this structure doesn’t fit all of the standard challenges to utilitarianism (e.g., the experience machine), it fits many of them, and—relevantly—it fits many of the landmarks on the way to crazy town.
Indeed, in section five Cowen comes close to suggesting a quasi-algorithmic procedure for generating challenges to utilitarianism. You just need a sum over a large number of individually-imperceptible epsilons somewhere in your example, and everything else falls into place. The epsilons can represent tiny amounts of pleasure, or pain, or probability, or something else; the large number can be extended in time, or space, or state-space, or across possible worlds; it can be a one-shot or repeated game. It doesn’t matter. You just need some Σ ε and you can generate a new absurdity: you start with an obvious choice between two options, then keep adding additional epsilons to the worse option until either utility vanishes in importance or utility dominates everything else.
In other words, Cowen can just keep generating more and more absurd examples, and there is no principled way for you to say ‘this far but no further’. As Cowen puts it:
Once values are treated as commensurable, one value may swamp all others in importance and trump their effects… The possibility of value dictatorship, when we must weigh conflicting ends, stands as a fundamental difficulty.
A popular response in the Effective Altruist community to problems that seem to involve something like dogmatism or ‘value dictatorship’—indeed, the response William MacAskill gave when Cowen himself made some of these points in an interview—is to invoke moral uncertainty. If your moral view faces challenges like these, you should downweigh your confidence in it; and then, if you place some weight on multiple moral views, you should somehow aggregate their recommendations, to reach an acceptable compromise between ethical outlooks.
Various theories of moral uncertainty exist, outlining how this aggregation works; but none of them actually escape the issue. The theories of moral uncertainty that Effective Altruists rely on are themselves frameworks for commensurating values and systematically ranking options, and (as such) they are also vulnerable to ‘value dictatorship’, where after some point the choices recommended by utilitarianism come to swamp the recommendations of other theories. In the literature, this phenomenon is well-known as ‘fanaticism’.
Once you let utilitarian calculations into your moral theory at all, there is no principled way to prevent them from swallowing everything else. And, in turn, there’s no way to have these calculations swallow everything without them leading to pretty absurd results. While some of you might bite the bullet on the repugnant conclusion or the experience machine, it is very likely that you will eventually find a bullet that you don’t want to bite, and you will want to get off the train to crazy town; but you cannot consistently do this without giving up the idea that scale matters, and that it doesn’t just stop mattering after some point.
Getting on a different train
Pluralism and universal domain
Is this post a criticism of Effective Altruism? I’m not actually sure. For some personal context: I’ve signed the Giving What We Can pledge, and I’m proud of that; I find large parts of Effective Altruism appealing, although I’m turned off by many other aspects; I think the movement has provably done a lot of good. It’s too easy, and completely uncharitable, to simply write the movement off as inconsistent. But yet, when it comes to the theory of Effective Altruism, I’m not sure how it gets off the ground at all without leading to absurdities. How do I square this?
In his recent interview with MacAskill, Cowen said the following (edited for clarity):
At the big macro level (like the whole world of nature versus humans, ethics of the infinite, and so on) it seems to me utilitarianism doesn’t perform that well. Isn’t the utilitarian part of our calculations only a mid-scale theory? You can ask: does rent control work? Are tariffs good? Utilitarianism is fine there. But otherwise, it just doesn’t make sense…
[W]hy not get off the train a bit earlier? Just say: ‘Well, the utilitarian part of our calculations is embedded within a particular social context, like, how do we arrange certain affairs of society. But if you try to shrink it down to too small (how should you live your life?) or to too large (how do we deal with infinite ethics on all of nature?) it just doesn’t work. It has to stay embedded in this context.’ Universal domain as an assumption doesn’t really work anywhere, so why should it work for the utilitarian part of our ethics?
… It’s not that there’s some other theory that’s going to tie up all the conundrums in a nice bundle, but simply that there are limits to moral reasoning, and we cannot fully transcend the notion of being partial because moral reasoning is embedded in that context of being partial about some things.
There are two ways to read this suggestion. The first is that Cowen just wants us to accept that, past a certain point, the value of additional utility vanishes quickly to zero: when we zoom out too far, utility becomes de facto meaningless. (This reading is supported especially by the first paragraph I quoted.) But there’s a different way to read his suggestion (which is supported more by the third paragraph I quoted), which is that rather than taking the logic of this own argument at face value, Cowen is urging MacAskill to take a step back and reject one of its presuppositions: universal domain.
If we accept a certain amount of incommensurability between our values, and thus a certain amount of non-systematicity in our ethics, we can avoid the absurdities directly. Different values are just valuable in different ways, and they are not systematically comparable: while sometimes the choices between different values are obvious, often we just have to respond to trade-offs between values with context-specific judgment. On these views, as we add more and more utility to option B, eventually we reach a point where the different goods in A and B are incommensurable and the trade-off is systematically undecidable; as such, we can avoid the problem of utility swallowing all other considerations without arbitrarily declaring it unimportant past a certain point.
MacAskill responds to Cowen by arguing that ‘we should be more ambitious than that with our moral reasoning’. He seems to think that we will eventually find a theory that ‘ties up all the conundrums’—perhaps if we hand it over to ‘specially-trained AIs’ for whom ethics is ‘child’s play’ (as he writes in What We Owe the Future). But it’s not clear that ‘ambition’ has anything to do with it. Even the smartest AI could not eliminate the logical contradictions we face in (say) population ethics; at most, it could give us a recommendation about which bullet to bite. Likewise, Alexander seems to think that (something like) this position is ‘anti-intellectual’. It is unsystematic, to be sure, but it’s no more anti-intellectual than Hume was: it’s not an unprincipled rejection of all thinking, but an attempt to figure out where we run up against the limits of moral theorising.
Such a position would rule out utilitarianism as a general-purpose theory of morality, or even its more limited role as a theory of the (supposed) part of morality philosophers call ‘beneficence’. But it wouldn’t stop us from using utilitarianism as a model for moral thinking, a framework representing certain ways we think about difficult questions. It might be especially relevant to thinking about trade-offs where we have to weigh up costs and benefits—especially if, as Barbara Fried has argued, it is the only rigourous ethical framework that is able to face up to uncertainty and scarcity. But, like all models, it would only be valid within a certain context. Utilitarianism can remain a really important aspect of moral reasoning, just not in the way that we are familiar with from universal moral theories.
To state my own personal view, I think I am probably >60% confident in something like this position being right. Some kind of consequentialist thinking seems pretty applicable in a lot of situations, and to often be very helpful. We can reject universal domain, and thus value commensurability, while retaining this insight. Cowen would not be the first to make this claim: Isaiah Berlin, the most famous 20th century defender of value incommensurability, was convinced that ‘Utilitarian solutions are sometimes wrong, but, I suspect, more often beneficent.’ Utilitarianism is not a general-purpose theory of the good; but it is an important framework that can generate important insights.
And this seems to be all Effective Altruism needs. Holden Karnofsky recently made a call for pluralism within Effective Altruism: the community needs to temper its central ‘ideas/themes/memes’ with pluralism and moderation. But Karnofsky argues, further, that the community already does this: ‘My sense is that many EAs’ writings and statements are much more one-dimensional … than their actions.’ In practice, Effective Altruists are not willing to purchase theoretical coherence at the price of absurdity; they place utilitarian reasoning in a pluralist context. They may do this unreflectively, and I think they do it imperfectly; but it is an existence proof of a version of Effective Altruism that accepts that utility considerations are embedded in a wider context, and tempers them with judgment.
The core of Effective Altruism
Effective Altruists can’t be entirely happy with Cowen’s position, of course. They think utilitarian reasoning should be applicable to examples beyond those drawn from economics textbooks—at very least, they think it should be relevant to decisions around donations and career choices.
We can be more precise about what Effective Altruism asks of utilitarian reasoning. Effective Altruism places far more weight than even previous utilitarians had on the optimising or maximising aspects of utilitarianism. This goes back to my previous comments on scale, and opposition to scope-insensitivity: the ‘hard core’ of Effective Altruism is the idea that, at least for most of us, the ethically relevant differences between the options we face are huge, despite the fact that we often tend to act as if they were negligible or unknowable. Given this premise, the appeal of utilitarianism is immediate. Utilitarianism is an optimising framework: its focus is on achieving the best possibilities, rather than simply a selection of acceptable options. This is an immensely (even overwhelmingly) useful feature of the utilitarian framework for Effective Altruists; it gives them reason to use it in their moral reasoning even at large scales. MacAskill should not have challenged Cowen’s ambition; rather, he should have challenged his naïve position that the limits of utilitarianism should be defined with reference to scale.
But, even as Effective Altruists are excited about the optimisation implicit in utilitarianism, they have to be wary about its flip-side: the potential for fanaticism and value dictatorship. Utilitarian reasoning needs to be bounded or restrained in some way. But Cowen’s argument shows that there can be no principled, systematic account of these bounds and restraints. If we hope to represent questions of scope using utilitarian reasoning, without having utility swallow all other values, there will have to be ambiguities, incommensurabilities, and arbitrariness; as I worried when reading Sam’s post, there are no principles we can use to decide when to get off the train to crazy town.
I do not think this is a huge problem. To borrow a point from Bernard Williams, there is no particular principled place to draw a line, but it is nonetheless entirely principled to say we need a line somewhere. And Cowen suggests, rightly I think, that the line is drawn based on the (possibly arbitrary) social contexts within which our moral reasoning is embedded. But the problem is just that Effective Altruism doesn’t have a good account of the context of moral reasoning, and thus no understanding of its own limits.
To be sure, a story is sometimes told (largely unreflectively) about why the ‘hard core’ of Effective Altruism is true even as most people act as if it isn’t; this story could tell us something about the context for scale-sensitive reasoning. It is derived from Derek Parfit’s work, and goes something like this. In the small, pre-modern societies where many of our moral ideas were developed, we could affect only small numbers of people in our communities; in such societies, an ethic that focussed our attention locally and largely ignored scale was reasonable. In the globalised and interconnected modern world, however, human action could (potentially / in expectation) affect many millions; we might even be at the ‘hinge of history’. In such a situation, the ‘spread’ of possible actions is much larger: there are way more options available to us, and at the same time there are way more morally relevant differences between the actions. There is thus a mismatch between our values and our reality, and it is incumbent upon us to be more explicit and rigourous in thinking about the scope of our actions, using frameworks oriented towards maximisation.
There are variations on this story, of course, but I hope that some version of it is recognisable to at least some readers. Fleshed out with more details, I think there is every chance it could be a plausible historical narrative. But in Parfit’s work, it was no more than a conjectural just-so story; and it has only got more skeletal and bare-bones since him, leading to a lot of bad ‘explanations’ for why people’s moral judgments about large numbers are (supposedly) unreliable. So while Effective Altruists are committed to their ‘hard core’, they have no good explanation for why it is true—and thus no account of the context, and limits, of their own reasoning.
As it happens, I think the ‘hard core’ of Effective Altruism probably is true. It’s definitely true in the limited realm of charitable donations, where large and identifiable differences in cost-effectiveness between different charities have been empirically validated. It becomes murkier as we move outwards from there—issues of judgment, risk/reward trade-offs, and unknown unknowns make it less obvious that we can identify and act on interventions that are hugely better than others—but it’s certainly plausible. Yet, while Effective Altruism has made a prominent and potentially convincing case for the importance of maximisation-style reasoning, this style of reasoning is simultaneously dangerous and liable to fanaticism. The only real solution to this problem is a proper understanding of the context and limits of maximisation. And it is here that Effective Altruism has come up short.
Conclusion and takeaways
I believe that Effective Altruism’s use of rigourous, explicit, maximisation-oriented reasoning is both very novel and (often) good. But if Effective Altruists don’t want to end up in crazy town, they need to start getting on a different train. They need a different understanding of their own enterprise, one grounded less in grand systematic theories of morality and more in an account of the modern world and recent history. Precisely because they lack that, Effective Altruists are simultaneously drawn towards and repulsed by the most absurd outer limits of utilitarianism. I think this marks a failure of seriousness; it certainly marks a failure of self-understanding. It marks something the Effective Altruist community needs to rectify.
None of this means abandoning the weirder sides of the movement. Many of the parts of Effective Altruism that are considered ‘weird’ relative to the wider culture are unrelated to the dynamics discussed in this post: for example, Alexander has repeatedly emphasised that concerns about risks from AI are logically independent from the philosophy of ‘longtermism’ or utility calculations, and should be treated separately.
But it does mean getting clearer about what exactly Effective Altruism is, and the contexts in which it makes sense: being more rigourous and explicit about why it is important to use systematic maximisation frameworks, what such frameworks are intended to do, and what countervailing considerations are most important to pay attention to. And this will likely require facing up to the limits of consequentialism, and thinking about situations in which consequentialist reasoning harms moral thinking more than it helps.
I don’t know what the consequences of this might be for Effective Altruism. Maybe it would ‘leave everything as it is’, and have no practical ramifications for the movement; I doubt it. Maybe, as suggested by a recent post by Lukas Gloor which discusses similar themes, it would create space for alternatives to ‘longtermism’, via a rejection of some of the arguments-from-systematicity that underpin it. Maybe it might lead to a rethinking of Effective Altruist approaches to politics, policy, and other contexts where game-theoretic considerations are paramount and well-honed judgment is necessary, and where explicit consequentialism can thus potentially cause serious problems. I don't know; one can’t typically predict the outcomes of reflection in advance.
Perhaps you, the reader, don’t feel any of this is necessary. Perhaps you follow Alexander: you think that too much sweeping criticism of Effective Altruism has been produced, and the movement should just get on with the object-level business of doing good while being aware of specific ‘anomalies’ that don’t fit with its assumptions which could suggest deeper problems. This is a reasonable position to take. Too much time in the armchair thinking of criticisms is almost never the best way to actually identify the problems in a movement or set of ideas. But the flip-side of this reasoning is that, when an anomaly does arise, Effective Altruism should be able to focus in on it; and it must be open to explanations of the anomaly that are able to unite it with other questions, even if those explanations are critical.
I think that the ‘train to crazy town’ phenomenon, the lack of clarity about whether and when utilitarian reasoning runs out, is just such an anomaly—one that hurts Effective Altruism’s ability to achieve its stated goals (both within and without the movement). I’ve tried to give an explanation that connects this anomaly to other problems in moral philosophy, and potentially suggests a way forward. You may disagree with my explanation; I certainly am not certain of my own core claims. But some diagnosis of the problem is necessary. Absent such a diagnosis, Effective Altruists will keep getting on the train to crazy town, and non–Effective Altruists will continue to recognise this (implicitly or explicitly) and be put off by it. It’s a problem worth engaging with.
I believe the metaphor is Ajeya Cotra’s, from her appearance on the 80,000 Hours Podcast. But its recent proliferation seems to be down to William MacAskill, who used it in response to one of Tyler Cowen’s arguments on the latter’s podcast.
Sam Atis, ‘The train to Crazy Town’. Sam attributes this example to Tyler Cowen, since Cowen has referred to it a number of times, but it originates in Thomas Hurka, ‘Value and Population Size’, p.499. Thank you to Cowen for pointing this out to me.
Christoph Fehige, ‘A Pareto Principle for Possible People’, pp.534–535.
For instance: John Rawls’ separateness of persons argument (discussion); Bernard Williams’ integrity objection (discussion); Thomas Nagel’s analysis of impartiality (discussion); T. M. Scanlon on aggregation (discussion).
Nota bene: in his paper, Cowen uses the term ‘pluralism’ to mean something different; nothing turns on this.
Cowen, writing in a population ethics context, writes about total utility specifically, but the same logic applies to average utility or person-affecting utility, which face similar problems (e.g., the Absurd Conclusion). I will equate utility with total utility for the rest of this post, with this footnote hopefully marking that this is without loss of generality.
It doesn’t include non-consequentialist theories with no room for purely impartial, aggregative beneficence. But this doesn’t matter much for my purposes, because such theories would not be compatible with Effective Altruism—since Effective Altruism is an attempt to pursue the good most effectively, it doesn’t make complete sense without an independent moral reason to pursue ‘the good’ in aggregate. Two papers on this theme which focus specifically on longtermism are Karri Heikkinen, ‘Strong Longtermism and the Challenge from Anti-Aggregative Moral Views’, and Emma J. Curran, ‘Longtermism, Aggregation, and Catastrophic Risk’ (public draft).
(As an aside, it is quite common for Effective Altruists to argue that, independent of all of these issues, any acceptable moral theory must include impartial beneficence in order to retain ‘basic moral decency’; but this seems to me to be simply false, as someone with no theory of beneficence can still end up recommending many of the same actions and dispositions, just so long as they recommend them for different reasons. This has recently been discussed in an Effective Altruist context by Lukas Gloor, ‘Population Ethics Without Axiology: A Framework’; for more detailed philosophical discussion, see Philippa Foot, ‘Utilitarianism and the Virtues’.)
These can be possible actions, or states of affairs, or intentions—whatever you want to evaluate.
Philosophers might appreciate an analogy with Linda Zagzebski’s great paper on ‘The Inescapability of Gettier Problems’; in a nearby possible world, Cowen might conceivably have written ‘The Inescapability of Repugnant Conclusions’.
While it might not be immediately obvious that the ‘moral parliament’ framework in particular falls victim to the problem of fanaticism, the issue here is that this framework as introduced is little more than a ‘metaphor’, and even fleshed out versions of the framework fail to meet the universal domain condition (cf. Newberry and Ord, ‘The Parliamentary Approach to Moral Uncertainty’, p.9). Thus, for those who seek to use moral uncertainty as a way to avoid problems for universal-domain moral theories, invoking the moral parliament is equivalent to an admission of defeat. I discuss the option of denying the universal domain assumption in the next section.
I think the latter reading is closer to the truth: in Cowen’s other paper on the repugnant conclusion, ‘Resolving the Repugnant Conclusion’, he quite explicitly rejects universal domain and only allows moral comparisons to be valid over bounded sets of options. But nothing I say in this section should be taken as representative of Cowen’s actual views.
Something like this, albeit stated in more explicitly numerical and consequentialist language, is captured in the three premises of Benjamin Todd’s ‘rigourous argument for Effective Altruism’ (about 29 minutes in).
See (e.g.) Parfit, Reasons and Persons, §31 ‘Rational Altruism’.
I don’t think I’m suggesting a new ‘paradigm’, whatever overblown meaning that word might have outside of the context of science; I am hoping to make a more modest suggestion than that, just a few new questions and arguments.
As far as I can tell, we don't know of any principles that satisfy both (1) they guide our actions in all or at least most situations and (2) when taken seriously, they don't lead to crazy town. So our options seem to be (a) don't use principles to guide actions, (b) go to crazy town, or (c) use principles to guide actions but be willing to abandon them when the implications get too crazy.
I wish this post had less of a focus on utilitarianism and more on whether we should be doing (a), (b) or (c).
(I am probably not going to respond to comments about how specific principles satisfy (1) and (2) unless it actually seems plausible to me that they might satisfy (1) and (2).)
Thanks for your comment.
It's not clear to me that (a), (b), and (c) are the only options - or rather, there are a bunch of different variants (c'), (c''), (c'''). Sure, you can say "use principles to guide action until they get too crazy', but you can also say 'use a multiplicity of principles to guide action until they conflict', or 'use a single principle to guide action until you run into cases where it is difficult to judge how the principle applies', or so on. There are lots of different rules of thumb to tell you when and how principles run out, none of them entirely satisfactory.
And, importantly, there is no meta-principle that tells you when and how to apply object-level principles. Intuitively there's an infinite regress argument for this claim, but since infinite regress arguments are notoriously unreliable I also tried to explain in more detail why it's true in this post: any meta-principle would be vulnerable in the same way to the train to crazy town. And so, if you have the worries about the train to crazy town that are expressed in this post, you have to move away from the realm of pure rationalist moral philosophy and begin to use principles as context-specific guides, exercising particular judgments to figure out when they are relevant.
And so, in order to figure out when to use principles, you have to examine some specific principles in specific context, rather than trying very abstract philosophising to get the meta-principle. That was why I focussed so much of utilitarianism, and the specific context within which it is a useful way to think about ethics: I think this kind of specific analysis is the only way to figure out the limits of our principles. Your comment seems to assume that I could do some abstract philosophising to decide between (a), (b), (c), (c'), (c''), etc.; but for the very reasons discussed in this post, I don't think that's an option.
I think there plausibly are principles that achieve (1) and (2), but they'll give up either transitivity or the independence of irrelevant alternatives, and if used to guide actions locally without anticipating your own future decisions and without the ability to make precommitments, lead to plausibly irrational behaviour (and more than usual than just with known hard problems like Newcomb's and Parfit's hitchhiker). I don't think those count as "crazy towns", but they're things people find undesirable or see as "inconsistent". Also, they might require more arbitrariness than usual, e.g. picking thresholds, or nonlinear monotonically increasing functions.
Principles I have in mind (although they need to be extended or combined with others to achieve 1):
Partial/limited aggregation, although I don't know if they're very well-developed, especially to handle uncertainty (and some extensions may have horrific crazy town counterexamples, like https://forum.effectivealtruism.org/posts/smgFKszHPLfoBEqmf/partial-aggregation-s-utility-monster). The Repugnant Conclusion and extensions (https://link.springer.com/article/10.1007/s00355-021-01321-2 ) can be avoided this way, and I think limiting and totally forbidding aggregation are basically the only ways to do so, but totally forbidding aggregation probably leads to crazy towns.
Difference-making risk aversion, to prevent fanaticism for an otherwise unbounded theory (objections of stochastic dominance and, if universalized, collective defeat here https://globalprioritiesinstitute.org/on-the-desire-to-make-a-difference-hilary-greaves-william-macaskill-andreas-mogensen-and-teruji-thomas-global-priorities-institute-university-of-oxford/ , https://www.youtube.com/watch?v=HT2w5jGCWG4 and https://forum.effectivealtruism.org/posts/QZujaLgPateuiHXDT/concerns-with-difference-making-risk-aversion , and some other objections and responses here https://forum.effectivealtruism.org/posts/sEnkD8sHP6pZztFc2/fanatical-eas-should-support-very-weird-projects?commentId=YqNWwzdpvmFbbXyoe#YqNWwzdpvmFbbXyoe ). There's also a modification that never locally violates stochastic dominance, by replacing the outcome distributions with their quantile functions (i.e. sorting each outcome distribution statewise), but it's also not implausible to me that we should just give up stochastic dominance and focus on statewise dominance, because stochastic dominance doesn't guarantee conditional worseness, difference-making may be fundamentally a statewise concern, and stochastic dominance is fanatical with the right background noise when (deterministic) value is additive (https://www.journals.uchicago.edu/doi/10.1086/705555).
A non-fanatical approach to normative uncertainty or similar, but (possibly) instead of weighing different theories by subjective credences, you think of it as weighing principles or reasons by all-things-considered intuitive weights within a single theory. This interpretation makes most sense if you're either a moral realist who treats ethics as fundamentally pluralistic or vague, or you're a moral antirealist. However, again, this will probably lead to violations of transitivity or the independence of irrelevant alternatives to avoid fanaticism, unless you assume reasons are always bounded in strength, which rules out risk-neutral EV-maximizing total utilitarianism.
Also, I normally think of crazy town as requiring an act, rather than an omission, so distinguishing the two could help, although that might be unfair, and I suppose you can imagine forced choices.
Why is criterion (1) something we want? Isn't it enough to find principles that would guide us in some or many situations? Or even just in "many of the currently relevant-to-EA situations"?
Thank you for the post which I really liked! Just two short comments:
On 2: I think the point is simply that, as noted in footnote 8, the 'train to crazy town' reasoning can apply quite directly to comparisons between states of affairs with no lingering uncertainty (Savagean consequences). When we apply the reasoning in this way, two features arise:
(a) Uncertainty, and frameworks for dealing with uncertainty, no longer have a role to play as we are certain about outcomes. This is the case with e.g. the Very Repugnant Conclusion.
(b) The absurdities that are generated apply directly at the level of axiology, rather than 'infecting' axiology via normative ethics. If we read multi-level utilitarianism as an attempt to insulate axiology from ethics, then it can't help in this case. Of course, multi-level utilitarians are often more willing to be bullet-biters! But the point is just that they do have to bite the bullet.
Thanks for your comment! I was considering writing much more about moral uncertainty, since I think it's an important topic here, but the post was long enough as it is. But you and other commenters below have all pulled me up on this one, so it's worth being more explicit. I thus hope it's OK for this reply to you to serve as a general reply to a lot of themes related to moral uncertainty in the comment section, to avoid repeating myself too much!
Starting with 1(b), the question of unconditional 'deontological constraints': this works in theory, but I don't think it applies in practice. The (dis)value placed on specific actions can't just be 'extremely high', because then it can still be swamped by utilitarianism over unbounded choice sets; it has to be infinite, such that (e.g.) intentional killing is infinitely disvaluable and no finite source of value, no matter how arbitrarily large, could outweigh it. This gets you around the impossibility proof, which as mentioned relies on order properties of the reals that don't hold for the extended reals - roughly, the value of utility is always already infinitesimal relative to the infinite sources of (dis)value, so the marginal value of utility doesn't need to decline asymptotically to avoid swamping them.
But in practice, I just don't see what marginally plausible deontological constraints could help a mostly-consequentialist theory avoid the train to crazy town. These constraints work to avoid counterexamples like e.g. the transplant case, where intuitively there is another principle at play that overrides utility considerations. In these cases, deontological constraints are simple, intuitive, and well-motivated. But in the cases I'm concerned with in this post, like Hurka's St Petersburg Paradox, it's not clear that Kantian-style constraints on murder or lying really help the theory - especially because of the role of risk in the example. To get around this example with deontological constraints, you either have to propose wildly implausible constraints like 'never accept any choices with downside risk to human life', or have an ad hoc restriction specifically designed to get around this case in particular - the latter of which seems a) epistemically dodgy and b) liable to be met with a slightly adjusted counterexample. I just don't see how you could avoid all such cases with any even mildly plausible deontological constraint.
Beyond these kinds of 'lexical' approaches, there are various other attempts to avoid fanaticism while respecting considerations of utility at scale - your 1(a). But by Cowen's proof, if these are indeed to work, they must deny the universal domain condition - as indeed, the theories mentioned tend to! I mentioned the moral parliament explicitly, but note also that (e.g.) if you accept that certain intertheoretical comparisons cannot be made, then you have ipso facto denied universal domain and accepted a certain level of incomparability and pluralism.
The difference between me and you is just that you've only accepted incomparability at the meta-level (it applies comparing different moral theories), whereas I'm encouraging you to adopt it at the object level (it should apply to the act of thinking about ethics in the first instance). But I see no coherent way to hold 'we can have incomparability in our meta-level theorising, but it must be completely banned from our object-level theorising'! There are many potential rationalistic reasons you might offer for why incomparability and incommensurability should be banished from moral philosophy; but none of these are available to you if you take on a framework for moral uncertainty that avoids fanaticism by denying universal domain. So accepting these kinds of positions about moral uncertainty just seems to me like an unstable halfway house between true rationalistic moral philosophy (on the one hand) and pluralism (on the other).
Thank you for your replies! In essence, I don’t think I disagree much with any of your points. I will mainly add different points of emphasis:
I think one argument I was gesturing at is a kind of divide-and-conquer strategy where some standard moves of utilitarians or moral uncertainty adherents can counter some of the counterintuitive implications (walks to crazy town) you point to. For instance, the St. Petersburg Paradox seems to be a objection to expected value utilitarianism, not for every form of the view. Similarly, some of the classical counterexamples to utilitarianism (e.g., some variants of trolley cases) involve violations of plausible deontological constraints. Thus, if you have a non-negligible credence in a moral view which posits unconditional prohibitions of such behavior, you don’t need to buy the implausible implication (under moral uncertainty). But you are completely correct that there will remain some, maybe many, implications that many find counterintuitive or crazy, e.g., the (very) repugnant conclusion (if you are totalist utilitarian). Personally, I tend to be less troubled by these cases and suspect that we perhaps should bite some of these bullets, but to justify this would of course require a longer argument (which someone with different intuitions won’t likely be tempted by, in any case).
The passage of your text which seemed most relevant to multi-level utilitarianism is the following: "In practice, Effective Altruists are not willing to purchase theoretical coherence at the price of absurdity; they place utilitarian reasoning in a pluralist context. They may do this unreflectively, and I think they do it imperfectly; but it is an existence proof of a version of Effective Altruism that accepts that utility considerations are embedded in a wider context, and tempers them with judgment.“ One possible explanation of this observation is that the EA’s which are utilitarians are often multi-level utilitarians who consciously and intentionally use considerations beyond maximizing utility in practical decision-situation. If that were true, it would raise the interesting question what difference adopting a pluralist normative ethics, as opposed to a universal-domain utilitarianism, would make for effective altruist practice (I do not mean to imply that there aren’t difference).
With respect to moral uncertainty, I interpret you as agreeing that the most common effective altruist views actually avoid fanaticism. This then raises the question whether accepting incomparability at the meta-level (between normative theories) gives you reasons to also (or instead) accept incomparability at the object-level (between first-order moral reasons for or against actions). I am not sure about that. I am sympathetic to your point that it might be strange to hold that 'we can have incomparability in our meta-level theorising, but it must be completely banned from our object-level theorising‘. At the same time, at least some of the reasons for believing in meta-level incomparability are quite independent from the relevant object-level arguments, so you might have good reasons to believe in it only on the meta-level. Also, the sort of incomparability seems different. As I understand your view, it says that different kinds of moral reasons can favor or oppose a course of action such that we sometimes have to use our faculties for particular, context-sensitive moral judgements, without being able to resort to a universal meta-principle that tells us how to weigh the relevant moral reasons. By contrast, the moral uncertainty view posits precisely such a meta-principle, e.g. variance voting. So I can see how one might think that the second-order incomparability is acceptable but yours is unacceptable (although this is not my view).
This comment might seem somewhat tangential but my main point is that the problem you are trying to solve is unsolvable and we might be better off reframing the question/solution.
(1) anti realism is true
(2) The EA community has a very nebulous relationship with meta-ethics
(3) I think the community should explicitly renounce its relationship with utilitarianism or any other ethical worldview.
'if anti-realism is true there is no logical way to pick a worldview'
This seems like an obvious 'crux' here - I don't think this is true? e.g., Simon Blackburn would insist that there are 'logical' ways to pick a worldview despite being committed to the view that value is just something we project onto the world, not something that is a property of the world in itself.
If it wasn’t clear I meant logical from first principles.
If you still disagree with the above statement I would definitely want to hear what your line of thinking is. I don’t know much about Blackburn or quasi-realism but I’ll check it out.
With the clarification that you specifically meant 'from first principles', I'm not sure how your point is supposed to be relevant. I agree that 'there is no logical way to pick a worldview from first principles', but just because you can't rationally conclude something from first principles doesn't mean you can't rationally conclude it at all. There's no way to get to science from first principles, but rational scientific argumentation still exists. Likewise, there might be no logical first-principles derivation of morality, but I can still apply logical reasoning to ethical questions to try to figure things out. So the idea that there's no first-principles derivation of morality - which the anti-realist and the realist can both accept, by the way, there's no connection here with anti-realism - is just irrelevant to your other points.
The implication here is you are rationally concluding it using your own values as a starting point. I don't think there is an easy way to adjucate disagreements when this is how we arrive at views. You are trying to talk about meta-ethics but this will turn into a political philosophy question.
I have heard this argument multiple times and I still don't get it. Science is a process that we can check the results of in real life, because it deals with real quantities. The fact that science doesn't exist from first principles doesn't mean we can use empirics to back up the validity of science in pushing forth our understanding of the world. We have seen time and time again that science does work for the purposes we want it to work for. Morality isn't a physical quantity. We can't use empiricism w/respect to morals. Hence we only have logic to lean on.
Sort of agree but not really. Here are the options as I see them.
If you are saying you buy into bullet point 3 that's fine but I would be upset if EA accepted this explicitly or implicitly.
I think you've run together several different positions about moral epistemology and meta-ethics. Your three bullet points definitely do not describe the whole range of positions here. For example: RM Hare was an anti-realist (the anti-realist par excellence, even) but believed in a first principles derivation of morality; you may have come across his position in the earlier works of his most famous student, Peter Singer. (Singer has since become a realist, under the influence of Derek Parfit). Likewise, you can have those who are as realist as realists can be, and who accept that we can know moral truths, but not that we can prove them. This seems to be what you're denying in your comment - you think the only hope for moral epistemology is first-principles logic - but that's a strong claim, and pretty much all meta-ethical naturalists have accounts of how we can know morality through some kind of natural understanding.
For myself, I'm a pretty strong anti-realist, but for reasons that have very little to do with traditional questions of moral epistemology; so I actually have a lot of sympathy with the accounts of moral epistemology given by many different metaethical naturalists, as well as by those who have straddled the realist/anti-realist line (e.g. constructivists, or Crispin Wright whatever name you want to give to his position), if their positions are suitably modified.
I'm confused. He thinks you can derive that morality doesn't exist or he thinks you can derive something that doesn't exist?
I mean it depends on what you mean by moral epistemology. If you just mean a decision tree that I might like to use for deciding my morals I think it exists. If you mean a decision tree that I Should follow then I disagree.
I deleted my first comment to post a better one.
Some of my comment is a bit long and may not be 100% directed at you, sorry for that. I still thought it was relevant enough to mention.
I am generally in favour of honesty and openness about one's views, including moral views. I'm also generally in favour of being open about power dynamics, such as the fact that two people (SBF and Moskowitz) control almost all EA funding. I also think moral antirealism is true in a strong sense with high probability.
However, some things I wanted to point out:
Re point 5: I would assume the subgroups would/could make moral trades to prevent this from happening.
[I just quickly listened to the post and I'm not philosopher, nor I know deeply about Ergodicity Economics]
Maybe Ergodicity Economics (EE) [site, wikipedia] could be relevant here? It is relevant for the St. Petersburg experiment. It has to do with expected values of stochastic processes not being equal to their time averages (ensemble average ≠ time average).
I am sure there is much more to EE than this, but the one insight that I took from it when I got to know about EE is that when one of the outcomes of the game is to lose everything, expected value does not do a good job describing it. And, at least for x-risks this is exactly the case (and I consider catastrophic risks also in this domain).
It seems that EE is not very well known in the EA community, at least the term is almost not mentioned in the forum, so I thought I would mention it here in case anyone wants to dig in more. I'm for sure not the better trained to go deeper into it nor do I have the time to attempt to do it.
One post that seems to address the issue of EE within the EA is this one.
I hope this is a good lead!
Thanks for this fantastic post and your comments, I both enjoyed and learned from them.
Is the following a reasonable summary of your position? Is there anything you would change or add?
Thank you very much for writing this - I have found it incredibly useful and wish all complex philosophical texts would be followed by something like this. Just wanted to give you extra feel-good for doing this because it is well deserved as it helped me immensely to fit all that I read into place.
Thanks so much! I'm really glad this was helpful for you.
Thank you for your fantastic summary! Yes, I think that's a great account of what I'm saying in this post.
"Various theories of moral uncertainty exist, outlining how this aggregation works; but none of them actually escape the issue. The theories of moral uncertainty that Effective Altruists rely on are themselves frameworks for commensurating values and systematically ranking options, and (as such) they are also vulnerable to ‘value dictatorship’, where after some point the choices recommended by utilitarianism come to swamp the recommendations of other theories. In the literature, this phenomenon is well-known as ‘fanaticism’."
This seems too strong. Doesn't this only apply to maximizing expected choiceworthiness with intertheoretic comparisons, among popular approaches? My impression is that none of the other popular approaches are fanatical. You mention moral parliament in the footnote as a non-fanatical exception, but there's also MEC variance voting, formal bargain-theoretic approaches and my favourite theory.
Also, theories with value lexicality can swamp utilitarianism, although this also seems fanatical.
Hi Michael, thanks for your comment - since it overlapped quite significantly with Leonard's comment, I have tried to address both of your concerns together in one of my replies to Leonard. Let me know if I failed to reply sufficiently to some of your points!
I think the quote literally mischaracterizes approaches to normative uncertainty that EAs rely on as all leading to fanaticism, when almost none do, only really MEC with intertheoretical comparisons. Whether the ones that deny intertheoretical comparisons also reject universal domain is a separate issue.
Or maybe I misunderstood, and you're only referring to versions of MEC with intertheoretical comparisons?
I think that approaches other than MEC + ITT aren't typically clear enough about universal domain, and can be interpreted in several different ways because they're not completely formally specified. (Not itself a criticism!) But (a corollary of the argument in this post) these frameworks only actually avoid fanaticism if they deny universal domain. So it isn't clear whether they fall victim too fanaticism, pace your claim that 'almost none do'.
I think - purely personal impression - that most people who think very hard about moral uncertainty are committed to quite a rationalistic view of ethics, and that this leads quite naturally to universal domain; so I said that, interpreting this into their frameworks, they will fall victim to fanaticism. But I mentioned the moral parliament as an example of an approach where the proponents have explicitly denied universal domain, as an example of the other possibility. However you interpret it, though, it's an either/or situation.
(Lexical orders / deontological constraints are a separate issue, as you mention.)
Assuming all your moral theories are at least interval-scale, I think variance voting and similar normalization-based methods
I think formal bargain-theoretic approaches (here and here) satisfy 1 and 3, aren't fanatical, but I'm not sure about 4 and violations of IIA.
On variance voting: yeah, I think 4 is the point here. I don't think you can extend this approach to unbounded choice sets. I'm travelling atm so can't be more formal, but hopefully tomorrow I can write up something a bit more detailed on this.
On bargaining-theory approaches, it actually isn't clear that they avoid fanaticism: see pp.24-26 of the Greaves and Cotton-Barratt paper you link, especially their conclusion that 'the Nash approach is not completely immune from fanaticism'. Again, I think constructive ambiguity in the way these theories get described often helps obscure their relationship to fanatical conclusions; but there's an impossibility result here, so ultimately there's no avoiding the choice.
Wow, this provided me with a lot of food for thought. For me personally, this was definitely one of the most simultaneously intriguing and relatable things I've ever read on the EA forum.
The arguments opening up this post, aren't really biting or that relevant.
I made a "contest", where I'll write "answers" and challenge anyone to reply to me.
This seems related to this post.
I'm not sure, but my guess is that probably the post is framed/motivated in a way that makes its ideas seem much less tangential than it really is.
Thanks a lot for this post. I have found it a superb piece and well worth meditating about, even if I have to say that I am probably biased because, a priori, I am not too inclined towards Utilitarianism in the first place. But I think the point you make is complex and not necessarily against consequentialism as such, and would probably go some way to accommodate the views of those who find a lot of it too alien and unpalatable.
The St. Petersburg paradox is not a paradox. The expected value of the wager varies as the square root of the sample size. Because we sum functions over infinity, we get infinite ev, but this is irrelevant to someone without infinite time. So the answer to the question, how much do we pay for the opportunity to wager, we say, something less than the square root of the number of times we get to run it. If you write a script to run this experiment, you can see how this is so.
Another approach to thinking about these difficulties could be to take counsel from the Maxwell Demon problem in thermodynamics. There, it looks like you can get a "repugnant conclusion" in which the second law is violated, if you don't address the details of Maxwell's demon directly and carefully.
I suspect there is a corresponding gap in analyses of situations at the edges of population ethics. Call it the "repugnant demon." Meaning, in this hypothetical world full of trillions of people we're being asked to create, what powers do we have to bestow on our demon which is responsible for enforcing barely livable conditions? These trillions of people want better lives, otherwise by definition they would not be suffering. So the demon must be given the power to prevent them from having those improved lives. How?
Pretty clearly, what we're actually being asked is whether we want to create a totalitarian autocratic transgalactic prison state with total domination over its population. Is such a society one you wish to create or do you prefer to use the power of your demon which it would take to produce this result in a different way?
A much smaller scale check here is whether it is good to send altruistic donations to existing autocratic rulers or not. Their populations are not committing suicide, so the people must have positive life utility. The dictator can force the population to increase, so the implementation here would be finding dictators who will accept altruistic donations for setting up forced birth camps in their countries.
In other words, I suspect when you finish defining in detail what "repugnant demon" powers need to be created in world-building awful conditions for even very comparatively small populations, it becomes clear immediately where the "missing negative utility" is in these cases: it is that the utilitarian power in producing conditions of very low satisfaction are actually very large. Therefore using that power in the evil act of setting up a totalitarian prison camp instead of a different and morally preferable society is to be condemned.
Very interesting perspective I've never thought about.
Is your argument just that there do not exist choices with the logical structure of the repugnant conclusion, and that in any decision-situation that seems like the repugnant conclusion there is always a hidden source of negative utility to balance things out? Or is it more limited to the specific case that Parfit initially discussed?
If the latter, then I think your point is irrelevant to the philosophical questions about the repugnant conclusion, which are concerned only with its logical structure. (Consider Michael's comment above for some other situations with similar structural features.) But if the former, then you're going to need some general argument as to why no such structure ever exists, not just an argument as to why the specific case often proposed is a bit misleading. Just because Parfit's original thought experiment might not pump your intuitions, so long as some set of choices has a structurally similar distribution of utility across time the question still arises. Do you have a more general argument why no choices exist with these distributions?
More like the former.
In other words, the moral weight of the choice we're asked to make is about the use of power. An example that's familiar and more successful because the power being exercised is much more clear is the drowning child example. The power here is going into the pond to rescue the child. Should one exercise that power, or are there reasons not to?
The powers bring appealed to in these population ethics scenarios are truly staggering. The question of how they should be used is (in my opinion) usually ignored in favor of treating them as preferences of states of affairs. I suspect this is the reason they end up being confusing--when you instead ask whether setting up forced reproduction camps is a morally acceptable use of the power to craft an arbitrary society, there's just very little room for moral people to disagree any more.
Relative to creating large numbers of beings with the property of being unable to experience any negative utility but only small amounts of positive utility, it isn't clear this power exists logically. (The same might be said about enforcing pan galactic totalitarianism but repugnant conclusion effects IMO start being noticeable on scales where we can be quite sure power does exist.)
If the power to create such beings exists, it implies a quite robust power to shape the minds and experiences of created beings. If it were used to prohibit the existence of beings with tremendous capacity for pleasure I think that would be an immoral application. Another scenario though might be the creation of large numbers of minimally-sentient beings who (in some sense) mildly "enjoy" being useful and supportive of such high-experience people. Do toasters and dishwashers and future helpful robots qualify here? It depends what kind of pan psychism ends up being like for hypothetical people with this kind of very advanced mind design power. I could see it being true that such a world is possible, but I think this framing in terms of power exercise removes the repugnance from the situation as well. Is a world of leisure supported by minimally-aware robots repugnant. Nah, not really. :-)
This confuses me. In the original context in which the Repugnant Conclusion was dreamed up (neo-Malthusian debates over population control), seeking a larger population was a kind of laissez-faire approach opposed to "population policy", while advocates for smaller populations such as Garett Hardin explicitly embraced 'coercion'. When Parfit originally formulated the Repugnant Conclusion, I don't think he imagined 'forced reproductive camps'!
So how about being more specific. Suppose you are living in 1968, and as a matter of fact you know that the claims made in Ehrlich and Ehrlich's The Population Bomb are true. (This is a hypothetical, as those claims turned out to be false in the actual world.) And you have control over population policies - say, you are president of the United States. If you exercise coercive power over reproduction, you can ensure that the world will have a relatively small population of relatively happy people. If you don't, and you simply let things be, then the world will have an enormous population of people whose lives are barely worth living.
This case has exactly the same structure as the Repugnant Conclusion, and not by accident: this is exactly the kind of question that Parfit and other population ethicists were thinking about in the 1960s and 1970s. But in this case, the larger population is not produced by the exercise of power; it is the smaller population that would be produced by coercion, and the larger population would be produced through laissez-faire. Thus, your argument about 'the use of power' does not support the claim that there do not exist choices with the logical structure of the Repugnant Conclusion.
In general, I think you have confused the Repugnant Conclusion itself with a weirdly specific variant of it, perhaps inspired by versions of the astronomical waste argument.
Thanks! This is a great set of context and a great way to ask for specifics. :-)
I think the situation is like this: I'm hypothetically in a position to exercise a lot of power over reproductive choices -- perhaps by backing tax plans which either reward or punish having children. I think what you're asking is "suppose you know that your plan to offer a child tax credit will result in a miserable population, should you stay with the plan because there'll be so many miserable people that it'll be better on utilitarian grounds"? The answer is no, I should not do that. I shouldn't exercise power I have to make a world which I believe will contain a lot of miserable people.
I think a better power-inversion question is: "suppose you are given dictatorial control of one million miserable and hungry people. Should you slaughter 999,000 of them so the other 1000 can be well fed and happy." My answer is, again, unsurprisingly, No. No, I shouldn't use dictatorial power to genocide this unhappy group. Instead I should use it to implement policies I think will lead over time to a sustainable 1000-member happy population, perhaps by the same kind of anti-natalist policies that would in other happier circumstances be abhorrent.
My suspicion I think I share with you: that consequentialism's advice is imperfect. My sense is it is imperfect mostly not because of unfamiliar galactic-scale reasons or other failures in reacting to odd situations involving unbelievably powerful political forces. If that's where it broke down it'd be mostly immaterial to considering alternatives to consequentialism in everyday situations (IMO).
You could create huge numbers of unlikely to be conscious beings or low moral weight beings who would not suffer at all and only experience pleasure, but each would only be barely above neutral in expected value. These beings may be more efficient to create and use to generate value, because their brains are simpler and more parallelizable.
The value of the far future seems vastly dominated by artificial sentience in expectation. The expected utility-maximizing artificial sentience could be huge numbers of beings with low average expected moral weight.
With current animals and ignoring the far future, it might be invertebrates, and the so-called Rebugnant Conclusion (https://jeffsebodotnet.files.wordpress.com/2021/08/the-rebugnant-conclusion-.pdf ). To be clear, though, I think many invertebrates are not very unlikely to be conscious, and it's plausible they have similar or incomparable moral weight to humans, rather than much lower moral weight.
A bit repetitive to what I replied below but it isn't clear to me that minimally -conscious beings can't suffer (or be made to not be able to suffer).
On relatively more stable ground wrt power to choose between a world optimized for insects vs humans, I'm happy to report I'm a humanity partisan. :-)
In theory of mind terms , it sounds like we differ in estimating likelihood insects will be thought of as having conscious experience as we learn more. (Other invertebrates I think the analysis may be very different.) My sense given extraordinary capabilities of really-clearly-not-conscious ML systems is that pretty sophisticated behaviors are well within reach for unconscious organisms. Moreso than I might have thought a few years ago.
I think it's very likely that we can stack the deck in favour of positive welfare, even if it's still close to 0 due to low expected probability of consciousness or low moral weight. There are individual differences in average hedonic setpoints between humans that are influenced genetically and some extreme cases like Jo Cameron. The systems for pleasure and suffering don't overlap fully, so we could cut parts selectively devoted to suffering out or reduce their sensitivity.
I agree with many sophisticated behaviors being well within range of reach for unconscious systems, but it's not clear this counts that much against invertebrates. You can also come to it from the other side: it's hard to pick out capacities that humans have that are with very high probability necessary for consciousness (e.g. theory of mind, self-awareness to the level of passing the mirror test and the capacity for verbal report don't seem necessary), but aren't present in (some) invertebrates. I'd recommend Rethink Priorities' work on this topic (disclaimer: I work there, but didn't work on this, and am not speaking for Rethink Priorities) and Luke Muehlhauser's report for Open Phil.
Also, at what point would you start to worry about ML (or other AI) systems being conscious, especially ones that aren't capable of verbal report?
Completely agree it is difficult to find "uniquely human" behaviors that seem indicative of consciousness as animals share so many of them.
Any animals which don't rear young I am much more likely to believe have behaviors much more genetically determined and so therefore operating at time scales that don't really satisfy what I think makes sense to call consciousness. I'm thinking of the famous Sphex wasp hacks for instance where complex behavior turns out to be pretty algorithmic and likely not indicative of anything approximating consciousness. Thanks for the pointer to the report!
WRT AI consciousness, I work on ML systems and have a lot of exposure to sophisticated models. My sense is that we are not close to that threshold, even with sophisticated systems that are obviously able to pass naive Turing tests (and have). My sense is we have a really powerful approach to world-model-building with unsupervised noise prediction now, and that current techniques (including RL) are just nowhere near enough to provide the kind of interiority that AI systems need to start me worrying there's conscious elements there.
IOW, I'm not a "scale is all you need" person -- I don't think current ideas on memory/long-range augmentation or current planning type long-range state modeling is workable. I mean, maybe times 10^100 it is all you need? But that's just sort of another way of saying it isn't. :-) The sort of "self-talk" modularity that some LLMs are being experimented with strikes me as the most promising current direction for this (e.g. LAMDA paper) but currently the scale and ingredients are way too small for that to emerge IMO.
I do suspect that building conscious AI will teach us way more about non-verbal-report consciousness. We have some access to these mechanisms with neuroscience experiments but it is difficult going. My belief is we have enough of those to be quite certain many animals share something best called conscious experience.
I think the Train to Crazytown is a result of mistaken utilitarian calculations, not an intrinsic flaw in utilitarianism. If we can’t help but make such mistakes, then perhaps utilitarianism would insist we take that risk into account when deciding whether or not to follow through on such calculations.
Take the St. Petersburg Paradox. A one-off button push has positive expected utility. But no rational gambler would take such a series of bets, even if they’re entirely motivated by money.
The Kelly criterion gives us the theoretically optimal bet size for an even money bet, which the St. Petersburg Paradox invokes (but for EV instead of money).
The Paradox proposes sizing the bet at 100% of bankroll. So to compute the proportion of the bet we’d have to win to make this an optimal bet, we plug it into the Kelly criteria and solve for B.
1 = .51 - .49/B
This gives B = -1, less than zero, so the Kelly criterion recommends not taking the bet.
The implicit utility function in Kelly (log of bankroll) amounts to rejecting additive aggregation/utilitarianism. That would be saying that doubling goodness from 100 to 200 would be of the same decision value as doubling from 100 billion to 200 billion, even though in the latter case the benefit conferred is a billion times greater.
It also absurdly says that loss goes to infinity as you go to zero. So it will reject any finite benefit of any kind to prevent even an infinitesimal chance of going to zero. If you say that the world ending has infinite disutility then of course you won't press a button with any chance of the end of the world, but you'll also sacrifice everything else to increment that probability downward, e.g. taking away almost everything good about the world for the last tiny slice of probability.
That's a helpful correction/clarification, thank you!
I suppose this is why it's important to be cautious about overapplying a particular utilitarian calculation - you (or in this case, I) might be wrong in how you're going about it, even though the right ultimate conclusion is justified on the basis of a correct utilitarian calculus.
I don't understand the relevance of the Kelly criterion. The wikipedia page for the Kelly criterion states that "[t]he Kelly bet size is found by maximizing the expected value of the logarithm of wealth," but that's not relevant here, is it?
Forgive me for dropping a new and potentially shallow point in on this discussion. The intellectual stimulation from the different theoretical approaches, thought experiments and models is clear. It is great to stretch the mind and nurture new concepts – but otherwise I question their utility and priority, given our situation today.
We do not need to develop pure World A and World B style thought experiments on the application of EA concepts, for want of other opportunities to test the model. We (collectively, globally), have literally dozens of both broad and highly targeted issues where EA style thinking can be targeted and thereby may help unlock solutions. Climate change, ecosystem collapse, hybrid warfare, inequality, misinformation, internal division, fading democracy, self-regulated tech monopolies, biosecurity, pandemic responses etc... the individual candidate list is endless… We also need to consider the interconnected nature, and thereby the potential for interconnected solutions.
Surely, a sustained application of the minds demonstrated in this paper and comment train could both help on solutions, and as a bonus, provide some tough real life cases that can be mined for review and advancing the EA intellectual model.
I draw a parallel between the EA community and what I see in the foresight community (which is much older). The latter have developed an array of complex tools and ideas, (overly dependent on graphics rather than logic in some cases). They are intellectually satisfying to an elite, but some 30 years+ after the field evolved, foresight is simply not commonly used in practice. At present it remains a niche specialism – to date generally sold within a management consultancy package, or used in confidential defence wargaming.
With this type of benchmark in mind, I would argue that the value/utility of the EA models/theories/core arguments, (and perhaps even the larger forum) should be in part based on the practical or broader mindset changes that they have demonstrably driven or at least catalysed.
I have two gripes with this thought experiment. First, time is not modelled. Second, it's left implicit why we should feel uneasy about the thought experiment. And that doesn't work due to highly variable philosophical intuitions. I honestly don't feel uneasy about the thought experiment at all (only slightly annoyed). But maybe I would have it been completely specified.
I can see two ways to add a time dimension to the problem. First, you could let all the presses be predetermined and in one go, where we get into Satan's apple territory. Second, you could have 30 seconds pause between all presses. But in that case, we would accumulate massive amounts of utility in a very short time - just the seconds in-between presses would be invaluable! And who cares if the world ends in five minutes with probability 1−0.4910 when every second it survives is so sweet? :p
People should be allowed to destroy the button (aka "x-risk reduction") ;-)
What is the "Satan's apple" reference? I don't get that, sorry.
I think I agree with you that the version of the example I offered did not fully specify the time dimension. But it's crucial to the original St Petersburg game that your winnings at stage n are not collected until the game is finished, so that if you lose at stage n+1 you don't get to enjoy your winnings. So presumably, in Hurka's variant, the extra world(s) should not be created until after you walk away from the button, so the seconds in-between presses would not be valuable at all. But I take your point that this wasn't specified well enough in OP.
Source: Satan, Saint Peter and Saint Petersburg
I'm going to bite the bullet of absurdity, and say this already happened.
Imagine a noble/priest 500-1000 years ago trying to understand our western society, and they would likely find it absurd as well. Some norms have survived primarily due to the human baseline not changing through genetic engineering, but overall it would be weird and worrying.
For example, the idea that people are relatively equal would be absurd to a medieval noble, let alone our tolerance for outgroups/dissent.
The idea that religion isn't an all-powerful panacea or even optional would be absurd to a priest.
The idea that there are positive sum trades that are the majority, rather than zero or negative sum trades, would again be absurd to the noble.
Science would be worrying to the noble.
And much more. In general I think people underestimate just how absurd things can get, so I'm not surprised.
The problem with this line of reasoning is that it applies to anything.
For example, suppose I have a metaethical view that says that the most important thing in life is doing a handstand. Morally, we ought to maximize the number of people who stand on their hands and we ought to maximize the length of time they do it, too. So universities are turned into handstand mills. Tax dollars are diverted from social spending and used to promote the public uptake of new handstand norms. And so on.
This view is, of course, counterintuitive in both its starting assumptions and its consequences. And that is a perfectly good reason for rejecting it. The fact that the medievals would have found modern physics and mathematics counterintuitive doesn't change that.
So, in a similar way, it is conceivable that utilitarianism (or a related view) could have consequences so counterintuitive that they would give us grounds for rejecting utilitarianism.
I do not agree, in general. Now the exact example is incorrect more or less for humans because we don't value it, though importantly this may not be universal in the future, but I do not generally accept this argument, especially about the long-term future.
As to why, I generally am not a fan of bullet-biting too much, but we should generally be suspicious of moralities that claim to do no wrong or have no weird or counterintuitive claims, because of the fact that we should generally be suspicious of straightforward, easy option A when that are just so clearly better than those uncomfortably costly tradeoff-y option B. This usually doesn't exist in organizations with at least some efficiency. So that's why bullet-biting in some cases is the best option.
(Also I have high likelihood that the long-term future in say 1,000 years will look absurd compared to us in technology and society.)
I'm not sure I understand your response to Adrian here? The claim is not that we should search for a view that has no weird or counterintuitive claims in it, only that some views might reasonably be rejected on the basis of their weird and counterintuitive claims. There might be no view that is completely un-weird, but handstandism is nonetheless obviously too weird!
Basically, my point is that coming back to my first comment is that the future can get really absurd, and arguably will.
We can pick which absurdities we choose and which to reject, but the option of a non-absurd future isn't really on the table. It's like claiming that gyroscopes won't do weird things.
I got on a different train long ago. I am not a utilitarian; I am a contractualist who wants 'maximize utility' to be a larger part of the social contract (but certainly not the only one).
I haven't read enough on the topic yet, but my impression is that my train of belief would indeed be something somewhat like 'a contractualist who wants to maximize utility'.
Reductionist utilitarian models are like play-dough. They're fun and easy to work with, but useless for doing anything complicated and/or useful.
Perhaps in 100-200 years our understanding of neurobiology or psychometrics will be good enough for utilitarian modelling to become relevant to real life, but until then I don't see any point getting on the train.
The fact that intelligent, well-meaning individuals are wasting their time thinking about the St Petersburg paradox is ironically un-utilitarian; that time could be used to accomplish tasks which actually generate wellbeing.