Hide table of contents

Summary: Even from an anti-realist stance on morality, there are various reasons we might expect moral convergence in practice.

[Largely written two years ago; cleaned up for draft amnesty week. The ideas benefited from comments and conversations with many people; errors remain my own.]

Consider:

Convergent morality thesis: for some non-tiny fraction of possible minds, their extrapolated volitions will (approximately) coincide — and also coincide with what we’d end up thinking was good (i.e. we ourselves are in this non-tiny fraction).

(This is something like an empirical analogue of moral realism. The idea is that the thing converged upon can be thought of as ~the good, although you don’t need to make any commitment to realism to follow this line of reasoning.)

The central claim of this post is that convergent morality thesis is quite plausible:

  • I think I’m around 75%, although that feels non-robust and I’m interested in arguments that might shift me
    • [my original draft from two years ago said “60%”, but that seems too low to me now]
  • NB if the convergent morality thesis were true, this would mean the complexity of value thesis was false (because one could give a short pointer to something in the relevant set of minds), although there might be a massive amount of computation required to arrive at an applicable conception of the good
    • e.g. perhaps sufficient information would be contained in the idea of “evolved social intelligence” and the skill of moral reflection
      • (It’s possible that one of these itself turns out to be complex, but my intuition is that they’re both pretty simple in Kolmogorov terms …)

The convergent morality thesis might hold in a straightforward way or a more convoluted way.

Create a wide, stylized, and abstract image depicting a person entering into a vast, open basin. The basin is symbolic of the search for moral truth, embodying themes of discovery, enlightenment, and the journey of the soul. The landscape should evoke a sense of wonder and mystery, with elements such as ethereal light beams, reflective waters, and enigmatic symbols scattered throughout. The figure is portrayed as a small, solitary explorer, emphasizing the grandeur of their quest and the vastness of the basin. The overall aesthetic should blend surrealism with elements of fantasy, creating a dreamlike atmosphere that captures the profound and transcendental nature of the search for moral truth.

 

The straightforward case

Perhaps it’s just the case that the process of moral reflection tends to cause convergence among minds from a range of starting points, via something like social logic plus shared evolutionary underpinnings.

The main intuition pump in favour of the straightforward case is that some limited version of this seems to apply after we restrict to humans:

  • It seems like we can predictably make moral progress by reflecting; i.e. coming to answers + arguments that would be persuasive to our former selves
  • I think I’m more likely to update towards the positions of smart people who’ve thought long and hard about a topic than the converse (and this is more true the smarter they are, the more they’re already aware of all the considerations I know, and the more they’ve thought about it)
  • If I imagine handing people the keys to the universe, I mostly want to know “will they put serious effort into working out what the right thing is and then doing it” rather than their current moral views

Note that all of these intuitions apply more strongly to moral reasoning than to e.g. aesthetic reasoning (though feel less secure than with e.g. mathematical reasoning, where I better understand the underlying dynamics). And they apply more strongly to people from similar cultures than from very distant cultures (although I mostly still hold the intuitions for people from distant cultures, my evidence base there mostly comes from looking at philosophers in the past, who came from societies with significant differences but also significant overlap with my own).

So it looks like there’s at least a region of mind-space where this kind of dynamic applies. This doesn’t tell us what should happen with alien minds, but I think it’s a pretty big update away from a prior of “maybe what people want is just pretty arbitrary/idiosyncratic”. If the hypothesis is false, it’s either because (1) somewhere on the spectrum in-between us and alien minds the reflective process breaks down so that it loses the property of reflection leading to convergence; or (2) morality is a mix of the derivable and underivable (and humanity has quite idiosyncratic choices for the underivable parts). On my impression (1) is relatively implausible, but (2) is a real possibility (explored further below).

More generally, I suspect that intuitions in favour of moral realism are in many cases also intuitions in favour of the straightforward case for the convergent morality thesis, so there might well be some good discussion of this in the philosophical literature.[1] 

Is morality a mix of the derivable and the underivable?

A thought experiment suggests this at least sometimes happens. If we had a society of beings who deep in their bones valued paperclips, and another who deep in their bones valued staples, it does seem somewhat likely they’d both derive e.g. prohibitions against stealing, or utilitarian instincts towards resource-allocation, as these would help the societies to run more effectively towards their eventual goal of producing vast quantities of their preferred stationary.

Is this what’s going on for humans? I’m not sure that it is. It does seem to me that a lot of human morality has developed in service of the goal of making more humans (and hence making society prosper so that it can afford to feed more humans, win conflicts with other societies, etc.). But I don’t think that has stuck with us; if you look at the outputs of our moral reflection I think we’ve gone deeper than things which are just in the service of making more humans.

If I’m right about that (that making more humans ultimately drove a lot of our moral intuitions but is a scaffolding we will eventually relinquish — and in many cases already have), there are three possibilities:

  1. Each human will have a significantly different extrapolated volition (/axiology); the apparent convergence-from-reflection is a local phenomenon which occurs at some levels of sophistication but will fall away before the end.
  2. Many humans (e.g. those trying to be moral and reflective) will have convergent extrapolated volitions (/axiologies). But these will be based on various idiosyncrasies of humanity, and we can’t reasonably expect convergence from non-human minds.
  3. Our extrapolated morality will converge to something more universal, such that we could expect convergence from some alien minds (perhaps mostly just those who similarly had their moral intuitions shaped by evolution in a social setting, or perhaps something broader).

When I hear discussion of the complexity of value, I normally imagine people are supposing something like B). But my current guess is that it is the least likely of these three possibilities — I’m quite unsure how to think about this, but if forced to put numbers on them now I might say 40% A), 15% B), 45% C). I think B) gets penalized relative to the other two because it postulates more complex behaviour — rather than one basic pattern that applies across minds, it says there’s something relatively special about the closeness of human minds to each other relative to their closeness to other minds.

Then I think for practical decision-making purposes we should apply a heavy discount to world A) — in that world, what everyone else would eventually want isn’t all that close to what I would eventually want. Moreover what me-of-tomorrow would eventually want probably isn’t all that close to what me-of-today would eventually want. So it’s much much less likely that the world we end up with even if we save it is close to the ideal one by my lights. Moreover, even though these worlds possibly differ significantly, I don’t feel like from my present position I have that much reason to be opinionated between them; it’s unclear that I’d greatly imperfect worlds according to the extrapolated volition of some future-me, relative to the imperfect worlds according to the extrapolated volition of someone else I think is pretty reasonable.

The convoluted case

Perhaps many minds end up at a shared notion of what they’re aiming for, via acausal trade (getting to some grand bargain), or evidential cooperation in large worlds. I don’t understand the mechanisms here well enough to be confident, but it seems like a pretty realistic possibility.

The implications of convergence via such a mechanism could be a bit different than for straightforward convergence — since it’s not just predictive of where agents might end up, but creates possible mechanisms for actors in our universe to have some (small) influence over the thing that everyone converges to. (Of course it’s also possible to expect significant straightforward convergence, but then also expect convergence from this mechanism.)

  1. ^

     Does moral realism imply the convergent morality thesis? Not strictly, although it’s suggestive. And even if you believe both, presumably there’s some causal mechanism behind convergent morality. Personally, though, I find many intuitions that used to make me sympathetic to realism now make me sympathetic to the convergent morality thesis.

Comments4


Sorted by Click to highlight new comments since:

Then I think for practical decision-making purposes we should apply a heavy discount to world A) — in that world, what everyone else would eventually want isn’t all that close to what I would eventually want. Moreover what me-of-tomorrow would eventually want probably isn’t all that close to what me-of-today would eventually want. So it’s much much less likely that the world we end up with even if we save it is close to the ideal one by my lights. Moreover, even though these worlds possibly differ significantly, I don’t feel like from my present position I have that much reason to be opinionated between them; it’s unclear that I’d greatly imperfect worlds according to the extrapolated volition of some future-me, relative to the imperfect worlds according to the extrapolated volition of someone else I think is pretty reasonable.

  1. You seem to be assuming that people's extrapolated views in world A will be completely uncorrelated with their current views/culture/background, which seems a strange assumption to make.
  2. People's extrapolated views could be (in part) selfish or partial, which is an additional reason that extrapolated views of you at different times may be closer than that of strangers.
  3. People's extrapolated views not converging doesn't directly imply "it’s much much less likely that the world we end up with even if we save it is close to the ideal one by my lights" because everyone could still get close to what they want through trade/compromise, or you (and/or others with extrapolated views similar to yours) could end up controlling most of the future by winning the relevant competitions.
  4. It's not clear that applying a heavy discount to world A makes sense, regardless of the above, because we're dealing with "logical risk" which seems tricky in terms of decision theory.

4 is a great point, thanks.

On 1--3, I definitely agree that I may prudentially prefer some possibilities than others. I've been assuming that from a consequentialist moral perspective the distribution of future outcomes still looks like the one I give in this post, but I guess it should actually look quite different. (I think what's going on is that in some sense I don't really believe in world A, so haven't explored the ramifications properly.)

This comment I just made on Will Aldred's Long Reflection Reading List seems relevant for this topic. 

Overall, I'd say there's for sure going to be some degree of moral convergence, but it's often overstated, and whether the degree of convergence is strong enough to warrant going for the AI strategies you discuss in your subsequent posts (e.g., here) would IMO depend on a tricky weighting of risks and benefits (including the degree to which alternatives seem promising).

Does moral realism imply the convergent morality thesis? Not strictly, although it’s suggestive. And even if you believe both, presumably there’s some causal mechanism behind convergent morality. Personally, though, I find many intuitions that used to make me sympathetic to realism now make me sympathetic to the convergent morality thesis.

I agree with this endnote. 

For my anti-realism sequence, I've actually made the stylistic choice of defining (one version of) moral realism as implying moral convergence (at least under ideal reasoning circumstances). That's notably different from how philosophers typically define it. I went for my idiosnycratic definition because, when I tried to find out what are the action-guiding versions of moral realism (here), many ways in which philosophers have defined "moral realism" in the literature don't actually seem relevant for what we should do as effective altruists. I could only come up with two (very different!) types of moral realism that would have clear implications for effective altruism. 

(1) Non-naturalist moral realism based on the (elusive?) concept of irreducible normativity.

(2) Naturalist moral realism where the true morality is what people who are interested in "doing the most moral/altruistic thing" would converge on under ideal reflection conditions.

(See this endnote where I further justify my choice of (2) against some possible objections.)

I think (1) just doesn't work as a concept, and (2) is almost certainly false at least in its strongest form. But yeah, there's going to be degrees of convergence, and moral reflection (even at the individual level without convergence) is relevant also from within a moral anti-realist reasoning framework. 

Perhaps it’s just the case that the process of moral reflection tends to cause convergence among minds from a range of starting points, via something like social logic plus shared evolutionary underpinnings.

Yes. And there are many cases where evolution has indeed converged on solutions to other problems[1].

  1. ^

    Some examples:

    (Copy-pasted from Claude 3 Opus. They pass my eyeball fact-check.)

    1. Wings: Birds, bats, and insects have all independently evolved wings for flight, despite having very different ancestry.
    2. Eyes: Complex camera-like eyes have evolved independently in vertebrates (like humans) and cephalopods (like octopuses and squids).
    3. Echolocation: Both bats and toothed whales (like dolphins) have evolved the ability to use echolocation for navigation and hunting, despite being unrelated mammals.
    4. Venomous spines: Both porcupines (mammals) and hedgehogs (also mammals, but not closely related to porcupines) have evolved sharp, defensive spines.
    5. Fins: Sharks (cartilaginous fish) and dolphins (mammals) have independently evolved similar fin shapes and placement for efficient swimming.
    6. Succulence: Cacti (native to the Americas) and euphorbs (native to Africa) have independently evolved similar water-storing, fleshy stems to adapt to arid environments.
    7. Flippers: Penguins (birds), seals, and sea lions (mammals) have all evolved flipper-like limbs for swimming, despite having different ancestries.
    8. Ant-eating adaptations: Anteaters (mammals), pangolins (mammals), and numbats (marsupials) have independently evolved long snouts, sticky tongues, and strong claws for eating ants and termites.
    Show all footnotes
    Curated and popular this week
    jackva
     ·  · 3m read
     · 
     [Edits on March 10th for clarity, two sub-sections added] Watching what is happening in the world -- with lots of renegotiation of institutional norms within Western democracies and a parallel fracturing of the post-WW2 institutional order -- I do think we, as a community, should more seriously question our priors on the relative value of surgical/targeted and broad system-level interventions. Speaking somewhat roughly, with EA as a movement coming of age in an era where democratic institutions and the rule-based international order were not fundamentally questioned, it seems easy to underestimate how much the world is currently changing and how much riskier a world of stronger institutional and democratic backsliding and weakened international norms might be. Of course, working on these issues might be intractable and possibly there's nothing highly effective for EAs to do on the margin given much attention to these issues from society at large. So, I am not here to confidently state we should be working on these issues more. But I do think in a situation of more downside risk with regards to broad system-level changes and significantly more fluidity, it seems at least worth rigorously asking whether we should shift more attention to work that is less surgical (working on specific risks) and more systemic (working on institutional quality, indirect risk factors, etc.). While there have been many posts along those lines over the past months and there are of course some EA organizations working on these issues, it stil appears like a niche focus in the community and none of the major EA and EA-adjacent orgs (including the one I work for, though I am writing this in a personal capacity) seem to have taken it up as a serious focus and I worry it might be due to baked-in assumptions about the relative value of such work that are outdated in a time where the importance of systemic work has changed in the face of greater threat and fluidity. When the world seems to
     ·  · 4m read
     · 
    Forethought[1] is a new AI macrostrategy research group cofounded by Max Dalton, Will MacAskill, Tom Davidson, and Amrit Sidhu-Brar. We are trying to figure out how to navigate the (potentially rapid) transition to a world with superintelligent AI systems. We aim to tackle the most important questions we can find, unrestricted by the current Overton window. More details on our website. Why we exist We think that AGI might come soon (say, modal timelines to mostly-automated AI R&D in the next 2-8 years), and might significantly accelerate technological progress, leading to many different challenges. We don’t yet have a good understanding of what this change might look like or how to navigate it. Society is not prepared. Moreover, we want the world to not just avoid catastrophe: we want to reach a really great future. We think about what this might be like (incorporating moral uncertainty), and what we can do, now, to build towards a good future. Like all projects, this started out with a plethora of Google docs. We ran a series of seminars to explore the ideas further, and that cascaded into an organization. This area of work feels to us like the early days of EA: we’re exploring unusual, neglected ideas, and finding research progress surprisingly tractable. And while we start out with (literally) galaxy-brained schemes, they often ground out into fairly specific and concrete ideas about what should happen next. Of course, we’re bringing principles like scope sensitivity, impartiality, etc to our thinking, and we think that these issues urgently need more morally dedicated and thoughtful people working on them. Research Research agendas We are currently pursuing the following perspectives: * Preparing for the intelligence explosion: If AI drives explosive growth there will be an enormous number of challenges we have to face. In addition to misalignment risk and biorisk, this potentially includes: how to govern the development of new weapons of mass destr
    Sam Anschell
     ·  · 6m read
     · 
    *Disclaimer* I am writing this post in a personal capacity; the opinions I express are my own and do not represent my employer. I think that more people and orgs (especially nonprofits) should consider negotiating the cost of sizable expenses. In my experience, there is usually nothing to lose by respectfully asking to pay less, and doing so can sometimes save thousands or tens of thousands of dollars per hour. This is because negotiating doesn’t take very much time[1], savings can persist across multiple years, and counterparties can be surprisingly generous with discounts. Here are a few examples of expenses that may be negotiable: For organizations * Software or news subscriptions * Of 35 corporate software and news providers I’ve negotiated with, 30 have been willing to provide discounts. These discounts range from 10% to 80%, with an average of around 40%. * Leases * A friend was able to negotiate a 22% reduction in the price per square foot on a corporate lease and secured a couple months of free rent. This led to >$480,000 in savings for their nonprofit. Other negotiable parameters include: * Square footage counted towards rent costs * Lease length * A tenant improvement allowance * Certain physical goods (e.g., smart TVs) * Buying in bulk can be a great lever for negotiating smaller items like covid tests, and can reduce costs by 50% or more. * Event/retreat venues (both venue price and smaller items like food and AV) * Hotel blocks * A quick email with the rates of comparable but more affordable hotel blocks can often save ~10%. * Professional service contracts with large for-profit firms (e.g., IT contracts, office internet coverage) * Insurance premiums (though I am less confident that this is negotiable) For many products and services, a nonprofit can qualify for a discount simply by providing their IRS determination letter or getting verified on platforms like TechSoup. In my experience, most vendors and companies