Will Aldred

AI forecasting & strategy @ Metaculus; also an EA Forum moderator
3388 karmaJoined Apr 2021Working (6-15 years)Oxford, UK


Just a bundle of subroutines sans free will, aka flawed post-monkey #~100,000,000,000. Peace and love xo

(Note: All my posts, except those where I state otherwise, reflect my views and not those of my employer(s).)

“[Man] thinks, ‘I am this body, I will soon die.’ This false sense of self is the cause of all his sorrow.” —Lao Tzu




Topic Contributions

Directionally, I agree with your points. On the last one, I'll note that counting person-years (or animal-years) falls naturally out of empty individualism as well as open individualism, and so the point goes through under the (substantively) weaker claim of “either open or empty individualism is true”.[1]

(You may be interested in David Pearce's take on closed, empty, and open individualism.)

  1. ^

    For the casual reader: The three candidate theories of personal identity are empty, open, and closed individualism. Closed is the common sense view, but most people who have thought seriously about personal identity—e.g., Parfit—have concluded that it must be false (tl;dr: because nothing, not memory in particular, can “carry” identity in the way that's needed for closed individualism to make sense). Of the remaining two candidates, open appears to be the fringe view—supporters include Kolak, Johnson, Vinding, and Gomez-Emilsson (although Kolak's response to Cornwall makes it unclear to what extent he is indeed a supporter). Proponents of (what we now call) empty individualism include Parfit, Nozick, Shoemaker, and Hume.

There was near-consensus that Open Phil should generously fund promising AI safety community/movement-building projects they come across

Would you be able to say a bit about to what extent members of this working group have engaged with the arguments around AI safety community/movement-building potentially doing more harm than good? For instance, points 6 through 11 of Oli Habryka's second message in the “Shutting Down the Lightcone Offices” post (link). If they have strong counterpoints to such arguments, then I imagine it would be valuable for these to be written up.

(Probably the strongest response I've seen to such arguments is the post “How MATS addresses ‘mass movement building’ concerns”. But this response is MATS-specific and doesn't cover concerns around other forms of movement building, for example, ML upskilling bootcamps or AI safety courses operating through broad outreach.)

I enjoyed this post, thanks for writing it.

Is there any crucial consideration I’m missing? For instance, are there reasons to think agents/civilizations that care about suffering might – in fact – be selected for and be among the grabbiest?

I think I buy your overall claim in your “Addressing obvious objections” section that there is little chance of agents/civilizations who disvalue suffering (hereafter: non-PUs) winning a colonization race against positive utilitarians (PUs). (At least, not without causing equivalent expected suffering.) However, my next thought is that non-PUs will generally work this out, as you have, and that some fraction of technologically advanced non-PUs—probably mainly those who disvalue suffering the most—might act to change the balance of realized upside- vs. downside-focused values by triggering false vacuum decay (or perhaps by doing something else with a similar switching-off-a-light-cone effect).

In this way, it seems possible to me that suffering-focused agents will beat out PUs. (Because there’s nothing a PU agent—or any agent, for that matter—can do to stop a vacuum decay bubble.) This would reverse the post’s conclusion. Suffering-focused agents may in fact be the grabbiest, albeit in a self-sacrificial way.

(It also seems possible to me that suffering-focused agents will mostly act cooperatively, only triggering vacuum decays at a frequency that matches the ratio of upside- vs. downside-focused values in the cosmos, according to their best guess for what the ratio might be.[1] This would neutralize my above paragraph as well as the post's conclusion.)

  1. ^

    My first pass at what this looks like in practice, from the point of view of a technologically advanced, suffering-focused (or maybe non-PU more broadly) agent/civilization: I consider what fraction of agents/civilizations like me should trigger vacuum decays in order to realize the cosmos-wide values ratio. Then, I use a random number generator to tell me whether I should switch off my light cone.

    Additionally, one wrinkle worth acknowledging is that some universes within the inflationary multiverse, if indeed it exists and allows different physics in different universes, are not metastable. PUs likely cannot be beaten out in these universes, because vacuum decays cannot be triggered. Nonetheless, this can be compensated for through suffering-focused/non-PU agents in metastable universes triggering vacuum decays at a correspondingly higher frequency.

I'm happy this post exists. One thing I notice, which I find a little surprising, is that the post doesn't seem to include what I'd consider the classic example of controlling the past: evidentially cooperating with beings/civilizations that existed in past cycles of the universe.[1]

  1. ^

    This example does rely on a cyclic (e.g., Big Bounce) model of cosmology,^ which has a couple of issues. Firstly, that such a cosmological model is much less likely to be true, in my all-things-considered view, than eternal inflation. Secondly, that within a cyclic model, there isn't a clearly meaningful notion of time across cycles. However, I don't think these issues undercut the example. Controlling faraway events through evidential cooperation is no less possible in an eternally inflating multiverse, it's just that space is doing more of the work now than time (which makes it a less classic example for controlling the past). Also, while to an observer within a cycle, the notion of time outside their cycle may not hold much meaning, I think that from a God's eye view, there is a material sense in which the cycles occur sequentially, with some in the past of others.

    In addition, the example can be adapted, I believe, to fit the simulation hypothesis. Sequential universe cycles become sequential simulation runs,* and the God’s eye view is now the point of view of the beings in the level of reality one above ours, whether that be base reality or another simulation.  *(It seems likely to me that simulation runs would be massively, but not entirely, parallelized. Moreover, even if runs are entirely parallelized, it would be physically impossible—so long as the level-above reality has physical laws that remotely resemble ours—for two or more simulations to happen in the exact same spatial location. Therefore, there would be frames of reference in the base reality from which some simulation runs take place in the past of others.)

    ^ (One type of cyclic model, conformal cyclic cosmology, allows causal as well as evidential influence between universes, though in this model one universe can only causally influence the next one(s) in the sequence (i.e., causally controlling the past is not possible). For more on this, see "What happens after the universe ends?".)

there are important downsides to the "cause-first" approach, such as a possible lock-in of main causes

I think this is a legitimate concern, and I'm glad you point to it. An alternative framing is lock-out of potentially very impactful causes. Dynamics of lock-out, as I see it, include:

  • EA selecting for people interested in the already-established causes.
  • Social status gradients within EA pushing people toward the highest-regarded causes, like AI safety.[1]
  • EAs working in an already-established cause having personal and career-related incentives to ensure that EA keeps their cause as a top priority.

A recent shortform by Caleb Parikh, discussing the specific case of digital sentience work, feels related. In Caleb's words:

I think aspects of EA that make me more sad is that there seems to be a few extremely important issues on an impartial welfarist view that don’t seem to get much attention at all, despite having been identified at some point by some EAs.

  1. ^

    Personal anecdote: Part of the reason, if I'm to be honest with myself, for my move from nuclear weapons risk research to AI strategy/governance is that it became increasingly difficult, socially, to be an EA working on nuclear risk. (In my sphere, at least.) Many of my conversations with other EAs, even in non-work situations and even with me trying avoid this conversation area, turned into me having to defend my not focusing on AI risk, on pain of being seen as "not getting it".

EA is still not yet "correct enough" about wild animal welfare - too little attention and resources relatively and absolutely.

I'm very sympathic to the view that wild animal suffering is a huge deal, and that a mature and moral civilization would solve this problem. However, I also find “Why I No Longer Prioritize Wild Animal Welfare” convincing. The conclusion of that post:

After looking into these topics, I now tentatively think that WAW [wild animal welfare] is not a very promising EA cause because:

  • In the short-term (the next ten years), WAW interventions we could pursue to help wild animals now seem less cost-effective than farmed animal interventions.
  • In the medium-term (10-300 years), trying to influence governments to do WAW work seems similarly speculative to other longtermist work but far less important. 
  • In the long-term, WAW seems important but not nearly as important as preventing x-risks and perhaps some other work.

I’m enjoying this sequence, thanks for writing it.

I imagine you’re well aware of what I write below – I write it to maybe help some readers place this post within some wider context.

My model of space-faring civilizations' values, which I’m sure isn’t original to me, goes something like the following:

  • Start with a uniform prior over all possible values, and with the reasonable assumption that any agent or civilization in the universe, whether biological or artificial, originated from biological life arising on some planet.
  • Natural selection. All biological life probably goes through a Darwinian selection process. This process predictably favors values that are correlated with genetic fitness.
  • Cultural evolution, including moral progress. Most sufficiently intelligent life (e.g., humans) probably organizes itself into a civilization, with culture and cultural evolution. It seems harder to predict which values cultural evolution might tend to favor, though.
  • Great filters.[1] Notably,
    • Self-destruction. Values that increase the likelihood of self-destruction (e.g., via nuclear brinkmanship-gone-wrong) are disfavored.
    • Desire to colonize space, aka grabbiness. As this post discusses, values that correlate with grabbiness are favored.
    • (For more, see Oesterheld (n.d.).)
  • A potentially important curveball: the transition from biological to artificial intelligence.
    • AI alignment appears to be difficult. This probably undoes some of the value selection effects I describe above, because some fraction of space-faring agents/civilizations is presumably AI with values not aligned with those of their biological creators, and I expect the distribution of misaligned AI values, relative to the distribution of values that survive the above selections, to be closer to uniform over all values (i.e., the prior we started with).
    • Exactly how hard alignment is (i.e., what fraction of biological civilizations that build superintelligent AI are disempowered?), as well as some other considerations (e.g., are alignment failures generally near misses or big misses?; if alignment is effectively impossible, then what fraction of civilizations are cognizant enough to not build superintelligence?), likely factor into how this curveball plays out.
  1. ^

    Technically, I mean late-stage steps within the great filter hypothesis (Wikipedia, n.d.; LessWrong, n.d.).

This house believes that if digital minds are built, they will:

  1. be conscious
  2. experience valence (i.e., pleasure and/or pain)

I think this is an important debate to have because, as has been pointed out here and here, EA seems to largely ignore prioritization considerations around digital sentience and suffering risk.[1]

To argue against the motion, I suggest David Pearce: see his view explained here. To argue for the motion, maybe—aiming high—David Chalmers: see his position outlined here.

  1. ^

    See the linked posts’ bullet points titled “I think EA ignores digital sentience too much,” and “Suffering-focused longtermism stuff seems weirdly sidelined,” respectively.

This post really resonates with me. Over winter 2021/22 I went on a retreat run by folks in the CFAR, MIRI, Lightcone cluster, and came away with some pretty crippling uncertainty about the sign of EA community building.[1] In retrospect, the appropriate response would have been one of the following:

  1. stop to investigate
  2. commit to community building (but maybe stop to investigate upon encountering new information or after some predetermined period of time)
  3. switch jobs

Instead, I continued on in my community building role, but with less energy, and with a constant cloud of uncertainty hanging over me. This was not good for my work outputs, my mental health, or my interactions and relationships with community building colleagues. Accordingly, “the correct response to uncertainty is *not* half-speed” is perhaps the number one piece of advice I’d give to my start-of-2022 self. I’m happy to see this advice so well elucidated here.

  1. ^

    To be clear, I don’t believe the retreat was “bad” in any particular way, or that it was designed to propagate any particular views regarding community building, and I have a lot of respect for these Rationalist folks.

Attendees should focus on getting as much ea-related value as possible out of EA events, and we as organizers should focus on generating as much value as possible. Thinking about which hot community builder you can get with later distracts from that. And, thinking about which hot participant you can get with later on can lead to decisions way more costly than just lost opportunities to provide more value.

Strongly agree. Moreover, I think it's worth us all keeping in mind that the only real purpose of the EA community is to do the most good. An EA community in which members view, for example, EA Globals as facilitating afterparties at which to find hook ups, is an EA community which is likely to spend more {time, money, attention} on EAGs and other events than achieves the most good.

If the current resource level going toward EA community events does the most good,
I desire to believe that the current resource level going toward EA community events does the most good;
If less {time, money, attention} spent on EA community events does the most good,
I desire to believe that less {time, money, attention} spent on EA community events does the most good;
Let me not become attached to beliefs I may not want.

Load more