WM

William_MacAskill

7065 karmaJoined Aug 2014

Comments
189

Thank you so much for your work with EV over the last year, Howie! It was enormously helpful to have someone so well-trusted, with such excellent judgment, in this position. I’m sure you’ll have an enormous positive impact at Open Phil.

And welcome, Rob - I think it’s fantastic news that you’ve taken the role!

I mentioned a few months ago that I was planning to resign from the board of EV UK: I’ve now officially done so.

Since last November, I’ve been recused from the board on all matters associated with FTX and related topics, which has ended up being a large proportion of board business. (This is because the recusal affected not just decisions that were directly related to the collapse of FTX, but also many other decisions for which the way EV UK has been affected by the collapse of FTX was important context.) I know I initially said that I’d wait for there to be more capacity, but trustee recruitment has moved more slowly than I’d anticipated, and with the ongoing recusal I didn’t expect to add much capacity for the foreseeable future, so it felt like a natural time to step down.  

It’s been quite a ride over the last eleven years. Effective Ventures has grown to a size far beyond what I expected, and I’ve felt privileged to help it on that journey. I deeply respect the rest of the board, and the leadership teams at EV, and I’m glad they’re at the helm.

Some people have asked me what I’m currently working on, and what my plans are. This year has been quite spread over a number of different things, including fundraising, helping out other EA-adjacent public figures, support for GPI, CEA and 80,000 Hours, writing additions to What We Owe The Future and helping with the print textbook version of utilitarianism.net that’s coming out next year. It’s also personally been the toughest year of my life; my mental health has been at its worst in over a decade, and I’ve been trying to deal with that, too.

At the moment, I’m doing three main things:

- Some public engagement, in particular around the WWOTF paperback and foreign language book launches and at EAGxBerlin. This has been and will be lower-key than the media around WWOTF last year, and more focused on in-person events; I’m also more focused on fundraising than I was before. 

- Research into "trajectory changes”: in particular, ways of increasing the wellbeing of future generations other than 'standard' existential risk mitigation strategies, in particular on issues that arise even if we solve AI alignment, like digital sentience and the long reflection. I’m also doing some learning to try to get to grips on how to update properly on the latest developments in AI, in particular with respect to the probability of an intelligence explosion in the next decade, and on how hard we should expect AI alignment to be.

- Gathering information for what I should focus on next. In the medium term, I still plan to be a public proponent of EA-as-an-idea, which I think plays to my comparative advantage, and because I’m worried about people neglecting “EA qua EA”. If anything, all the crises faced by EA and by the world in the last year has reminded me of just how deeply I believe in EA as a project, and how the message of taking a thoughtful, humble, and scientific approach to doing good is more important than ever. The precise options I’m considering are still quite wide-ranging, including: a podcast and/or YouTube show and/or substack; a book on effective giving; a book on evidence-based living; or deeper research into the ethics and governance questions that arise even if we solve AI alignment. I hope to decide on that by the end of the year.

(My personal views only, and like Nick I've been recused from a lot of board work since November.)

Thank you, Nick, for all your work on the Boards over the last eleven years. You helped steward the organisations into existence, and were central to helping them flourish and grow. I’ve always been impressed by your work ethic, your willingness to listen and learn, and your ability to provide feedback that was incisive, helpful, and kind.

Because you’ve been less in the limelight than me or Toby, I think many people don’t know just how crucial a role you played in EA’s early days. Though you joined shortly after launch, given all your work on it I think you were essentially a third cofounder of Giving What We Can; you led its research for many years, and helped build vital bridges with GiveWell and later Open Philanthropy. I remember that when you launched Giving What We Can: Rutgers, you organised a talk with I think over 500 people. It must still be one of the most well-attended talks that we’ve ever had within EA, and helped the idea of local groups get off the ground.

The EA movement wouldn’t have been the same without your service. It’s been an honour to have worked with you.

Hey,

I’m really sorry to hear about this experience. I’ve also experienced what feels like social pressure to have particular beliefs (e.g. around non-causal decision theory, high AI x-risk estimates, other general pictures of the world), and it’s something I also don’t like about the movement. My biggest worries with my own beliefs stem around the worry that I’d have very different views if I’d found myself in a different social environment. It’s just simply very hard to successfully have a group of people who are trying to both figure out what’s correct and trying to change the world: from the perspective of someone who thinks the end of the world is imminent, someone who doesn’t agree is at best useless and at worst harmful (because they are promoting misinformation).

In local groups in particular, I can see how this issue can get aggravated: people want their local group to be successful, and it’s much easier to track success with a metric like “number of new AI safety researchers” than “number of people who have thought really deeply about the most pressing issues and have come to their own well-considered conclusions”. 

One thing I’ll say is that core researchers are often (but not always) much more uncertain and pluralist than it seems from “the vibe”. The second half of Holden Karnofsky’s recent 80k blog post is indicative. Open Phil splits their funding across quite a number of cause areas, and I expect that to continue. Most of the researchers at GPI are pretty sceptical of AI x-risk. Even among people who are really worried about TAI in the next decade, there’s normally significant support (whether driven by worldview diversification or just normal human psychology) for neartermist or other non-AI causes. That’s certainly true of me. I think longtermism is highly non-obvious, and focusing on near-term AI risk even more so; beyond that, I think a healthy EA movement should be highly intellectually diverse and exploratory. 

What should be done? I have a few thoughts, but my most major best guess is that, now that AI safety is big enough and getting so much attention, it should have its own movement, separate from EA. Currently, AI has an odd relationship to EA. Global health and development and farm animal welfare, and to some extent pandemic preparedness, had movements working on them independently of EA. In contrast, AI safety work currently overlaps much more heavily with the EA/rationalist community, because it’s more homegrown. 

If AI had its own movement infrastructure, that would give EA more space to be its own thing. It could more easily be about the question “how can we do the most good?” and a portfolio of possible answers to that question, rather than one increasingly common answer — “AI”.

At the moment, I’m pretty worried that, on the current trajectory, AI safety will end up eating EA. Though I’m very worried about what the next 5-10 years will look like in AI, and though I think we should put significantly more resources into AI safety even than we have done, I still think that AI safety eating EA would be a major loss. EA qua EA, which can live and breathe on its own terms, still has huge amounts of value: if AI progress slows; if it gets so much attention that it’s no longer neglected; if it turns out the case for AI safety was wrong in important ways; and because there are other ways of adding value to the world, too. I think most people in EA, even people like Holden who are currently obsessed with near-term AI risk, would agree. 

This isn't answering the question you ask (sorry), but one possible response to this line of criticism is for some people within EA / longtermism  to more clearly state what vision of the future they are aiming towards. Because this tends not to happen, it means that critics can attribute particular visions to people that they don't have. In particular, critics of WWOTF often thought that I was trying to push for some particular narrow vision of the future, whereas really the primary goal, in my mind at least, is to keep our options open as much as possible, and make moral progress in order to figure out what sort of future we should try to create.

Here are a couple of suggestions for positive visions. These are what I'd answer if asked: "What vision of the future are you aiming towards?":

"Procedural visions"
(Name options:  Viatopia - representing the idea of a waypoint, and of keeping multiple paths open - though this mixes latin and greek roots. Optiotopia, though is a mouthful and mixes latin and greek roots. Related ideas: existential security, the long reflection.)

These doesn't have some vision of what we ultimately want to achieve. Instead they propose a waypoint that we'd want to achieve, as a step on the path to a good future. That waypoint would involve: (i) ending all obvious grievous contemporary harms, like war, violence and unnecessary suffering; (ii) reducing existential risk down to a very low level; (iii) securing a deliberative process for humanity as a whole, so that we make sufficient moral progress before embarking on potentially-irreversible actions like space settlement.  

The hope could be that almost everyone could agree on this as a desirable waypoint.

"Utopia for everyone"
(Name options: multitopia or pluritopia, but this mixes latin and greek roots; polytopia, but this is the name of a computer game. Related idea: Paretopia.)

This vision is where a great diversity of different visions of the good are allowed to happen, and people have choice about what sort of society they want to live in. Environmentalists could preserve Earth's ecosystems; others can build off-world societies. Liberals and libertarians can create a society where everyone is empowered to act autonomously, pursuing their own goals; lovers of knowledge can build societies devoted to figuring out the deepest truths of the universe; philosophical hedonists can create societies devoted to joy, and so on.

The key insight, here, is that there's just a lot of available stuff in the future, and that scientific, social and moral progress will potentially enable us to produce great wealth with that stuff (if we don't destroy the world first, or suffer value lock-in). Plausibly, if we as a global society get our act together, the large majority of moral perspectives can get most of what they want. 

Like the procedural visions, spelling this vision out more could have great benefits today, via greater collaboration: if we could agree that this is what we'll aim for, at least in part, then we could reduce the chance of some person or people with some narrow view trying to grab power for itself.

(I write about these a little bit about both of these idea in a fictional short story, here.)

I'd welcome name ideas for these, especially the former. My best guesses so far are "viatopia" and "multitopia", but I'm not wedded to them and I haven't spent lots of time on naming. I don't think that the -topia suffix is strictly necessary.

This is a good point, and it's worth pointing out that increasing  is always good whereas increasing  is only good if the future is of positive value. So risk aversion reduces the value of increasing  relative to increasing , provided we put some probability on a bad future.

Agree this is worth pointing out! I've a draft paper that goes into some of this stuff in more detail, and I make this argument. 

Another potential argument for trying to improve  is that, plausibly at least, the value lost as a result of the gap between expected- and best-possible- is greater that the value lost as a result of the gap between expected-  and best-possible-. So in that sense the problem that expected-  is not as high as it could be is more "important" (in the ITN sense) than the problem that the expected  is not as high as it could be.
 

Existential risk, and an alternative framework


One common issue with “existential risk” is that it’s so easy to conflate it with “extinction risk”. It seems that even you end up falling into this use of language. You say: “if there were 20 percentage points of near-term existential risk (so an 80 percent chance of survival)”. But human extinction is not necessary for something to be an existential risk, so 20 percentage points of near-term existential risk doesn’t entail an 80 percent chance of survival. (Human extinction may also not be sufficient for existential catastrophe either, depending on how one defines “humanity”))

Relatedly, “existential risk” blurs together two quite different ways of affecting the future. In your model: . (That is: The value of humanity's future is the average value of humanity's future over time multiplied by the duration of humanity's future.)

This naturally lends itself to the idea that there are two main ways of improving the future: increasing  and increasing .

In What We Owe The Future I refer to the latter as “ensuring civilisational survival” and the former as “effecting a positive trajectory change”. (We’ll need to do a bit of syncing up on terminology.)

I think it’s important to keep these separate, because there are plausible views on which affecting one of these is much more important than affecting the other.

Some views on which increasing  is more important:

  • If the future is of zero or net-negative value
  • If large drops in future population size are not of enormous importance (e.g. the average view, variable-value views)
  • If nonhuman-originating civilisation would use the resources that we would use, and is similarly good

Some views on which increasing  is more important:

  • If there’s a “low” upper bound on value, which we expect almost all future civilisations to meet
  • If think that moral convergence, conditional on survival, is very likely 

What’s more, changes to  are plausible binary, but changes to    are not.  Plausibly, most probability mass is on  being small (we go extinct in the next thousand years) or very large (we survive for billions of years or more). But, assuming for simplicity that there’s a “best possible” and “worst possible” future,    could take any value between 100% and -100%.  So focusing only on “drastic” changes, as the language of “existential risk” does, makes sense for changes to , but not for changes to   .

Flow vs fixed resources

In footnote 14 you say: “It has also been suggested (Sandberg et al 2016, Ord 2021) that the ultimate physical limits may be set by a civilisation that expands to secure resources but doesn’t use them to create value until much later on, when the energy can be used more efficiently. If so, one could tweak the framework to model this not as a flow of intrinsic value over time, but a flow of new resources which can eventually be used to create value.”

This feels to me that it would really be changing the framework considerably, rather than just a “tweak”. 

For example, consider a “speed up” with an endogenous end time. On the original model, this decreases total value (assuming the future is overall good). But if we’re talking about gaining a pot of fixed resources, speeding up progress forever doesn’t change total value.

Humanity

Like the other commenter says, I feel worried that v(.) refers to the value of “humanity”. For similar reasons, I feel worried that existential risk is defined in terms of humanity’s potential.

One issue is that it’s vague what counts as “humanity”. Homo sapiens count, but what about:

  • A species that Homo sapiens evolves into 
  • “Uploaded” humans
  • “Aligned” AI systems
  • Non-aligned AI systems that nonetheless produce morally valuable or disvaluable outcomes

I’m not sure where you draw the line, or if there is a principled place to draw the line.

A second issue is that “humanity” doesn’t include the value of:

  • Earth-originating but nonhuman civilisations, for example if Homo sapiens go extinct, but some other species later evolves that has technological capability.
  • Non-Earth-originating alien civilisation. 

And, depending on how “humanity” is defined, it may not include non-aligned AI systems that nonetheless produce morally valuable or disvaluable outcomes.

I tried to think about how to incorporate this into your model, but ultimately I think it’s hard without it becoming quite unintuitive.

And I think these adjustments are potentially non-trivial. I think one could reasonably hold, for example, that the probability of a technologically-capable species evolving, if Homo sapiens goes extinct, is 90%, that non-Earth-originating alien civilisations settling the solar systems that we would ultimately settle is also 90%, and that such civilisations would have similar value to human-originating civilisation. 

(They also change how you should think about longterm impact. If alien civilisations will settle the Milky Way (etc) anyway, then preventing human extinction is actually about changing how interstellar resources are used, not whether they are used at all .)

And I think it means we miss out on some potentially important ways of improving the future. For example, consider scenarios where we fail on alignment. There is no “humanity”, but we can still make the future better or worse. A misaligned AI system that promotes suffering (or promotes something that involves a lot of suffering) is a lot worse than an AI system that promotes something valueless.  

Load more