The idea of the long reflection is that of a long period—perhaps tens of thousands of years—during which human civilisation, perhaps with the aid of improved cognitive ability, dedicates itself to working out what is ultimately of value (INFORMAL: MacAskill 2018; Lewis 2018). It may be argued that such a period would be warranted before deciding whether to undertake an irreversible decision of immense importance, such as whether to attempt spreading to the stars. Do we find ourselves, or are we likely to find ourselves, in a situation where a ‘long reflection’ would in fact be warranted? If so, how should it be implemented?
But as best I can quickly tell, there seemed to be very few publicly accessible sources on the long reflection (at least before Ord’s book). So I thought I’d make a quite unambitious post that just collects all relevant quotes I’ve found after looking through all the Google hits for “the "long reflection" macaskill” and through all posts on the EA Forum and LessWrong that came up when I searched "long reflection". At the end, I also list some other related work and concepts.
Please comment to let me know if you’re aware of any other sources which I haven’t mentioned here.
80,000 Hours interview with MacAskill
Quote from 80,000 Hours’ summary
Throughout history we’ve consistently believed, as common sense, truly horrifying things by today’s standards. According to University of Oxford Professor Will MacAskill, it’s extremely likely that we’re in the same boat today. If we accept that we’re probably making major moral errors, how should we proceed?
If our morality is tied to common sense intuitions, we’re probably just preserving these biases and moral errors. Instead we need to develop a moral view that criticises common sense intuitions, and gives us a chance to move beyond them. And if humanity is going to spread to the stars it could be worth dedicating hundreds or thousands of years to moral reflection, lest we spread our errors far and wide.
Will is an Associate Professor in Philosophy at Oxford University, author of Doing Good Better, and one of the co-founders of the effective altruism community. In this interview we discuss a wide range of topics:
- How would we go about a ‘long reflection’ to fix our moral errors?
Quotes from the interview itself
Will MacAskill: If you really appreciate moral uncertainty, and especially if you look back through the history of human progress, we have just believed so many morally abominable things and been, in fact, very confident in them. [...]
Even for people who really dedicated their lives to trying to work out the moral truths. Aristotle, for example, was incredibly morally committed, incredibly smart, way ahead of his time on many issues, but just thought that slavery was a pre-condition for some people having good things in life. Therefore, it was justified on those grounds. A view that we’d now think of as completely abominable.
That makes us think that, wow, we probably have mistakes similar to that. Really deep mistakes that future generations will look back and think, “This is just a moral travesty that people believed it.” That means, I think, we should place a lot of weight on moral option value and gaining moral information. That means just doing further work in terms of figuring out what’s moral the case. Doing research in moral philosophy, and so on. Studying it for yourself.
Secondly, into the future, ensuring that we keep our options open. I think this provides one additional argument for ensuring that the human race doesn’t go extinct for the next few centuries. It also provides an argument for the sort of instrumental state that we should be trying to get to as a society, which I call the long reflection. We can talk about that more.
Robert Wiblin: Humanity should thrive and grow, and then just turn over entire planets to academic philosophy. Is that the view? I think I’m charitable there.
Will MacAskill: Yeah, obviously the conclusion of a moral philosopher saying, “Moral philosophy is incredibly important” might seem very self-serving, but I think it is straightforwardly the implication you get if you at least endorse the premises of taking moral uncertainty very seriously, and so on. If you think we can at least make some progress on moral philosophy. If you reject that view you have to kind of reject one of the underlying premises.
Robert Wiblin: Before, you mentioned that if humanity doesn’t go extinct in the future, there might be a lot time and a lot of people and very educated people who might be able to do a lot more research on this topic and figure out what’s valuable. That was a long reflection. What do you think that would actually look like in practice, ideally?
Will MacAskill: Yeah. The key idea is just, different people have different sets of values. They might have very different views for what does an optimal future look like. What we really want ideally is a convergent goal between different sorts of values so that we can all say, “Look, this is the thing that we’re all getting behind that we’re trying to ensure that humanity…” Kind of like this is the purpose of civilization. The issue, if you think about purpose of civilization, is just so much disagreement. Maybe there’s something we can aim for that all sorts of different value systems will agree is good. Then, that means we can really get coordination in aiming for that.
I think there is an answer. I call it the long reflection, which is you get to a state where existential risks or extinction risks have been reduced to basically zero. It’s also a position of far greater technological power than we have now, such that we have basically vast intelligence compared to what we have now, amazing empirical understanding of the world, and secondly tens of thousands of years to not really do anything with respect to moving to the stars or really trying to actually build civilization in one particular way, but instead just to engage in this research project of what actually is a value. What actually is the meaning of life? And have, maybe it’s 10 billion people, debating and working on these issues for 10,000 years because the importance is just so great. Humanity, or post-humanity, may be around for billions of years. In which case spending a mere 10,000 is actually absolutely nothing.
In just the same way as if you think as an individual, how much time should you reflect in your own values before choosing your career and committing to one particular path.
Robert Wiblin: Probably at least a few minutes. At least .1% of the whole time.
Will MacAskill: At least a few minutes. Exactly. When you’re thinking about the vastness of the potential future of civilization, the equivalent of just a few minutes is tens of thousands of years.
Then, there’s questions about how exactly do you structure that. I think it would be great if there was more work done really fleshing that out. Perhaps that’s something you’ll have time to do in the near future. One thing you want to do is have as little locked in as possible. So, you want to be very open both on… You don’t want to commit to one particular moral methodology. You just want to commit to things that seem extremely good for basically whatever moral view you might think ends up as correct or what moral epistemology might be correct.
Just people having a higher IQ but everything else being equal, that just seems strictly good. People having greater empirical understanding just seems strictly good. People having a better ability to empathize. That all seems extremely good. People having more time. Have cooperation seems extremely good. Then I think, yeah, like you say, many different people can get behind this one vision for what we want humanity to actually do. That’s potentially exciting because we can coordinate.
It might be that one of the conclusions we come to takes moral uncertainty into account. We might say, actually, there’s some fundamental things that we just can’t ultimately resolve and so we want to do a compromise between them. Maybe that means that for civilization, part of civilization’s devoted to common sense, thick values of pursuit of art, and flourishing, and so on, whereas large parts of the rest of civilization are devoted to other values like pure bliss, blissful state. You can imagine compromise scenarios there. It’s just large amounts of civilization… The universe is a big place.
Quotes from an AI Alignment Podcast interview with MacAskill
Will MacAskill: In terms of answering this alignment problem, the deep one of just where ought societies to be going [working out what’s actually right and what’s actually wrong and what ought we to be doing], I think the key thing is to punt it. The key thing is to get us to a position where we can think about and reflect on this question, and really for a very long time, so I call this the long reflection. Perhaps it’s a period of a million years or something. We’ve got a lot of time on our hands. There’s really not the kind of scarce commodity, so there are various stages to get into that state.
The first is to reduce extinction risks down basically to zero, put us a position of kind of existential security. The second then is to start developing a society where we can reflect as much as possible and keep as many options open as possible.
Something that wouldn’t be keeping a lot of options open would be, say we’ve solved what I call the control problem, we’ve got these kind of lapdog AIs that are running the economy for us, and we just say, “Well, these are so smart, what we’re gonna do is just tell it, ‘Figure out what’s right and then do that.'” That would really not be keeping our options open. Even though I’m sympathetic to moral realism and so on, I think that would be quite a reckless thing to do.
Instead, what we want to have is something kind of … We’ve gotten to this position of real security. Maybe also along the way, we’ve fixed the various particularly bad problems of the present, poverty and so on, and now what we want to do is just keep our options open as much as possible and then kind of gradually work on improving our moral understanding where if that’s supplemented by AI system …
I think there’s tons of work that I’d love to see developing how this would actually work, but I think the best approach would be to get the artificially intelligent agents to be just doing moral philosophy, giving us arguments, perhaps creating new moral experiences that it thinks can be informative and so on, but letting the actual decision making or judgments about what is right and wrong be left up to us. Or at least have some kind of gradiated thing where we gradually transition the decision making more and more from human agents to artificial agents, and maybe that’s over a very long time period.
What I kind of think of as the control problem in that second level alignment problem, those are issues you face when you’re just addressing the question of, “Okay. Well, we’re now gonna have an AI run economy,” but you’re not yet needing to address the question of what’s actually right or wrong. And then my main thing there is just we should get ourselves into a position where we can take as long as we need to answer that question and have as many options open as possible.
Lucas: I guess here given moral uncertainty and other issues, we would also want to factor in issues with astronomical waste into how long we should wait?
Will: Yeah. That’s definitely informing my view, where it’s at least plausible that morality has an aggregative component, and if so, then the sheer vastness of the future may, because we’ve got half a billion to a billion years left on Earth, a hundred trillion years before the starts burn out, and then … I always forget these numbers, but I think like a hundred billion stars in the Milky Way, ten trillion galaxies.
With just vast resources at our disposal, the future could be astronomically good. It could also be astronomically bad. What we want to insure is that we get to the good outcome, and given the time scales involved, even what seem like an incredibly long delay, like a million years, is actually just very little time indeed.
Lucas: In half a second I want to jump into whether or not this is actually likely to happen given race dynamics and that human beings are kind of crazy. The sort of timeline here is that we’re solving the technical control problem up into and on our way to sort of AGI and what might be superintelligence, and then we are also sort of idealizing everyone’s values and lives in a way such that they have more information and they can think more and have more free time and become idealized versions of themselves, given constraints within issues of values canceling each other out and things that we might end up just deeming to be impermissible.
After that is where this period of long reflection takes place, and sort of the dynamics and mechanics of that are seeming open questions. It seems that first comes computer science and global governance and coordination and strategy issues, and then comes long time of philosophy.
Will: Yeah, then comes the million years of philosophy, so I guess not very surprising a philosopher would suggest this. Then the dynamics of the setup is an interesting question, and a super important one.
One thing you could do is just say, “Well, we’ve got ten billion people alive today, let’s say. We’re gonna divide the universe into ten billionths, so maybe that’s a thousand galaxies each or something.” And then you can trade after that point. I think that would get a pretty good outcome. There’s questions of whether you can enforce it or not into the future. There’s some arguments that you can. But maybe that’s not the optimal process, because especially if you think that “Wow! Maybe there’s actually some answer, something that is correct,” well, maybe a lot of people miss that.
I actually think if we did that and if there is some correct moral view, then I would hope that incredibly well informed people who have this vast amount of time, and perhaps intellectually augmented people and so on who have this vast amount of time to reflect would converge on that answer, and if they didn’t, then that would make me more suspicious of the idea that maybe there is a real face to the matter. But it’s still the early days we’d really want to think a lot about what goes into the setup of that kind of long reflection.
[The discussion from that point to "If it’s the case that there is a right answer." are also very relevant.]
Cause prioritization for downside-focused value systems by Lukas Gloor
Quote from the article
I’m using the term downside-focused to refer to value systems that in practice (given what we know about the world) primarily recommend working on interventions that make bad things less likely. [...]
By contrast, other moral views place great importance on the potential upsides of very good futures [...] I will call these views upside-focused.
Some people have argued that even (very) small credences in upside-focused views [which roughly means moral views which place great importance on the potential upsides of very good futures], such as 1-20% for instance, would in itself already speak in favor of making extinction risk reduction a top priority because making sure there will still be decision-makers in the future provides high option value. I think this gives by far too much weight to the argument from option value. Option value does play a role, but not nearly as strong a role as it is sometimes made out to be. To elaborate, let’s look at the argument in more detail: The naive argument from option value says, roughly, that our descendants will be in a much better position to decide than we are, and if suffering-focused ethics or some other downside-focused view is indeed the outcome of their moral deliberations, they can then decide to not colonize space, or only do so in an extremely careful and controlled way. If this picture is correct, there is almost nothing to lose and a lot to gain from making sure that our descendants get to decide how to proceed.
I think this argument to a large extent misses the point, but seeing that even some well-informed effective altruists seem to believe that it is very strong led me realize that I should write a post explaining the landscape of cause prioritization for downside-focused value systems. The problem with the naive argument from option value is that the decision algorithm that is implicitly being recommended in the argument, namely focusing on extinction risk reduction and leaving moral philosophy (and s-risk reduction in case the outcome is a downside-focused morality) to future generations, makes sure that people follow the implications of downside-focused morality in precisely the one instance where it is least needed, and never otherwise. If the future is going to be controlled by philosophically sophisticated altruists who are also modest and willing to change course given new insights, then most bad futures will already have been averted in that scenario. An outcome where we get long and careful reflection without downsides is far from the only possible outcome. In fact, it does not even seem to me to be the most likely outcome (although others may disagree). No one is most worried about a scenario where epistemically careful thinkers with their heart in the right place control the future; the discussion is instead about whether the probability that things will accidentally go off the rails warrants extra-careful attention. (And it is not as though it looks like we are particularly on the rails currently either.) Reducing non-AI extinction risk does not preserve much option value for downside-focused value systems because most of the expected future suffering probably comes not from scenarios where people deliberately implement a solution they think is best after years of careful reflection, but instead from cases where things unexpectedly pass a point of no return and compassionate forces do not get to have control over the future. Downside risks by action likely loom larger than downside risks by omission, and we are plausibly in a better position to reduce the most pressing downside risks now than later. (In part because “later” may be too late.)
This suggests that if one is uncertain between upside- and downside-focused views, as opposed to being uncertain between all kinds of things except downside-focused views, the argument from option value is much weaker than it is often made out to be. Having said that, non-naively, option value still does upshift the importance of reducing extinction risks quite a bit – just not by an overwhelming degree. In particular, arguments for the importance of option value that do carry force are for instance:
- There is still some downside risk to reduce after long reflection
- Our descendants will know more about the world, and crucial considerations in e.g. infinite ethics or anthropics could change the way we think about downside risks (in that we might for instance realize that downside risks by omission loom larger than we thought)
- One’s adoption of (e.g.) upside-focused views after long reflection may correlate favorably with the expected amount of value or disvalue in the future (meaning: conditional on many people eventually adopting upside-focused views, the future is more valuable according to upside-focused views than it appears during an earlier state of uncertainty)
The discussion about the benefits from option value is interesting and important, and a lot more could be said on both sides. I think it is safe to say that the non-naive case for option value is not strong enough to make extinction risk reduction a top priority given only small credences in upside-focused views, but it does start to become a highly relevant consideration once the credences become reasonably large. Having said that, one can also make a case that improving the quality of the future (more happiness/value and less suffering/disvalue) conditional on humanity not going extinct is probably going to be at least as important for upside-focused views and is more robust under population ethical uncertainty – which speaks particularly in favor of highly prioritizing existential risk reduction through AI policy and AI alignment.
Much of the rest of that article is also somewhat relevant to the concept of the long reflection.
From memory, I think somewhat similar points are made in the interesting post The expected value of extinction risk reduction is positive, though that post doesn’t use the term “long reflection”.
Other places where the term was used in a relevant way
These are sources that explicitly refer to the concept of the long reflection, but which essentially just repeat parts of what the above quotes already say:
- The Forethought Foundation's list of research areas
- They have a modified version of the paragraph quoted at the start from the Global Priorities Institute. Forethought is affiliated with the Global Priorities Institute, and Will MacAskill is its Director.
- The Argument from Philosophical Difficulty
- Does climate change deserve more attention within EA?
- Winter Workshop 2019 Transcript: Whole Brain Emulation & AI Safety
These are sources which may say something new about the concept, but which I haven’t read properly, so I don’t want to risk misleadingly pulling quotes from them out of context:
- Artificial Intelligence, Values, and Alignment
- Deliberation as a method to find the "actual preferences" of humans
- Earning to give or waiting to give
Some other somewhat relevant concepts
- Bostrom’s concept of technological maturity: “the attainment of capabilities affording a level of economic productivity and control over nature close to the maximum that could feasibly be achieved.”
- “Stably good futures”: “those where society has achieved enough wisdom and coordination to guarantee the future against existential risks and other dystopian outcomes, perhaps with the aid of Friendly AI (FAI).”
- The post contrasts this against “Stably bad futures (‘bad outcomes’)[, which] are those where existential catastrophe has occurred.”
- Option value
- The relevance of the idea of option for the topic of existential risk reduction, and arguably for the long reflection, is discussed in the article by Lukas Gloor quoted from above, and in The expected value of extinction risk reduction is positive (though that post doesn’t use the term “long reflection”)
I hope you’ve found this post useful. Hopefully Toby Ord’s book and/or Will MacAskill’s book will provide a more comprehensive, detailed discussion of the concept, in which case this post can serve just as a record of how the concept was discussed in its early days. I’d also be interested to see EA Forum users writing up their own fleshed out versions of, critiques of, or thoughts on the long reflection, either as comments here or as their own posts.
And as I said earlier, please comment to let me know if you’re aware of any other relevant sources which I haven’t mentioned here.