Summary
In this post I try to outline some intuitions about the variance of value in “good” futures. Discussions of longtermism consider various futures that could be large in value and that justify focusing on longterm effects and on reducing existential risk. It seemed intuitive to me that the difference in the value of some of these futures (many orders of magnitude) should in some way affect our actions, perhaps by focusing us on just making a very good future likely as opposed to making a good future more likely. Below I try to flesh out this argument and my intuitions. I lay out two chains of reasoning and try to justify each step. I then try to consider different comments and possible objections to this reasoning. I conclude that this reasoning may be flawed, and may not change what actions we should take, but could change the reason we should pursue those actions.
Acknowledgements
Thanks to Simon Marshall and Aimee Watts for proof-reading, and to Charlotte Siegmann and Toby Tremlett for reviewing an earlier draft. All mistakes my own
Two arguments
(A) Good v. Optimal Futures
The reasoning I’m considering is of the following structure:
- There is a lot of variety in the value of futures we consider “good”. Some futures, which I’ll call “optimal”, are several orders of magnitude better than other seemingly good futures.
- It is plausible that the action that has the highest expected value is what increases the chance of an optimal future, not what increases the chance of a good future.
- It is possible that the action that best increases the chance of an optimal future is different to the one that best increases the chance of a “good future”
- Therefore we should choose the action that best increases the chance of an optimal future, even if it is not the action that best increases the chance of a “good future”
Below I try to justify each of the above claims.
A1 - In The Case for Strong Longtermism, Greaves and MacAskill consider how humanity could continue to exist on Earth for a billion years at a population of 10 billion people per century, giving a possible future of or 100 quadrillion people. In Astronomical Waste Nick Bostrom considers how the Virgo supercluster could support humans. If this were to last for a billion years, there could be humans in the future. Bostrom also considers harnessing the power of all the stars in the supercluster to simulate human minds, possibly simulating human lives a century. If this were to last for a billion years, there could be humans minds in the future. These futures are all huge, but the differences between them are enormous. Under expected utility theory, even a 1 in a trillion chance of a future full of simulated minds is better than a certainty of a future full of biological humans across the galaxy.
A2- We can consider a very basic model of the future. Suppose there is some limited period of existential risk, after which humanity reaches existential security and pursues a “good future”. Let the probability of humanity reaching existential security be p, and suppose there are 3 possible future trajectories after existential security with values , , and probabilities of happening respectively conditioned on humanity reaching existential security. Suppose is the value of humanity continuing to live on Earth for the next 1 billion years with 10 billion people. Suppose is the value of the galaxy filling with biological humans and existing as so for 1 billion years. Let be the value of the galaxy filling with simulated minds and existing as for 1 billion years.
Then using our numbers from above, , and . Suppose the time before reaching existential security is relatively short (e.g. a few centuries) so the expected value of the future is mainly from the time after reaching existential security. Then the expected value of the future is and . Then and . Putting in our numerical values gives and . Then given the size of (and assuming is not very small), we essentially have and . So if you can increase or by some small amount, you should choose to increase the smaller of the two. The question then is which is smaller, and how easy is it to affect or ? Suppose we can expend some amount of effort to work on reducing or , then by reducing , , and by reducing p3 we have. So what matters is whether or is larger. Suppose humanity is likely to reach existential security, say , and that it is unlikely to go down the simulated minds route, say . Then increasing has to be 50 times easier than increasing for it to be worth focusing on .
In general, we find that if you think we’re likely to converge to the optimal future if we reach existential security, then you should try to reduce existential risk, and if you think existential risk is low but we’re unlikely to converge on the optimal future, you should try to make that convergence more likely.
A3 - Methods for increasing include reducing AI risk, reducing biorisk and improving international cooperation. While these approaches may improve the overall chance of an optimal future, they do so by increasing not . might be increased by promoting valuation of simulated minds or the spreading of certain values such as altruism or impartiality. It is possible that spreading suchs values could reduce existential risk also. could also be increased by increasing the chance of a “singleton” forming, such as a dominant AGI or world government, which would then directly pursue the optimal future. Moreover, one could try to create institutions that will persist into the future and be able to alter humanity’s path once it reaches existential security. Overall though, it does seem that increasing is much harder than increasing .
A4- This point then follows from the previous ones. Note that even if it is better to increase , the motivation for this is because it is the best way to make most likely.
(B) Sensitivity to values
I believe a second point is how sensitive these optimal futures are to our values.
- Which good future you consider optimal is very sensitive to your values. People with similar but slightly different values will consider different futures optimal.
- Given A4, people with very similar but slightly different values will wish to pursue very different actions.
Below I try to justify each point.
B1 - The clearest difference is whether you value simulated minds or not. If you do, then a future of simulated minds is vastly better than anything else for you. But if you don’t, then a future of simulated minds is worthless and a great loss of potential. I believe you could also disagree on issues like the definition of pleasure or of a valuable life. If you think pleasure has a very large maximal limit, then if you have two slightly different definitions of pleasure the universes where they get maximised might diverge considerably.
B2 - Given that according to (A) the most impactful thing we can do would be to increase the likelihood of the optimal future, and this might be from making it more likely once humanity reaches existential security, then what action you should take varies highly with your values. Moreover, increasing the chance of your optimal future decreases the chance of someone else's optimal future, and you could be actively working against each other.
Objections and Comments
Value Convergence
The most obvious response seems to be that humanity will naturally converge towards the correct value system in the future and so we don’t have to worry about shepherding humanity too much once it reaches existential security. Then in the calculation above, would be very high and therefore working on reducing existential risk would probably be more impactful. This seems possible to me, though I’m quite uncertain how to think of moral progress, and whether it is too naively optimistic to think humanity will naturally converge to the best possible future.
Moral Uncertainty
We should probably be quite uncertain about our values, and reluctant to commit to prioritising one specific future that is optimal for our best guess of values, but is likely not the optimal future. Then trying to reduce existential risk seems pretty robustly good as it increases the chance of an optimal future, even if you don’t know which future that is. Moreover, a community of people with similar but slightly different values who are working to reduce existential risk might almost be a case of moral trade, where instead of actively working against each other on trajectories after existential security, they work on the mutually beneficial existential risk reduction.
S-risks
A consideration I did not include earlier is s-risks, the risk of a very bad future containing large amounts of suffering. If you thought this was possible and that these bad futures could be very large in scale, then reducing x-risk seems like a worse approach, and you might want to focus on making the optimal future more likely after reaching existential security. If you thought the suffering of an s-risk future could dwarf any possible optimal future, then your considerations could be dominated by reducing the chance of this s-risk, which could be from reducing its possibility once humanity reaches existential security.
Non-extinction x-risks
I’ve struggled how to fit non-extinction x-risks into the above model. It seems that a non-optimal future might just be considered a dystopia or a flawed realisation. Then this model is really just comparing between different x-risk interventions and I may not be applying the term existential security correctly. I am hopeful though that the options I’m comparing between are still quite different and worth comparing, even if both fall under a broader definition of “x-risk”.
Conclusion
Overall I’m quite uncertain about how these thoughts and considerations fit together, and it seems quite possible they don’t change our opinions of what we should be doing. But it seems possible that they could, and we should be aware of that possibility. If we are to work on reducing existential risk, it seems important to realise that we’re doing it because it’s the best way to increase the chance of an optimal future, not because we want to increase the chance of a good future, as reducing existential risk could stop being the best way to increase the chance of an optimal future.
I’d really appreciate any comments and feedback.
I agree with the general point that:
E[~optimal future] - E[good future] >> E[good future] - E[meh/no future]
It's not totally clear to me how much can be done to optimize chances of ~optimal future (as opposed to, there's probably a lot more that can be done to decrease X-risk), but I do have an intuition that probably some good work on the issue can be done. This does seem like an under-explored area, and I would personally like to see more research in it.
I'd also like to signal-boost this relevant paper by Bostrom and Shulman:
https://nickbostrom.com/papers/digital-minds.pdf
which proposes that an ~optimal compromise (along multiple axes) between human interests and totalist moral stances could be achieved by, for instance, filling 99.99% of the universe with hedonium and leaving the rest to (post-)humans.
Thanks for sharing this paper, I had not heard of it before and it sounds really interesting.
Thanks for sharing Rob! Here's a summary of my comments from our conversation:
Replace "optimal" with "great"
I think the terminology good vs "optimal" is a little confusing in that the probability of obtaining a future which is mathematically optimal seems to me to be zero. I'd suggest "great" instead.
Good, great, really great, really really great ...
(Using great rather than "optimal") I'd imagine that some great futures are also several orders of magnitude better than other seemingly great futures. I think here we'd really like to say something about the rates of decay.
Distribution over future value
Let X be the value of the future, which we suppose has some distribution pX(x). It's my belief that X would be essentially continuous variable but in this post you choose to distinguish between two types of future ("good" and "optimal") rather than consider a continuous distribution .
This choice could be justified if you believe a priori that pX is bimodal, or perhaps multi-modal. If this reflects your view, I think it would be good to make your reasoning on this more explicit.
One way to model the value of the future is
X=Resouces×Values×Efficiencywhere Resouces refers to the physical, spatial, temporal resources that humanity has access to, Valuesare our ethical beliefs about how best to use those resources and the final Efficiency term reflects our ability to use resources in prompting our values. In A2 your basic model of the future is suggestive of a multi-modal distribution over future Resouces. It does seem reasonable to me that this would be the case. I'm quite uncertain about the distributions on the other two terms which appear less physically constrained.
Thanks for your comment athowes. I appreciate your point that I could have done more in the post to justify this "binary" of good and optimal.
Though the simulated minds scenario I described seems at first to be pretty much optimal, it could be much larger if you thought it would last for many more years. Given large enough uncertainty about future technology, maybe seeking to identify the optimal future is impossible.
I think your resources, value and efficiency model is really interesting. My intuition is that values is the limiting factor. I can believe there are pretty strong forces that mean that humanity will eventually end up optimising resources and efficiency, but less confident values will converge to the best ones over time. This probably depends on whether you think a singleton will form at some point, and then it feels like the limit is how good the values of the singleton are.
I think he's saying "optimal future = best possible future", which necessarily has a non-zero probability.
Events which are possible may still have zero probability, see "Almost never" on this Wikipedia page. That being said I think I still might object even if it was ϵ-optimal (within some small number ϵ>0 of achieving the mathematically optimal future) unless this could be meaningfully justified somehow.
In terms of cardinal utility? I think drawing any line in the sand has problems when things are continuous because it falls right into a slippery slope (if ϵ doesn't make a real difference, what about drawing the line at 2ϵ, and then what about 3ϵ?).
But I think of our actions as discrete. Even if we design a system with some continuous parameter, the actual implementation of that system is going to be in discrete human actions. So I don't think we can get arbitrarily small differences in utility. Then maximalism (i.e. going for only ideal outcomes) makes sense when it comes to designing long-lasting institutions, since the small (but non-infinitesimal) differences add up across many people and over a long time.
Thanks for writing this Robert, I found it interesting! I just have one thought about a potential conclusion that may follow from your model.
I think you correctly note that friendly super-intelligent AI may be amongst the best ways to increase p3 (probability of optimal future). I have a feeling that friendly super-intelligent AI may also be one of the best ways to increase p (probability of achieving existential security). I guess I just don't trust us humans to achieve existential security ourselves.
If that's true I guess your model provides robustness to the claim that we should ensure development of AI goes well, and not just because if it doesn't it might pose an x-risk. In essence there are three reasons why a focus on AI may be important:
I'd be interested to hear others thoughts on potential conclusions that follow from Robert's model.
Thanks for your comment Jack, that's a really great point. I suppose that we would seek to influence AI slightly differently for each reason:
e.g. you could reduce the chance of AI risk by stopping all AI development but then lose the other two benefits, or you could create a practically useful AI but not one that would guide humanity towards an optimal future. That being said I reckon in practice a lot of work to improve the development of AI would hit all 3. Though maybe if you view one reason as much more important than the others then you focus on a specific type of AI work.
That makes sense. I'm no expert in AI but I would think:
Would be interesting to know if anyone thinks I'm wrong on any one of these points.