If I want to prove that technological progress generally correlates with methods that involve more suffering, yes! Agreed.
But while the post suggests that this is a possibility, its main point is that suffering itself is not inefficient, such that there is no reason to expect progress and methods that involve less suffering to correlate by default (much weaker claim).
This makes me realize that the crux is perhaps this below part more than the claim we discuss above.
...
While I tentatively think the “the most efficient solutions to problems don
I do not mean to argue that the future will be net negative. (I even make this disclaimer twice in the post, aha.) :)
I simply argue that the convergence between efficiency and methods that involve less suffering argument in favor of assuming it'll be positive is unsupported.
There are many other arguments/considerations to take into account to assess the sign of the future.
Thanks!
Are you thinking about this primarily in terms of actions that autonomous advanced AI systems will take for the sake of optimisation?
Hum... not sure. I feel like my claims are very weak and true even in future worlds without autonomous advanced AIs.
"One large driver of humanity's moral circle expansion/moral improvement has been technological progress which has reduced resource competition and allowed groups to expand concern for others' suffering without undermining themselves".
Agreed but this is more similar to argument (A) fleshed out in this foo...
Thanks Vasco! Perhaps a nitpick but suffering still doesn't seem to be the limiting factor per se, here. If farmed animals were philosophical zombies (i.e., were not sentient but still had the exact same needs), that wouldn't change the fact that one needs to keep them in conditions that are ok enough to be able to make a profit out of them. The limiting factor is their physical needs, not their suffering itself. Do you agree?
I think the distinction is important because it suggests that suffering itself appears as a limiting factor only insofar as it is strong evidence of physical needs that are not met. And while both strongly correlate in the present, I argue that we should expect this to change.
Interesting, thanks Ben! I definitely agree that this is the crux.
I'm sympathetic to the claim that "this algorithm would be less efficient than quicksort" and that this claim is generalizable.[1] However, if true, I think it only implies that suffering is -- by default -- inefficient as a motivation for an algorithm.
Right after making my crux claim, I reference some of Tobias Baumann's (2022a, 2022b) work which gives some examples of how significant amounts of suffering may be instrumentally useful/required in cases such as scientific exp...
Thanks, Maxime! This is indeed a relevant consideration I thought a tiny bit about, and Michael St. Jules also brought that up in a comment on my draft.
First of all, it is important to note that UCC affects the neglectedness -- and potentially also the probability -- of "late s-risks", only (i.e., those that happen far away enough from now for the UCC selection to actually have the time to occur). So let's consider only these late s-risks.
We might want to differentiate between three different cases:
1. Extreme UCC (where suffering is not just ignored b...
Thanks for the comment!
Right now, in rich countries, we seem to live in an unusual period Robin Hanson (2009) calls "the Dream Time". You can survive valuing pretty much whatever you want, which is why there isn't much selection pressure on values. This likely won't go on forever, especially if Humanity starts colonizing space.
(Re religion. This is anecdotical but since you brought up this example: in the past, I think religious people would have been much less successful at spreading their values if they were more concerned about the suffering of the peop...
Thanks Will! :)
I think I haven't really thought about this possibility.
I know nothing about how things like false vacuum decay work (thankfully, I guess), about how tractable it is, and about how the minds of the agents would work on trying to trigger that operate. And my immediate impression is that these things matter a lot to whether my responses to the first two "obvious objections" sort of apply here as well and to whether "decay-conducive values" might be competitive.
However, I think we can at least confidently say that -- at least in the intra-civ s...
Thanks for giving arguments pointing the other way! I'm not sure #1 is relevant to our context here, but #2 is definitely worth considering. In the second post of the present sequence, I argue that something like #2 probably doesn't pan out, and we discuss an interesting counter-argument in this comment thread.
Thanks Miranda! :)
I personally think the strongest argument for reducing malevolence is its relevance for s-risks (see section Robustness: Highly beneficial even if we fail at alignment), since I believe s-risks are much more neglected than they should be.
And the strongest counter-considerations for me would be
Right so assuming no early value lock-in and the values of the AGI being (at least somewhat) controlled/influenced by its creators, I imagine these creators to have values that are grabby to varying extents, and these values are competing against one another in the big tournament that is cultural evolution.
For simplicity, say there are only two types of creators: the pure grabbers (who value grabbing (quasi-)intrinsically) and the safe grabbers (who are in favor of grabbing only if it is done in a "safe" way, whatever that means).
Since we're assuming there...
Thanks a lot for this comment! I linked to it in a footnote. I really like this breakdown of different types of relevant evolutionary dynamics. :)
Thanks for the comment! :) You're assuming that the AGI's values will be pretty much locked-in forever once it is deployed such that the evolution of values will stop, right? Assuming this, I agree. But I can also imagine worlds where the AGI is made very corrigible (such that the overseers stay in control of the AGI's values) and where intra-civ value evolution continues/accelerates. I'd be curious if you see reasons to think these worlds are unlikely.
If you had to remake this 3D sim of the expansion of grabby aliens based on your beliefs, what would look different, exactly? (Sorry, I know you already answer this indirectly throughout the post, at least partially.)
Do you have any reading to suggest on that topic? I'd be curious to understand that position more :)
Insightful! Thanks for taking the time to write these.
failing to act in perfect accord with the moral truth does not mean you're not influenced by it at all. Humans fail your conditions 4-7 and yet are occasionally influenced by moral facts in ways that matter.
Agreed and I didn't mean to argue against that so thanks for clarifying! Note however that the more you expect the moral truth to be fragile/complex, the further from it you should expect agents' actions to be.
...you expect intense selection within civilizations, such that their members behave so as to
Very interesting, Wei! Thanks a lot for the comment and the links.
TL;DR of my response: Your argument assumes that the first two conditions I list are met by default, which is I think a strong assumption (Part 1). Assuming that is the case, however, your point suggests there might be a selection effect favoring agents that act in accordance with the moral truth, which might be stronger than the selection effect I depict for values that are more expansion-conducive than the moral truth. This is something I haven't seriously considered and this made me...
Unfortunately we are unable to sponsor visas, so applicants must be eligible to work in the US.
Isn't it possible to simply contract (rather than employ) those who have or can get an ESTA, such that there's no need for a visa?
As far as I know, there are no estimates (at least not public ones). But as Stan pointed out, Tobias Baumann has raised some very relevant considerations in different posts/podcasts.
Fwiw, researchers at the Center on Long-Term Risk think AGI conflict is the most concerning s-risk (see Clifton 2019), although it may be hard to comprehend all the details of why they think that if you just read their posts and don't talk to them.
Thanks Oscar!
predicting future (hopefully wiser and better-informed) values for moral antirealists
Any reason to believe moral realists would be less interested in this empirical work? You seem to assume the goal is to update our values based on those of future people. While this can be a motivation (this is among those of Danaher 2021), we might also worry -- independently from whether we are moral realists or antirealists -- that the expected future evolution of values doesn't point towards something wiser and better-informed (since that's not what evolut...
Yeah so Danaher (2021) coined the term axiological futurism, but research on this topic has existed long before that. For instance, I find those two pieces particularly insightful:
They explore how compassionate values might be selected against because of evolutionary pressures, and be replaced by values more competitive for, e.g., space colonization races. In The Age of Em, Robin Hanson forecasts wh...
Very interesting post, thanks for writing this!
...1. Simulations are not the most efficient way for A and B to reach their agreement. Rather, writing out arguments or formal proofs about each other is much more computationally efficient, because nested arguments naturally avoid stack overflows in a way that nested simulations do not. In short, each of A and B can write out an argument about each other that self-validates without an infinite recursion. There are several ways to do this, such as using Löb's Theorem-like constructions (as in this&nbs
I really like the section S-risk reduction is separate from alignment work! I've been surprised by the extent to which people dismiss s-risks on the pretext that "alignment work will solve them anyway" (which is both insufficient and untrue as you pointed out).
I guess some of the technical work to reduce s-risks (e.g., preventing the "accidental" emergence of conflict-seeking preferences) can be considered a very specific kind of AI intent alignment (that only a few cooperative AI people are working on afaik) where we want to avoid worst-case scenarios.&nb...
(emphasis is mine)
For something to constitute an “s-risk” under this definition, the suffering involved not only has to be astronomical in scope (e.g., “more suffering than has existed on Earth so far”),[5] but also significant compared to other sources of expected future suffering. This last bit ensures that “s-risks,” assuming sufficient tractability, are always a top priority for suffering-focused longtermists.
Nitpick but you also need to assume sufficient likelihood, right? One might very well be a suffering-focused longtermist and thin...
Interesting! Thanks for writing this. Seems like a helpful summary of ideas related to s-risks from AI.
Another important normative reason for dedicating some attention to s-risks is that the future (conditional on humanity's survival) is underappreciatedly likely to be negative -- or at least not very positive -- from whatever plausible moral perspective, e.g., classical utilitarianism (see DiGiovanni 2021; Anthis 2022).
While this does not speak in favor of prioritizing s-risks per se, it obviously speaks against prioritizing X-risks which seem to be...
"[U]nderappreciatedly likely to be negative [...] from whatever plausible moral perspective" could mean many things. I maybe agree with the spirit behind this claim, but I want to flag that, personally, I think it's <10% likely that, if the wisest minds of the EA community researched and discussed this question for a full year, they'd conclude that the future is net negative in expectation for symmetric or nearly-symmetric classical utilitarianism. At the same time, I expect the median future to not be great (partly because I already think the current w...
Insightful! Thanks for writing this.
> Perhaps it will be possible to design AGI systems with goals that are cleanly separated from the rest of their cognition (e.g. as an explicit utility function), such that learning new facts and heuristics doesn’t change the systems’ values.
In that case, value lock-in is the default (unless corrigibility/uncertainty is somehow part of what the AGI values), such that there's no need for the "stable institution" you keep mentioning, right?
> But the one example of general intelligence we have — humans — instead ...
Oh interesting! Ok so I guess there are two possibilities.
1) Either by “supperrationalists”, you mean something stronger than “agents taking acausal dependences into account in PD-like situations”, which I thought was roughly Caspar’s definition in his paper. And then, I'd be even more confused.
2) Or you really think that taking acausal dependences into account is, by itself, sufficient to create a significant correlation in two decision-algorithms. In that case, how do you explain that I would defect against you and exploit you in one-shot PD (very sorry,...
Thanks for the reply! :)
By "copies", I meant "agents which action-correlate with you" (i.e., those which will cooperate if you cooperate), not "agents sharing your values". Sorry for the confusion.
Do you think all agents thinking superrationaly action-correlate? This seems like a very strong claim to me. My impression is that the agents with a decision-algorithm similar enough to mine to (significantly) action-correlate with me is a very small subset of all superrationalists . As your post suggests, even your past-self doesn't fully action-corr...
Caspar Oesterheld’s work on Evidential Cooperation in Large Worlds (ECL) shows that some fairly weak assumptions about the shape of the universe are enough to arrive at the conclusion that there is one optimal system of ethics: the compromise between all the preferences of all agents who cooperate with each other acausally. That would solve ethics for all practical purposes. It would therefore have enormous effects on a wide variety of fields because of how foundational ethics is.
ECL recommends that agents maximize a compromise utility function averaging t...
For EA group retreats, is it better to apply for the CEA event support you introduced, or for CEA's support group funding?
I haven't received anything on my side. I think a confirmation by email would be nice, yes. Otherwise, I'll send the application a second time just in case.
Thanks for writing this Jamie!
Concerning the "SHOULD WE FOCUS ON MORAL CIRCLE EXPANSION?" question, I think something like the following sub-question is also relevant: Will MCE lead to a "near miss" of the values we want to spread?
Magnus Vinding (2018) argues that someone who cares about a given sentient being, is absolutely not guaranteed to wish what we think is the best for this sentient being. While he argues from a suffering-focused perspective, the problem is still the same from any ethical framework.
For instance, future people who ...
I completely agree with 3 and it's indeed worth clarifying. Even ignoring this, the possibility of humans being more compassionate than pro-life grabby aliens might actually be an argument against human-driven space colonization, since compassion -- especially when combined with scope sensitivity -- might increase agential s-risks related to potential catastrophic cooperation failure between AIs (see e.g., Baumann and Harris 2021, 46:24), which are the most worrying s-risks according to Jesse Clifton's preface of CLR's agenda. A space filled with lif...
Interesting! Thank you for writing this up. :)
It does seem plausible that, by evolutionary forces, biological nonhumans would care about the proliferation of sentient life about as much as humans do, with all the risks of great suffering that entails.
What about the grabby aliens, more specifically? Do they not, in expectation, care about proliferation (even) more than humans do?
All else being equal, it seems -- at least to me -- that civilizations with very strong pro-life values (i.e., that thinks that perpetuating life is good and necessary, ...
Thank you for writing this.
- According to a survey of quantitative predictions, disappointing futures appear roughly as likely as existential catastrophes. [More]
It looks like that Bostrom and Ord included risks of disappointing futures in their estimates on x-risks, which might make this conclusion a bit skewed, don't you think?
Michael's definition of risks of disappointing futures doesn't include s-risks though, right?
a disappointing future is when humans do not go extinct and civilization does not collapse or fall into a dystopia, but civilization[1] nonetheless never realizes its potential.
I guess we get something like "risks of negative (or nearly negative) future" adding up the two types.
Great piece, thanks !
Since you devoted a subsection to moral circle expansion as a way of reducing s-risks, I guess you consider that its beneficial effects outweigh the backfire risks you mention (at least if MCE is done "in the right way"). CRS' 2020 End-of-Year Fundraiser post also induces optimism regarding the impact of increasing moral consideration for artificial minds (the only remaining doubts seem to be about when and how to do it).
I wonder how confident we should be about this (the positiveness of MCE in reducing s-risks), at this point? Have yo...
Thanks for writing this! :)
Another potential outcome that comes to mind regarding such projects is a self-fulfilling prophecy effect (provided the predictions are not secret). I have no idea how much of an (positive/negative) impact it would have though.
Interesting, thanks for sharing your thoughts on the process and stuff! (And happy to see the post published!) :)