Thanks for asking — you can read more about these two sources of s-risk in Section 3.2 of our new intro to s-risks article. (We also discuss "near miss" there, but our current best guess is that such scenarios are significantly less likely than other s-risks of comparable scale.)
I agree with your reasoning here—while I think working on s-risks from AI conflict is a top priority, I wouldn't give Dawn's argument for it. This post gives the main arguments for why some "rational" AIs wouldn't avoid conflicts by default, and some high-level ways we could steer AIs into the subset that would.
Given that you can just keep doing better and better essentially indefinitely, and that GPT is not anywhere near the upper limit, talking about the difficulty of the task isn't super meaningful.
I don't understand this claim. Why would the difficulty of the task not be super meaningful when training to performance that isn't near the upper limit?
As an analogy: consider a variant of rock paper scissors where you get to see your opponent's move in advance---but it's encrypted with RSA. In some sense this game is much harder than proving Fermat's last theorem, since playing optimally requires breaking the encryption scheme. But if you train a policy and find that it wins 33% of the time at encrypted rock paper scissors, it's not super meaningful or interesting to say that the task is super hard, and in the relevant intuitive sense it's an easier task than proving Fermat's last theorem.
In "Against neutrality...," he notes that he's not arguing for a moral duty to create happy people, and it's just good "others things equal." But, given that the moral question under opportunity costs is what practically matters, what are his thoughts on this view?: "Even if creating happy lives is good in some (say) aesthetic sense, relieving suffering has moral priority when you have to choose between these." E.g., does he have any sympathy for the intuition that, if you could either press a button that treats someone's migraine for a day or one that cre...
I am (clearly) not Tobias, but I'd expect many people familiar with EA and LW would get something new out of Ch 2, 4, 5, and 7-11. Of these, seems like the latter half of 5, 9, and 11 would be especially novel if you're already familiar with the basics of s-risks along the lines of the intro resources that CRS and CLR have published. I think the content of 7 and 10 is sufficiently crucial that it's probably worth reading even if you've checked out those older resources, despite some overlap.
Anecdote: My grad school personal statement mentioned "Concrete Problems in AI Safety" and Superintelligence, though at a fairly vague level about the risks of distributional shift or the like. I got into some pretty respectable programs. I wouldn't take this as strong evidence, of course.
I'm fine with other phrasings and am also concerned about value lock-in and s-risks though I think these can be thought of as a class of x-risks
I'm not keen on classifying s-risks as x-risks because, for better or worse, most people really just seem to mean "extinction or permanent human disempowerment" when they talk about "x-risks." I worry that a motte-and-bailey can happen here, where (1) people include s-risks within x-risks when trying to get people on board with focusing on x-risks, but then (2) their further discussion of x-risks basically equates them with non-s-x-risks. The fact that the "dictionary definition" of x-risks would include s-risks doesn't solve this problem.
I think this is a valid concern. Separately, it's not clear that all s-risks are x-risks, depending on how "astronomical suffering" and "human potential" are understood.
What do you think about the concept of a hellish existential catastrophe? It highlights both that (some) s-risks fall under the category of existential risk and that they have an additional important property absent from typical x-risks. The concept isolates a risk the reduction of which should arguably be prioritized by EAs with different moral perspectives.
...e.g. 2 minds with equally passionate complete enthusiasm (with no contrary psychological processes or internal currencies to provide reference points) respectively for and against their own experience, or gratitude and anger for their birth (past or future). They can respectively consider a world with and without their existences completely unbearable and beyond compensation. But if we're in the business of helping others for their own sakes rather than ours, I don't see the case for excluding either one's concern from our moral circle.
..
...Having said that, I do think the "deeper intuition that the existing Ann must in some way come before need-not-ever-exist-at-all Ben" plausibly boils down to some kind of antifrustrationist or tranquilist intuition. Ann comes first because she has actual preferences (/experiences of desire) that get violated when she's deprived of happiness. Not creating Ben doesn't violate any preferences of Ben's.
certainly don't reflect the kinds of concerns expressed by Setiya that I was responding to in the OP
I agree. I happen to agree with you that the attempts to accommodate the procreation asymmetry without lexically disvaluing suffering don't hold up to scrutiny. Setiya's critique missed the mark pretty hard, e.g. this part just completely ignores that this view violates transitivity:
...But the argument is flawed. Neutrality says that having a child with a good enough life is on a par with staying childless, not that the outcome in which you have a child is equa
appeal to some form of partiality or personal prerogative seems much more appropriate to me than denying the value of the beneficiaries
I don't think this solves the problem, at least if one has the intuition (as I do) that it's not the current existence of the people who are extremely harmed to produce happy lives that makes this tradeoff "very repugnant." It doesn't seem any more palatable to allow arbitrarily many people in the long-term future (rather than the present) to suffer for the sake of sufficiently many more added happy lives. Even if those liv...
I think such views have major problems, but I don’t talk about those problems in the book. (Briefly: If you think that any X outweighs any Y, then you seem forced to believe that any probability of X, no matter how tiny, outweighs any Y. So: you can either prevent a one in a trillion trillion trillion chance of someone with a suffering life coming into existence, or guarantee a trillion lives of bliss. The lexical view says you should do the former. This seems wrong, and I think doesn’t hold up under moral uncertainty, either. There are ways of avo...
The Asymmetry endorses neutrality about bringing into existence lives that have positive wellbeing, and I argue against this view for much of the population ethics chapter, in the sections “The Intuition of Neutrality”, “Clumsy Gods: The Fragility of Identity”, and “Why the Intuition of Neutrality is Wrong”.
You seem to be using a different definition of the Asymmetry than Magnus is, and I'm not sure it's a much more common one. On Magnus's definition (which is also used by e.g. Chappell; Holtug, Nils (2004), "Person-affecting Moralities"; and ...
Are you saying that from your and Teo's POVs, there's a way to 'improve a mental state' that doesn't amount to decreasing suffering (/preventing it)?
No, that's precisely what I'm denying. So, the reason I mentioned that "arbitrary" view was that I thought Jack might be conflating my/Teo's view with one that (1) agrees that happiness intrinsically improves a mental state, but (2) denies that improving a mental state in this particular way is good (while improving a mental state via suffering-reduction is good).
Such an understanding seems plausible in a s...
Some things I liked about What We Owe the Future, despite my disagreements with the treatment of value asymmetries:
I think one crux here is that Teo and I would say, calling an increase in the intensity of a happy experience "improving one's mental state" is a substantive philosophical claim. The kind of view we're defending does not say something like, "Improvements of one's mental state are only good if they relieve suffering." I would agree that that sounds kind of arbitrary.
The more defensible alternative is that replacing contentment (or absence of any experience) with increasingly intense happiness / meaning / love is not itself an improvement in mental state. And this follows from intuitions like "If a mind doesn't experience a need for change (and won't do so in the future), what is there to improve?"
Is it thought experiments such as the ones Magnus has put forward? I think these argue that alleviating suffering is more pressing than creating happiness, but I don't think these argue that creating happiness isn't good.
I think they do argue that creating happiness isn't intrinsically good, because you can always construct a version of the Very Repugnant Conclusion that applies to a view that says suffering is weighed some finite X times more than happiness, and I find those versions almost as repugnant. E.g. suppose that on classical utilitarianism we...
This only applies to flavors of the Asymmetry that treat happiness as intrinsically valuable, such that you would pay to add happiness to a "neutral" life (without relieving any suffering by doing so). If the reason you don't consider it good to create new lives with more happiness than suffering is that you don't think happiness is intrinsically valuable, at least not at the price of increasing suffering, then you can't get Dutch booked this way. See this comment.
I didn't directly respond to the other one because the principle is exactly the same. I'm puzzled that you think otherwise.
Removing their sadness at separation while leaving their desire to be together intact isn't a clear Pareto improvement unless one already accepts that pain is what is bad.
I mean, in thought experiments like this all one can hope for is to probe intuitions that you either do or don't have. It's not question-begging on my part because my point is: Imagine that you can remove the cow's suffering but leave everything else practically the...
Here's another way of saying my objection to your original comment: What makes "happiness is intrinsically good" more of an axiom than "sufficiently intense suffering is morally serious in a sense that happiness (of the sort that doesn't relieve any suffering) isn't, so the latter can't compensate for the former"? I don't see what answer you can give that doesn't appeal to intuitions about cases.
For all practical purposes suffering is dispreferred by beings who experience it, as you know, so I don't find this to be a counterexample. When you say you don't want someone to make you less sad about the problems in the world, it seems like a Pareto improvement would be to relieve your sadness without changing your motivation to solve those problems—if you agree, it seems you should agree the sadness itself is intrinsically bad.
No, I know of no thought experiments or any arguments generally that make me doubt that suffering is bad. Do you?
On a really basic level my philosophical argument would be that suffering is bad, and pleasure is good (the most basic of ethical axioms that we have to accept to get consequentialist ethics off the ground).
It seems like you're just relying on your intuition that pleasure is intrinsically good, and calling that an axiom we have to accept. I don't think we have to accept that at all — rejecting it does have some counterintuitive consequences, I won't deny that, but so does accepting it. It's not at all obvious (and Magnus's post points to some reasons we might favor rejecting this "axiom").
This is how Parfit formulated the Repugnant Conclusion, but the way it's usually referred to in population ethics discussions about the (de)merits of total symmetric utilitarianism, it need not be the case that the muzak and potatoes lives never suffer.
The real RC that some kinds of total views face is that world A with lives of much more happiness than suffering is worse than world Z with more lives of just barely more happiness than suffering. How repugnant this is, for some people like myself, depends on how much happiness or suffering is in those lives...
which goes against the belief in a net-positive future upon which longtermism is predicated
Longtermism per se isn't predicated on that belief at all—if the future is net-negative, it's still (overwhelmingly) important to make future lives less bad.
But I want to be clear that this normative disagreement isn't evidence of any philosophical defect on our part.
Oh I absolutely agree with this. My objections to that quote have no bearing on how legitimate your view is, and I never claimed as much. What I find objectionable is that by using such dismissive language about the view you disagree with, not merely critical language, you're causing harm to population ethics discourse. Ideally readers will form their views on this topic based on their merits and intuitions, not based on claims that views are "too...
It seems like you're conflating the following two views:
I would claim #2, not #1, and presumably so would Michael. The quote about nihilism etc. is objectionable because it's not just unsympathetic to such views, it's condescending. Clearly many people who have reflected carefully about ethics thi...
One is that views of the "making people happy" variety basically always wind up facing structural weirdness when you formalize them. It was my impression until recently that all of these views imply intransitive preferences (i.e something like A>B>C>A), until I had a discussion with Michael St Jules in which he pointed out more recent work that instead denies the independence of irrelevant alternatives.
It depends if by valuing "making people happy" one means 1) intrinsically valuing adding happiness to existing people's lives, or 2) valuing "m...
Second, I might be mistaken about what this agent’s choice would be. For instance, perhaps the lake is so cold that the pain of jumping in is of greater moral importance than any happiness I obtain.
Yeah, I think this is pretty plausible at least for sufficiently horrible forms of suffering (and probably all forms, upon reflection on how bad the alternative moral views are IMO). I doubt my common sense intuitions about bundles of happiness and suffering can properly empathize, in my state of current comfort, with the suffering-moments.
But given you said t...
There is a defense of ideas related to your position here
For the record I also don't find that post compelling, and I'm not sure how related it is to my point. I think you can coherently think that the moral truth is consistent (and that ethics is likely to not be consistent if there is no moral truth), but be uncertain about it. Analogously I'm pretty uncertain what the correct decision theory is, and think that whatever that decision theory is, it would have to be self-consistent.
I also would be interested in seeing someone compare the tradeoffs on non- views vs person-affecting. E.g. person affecting views might entail X weirdness, but maybe X weirdness is better to accept than the repugnant conclusion, etc.
Agreed—while I expect people's intuitions on which is "better" to differ, a comprehensive accounting of which bullets different views have to bite would be a really handy resource. By "comprehensive" I don't mean literally every possible thought experiment, of course, but something that gives a sense of the significant consi...
Also, moral realism seems more predictive of ethics being consistent, not less. (Not consistent with our unreflected intuitions, though.)
I'm confused — welfare economics seems premised on the view that interpersonal comparisons of utility are possible. In any case, ethics =/= economics; comparisons of charity effectiveness aren't assessing interpersonal "utility" in the sense of VNM preferences, they're concerned with "utility" in the sense of e.g. hedonic states, life satisfaction, so-called objective lists, and so on.
No, longtermism is not redundant
I’m not keen on the recent trend of arguments that persuading people of longtermism is unnecessary, or even counterproductive, for encouraging them to work on certain cause areas (e.g., here, here). This is for a few reasons:
I think this is just an equivocation of "utility." Utility in the ethical sense is not identical to the "utility" of von Neumann Morgenstern utility functions.
It's notable that a pilot study (N = 172, compared to N = 474 for the results given in Fig. 1) discussed in the supplementary materials of this paper suggests a stronger suffering/happiness asymmetry in people's intuitions about creating populations. e.g. In response to the question, “Suppose you could push a button that created a new world with X people who are generally happy and 10 people who generally suffer. How high would X have to be for you to push the button?”, the median response was X = 1000.
For a mundane example, imagine I'm ambivalent about mini-golfing. But you know me, and you suspect I'll love it, so you take me mini-golfing. Afterwards, I enthusiastically agree that you were right, and I loved mini-golfing.
It seems you can accommodate this just as well, if not better, within a hedonistic view—you didn't prefer to go mini-golfing, but mini-golfing made you happier once you tried it, so that's why you endorse people encouraging you to try new things. (Although I'm inclined to say, it really depends on what you would've otherwise done with your time instead of mini-golfing, and if someone is fine not wanting something, it's reasonable to err on the side of not making them want it.)
In Defense of Aiming for the Minimum
I’m not really sympathetic to the following common sentiment: “EAs should not try to do as much good as feasible at the expense of their own well-being / the good of their close associates.”
It’s tautologically true that if trying to hyper-optimize comes at too much of a cost to the energy you can devote to your most important altruistic work, then trying to hyper-optimize is altruistically counterproductive. I acknowledge that this is the principle behind the sentiment above, and evidently some people’s effectiveness has...
For what it's worth, my experience hasn't matched this. I started becoming concerned about the prevalence of net-negative lives during a particularly happy period of my own life, and have noticed very little correlation between the strength of this concern and the quality of my life over time. There are definitely some acute periods where, if I'm especially happy or especially struggling, I have more or less of a system-1 endorsement of this view. But it's pretty hard to say how much of that is a biased extrapolation, versus just a change in the size of my empathy gap from others' suffering.
But only some s-risks are very concerning to utilitarians -- for example, utilitarians don't worry much about the s-risk of 10^30 suffering people in a universe with 10^40 flourishing people.
Utilitarianism =/= classical utilitarianism. I'm a utilitarian who would think that outcome is extremely awful. It depends on the axiology.
Longtermism, as a worldview, does not want present day people to suffer; instead, it wants to work towards a future with as little suffering as possible, for everyone.
This is a bit misleading. Some longtermists, myself included, prioritizing minimizing suffering in the future. But this is definitely not a consensus among longtermists, and many popular longtermist interventions will probably increase future suffering (by increasing future sentient life, including mostly-happy lives, in general).
I think the strength of these considerations depends on what sort of longtermist intervention you're comparing to, depending on your ethics. I do find the abject suffering of so many animals a compelling counter to prioritizing creating an intergalactic utopia (if the counterfactual is just that fewer sentient beings exist in the future). But some longtermist interventions are about reducing far greater scales of suffering, by beings who don't matter any less than today's animals. So when comparing to those interventions, while of course I feel really horr...
Longtermism is probably not really worth it if the far future contains much more suffering than happiness
Longtermism isn't synonymous with making sure more sentient beings exist in the far future. That's one subset, which is popular in EA, but an important alternative is that you could work to reduce the suffering of beings in the far future.
Thanks for the kind feedback. :) I appreciated your post as well—I worry that many longtermists are too complacent about the inevitability of the end of animal farming (or its analogues for digital minds).
Ambitious value learning and CEV are not a particularly large share of what AGI safety researchers are working on on a day-to-day basis, AFAICT. And insofar as researchers are thinking about those things, a lot of that work is trying to figure out whether those things are good ideas the first place, e.g. whether they would lead to religious hell.
Sure, but people are still researching narrow alignment/corrigibility as a prerequisite for ambitious value learning/CEV. If you buy the argument that safety with respect to s-risks is non-monotonic in proximity ...
I'm pretty happy to bite that bullet, especially since I'm not an egoist. I should still leave my house because others are going to suffer far worse (in expectation) if I don't do something to help, at some risk to myself. It does seem strange to say that if I didn't have any altruistic obligations then I shouldn't take very small risks of horrible experiences. But I have the stronger intuition that those horrible experiences are horrible in a way that the nonexistence of nice experiences isn't. And that "I" don't get to override the preference to avoid such experiences, when the counterfactual is that the preferences for the nice experiences just don't exist in the first place.
My understanding is that: