Future-proof ethics

AI Use Note: Main body text entirely human written. Claude (Opus 4.8) helped develop models of animal life histories in the appendix. Cross-posted from Good Structures. Executive Summary * Animal advocates sometimes make claims like “there are X of this animal...

151

Let's taboo the V-word

lincolnq·4d ago·8m read

“How long have you been v*g*n?” This is one of the most common icebreakers at animal protection events. It’s a baseline assumption, and it mostly holds true: if you’re out advocating for animals not to be tortured or abused, realistically these days you are v**n, or close. And it makes for good conversation. It seems fairly safe to assume when you meet strangers. But this assumption is hurting the movement in a way which we don’t always notice: someone new comes into the sp...

Spiro: an update 2.5 years on and a fundraising ask for expansion

Habiba Banu·1d ago·6m read

Summary Back in November 2023 I posted here to launch Spiro and raise our first $198k. Two and a half years later this is an update and a fundraiser for the next step. The short version: we've now reached over-5,900 people with TB preventive medicine, including over 3,000 children under five years old. Our early results have held up well an...

Recent opportunities to take action

EA Organisation Updates thread: July 2026

Dane Valerie·3d ago·1m read

Help us launch AI safety university groups by referring potential founders

Jason Chin🔸·12h ago·4m read

Save the date: Swiss AI Safety Days 2026 (7-8 November, ETH Zurich)

Andre Santos 🔸, patrickwidmann, mariuswenk·15h ago·1m read

AppliedDivinityStudies

Do you have a stronger argument for why we should want to future-proof ethics? From the perspective of a conservative Christian born hundreds of years ago, maybe today's society is very sinful. What would compel them to adopt an attitude such that it isn't?

Similarly, say in the future we have moral norms that tolerate behavior we currently see as reprehensible. Why would we want to adopt those norms? Should we assume that morality will make monotonic progress, just because we're repulsed by some past moral norms? That doesn't seem to follow. In fact, it seems plausible that morality has simply shifted. From the outside view, there's nothing to differentiate "my morality is better than past morality" from "my morality is different than past morality, but not in any way that makes it obviously superior".

You can imagine, for example, a future with sexual norms we would today consider reprehensible. Is there any reason I should want to adopt them?

[anonymous]

From the perspective of a conservative Christian born hundreds of years ago, maybe today's society is very sinful.

We don't have to argue about Christians born hundreds of years ago; I know that conservative Christians today also think today's society is very sinful.

This example isn't compelling to me, because as an inherently theistic religion, conservative Christianity seems fundamentally flawed to me in its understanding of empirical facts. But we could easily replace conservative Christianity in this example with more secular ancient philosophies, such as Confucianism, Jainism, or Buddhism, abstracting away the components that involve belief in the supernatural. It seems to me that these people would still perceive our society's moral beliefs as in a state of severe moral decline.

We see moral progress over time simply because over time, morals have shifted closer to our own. But conversely, people in the past would see morals declining over time. I think we should expect future evolutions in morality to likewise be viewed by present-day people as moral decline. This undercuts much of the intuitive appeal of future-proof ethics, though I believe it is still worthwhile to aspire to.

Holden Karnofsky

I don't think we should assume future ethics are better than ours, and that's not the intent of the term. I discuss what I was trying to do more here.

splinter

These are good questions, and I think the answer generally is yes, we should be disposed to treating the future's ethics as superior to our own, although we shouldn't be unquestioning about this.

The place to start is simply to note that obvious fact that moral standards do shift all the time, often in quite radical ways. So at the very least we ought to assume a stance of skepticism toward any particular moral posture, as we have reason to believe that ethics in general are highly contingent, culture-bound, etc.

Then the question becomes whether we have reasons to favor some period's moral stances over any others. There are a variety of reasons we might do so:

Knowledge has been increasing monotonically, and in recent years extremely rapidly. Much of this knowledge is scientific , technological, or involves other kinds of expertise, and such knowledge does have a moral valence. E.g., we do not believe in witches anymore.
Some of our increasing knowledge is historical and philosophical. The Catholic church did a lot of things in the middle ages that to me seem very bad but seemed to the church at the time morally justified. But I also have access to a lot of historical information about the middle ages, and I can situate the church's actions in a broader story about politics, empire, religious conflict, etc., that undercuts the church's moral claims. Other things being equal, we probably are wise to privilege later time periods over earlier time periods because later time periods saw how things turned out. Nazism seemed like a moral imperative to Nazis, but here in 2022, I know how WWII played out. (Spoiler alert: not well!)
The moral changes that have occurred over time are not random, and we can apply meta-ethics to them to try to understand how things have changed. We used to condone slavery and now we abhor it. Is that just happenstance, such that in some alternate history we used to abhor slavery (perhaps for religious reasons) and now embrace it (perhaps because of the logic of capitalism)? Probably not, because across the board the ethical trend has been an extension of rights, franchise, and dignity to widening circles of humans. So we can ask whether we think that is a good ethical trend and draw conclusions about the relative merits of different moral frameworks.
Wealth has also been increasingly more or less monotonically, and insofar as moral behavior might be considered a luxury good, we should suppose that it may be more abundant these days than in past. (This claim deserves a ton of scrutiny. I think it probably is true in some spheres -- e.g., gender equality -- and maybe less so in others.)

I want to stress that I don't think these arguments are absolute proof of anything; they are simply reasons we should be disposed to privilege the broad moral leanings of the future over those of the past. Certainly I think over short time spans, many moral shifts are highly contingent and culture-bound. I also think that broad trends might mask a lot of smaller trends that could bounce around much more randomly. And it is absolutely possible that some long-term trends will be morally degrading. For example, I am also not at all sure that long-term technological trends are well-aligned with human flourishing.

It is very easy to imagine that future generations will hold moral positions that we find repugnant. Imagine, for example, that in the far future pregnancy is obsolete. The vast majority of human babies are gestated artificially, which people of the future find safer and more convenient than biological pregnancy. Imagine as a consequence of this that viable fetuses become much more abundant, and people of the future think nothing of raising multiple babies until they are say, three months old, selecting the "best" one based on its personality, sleeping habits, etc., and then painlessly euthanizing the others. Is this a plausible future scenario, or do meta-ethical trends suggest we shouldn't be concerned about it? If we look into our crystal ball and discover that this is in fact what our ancestors get up to, should we conclude that in the future technological progress will degrade the value of human life in a way that is morally perverse? Or should we conclude instead that technological progress will undermine some of our present-day moral beliefs that aren't as well-grounded as we think they are? I don't have a definitive answer, but I would at least suggest that we should strongly consider the latter.

AppliedDivinityStudies

across the board the ethical trend has been an extension of rights, franchise, and dignity to widening circles of humans

I have two objections here.
1) If this is the historical backing for wanting to future-proof ethics, shouldn't we just do the extrapolation from there directly instead of thing about systematizing ethics? In other words, just extent rights to all humans now and be done with it.
2) The idea that the ethical trend has been a monotonic widening is a bit self-fulfilling, since we don't no longer consider some agents to be morally important. I.e. the moral circle has narrowed to exclude ancestors, ghosts, animal worship, etc. See Gwern's argument here:
https://www.gwern.net/The-Narrowing-Circle

splinter

I'm not totally sure what #1 means. But it doesn't seem like an argument against privileging future ethics over today's ethics.

I view #2 as very much an argument in favor of privileging future ethics. We don't give moral weight to ghosts and ancestors anymore because we have improved our understanding of the world and no longer view these entities as having consciousness or agency. Insofar as we live in a world that requires tradeoffs, it would be actively immoral to give weight to a ghost's wellbeing when making a moral decision.

MattBall

>In theory, any harm can be outweighed by something that benefits a large enough number of persons, even if it benefits them in a minor way.

Holden, do you know of any discussion that doesn't rest on that assumption? It is where I get off the train:

https://www.mattball.org/2021/09/why-i-am-not-utilitarian-repost-from.html

Thanks

Kenny Easwaran

The keywords in the academic discussion of this issue are the "Archimedean principle" (I forget if Archimedes was applying it to weight or distance or something else, but it's the general term for the assumption that for any two quantities you're interested in, a finite number of one is sufficient to exceed the other - there are also various non-Archimedean number systems, non-Archimedean measurement systems, and non-Archimedean value theories) and "lexicographic" preference (the idea is that when you are alphabetizing things like in a dictionary/lexicon, any word that begins with an M comes before any word that begins with a N, no matter how many Y's and Z's the M word has later and how many A's and B's the N word has later - similarly, some people argue that when you are comparing two states of affairs, any state of affairs where there are 1,000,001 living people is better than any state of affairs where there are 1,000,000 living people, no matter how impoverished the people in the first situation are and how wealthy the people in the second situation are). I'm very interested in non-Archimedean measurement systems formally, though I'm skeptical that they are relevant for value theory, and of the arguments for any lexicographic preference for one value over another, but if you're interested in these questions, those are the terms you should search for. (And you might check out PhilPapers.org for these searches - it indexes all of the philosophy journals that I'm aware of, and many publications that aren't primarily philosophy.)

MattBall

Thanks Kenny!
I think it is the main bias in EAs -- we so easily add up things in our minds (e.g., summing happiness across individuals) that we don't stop to realize that there is no "cosmic" place where all that happiness is occurring. There are just individual minds.

Holden Karnofsky

I appreciate Kenny's comments pointing toward potentially relevant literature, and agree that you could be a utilitarian without fully biting this bullet ... but as far as I can tell, attempts to do so have enough weird consequences of their own that I'd rather just bite the bullet. This dialogue gives some of the intuition for being skeptical of some things being infinitely more valuable than others.

Eric

How does the potential pain/suffering that a fetus may experience factor into this system of ethics? Many liberals advocate for expanding the circle of compassion to animals, but I rarely hear anyone make the consistent leap to fetuses (at least those past a certain stage of development).

[anonymous]

A quick point on the track record of utilitarianism. One counter-example is James Fitzjames Stephen, a judge and utilitarian philosopher who wrote a trenchant critique of J.S. Mill's arguments in On Liberty and his defence of the rights of women. This is in Stephen's Liberty, Equality and Fraternity.

It does seem that the most famous utilitarians were ahead of the curve, but I do wonder whether they are famous in part because their ideas won the day. There may have been other utilitarians arguing for different opinions.

1mkl32j201091

The only meta-ethical justification we should care about is our ethical theory being true. We should only care about a ethical theory being aesthetically pleasing, "fit for a the modern age", easily explainable, future-proofed, or having other qualities to the extent that it correlates with truthfulness. I see the future-proof goal as misguided. To me, it feels as though you may have selected this meta-ethical principle with the idea of justifying your ethical theory rather than having this meta-ethical theory and using it to find an ethical there which coheres to it.

I could be a Christian and use the meta-ethical justification "I want an ethical theory uncorrupted by 21st century societal norms!" But like the utilitarian, this would seem selected in a biased way to reach my conclusion. I could have a number of variables like aesthetically pleasing, easily communicable, looked upon favorably by future humans and so forth, but the only variable I'm maximizing on is truth.

Your goal is to select an ethical theory that will be looked upon favorably by future humans. You want this because you believe in moral progress. You believe in moral progress because you look down on past humans as less moral than more recent humans. You look down on past humans as less moral because they don't fit your ethical theory. This is circular; your method for selecting an ethical theory uses an ethical theory to determine it is a good method.

That is: simply going with our intuitions and societal norms has, in the past, meant endorsing all kinds of insanity.

The irony is that this can be presented as insanity and horrible without justification. There is no need to say why lynching and burning humans at picnics is bad. Karnofsky does not even try to apply a utility analysis to dissuading crimes via lynch mobs or discuss the effectiveness of waterboardining or the consequences of the female vote. He doesn't need to do this because these things are intuitively immoral. Ironically, it goes without saying because of intuition.

Once again, we can flip the argument. I could take someone from 1400 and tell him that homosexuality is legalized and openly practiced. In some places, teenage boys are encouraged to openly express their homosexuality by wearing flag pins. A great deal of homosexuals actually have sex with many men. Every adult, and unfortunately many minors, has access to a massive video library of sexual acts which illicit feelings of disgust in even the most open minded. If this man from 1400 saw the future as a bleak and immoral place which we should avoid becoming, how would you convince him he was wrong. Why are your intuitions right and his intuitions wrong? What objective measure are you using? If he formulated a meta-ethical principle that "We should not become like the future", what would be wrong with that?

My take is that intuitions are imperfect, but they are what we have. I think that the people who hung homosexuals probably had an intuitive sense that it was immoral, but religious ferver was overwhelming. There are evil and wicked people that existed in the past, but there were also people who saw these things as immoral. I'm sure many saw burning and lynching humans as repugnant. Intuitions are the only tool we have for determining right from wrong. The fact that people were wrong in the past is not a good reason to say that we can't use intuition whatsoever.

Very intelligent people of a past era used the scientific method, deduction and inductive inference to reach conclusions that were terribly wrong. These people were often motivated by their ideological desires or influenced by their peers and culture. People thought the earth was at the center of the solar system and they had elaborate theories. I don't think Karnofsky is arguing we should throw out intuitions entirely, but for those who don't believe in intuitions: we can't throw out intuitions like we can't throw out the scientific method, deduction and induction because people of a past era were wrong.

The most credible candidate for a future-proof ethical system, to my knowledge, rests on three basic pillars:
Systemization: seeking an ethical system based on consistently applying fundamental principles, rather than handling each decision with case-specific intuitions. More
Thin utilitarianism: prioritizing the "greatest good for the greatest number," while not necessarily buying into all the views traditionally associated with utilitarianism. More
Sentientism: counting anyone or anything with the capacity for pleasure and suffering - whether an animal, a reinforcement learner (a type of AI), etc. - as a "person" for ethical purposes. More

How do we know the people of the future won't be non-systemitizing, non-utilitarian and not care about AI or animals quite as much? I think in order to think they will, we must believe in moral progress. In order to believe moral progress results in these beliefs, we must believe that our moral theory is the actually correct one.

I just think that you can flip these things around so easily and apply them to stuff that isn't utilitarianism and sentientism. I think that Roman Catholicism would be a good example of a future proofed ethical system. They laid out a system of rules and took it where it goes. Even if it seems unintuitive to modern Catholics to oppose homosexuality or if in the past it felt okay to commit infanticide or abortion, we should just follow the deep truths of the doctrine. I don't think we can just say "well Catholicism is wrong." I think the Catholic ethical code is wrong, but I think it meets your systematizing heuristic.

Let’s start with a basic, appealing-seeming principle for ethics: that it should be other-centered. That is, my ethical system should be based as much as possible on the needs and wants of others, rather than on my personal preferences and personal goals.

Once again, I'll just flip this and say that ethics should be God centered. It should be based as much as possible on the needs and wants of others. Why is the God centered principle false and your principle true? Intuition? How do we know the future will be other centered ethics?

In general, I'm committed to some non-utilitarian personal codes of ethics, such as (to simplify) "deceiving people is bad" and "keeping my word is good." I'm only interested in applying utilitarianism within particular domains (such as "where should I donate?") where it doesn't challenge these codes.

I'm confused. How are you getting these principles? Why are you not following precisely the system you just argued for.

Charles He

The only meta-ethical justification we should care about is our ethical theory being true.

How would you find Truth?

1mkl32j201091

I think there are two methods that people use. You could deduce ethical rules from some truths or you could believe it is most probable given the evidence. I think that intuitions are the only form of evidence possible. Something seeming true is a prima facie justification for that ethical truth. We accept intuition in the form of perception, memory knowledge, mathematical knowledge, etc. I don't find it as much of a leap to accept it in the case of moral truths. Torturing an infant seems wrong and that is evidence it is wrong. I think I remember my name on here is Parrhesia and so that is at least some reason to think my name on here is Parrhesia.

Michael St Jules 🔸

I think welfare-based benificience, impartiality and at least limited aggregation do the most important work of thin utilitarianism, and I don't think you need additivity or that any harm can be outweighed by a large enough sum of tiny benefits, so that we should allow someone to be electrocuted in an accident to avoid interrupting a show a very large crowd is enjoying.

MattBall

Michael, this is kinda what I'm looking for. What does "limited aggregation" mean / do in your case.

Michael St Jules 🔸

Sorry I didn't see this until now.

"Limited aggregation" allows you to say that two people suffering is worse than one and make some tradeoffs between numbers and severity without very small changes in welfare aggregating across separate individuals to outweigh large changes. "Limited aggregation" is a term in the literature, and I think it usually requires giving up the independence of irrelevant alternatives.

Almost all social welfare functions that satisfy the independence of irrelevant alternatives allow small changes to outweigh large changes. That includes non-additive but aggregative social welfare functions. See Spears and Budolfson:

It's obvious that utilitarianism does this. Consider also maximin. Maximin requires you to focus entirely on the worst off individual (or individuals, if there are ties). This might seem good because it means preventing the worst states, but it also means even preferring to prevent a tiny harm to the worst off (or a worse off) over bringing someone down to their level of welfare. E.g., one extra pin prick to someone being tortured anyway outweighs the (only very slightly less bad) torture of someone who wouldn't have otherwise been tortured. More continuous versions of maximin, like moderate tradeoff view/rank-discounted utilitarianism, have the same implications in some cases, which will depend on the numbers involved.

Limited aggregation allows you to make some intuitive tradeoffs without extreme prioritization like maximin or allowing tiny harms to aggregate outweigh large harms.

On the other hand, there are views that reject the independence of irrelevant alternatives but don't allow any aggregation at all, and require you to minimize the greatest individual loss in welfare (not maximize the worst off state or maximize the welfare of the worst off individual, like maximin). This doesn't allow enough tradeoffs either, in my view. Scanlon the contractualist and Tom Regan the deontological animal rights theorist endorsed such a principle, as "the greater burden principle" and "the harm principle", respectively. Maybe also the animal advocate Richard Ryder, with his "painism", unless that is just a form of maximin.

MattBall

Thanks, Michael. This is what I've been looking for. I'll check out your links.
I tend to agree with Ryder, although I don't know how thorough his framework is.
Thanks again.
PS: Hey Michael, those links were interesting. Do you have a good link to go into more about "limited aggregation"?
Thanks,
-Matthew Michael

Holden Karnofsky

I think you lose a lot when you give up additivity, as discussed here and here.

MattBall

I understand that you lose a lot (and I appreciate your blog posts). But that is not an argument that additivity is correct. As I've written for my upcoming book:

Imagine a universe that has only two worlds, World R and World FL. In World R, Ricky the Rooster is the only sentient being, and is suffering in an absolutely miserable life.

This is bad. But where is it bad? In Ricky’s consciousness. And nowhere else.

On World FL, Rooster Foghorn is living in one forest and Rooster Leghorn is living in a separate forest. They are the World FL’s only sentient beings, and don’t know each other. Their lives are as bad as Ricky’s.

Our natural response is to think that World FL is twice as bad as World R. But where could it possibly be twice as bad? Foghorn’s life is bad in his consciousness and nowhere else. Leghorn’s life is bad in his consciousness and nowhere else.

Where is their world twice as bad as Ricky’s?

Nowhere.

Okay, yes, I admit it is twice as bad in your mind and my mind. But we are not part of that universe. Imagine that these worlds are unknown to any other sentient being. Then there is simply nowhere that World FL is worse than World R.

In this universe, there are three worlds and only three worlds: one in each of their minds.

Tell me where I am factually wrong. Please, I’m asking you. My life would be much easier and happier if you would.

Don’t say that the implications of this insight leads to absurd conclusions that offend our intuitions. I already know that! Just tell me where am I factually wrong.

I know (oh, yes, I know) that this seems like it can’t possibly be right. This is because we can’t help but be utilitarian in this regard, just like we can’t help but feel like we are in control of our consciousness and our decisions and our choices.

But I can see no way around this simple fact: morally-relevant “badness” exists only in individual consciousnesses.

JamieWoodhouse

Thanks Holden - great article.

The Sentientism web site (and the Sentientism podcast/YouTube series of conversations) proposes Sentientism as an explicitly naturalistic, sentiocentric worldview. I summarise it as "evidence, reason and compassion for all sentient beings". Feedback very welcome.

Methodological naturalism is so obvious to many that it's often left unstated. However, given most people on the planet have their ethics shaped (warped?) by unfounded and / or supernatural beliefs it seems important to specify this epistemological stance alongside an ethical one re: our scope of moral patiency.

Arguably every human caused problem is rooted in a failure of compassion, un-founded credence/belief or a combination of the two.

MichaelPlant

I'm puzzled by the aspiration that our ethical system should be 'future-proofed'. Why, exactly, should we care about what future people will think of us? How far in the future should we care about, anyway? Conversely, shouldn't we also care that past people would have judged us? Should we care if current people do judge us? How are we to weigh these considerations? If we knew that the world was about to be taken over by some immortal totalitarian regime, we would future proof our views by just adopting those beliefs now. Does knowing that this would happen give us any reason to change our views?

Presumably, the underlying thought is that future people will have superior ethical views - that's what matters, not the fact in itself that future people have them (Cf Plato's Euthypro dilemma: do the gods love things because they are good or are they good because the gods love them?). And the reason we believe that is because we think there's been 'moral progress', that is, we have superior views to our forebears. But to say our views are superior because and only because they are (say) more utilitarian, sentientist, etc. is just to assert that one thinks those beliefs are true; it's not an argument for those views. Someone who held other views might think we are experiencing moral decay.

Given all this, I prefer the task of engaging with the object-level ethical arguments, doing our best to work out what the right principles are, then taking action. It feels disempowering and 'spooky' to say "future people are going to be much better at ethics for reasons we would not or cannot understand; so let's try to figure out what they would do and do that, even if it makes no sense to us".

TylerMaule

I didn't read the goal here as literally to score points with future people, though I agree that the post is phrased such that it is implied that future ethical views will be superior.

Rather, I think the aim is to construct a framework that can be applied consistently across time—avoiding the pitfalls of common-sense morality both past and future.

In other words, this could alternatively be framed as 'backtesting ethics' or something, but 'future-proofing' speaks to (a) concern about repeating past mistakes (b) personal regret in future.

Holden Karnofsky

I think I agree with Tyler. Also see this follow-up piece - "future-proof" is supposed to mean "would still look good if we made progress, whatever that is." This is largely supposed to be a somewhat moral-realism-agnostic operationalization of what it means for object-level arguments to be right.

Kenny Easwaran

A few comments:

Although doing something because it is the intuitive, traditional, habitual, or whatever way of doing things doesn't necessarily have a great record of getting good results, many philosophers (particularly those in the virtue ethics tradition, but also "virtue consequentialists" and the like) argue that cultivating good intuitions, traditions, habits, and so on is probably more effective at actually having good consequences on the world rather than evaluating each act individually. This is partly probably due to quirks of human psychology, but partly due to the general limitations of finite beings of any sort - we need to operate under heuristics rather than unboundedly complex rules or calculations. (You're probably getting at something like this point towards the end.

On the Harsanyi results - I think there's a bit more flexibility than your discussion suggests. I don't think there's any solid argument that rules out non-Archimedean value scales, where some things count infinitely more than others. I'm not convinced that there are such things, but I don't think they cause all the problems for utilitarianism and related views than they are sometimes said to. Also, I don't think the argument for expected-value reasoning and equal-weight consideration for all individuals are quite as knock-down as is sometimes suggested - Lara Buchak's work on risk aversion is very interesting to me, and it is formally analogous (through the same Harsanyi/Rawls veil of ignorance thought experiment) to one standard form of inequality aversion (I always forget whether it's "prioritarianism" or "egalitarianism" - one says that value counts for more at lower points on the value scale and is formally like "diminishing marginal utility of utility" if that wasn't a contradiction; the other says that improvements for people who are relatively low off in the social ordering count more than improvements for people who are relatively high off, and this one is analogous to Buchak's risk aversion, where improvements in the worst outcomes matter more than improvement in the best outcomes, regardless of the absolute level those improvements occur at).

You endorse sentientism, based on "the key question is the extent to which they’re sentient: capable of experiencing pleasure and suffering." It seems like it might be a friendly amendment to this to define "sentient" as "capable of preferring some states to others" - that seems to get away from some of the deeper metaphysical questions of consciousness, and allow us to consider pleasure and pain as preference-like states, but not the only ones.

Holden Karnofsky

That seems reasonable re: sentientism. I agree that there's no knockdown argument against lexicographic preferences, though I find them unappealing for reasons gestured at in this dialogue.

MattBall

Thanks for this, Kenny. I've always thought Rawls' Veil of Ignorance can do a lot of heavy lifting.
https://www.mattball.org/2017/03/a-theory-of-ethics.html

Holden Karnofsky

Comments on Defending One-Dimensional Ethics will go here.

Kenny Easwaran

I don't think your argument against risk aversion fully addresses the issue. You give one argument for diversification that is based on diminishing marginal utilities, and then show that this plausibly doesn't apply in global charities. However, there's a separate argument for diversification that is actually about risk itself, and not diminishing marginal utility. You should look at Lara Buchak's book, "Risk and Rationality", which argues that there is a distinct form of rational risk-aversion (or risk-seeking-ness). On a risk neutral approach, each outcome counts in exact proportion to its probability, regardless of whether it's the best outcome, the worst, or in between. On a risk averse approach, the relative weight of the top ten percentiles of outcomes is less than the relative weight of the bottom ten percentiles of outcomes, and vice versa for risk seeking approaches.

This turns out to precisely correspond to ways to make sense of some kinds of inequality aversion - making things better for a worse off person improves the world more than making things equally much better for a better off person.

None of the arguments you give tell against this approach rather than the risk-neutral one.

One important challenge to the risk-sensitive approach is that, if you make large numbers of uncorrelated decisions, then the law of large numbers kicks in and it ends up behaving just like risk neutral decision theory. But these cases of making a single large global-scale intervention are precisely the ones in which you aren't making a large number of uncorrelated decisions, and so considerations of risk sensitivity can become relevant.

Holden Karnofsky

You're right that I haven't comprehensively addressed risk aversion in this piece. I've just tried to give an intuition for why the pro-risk-aversion intuition might be misleading.

gbhn

A big difference in button 1 (small benefit for someone) and 1A (small chance of a small benefit for a large number of people) is the kind of system required for these outcomes.

Button 1 requires basically a days worth of investment by someone making a choice to give it to another. Button 1A requires... perhaps a million times as much effort? We're talking about the equivalent of passing a national holiday act. This ends up requiring an enormous amount of coordination and investment. And the results do not scale linearly at all. That is, a person investing a day's worth of effort to try and pass a national holiday act don't have a 10E-8 chance of working. They have a much much smaller chance. Many many orders of magnitude less.

In other words, the worlds posited by a realistic interpretation of what these buttons mean are completely different, and the world where button 1A process succeeds is to be preferred by at least six orders of magnitude. In other words, the colloquial understanding of the "big" impact is closer to right than the multiplication suggests.

I'm not sure exactly how that impacts the overall conclusions, but I think this same dynamic applies to several odd conclusions -- the flaw is that the button is doing much much much more work in some situations than in others described as identical, and that descriptive flaw is pumping our intuitions to ignore those differences rather than address them.

Anthony DiGiovanni 🔸

I started writing a comment, then it got too long, so I put in my shortform here. :)

Holden Karnofsky

It's interesting that you have that intuition! I don't share it, and I think the intuition somewhat implies some of the "You shouldn't leave your house" type things alluded to in the dialogue.

Anthony DiGiovanni 🔸

I'm pretty happy to bite that bullet, especially since I'm not an egoist. I should still leave my house because others are going to suffer far worse (in expectation) if I don't do something to help, at some risk to myself. It does seem strange to say that if I didn't have any altruistic obligations then I shouldn't take very small risks of horrible experiences. But I have the stronger intuition that those horrible experiences are horrible in a way that the nonexistence of nice experiences isn't. And that "I" don't get to override the preference to avoid such experiences, when the counterfactual is that the preferences for the nice experiences just don't exist in the first place.

MattBall

I don't necessarily disagree with your conclusion, but I don't know how you can feel sure about weighing a chicken's suffering vs a person.

But I definitely disagree with the initial conclusion, and I think it is because you don't fear extreme suffering enough. If everyone behind the veil of ignorance knew what the worst suffering was, they would fear it more than they would value time at the beach.

Re: longtermism, I find the argument in Pinker's latest book to be pretty compelling:

The optimal rate at which the discount the future is a problem that we face not just as individuals but as societies, as we decide how much public wealth we should spend benefit our older selves and future generations. Discount it we must. It's not only that a current sacrifice would be in vain if an asteroid sends us the way of the dinosaurs. It's also that our ignorance of what the future will bring, including advances in technology, grows exponentially the farther out we plan. It would have made little sense for our ancestors a century ago to have scrimped for our benefit - say, diverting money from schools and roads to a stockpile of iron lungs to prepare for a polio epidemic - given that we're six times richer and have solved some of their problems while facing new ones they could not have dreamed of.

Holden Karnofsky

I agree with this argument for discount rates, but I think it is a practical rather than philosophical argument. That is, I don't think it undermines the idea that if we were to avert extinction, all of the future lives thereby enabled should be given "full weight."

Elliott Thornley

Nice post! I share your meta-ethical stance, but I don't think you should call it 'moral quasi-realism'. 'Quasi-realism' already names a position in meta-ethics, and it's different to the position you describe.

Very roughly, quasi-realism agrees with anti-realism in stating:

(1) Nothing is objectively right or wrong.
(2) Moral judgments don't express beliefs.

But, in contrast to anti-realism, quasi-realism also states:

(3) It's nevertheless legitimate to describe certain moral judgments as true.

The conjunction of (1)-(3) defines quasi-realism.

What you call 'quasi-realism' might be compatible with (2) and (3), but its defining features seem to be (1) plus something like:

(4) Our aim is to abide by the principles that we'd embrace if we were more thoughtful, informed, etc.

(1) plus (4) could point you towards two different positions in meta-ethics. It depends whether you think it's appropriate to describe the principles we'd embrace if we were more thoughtful, etc., as true.

If you think it is appropriate to describe these principles as true, then that counts as an ideal observer theory.

If you think it isn't appropriate to describe these principles as true, then your position is just anti-realism plus the claim that you do in fact try to abide by the principles that you'd embrace if you were more thoughtful, etc.

Holden Karnofsky

Thanks, this is helpful! I wasn't aware of that usage of "moral quasi-realism."

Personally, I find the question of whether principles can be described as "true" unimportant, and don't have much of a take on it. My default take is that it's convenient to sometimes use "true" in this way, so I sometimes do, while being happy to taboo it anytime someone wants me to or I otherwise think it would be helpful to.

AppliedDivinityStudies

One candidate you don't mention is:

- Extrapolate from past moral progress to make educated guesses about where moral norms will be in the future.

On a somewhat generous interpretation, this is the strategy social justice advocates have been using. You look historically, see that we were wrong about treating women, minorities, etc less worthy of moral consideration, and try to guess which currently subjugated groups will in the future be seen as worthy of equal treatment. This gets you to feeling more concern for trans people, people with different sexual preferences (including ones that are currently still taboo), for poor people, disabled people, etc, and eventually maybe animals too.

Another way of phrasing that is:
- Identify which groups will be raised in moral status in the future, and work proactively to raise their status today.

Will MacAskill has an 80k podcast titled "Our descendants will probably see us as moral monsters". One way to interpret the modern social justice movement is that it advocates for adopting a speculative future ethics, such that we see each other as moral monsters today. This has led to mixed results.

splinter

I think this is well-taken, but we should be cautious about the conclusions we draw from it.

It helps to look at a historical analogy. Most people today (I think) consider the 1960s-era civil rights movement to be on the right side of history. We see the racial apartheid system of Jim Crow America as morally repugnant. We see segregated schools and restaurants and buses as morally repugnant. We see flagrant voter suppression as morally repugnant (google "white primaries" if you want to see what flagrant means). And so we see the people who were at the forefront of the civil rights movement as courageous and noble people who took great personal risks to advance a morally righteous cause. Because many of them were.

If you dig deeply into the history of the civil rights movement, though, you will also find a lot of normal human stuff. Infighting. Ideological excess. Extremism. Personal rivalry. Some civil rights organizations of the time were organizationally paralyzed by a very 1960s streak of countercultural anti-authoritarianism that has not aged well. They were often heavily inflected with Marxist revolutionary politics that has not aged well. Many in the movement regarded now revered icons like MLK Jr. as overly cautious establishmentarian sellouts more concerned with their place in history than with social change.

My point is not that the civil rights movement was actually terrible. Nor is it that because the movement was right about school integration, it was also right about the virtues of Maoism. My point is that if you look closely enough, history is always a total goddamned mess. And yet, I still feel pretty comfortable saying that we have made progress on slavery.

So yes, I absolutely agree that many contemporary arguments about moral progress and politics will age terribly, and I doubt it will even take very long. Probably in ten years times, many of the debates of today will look quaint and misguided. But this doesn't mean we should lapse into a total relativism. It means we need to look at the right scale and also that we should increase our ethical and epistemic humility in direct proportion to the specificity of the moral question we are asking.

Holden Karnofsky

Comments on Debating myself on whether “extra lives lived” are as good as “deaths prevented” will go here.

Patrick Wilson

Dear Holden and all Karnofskyites ,

Thanks for this great post and discussion - I really enjoyed the audio too.

I began to compose a comment here but then it rambled on and on, and dived into various weird rabbit holes, and then I realised I needed to do more reading.

I ended up writing a full-length essay over Easter and have just posted it on my new blog 'Path findings'. I launched this a few weeks ago inspired by reading your post 'Learning by Writing' - and yay it seems that really works!

Anyway, here's the post , fresh off the slab

Rabbits, robots and resurrection

Riffing with Karnofsky on the value of present and future lives, to celebrate the 50th anniversaries of 'Watership Down', 'Limits to Growth' and the Alcor foundation...

I'd be thrilled if you could take a few moments to read or at least skim it, and would welcome any and all feedback, however brutal!

Up front I confess not all the arguments are consistent, and the puns are consistently terrible, but I hope it makes some kind of sense. It will appeal particularly to people who like philosophy, ecology and rabbits, and features a lovely illustration Lyndsey Green.

As a taster, here are some of the section headers (and most of the terrible puns):

Warren peace: a brief history of British rabbits
Too many bunnies? Malthus bites back
Abundant lives: valuing people now and in future
Staying alive: trolling the trolley problems
Of bunnies and bugs: who qualifies as people?
Back to life, back to reality… being human

You have been warned!

Best regards,

Patrick

Erich_Grunewald 🔸

Really like this post!

I think one important crux here is differing theories of value.

My preferred theory is the (in my view, commonsensical) view that for something to be good or bad, it has to be good or bad for someone. (This is essentially Christine Korsgaard's argument; she calls it "tethered value".) That is, value is conditional on some valuer. So where a utilitarian might say that happiness/well-being/whatever is the good and that we therefore ought to maximise it, I say that the good is always dependent on some creature who values things. If all the creatures in the world valued totally different things than what they do in our dimension, then that would be the good instead.

(I should mention that, though I'm not very confident about moral philosophy, to me the most plausible view is a version of Kantianism. Maybe I give 70% weight to that, 20% to some form of utilitarianism and the rest to Schopenhauerian ethics/norms/intuitions. I can recommend being a Kantian effective altruist: it keeps you on your toes. Anyway, I'm closer to non-utilitarian Holden in the post, but with some differences.)

This view has two important implications:

It no longer makes sense to aggregate value. As Korsgaard puts it, "If Jack would get more pleasure from owning Jill's convertible than Jill does, the utilitarian thinks you should take the car away from Jill and give it to Jack. I don't think that makes things better for everyone. I think it makes it better for Jack and worse for Jill, and that's all. It doesn't make it better on the whole."
It no longer makes sense to talk about the value of potential people. Their non-existence is neither good nor bad because there is no one for it to be good or bad for. (Exception: They can still be valued by people who are alive. But let's ignore that.)

I haven't spent tons of time thinking about how this shakes out in longtermism, so quite a lot of uncertainty here. But here's roughly how I think this view would apply to your thought experiments:

Challenge 1A -- climate change. If we decide to ignore climate change, then we wrong future people (because climate change is bad for them). If we don't ignore it, then we don't wrong those people (because they won't exist); we also don't wrong the future people who will exist, because we did our best to mitigate the problem. In a sense, we have a duty to future generations, whoever they may be.
Challenge 1B -- world A/B/C. It doesn't make sense to compare different world in this way, because that would necessarily involve aggregation. Instead, we have to evaluate every action based on whether it wrongs (or not, or benefits) people in the world it produces.
Challenge 2 -- asymmetry. This objection I think doesn't apply now. The relevant question is still: does our action wrong the person that does come into existence? If we have good reason to believe that a new life will be full of suffering, and we choose to bring it into existence, plausibly we do wrong that person. If we have good reason to believe that the life will be great, and we choose to bring it into existence, obviously we don't wrong the person. (If we do not bring it into existence, we don't wrong anyone, because there's no one to wrong.)

Additional thoughts:

I want to mention a harder problem than the "should we have as many children as possible?" one you mention. It is that it seems ok to abort a fetus that would have a happy life, but it seems really wrong not to abort a fetus we know would have a terrible life full of pain and suffering. (This is apparently called the asymmetry problem in philosophy.) These intuitions make perfect sense if we take the view that value is tethered. But they don't really make sense in total utilitarianism.
Extinction would still be very bad, but it would be bad for the people who are alive when it happens, and for all the people in history whose work to improve things in the far future is being thwarted.

(I recognise that my view gets weirder when we bring probability into the picture (as we have to). That's something I want to think more about. I also totally recognise that my view is pretty complicated, and simplicity is one of the things I admire in utilitarianism.)

I think one important difference between me and non-utilitarian Holden is that I am not a consequentialist, but I kind of suspect that he is? Otherwise I would say that he is ceding too much ground to his evil twin. ;)

Holden Karnofsky

I share a number of your intuitions as a starting point, but this dialogue (and previous ones) is intended to pose challenges to those intuitions. To follow up on those:

On Challenge 1A (and as a more general point) - if we take action against climate change, that presumably means making some sort of sacrifice today for the sake of future generations. Does your position imply that this is "simply better for some and worse for others, and not better or worse on the whole?" Does that imply that it is not particularly good or bad to take action on climate change, such that we may as well do what's best for our own generation?

Also on Challenge 1A - under your model, who specifically are the people it is "better for" to take action on climate change, if we presume that the set of people that exists conditional on taking action is completely distinct from the set of people that exists conditional on not taking action (due to chaotic effects as discussed in the dialogue)?

On Challenge 1B, are you saying there is no answer to how to ethically choose between those two worlds, if one is simply presented with a choice?

On Challenge 2, does your position imply that it is wrong to bring someone into existence, because there is a risk that they will suffer greatly (which will mean they've been wronged), and no way to "offset" this potential wrong?

Non-utilitarian Holden has a lot of consequentialist intuitions that he ideally would like to accommodate, but is not all-in on consequentialism.

Erich_Grunewald 🔸

As you noticed, I limited the scope of the original comment to axiology (partly because moral theory is messier and more confusing to me), hence the handwaviness. Generally speaking, I trust my intuitions about axiology more than my intuitions about moral theory, because I feel like my intuition is more likely to "overfit" on more complicated and specific moral dilemmas than on more basic questions of value, or something in that vein.

Anyway, I'll just preface the rest of this comment with this: I'm not very confident about all this and at any rate not sure whether deontology is the most plausible view. (I know that there are consequentialists who take person-affecting views too, but I haven't really read much about it. It seems weird to me because the view of value as tethered seems to resist aggregation, and it seems like you need to aggregate to evaluate and compare different consequences?)

On Challenge 1A (and as a more general point) - if we take action against climate change, that presumably means making some sort of sacrifice today for the sake of future generations. Does your position imply that this is "simply better for some and worse for others, and not better or worse on the whole?" Does that imply that it is not particularly good or bad to take action on climate change, such that we may as well do what's best for our own generation?

Since in deontology we can't compare two consequences and say which one is better, the answer depends on the action used to get there. I guess what matters is whether the action that brings about world X involves us doing or neglecting (or neither) the duties we have towards people in world X (and people alive now). Whether world X is good/bad for the population of world X (or for people alive today) only matters to the extent that it tells us something about our duties to those people.

Example: Say we can do something about climate change either (1) by becoming benevolent dictators and implementing a carbon tax that way, or (2) by inventing a new travel simulation device, which reduces carbon emissions from flights but is also really addictive. (Assume the consequences of these two scenarios have equivalent expected utility, though I know the example is unfair since "dictatorship" sounds really bad -- I just couldn't think of a better one off the top of my head.) Here, I think the Kantian should reject (1) and permit or even recommend (2), roughly speaking because (2) respects people's autonomy (though the "addictive" part may complicate this a bit) in a way that (1) does not.

Also on Challenge 1A - under your model, who specifically are the people it is "better for" to take action on climate change, if we presume that the set of people that exists conditional on taking action is completely distinct from the set of people that exists conditional on not taking action (due to chaotic effects as discussed in the dialogue)?

I don't mean to say that a certain action is better or worse for the people that will exist if we take it. I mean more that what is good or bad for those people matters when deciding what duties we have to them, and this matters when deciding whether the action we take wrongs them. But of course the action can't be said to be "better" for them as they wouldn't have existed otherwise.

On Challenge 1B, are you saying there is no answer to how to ethically choose between those two worlds, if one is simply presented with a choice?

I am imagining this scenario as a choice between two actions, one involving waving a magic wand that brings world X into existence, and the other waving it to bring world Y into existence.

I guess deontology has less to say about this thought experiment than consequentialism does, given that the latter is concerned with the values of states of affair and the former more with the values of actions. What this thought experiment does is almost eliminate the action, reducing it to a choice of value. (Of course choosing is still an action, but it seems qualitatively different to me in a way that I can't really explain.) Most actions we're faced with in practice probably aren't like that, so it seems like ambivalence in the face of pure value choices isn't too problematic?

I realise that I'm kind of dodging the question here, but in my defense you are, in a way, asking me to make a decision about consequences, and not actions. :)

On Challenge 2, does your position imply that it is wrong to bring someone into existence, because there is a risk that they will suffer greatly (which will mean they've been wronged), and no way to "offset" this potential wrong?

One of the weaknesses in deontology is its awkwardness with uncertainty. I think one ok approach is to put values on outcomes (by "outcome" I mean e.g. "violating duty X" or "carrying out duty Y", not a state of affairs as in consequentialism) and multiplying by probability. So I could put a value on "wronging someone by bringing them into a life of terrible suffering" and on "carrying out my duty to bring a flourishing person into the world" (if we have such a duty) and calculating expected value that way. Then whether or not the action is wrong would depend on the level of risk. But that is very tentative ...

Richard Y Chappell🔸

Great dialogue! As an additional 'further reading' suggestion, I just want to plug the 'Population Ethics' chapter at utilitarianism.net. It summarizes some less well-known possibilities (such as "value blur" in the context of a critical range view) that might avoid some of the problems of the (blur-free) total view.

Luís Campos

FYI, the audio on the recording is slightly weird. :)

Jacob Valero

Thanks for this post! I found the inner dialogue very relatable and it was helpful in thinking about my own uncertainties.

Jeremy

The link to Chapter 2 of On the Overwhelming Importance of Shaping the Far Future at the end links to a non-public Google Drive file.

Holden Karnofsky

The link works for me in incognito mode (it is a Google Drive file).

Jeremy

Huh, maybe someone else wants to weigh in? When I view in an incognito window, it prompts me to login. When I view it logged in, it says "You need access. Ask for access, or switch to an account with access." I'm not sure if you are the owner, but if so, you likely just need to click on "Share", then "Restricted" in the Get Link dialog (it doesn't really look like you can click there, but you can), then change the setting to "Anyone with the link".

Holden Karnofsky

Hm. I contacted Nick and replaced it with another link - does that work?

Jeremy

Yup, works for me now.

Lukas Finnveden

I think the title of this post doesn't quite match the dialogue. Most of the dialogue is about whether additional good lives is at least somewhat good. But that's different from whether each additional good life is morally equivalent to a prevented death. The former seems more plausible than the latter, to me.

Separating the two will lead to some situations where a life is bad to create but also good to save, once started. That seems more like a feature than a bug. If you ask people in surveys, my impression is that some small fraction of people say that they'd prefer to not have been born and that some larger fraction of people say that they'd not want to relive their life again — without this necessarily implying that they currently want to die.

Holden Karnofsky

I think that's a fair point. These positions just pretty much end up in the same place when it comes to valuing existential risk.

dominicroser

[Pre-remark: I have only lightly skimmed the post]

Just wanted to add a pointer to Tim Mulgan's book Ethics for a Broken World -- given the similarity in framing: "Imagine living in the future in a world already damaged by humankind...Then imagine looking back into the past, back to our own time and assessing the ethics of the early twenty-first century. ....This book is presented as a series of history of philosophy lectures given in the future, studying the classic texts from a past age of affluence, our own time. "

I don't have a cite for these being the key properties of a good scientific theory, but I think these properties tend to be consistently sought out across a wide variety of scientific domains. The simplicity criterion is often called "Occam's razor," and the other criterion is hopefully somewhat self-explanatory. You could also see these properties as essentially a plain-language description of Solomonoff induction. ↩
It's possible to combine sentientism with a non-hedonist theory of well-being. For example, one might believe that only beings with the capacity for pleasure and suffering matter, but also that once we've determined that someone matters, we should care about what they want, not just about their pleasure and suffering. ↩
At first [the] insider/ outsider distinction applied even between the citizens of neighboring Greek city-states; thus there is a tombstone of the mid-fifth century B.C. which reads:

This memorial is set over the body of a very good man. Pythion, from Megara, slew seven men and broke off seven spear points in their bodies … This man, who saved three Athenian regiments … having brought sorrow to no one among all men who dwell on earth, went down to the underworld felicitated in the eyes of all.

This is quite consistent with the comic way in which Aristophanes treats the starvation of the Greek enemies of the Athenians, starvation which resulted from the devastation the Athenians had themselves inflicted. Plato, however, suggested an advance on this morality: he argued that Greeks should not, in war, enslave other Greeks, lay waste their lands or raze their houses; they should do these things only to non-Greeks. These examples could be multiplied almost indefinitely. The ancient Assyrian kings boastfully recorded in stone how they had tortured their non-Assyrian enemies and covered the valleys and mountains with their corpses. Romans looked on barbarians as beings who could be captured like animals for use as slaves or made to entertain the crowds by killing each other in the Colosseum. In modern times Europeans have stopped treating each other in this way, but less than two hundred years ago some still regarded Africans as outside the bounds of ethics, and therefore a resource which should be harvested and put to useful work. Similarly Australian aborigines were, to many early settlers from England, a kind of pest, to be hunted and killed whenever they proved troublesome.
↩
E.g., https://www.openphilanthropy.org/2017-report-consciousness-and-moral-patienthood#ProposedCriteria ↩
Wikipedia ↩
I mean, I agree with the critic that the "track record" point is far from a slam dunk, and that "utilitarians were ahead of the curve" doesn't necessarily mean "utilitarianism was ahead of the curve." But I don't think the "track record" argument is intended to be a philosophically tight point; I think it's intended to be interesting and suggestive, and I think it succeeds at that. At a minimum, it may imply something like "The kind of person who is drawn to utilitarianism+sentientism is also the kind of person who makes ahead-of-the-curve moral judgments," and I'd consider that an argument for putting serious weight on the moral judgments of people who drawn to utilitarianism+sentientism today. ↩

Future-proof ethics

Rabbits, robots and resurrection

Riffing with Karnofsky on the value of present and future lives, to celebrate the 50th anniversaries of 'Watership Down', 'Limits to Growth' and the Alcor foundation...

Future-proof ethics

"Common-sense" ethics

Three pillars of future-proof ethics

Systemization

Thin Utilitarianism

Sentientism

Putting the pieces together

Appendix: other candidates for future-proof ethics?

Appendix: aspects of the utilitarianism debate I'm skipping

Footnotes