This is a special post for quick takes by MichaelStJules. Only they can create top-level comments. Comments here also appear on the Quick Takes page and All Posts page.

This is a special post for quick takes by MichaelStJules. Only they can create top-level comments. Comments here also appear on the Quick Takes page and All Posts page.

Mentioned in

More from MichaelStJules

61

MichaelStJules

MichaelStJules+ 0 more

· 2mo ago · 32m read

36

MichaelStJules

MichaelStJules+ 0 more

· 2mo ago · 3m read

Curated and popular this week

Relevant opportunities

55

· 4d ago · 5m read

6

I feel increasingly unsympathetic to hedonism (and maybe experientalism generally?). Yes, emotions matter, and the strength of emotions could be taken to mean how much something matters, but if you separate a cow and her calf and they’re distressed by this, the appropriate response for their sake is not to drug or fool them until they feel better, it’s to reunite them. What they want is each other, not to feel better. Sometimes I think about something bad in the world that makes me sad; I don't think you do

meany favour by just taking away my sadness; I don't want to stop feeling sad, what I want is for the bad in the world to be addressed.Rather than affect being what matters in itself, maybe affect is a signal for what matters and its intensity tells us how much it matters. Hedonism as normally understood would therefore be like Goodhart’s law: it ignores the objects of our emotions. This distinction can also be made between different versions of preference utilitarianism/consequentialism, as "satisfaction versions" and "object versions". See Krister Bykvist's PhD thesis and Wlodek Rabinowicz and Jan Österberg, "Value Based on Preferences: On Two Interpretations of Preference Utilitarianism" (unfortunately both unavailable online to me, at least).

Of course, often we do just want to feel better, and that matters, too. If someone wants to not suffer, then of course they should not suffer.

Related: wireheading, the experience machine, complexity of value.

The procreation asymmetry can be formulated this way (due to Jeff McMahan):

I think it's a consequence of a specific way of interpreting the claim "an outcome can only be worse than another if it's worse for someone", where the work goes into defining "worse for A". Using "better" instead of "worse" would give you a different asymmetry.

This is a summary of the argument for the procreation asymmetry here and in the comments, especially this comment, which also looks further at the case of bringing someone into existence with a good life. I think this is an

actualistargument, similar to Krister Bykvist's argument in 2.1 (which cites Dan Brock from this book) and Derek Parfit's argument on p.150 of Reasons and Persons, and Johann Frick's argument (although his is not actualist, and he explicitly rejects actualism). The starting claim is that your ethical reasons are in some sense conditional on the existence of individuals, and the asymmetry between existence and nonexistence can lead to the procreation asymmetry.1. From an outcome in which an individual doesn't/won't exist, they don't have any interests that would give you a reason to believe that another outcome is better on their account (they have no account!). So, ignoring other reasons, this outcome is not dominated by any other, and the welfare of an individual we could bring into existence is not in itself a reason to bring them into existence. This is reflected by the absence of arrows starting from the Nonexistence block in the image above.

2. An existing individual (or an individual who will exist) has interests. In an outcome in which they have a bad life, an outcome in which they didn't exist would have been better for them

from the point of view of the outcome in which they do exist with a bad life, so an outcome with a bad life is dominated by one in which they don't exist, ignoring other reasons. Choosing an outcome which is dominated this way is worse than choosing an outcome that dominates it. So, that an individual would have negative welfare is a reason to prevent them from coming into existence. This is reflected by the arrow from Negative existence to Nonexistence in the image above.3. If the individual would have had a good life, we could say that this would be better than their nonexistence and dominates it (ignoring other reasons), but

this only applies from outcomes in which they exist and have a good life. If they never existed, because of 1, it would not be dominated from that outcome (ignoring other reasons).Together, 1 and 2 are the procreation asymmetry (reversing the order of the two claims from McMahan's formulation).

Considering formalizations of actualism, Jack Spencer, 2021, The procreative asymmetry and the impossibility of elusive permission (pdf here) discusses problems with actualism (especially "strong actualism" but also "weak actualism") and proposes "stable actualism" as a solution.

I think my argument builds off the following from "The value of existence" by Gustaf Arrhenius and Wlodek Rabinowicz (2016):

The footnote that expands on this:

You could equally apply this argument to individual experiences, for an asymmetry between suffering and pleasure, as long as whenever an individual suffers, they have an interest in not suffering, and it's not the case that each individual, at every moment, has an interest in more pleasure, even if they don't know it or want it.

Something only matters if it matters (or will matter) to someone, and an absence of pleasure

doesn't necessarilymatter to someone who isn't experiencing pleasure* and certainly doesn't matter to someone who does not and will not exist, and so we have no inherent reason to promote pleasure. On the other hand, there's no suffering unless someone is experiencing it, and according to some definitions of suffering, it necessarily matters to the sufferer.* for example, when concentrating in a flow state, while asleep, when content.

See also tranquilism and this post I wrote.

And we can turn this into a

wideperson-affecting view to solve the Nonidentity problem by claiming that identity doesn't matter. To make the above argument fit better with this, we can rephrase it slightly to refer to "extra individuals" or "no extra individuals" rather than anyspecificindividuals who will or won't exist. Frick makes a separate general claim that if exactly one of two normative standards (e.g. people, with interests) will exist, and they are standards of the same kind (e.g. the extent to which people's interests are satisfied can be compared), then it's better for the one which will be better satisfied to apply (e.g. the better off person should come to exist).On the other hand, a narrow view might still allow us to say that it's worse to bring a worse off individual into existence with a bad life than a better off one, if our reasons against bringing an individual into existence with a bad life are stronger the worse off they would be, a claim I'd expect to be widely accepted. If we apply the view to individual experiences or person-moments, the result seems to be a negative axiology, in which only the negative matters, on, and with hedonism, only suffering would matter. Whether or not this follows can depend on how the procreation asymmetry is captured, and there are systems in which it would not follow, e.g. the narrow asymmetric views here, although these reject the independence of irrelevant alternatives.

Under standard order assumptions which include the independence of irrelevant alternatives and completeness, the procreation asymmetry does imply a negative axiology.

Utility functions (preferential or ethical, e.g. social welfare functions) can have lexicality, so that a difference in category A can be larger than the maximum difference in category B, but we can still make probabilistic tradeoffs between them. This can be done, for example, by having separate utility functions, fA:X→R and fB:X→R for A and B, respectively, such that

Then we can define our utility function as the sum f=fA+fB , so

This ensures that all outcomes with P(x) are at least as good as all outcomes with Q(x), without being Pascalian/fanatical to maximize fA regardless of what happens to fB. Note, however, that fB may be increasingly difficult to change as the number of moral patients increases, so we may approximate Pascalian fanaticism in this limit, anyway.

For example, fA(x)≤−1 if there is any suffering in x that meets a certain threshold of intensity, Q(x), and fA(x)=0 if there is no suffering at all in x, P(x). f can still be continuous this way.

If the probability that this threshold is met is p,0≤p<1 and the expected value of fA conditional on this is bounded below by −L, L>0, regardless of p for the choices available to you, then increasing fB by at least pL, which can be small, is better than trying to reduce p.

As another example, an AI could be incentivized to ensure it gets monitored by law enforcement. Its reward function could look like

where IMi(x) is 1 if the AI is monitored by law enforcement and passes some test (or did nothing?) in period i, and 0 otherwise. You could put an upper bound on the number of periods or use discounting to ensure the right term can't evaluate to infinity since that would allow fB to be ignored (maybe the AI will predict its expected lifetime to be infinite), but this would eventually allow fB to overcome the IMi, unless you also discount the future in fB.

This should also allow us to modify the utility function fB, if preventing the modification would cause a test to be failed.

Furthermore, satisfying the IMi(x) strongly lexically dominates increasing fB(x), but we can still make expected tradeoffs between them.

The problem then reduces to designing the AI in such a way that it can't cheat on the test, which might be something we can hard-code into it (e.g. its internal states and outputs are automatically sent to law enforcement), and so could be easier than getting fB right.

This overall approach can be repeated for any finite number of functions, f1,f2,…,fn. Recursively, you could define

for σ:R→R increasing and bounded with range in an interval of length at most 1, e.g. some sigmoid function. In this way, each fk dominates the previous ones, as above.

To adapt to a more deontological approach (not rule violation minimization, but according to which you should not break a rule in order to avoid violating a rule later), you could use geometric discounting, and your (moral) utility function could look like:

where

1. x is the act and its consequences without uncertainty and you maximize the expected value of f over uncertainty in x,

2. x is broken into infinitely many disjoint intervals xi, with xi coming just before xi+1 temporally (and these intervals are chosen to have the same time endpoints for each possible x),

3. I(xi)=1 if a rule is broken in xi, and 0 otherwise, and

4. r is a constant, 0<r≤0.5.

So, the idea is that f(x)>f(y) if and only if the earliest rule violation in x happens later than the earliest one in y (at the level of precision determined by how the intervals are broken up). The value of r≤0.5 ensures this. (Well, there are some rare exceptions if r=0.5). You essentially count rule violations and minimize the number of them, but you use geometric discounting based on when the rule violation happens in such a way to ensure that it's always worse to break a rule earlier than to break any number of rules later.

However, breaking x up into intervals this way probably sucks for a lot of reasons, and I doubt it would lead to prescriptions people with deontological views endorse when they maximize expected values.

This approach basically took for granted that a rule is broken not when I act, but when a particular consequence occurs.

If, on the other hand, a rule is broken at the time I act, maybe I need to use some functions Ii(x) instead of the I(xi), because whether or not I act now (in time interval i) and break a rule depends on what happens in the future. This way, however, Ii(x) could basically always be 1, so I don't think this approach works.

This nesting approach with σ above also allows us to "fix" maximin/leximin under conditions of uncertainty to avoid Pascalian fanaticism, given a finite discretization of welfare levels or finite number of lexical thresholds. Let the welfare levels be t0>t1>⋯>tn, and define:

i.e. fk(x) is the number of individuals with welfare level at most tk, where uiis the welfare of individual i, and I(ui≤tk) is 1 if ui≤tk and 0 otherwise. Alternatively, we could use I(tk+1<ui≤tk).

In situations without uncertainty, this requires us to first choose among options that minimize the number of individuals with welfare at most tn, because fn takes priority over fk, for all k<n, and then, having done that, choose among those that minimize the number of individuals with welfare at most tn−1, since fn−1 takes priority over fk, for all k<n−1, and then choose among those that minimize the number of individuals with welfare at most tn−2, and so on, until t0.

This particular social welfare function assigns negative value to new existences when there are no impacts on others, which leximin/maximin need not do in general, although it typically does in practice, anyway.

This approach does not require welfare to be cardinal, i.e. adding and dividing welfare levels need not be defined. It also dodges representation theorems like this one (or the stronger one in Lemma 1 here, see the discussion here), because continuity is not satisfied (and welfare need not have any topological structure at all, let alone be real-valued). Yet, it still satisfies anonymity/symmetry/impartiality, monotonicity/Pareto, and separability/independence. Separability means that whether one outcome is better or worse than another does not depend on individuals unaffected by the choice between the two.

Here's a way to capture lexical threshold utilitarianism with a separable theory and while avoiding Pascalian fanaticism, with a negative threshold t−<0 and a positive threshold t+ > 0:

Either of the second or third term can be omitted.

We could require t−≤ui≤t+ for all i, although this isn't necessary.

More thresholds could be used, as in this comment: we would apply σ to the whole expression above, and then add new terms like the second and/or the third, with thresholds t++>t+ and t−−<t−, and repeat as necessary.

I think EA hasn't sufficiently explored the use of different types of empirical studies from which we can rigorously estimate causal effects, other than randomized controlled trials (or other experiments). This leaves us either relying heavily on subjective estimates of the magnitudes of causal effects based on weak evidence, anecdotes, expert opinion or basically guesses, or being skeptical of interventions whose cost-effectiveness estimates don't come from RCTs. I'd say I'm pretty skeptical, but not so skeptical that I think we

needRCTs to conclude anything about the magnitudes of causal effects. There are methods to do causal inference from observational data.I think this has lead us to:

1. Underexploring the global health and development space. See John Halstead's and Hauke Hillebrandt's "Growth and the case against randomista development". I think GiveWell is starting to look beyond RCTs. There's probably already a lot of research out there they can look to.

2. Relying too much on guesses and poor studies in the effective animal advocacy space (especially in the past), for example overestimating the value of leafletting. I think things have improved a lot since then, and I thought the evidence presented in the work of Rethink Priorities, Charity Entrepreneurship and Founders Pledge on corporate campaigns was good enough to meet the bar for me to donate to support corporate campaigns specifically. Humane League Labs and some academics have done and are doing research to estimate causal effects from observational data that can inform EAA.

Fehige defends the asymmetry between preference satisfaction and frustration on rationality grounds. I start from a "preference-affecting view" in this comment, and in replies, describe how to get to antifrustrationism and argue against a symmetric view.

Let's consider a given preference

from the point of view of a given outcome after choosing it, in which the preference either exists or does not, by cases:1. The preference exists:

a. If there's an outcome in which the preference exists and is more satisfied, and all else is equal, it would have been irrational to have chosen this one (over it, and at all).

b. If there's an outcome in which the preference exists and is less satisfied, and all else is equal, it would have been irrational to have chosen the other outcome (over this one, and at all).

c. If there's an outcome in which the preference does not exist, and all else is equal, the preference itself does not tell us if either would have been irrational to have chosen.

2. The preference doesn't exist:

a. If there's an outcome in which the preference exists, regardless of its degree of satisfaction, and all else equal, the preference itself does not tell us if either would have been irrational to have chosen.

So, all else equal besides the existence or degree of satisfaction of the given preference, it's always rational to choose an outcome in which the preference does not exist, but it's irrational to choose an outcome in which the preference exists but is less satisfied than in another outcome.

(I made a similar argument in the thread starting here.)

I also think that antifrustrationism in some sense overrides interests

lessthan symmetric views (not to exclude "preference-affecting" views or mixtures as options, though). Rather than satisfying your existing preferences, according to symmetric views, it can be better to create new preferences in you and satisfy them, against your wishes. This undermines the appeal of autonomy and subjectivity that preference consequentialism had in the first place. If, on the other hand, new preferences don't add positive value, then they can't compensate for the violation of preferences, including the violation of preferences to not have your preferences manipulated in certain ways.Consider the following two options for interests within one individual:

A. Interest 1 exists and is fully satisfied

B. Interest 1 exists and is not fully satisfied, and interest 2 exists and is (fully) satisfied.

A symmetric view would sometimes choose B, so that the creation of interests can take priority over interests that would exist regardless. In particular, the proposed benefit comes from satisfying an interest that

would not have existed in the alternative, so it seems like we're overriding the interests the individual would have in A with a new interest, interest 2. For example, we make someone want something and satisfy that want, at the expense of their other interests.On the other hand, consider:

A. Interest 1 exists and is partially unsatisfied

B. Interest 1 exists and is fully satisfied, and interest 2 exists and is partially unsatisfied.

In this case, antifrustrationism would sometimes choose A, so that the removal or avoidance of an otherwise unsatisfied interest can take priority over (further) satisfying an interest that would exist anyway. But in this case, if we choose A because of concerns for interest 2, at least interest 2 would exist in the alternative A, so the benefit comes from the avoidance of an interest that would have otherwise existed. In A, compared to B, I wouldn't say we're overriding interests, we're dealing with an interest, interest 2, that

would have existed otherwise.Smith and Black's "The morality of creating and eliminating duties" deals with duties rather than preferences, and argues that assigning positive value to duties and their satisfaction leads to perverse conclusions like the above with preferences, and they have a formal proof for this under certain conditions.

Some related writings, although not making the same point I am here:

I also think this argument isn't specific to preferences, but could be extended to any interests, values or normative standards that are necessarily

heldby individuals (or other objects), including basically everything people value (see here for a non-exhaustive list). See Johann Frick’s paper and thesis which defend the procreation asymmetry, and my other post here.Then, if you extend these comparisons to satisfy the independence of irrelevant alternatives by stating that in comparisons of multiple choices in an option set, all permissible options are strictly better than all impermissible options regardless of option set, extending these rankings beyond the option set, the result is antifrustrationism. To show this, you can use the set of the following three options, which are identical except in the ways specified:

and since B is impermissible because of the presence of A, this means C>B, and so it's always better for a preference to not exist than for it to exist and not be fully satisfied, all else equal.

This is an argument against hedonic utility being cardinal and for widespread commensurability between hedonic experiences of different kinds. It seems that our tradeoffs, however we arrive at them, don't track the moral value of hedonic experiences.

Let X be some method or system by which we think we can establish the cardinality and/or commensurability of our hedonic experiences, and rough tradeoff rates. For example, X=reinforcement learning system in our brains, our actual choices, or our judgements of value (including intensity).

If X is not identical to our hedonic experiences, then it may be the case that X is itself what's forcing the observed cardinality and/or commensurability onto our hedonic experiences. But if it's X that's doing this, and it's the hedonic experiences themselves that are of moral value, then that cardinality and/or commensurability are properties of X, not our hedonic experiences themselves. So the observed cardinality and/or commensurability is a moral illusion.

Here's a more specific illustration of this argument:

Do our reinforcement systems have access to our whole experiences (or the whole hedonic component), or only some subsets of those neurons that are firing that are responsible for them? And what if they're more strongly connected to parts of the brain for certain kinds of experiences than others? It seems like there's a continuum of ways our reinforcement systems could be off or even badly off, so it would be

more surprisingto me that it would track true moral tradeoffs perfectly. Change (or add or remove) one connection between a neuron in the hedonic system and one in the reinforcement system, and now the tradeoffs made will be different, without affecting the moral value of the hedonic states. If the link between hedonic intensity and reinforcement strength is so fragile, what are the chances the reinforcement system has got it exactly right in the first place? Should be 0 (assuming my model is right).At least for similar hedonic experiences of different intensities, if they're actually cardinal, we might expect the reinforcement system to capture some continuous monotonic transformation and not a linear transformation. But then it could be applying different monotonic transformations to different kinds of hedonic experiences. So why should we trust the tradeoffs between these different kinds of hedonic experiences?

The "cardinal hedonist" might object that X (e.g. introspective judgement of intensity) could be identical to our hedonistic experiences, or does track their cardinality closely enough.

I think, as a matter of fact, X will necessarily involve extra (neural) machinery that can distort our judgements, as I illustrate with the reinforcement learning case. It could be that our judgements are still approximately correct despite this, though.

Most importantly, the accuracy of our judgements depends on there being something fundamental that they're tracking in the first place, so I think hedonists who use cardinal judgements of intensity owe us a good explanation for where this supposed cardinality comes from, which I expect is not possible with our current understanding of neuroscience, and I'm skeptical that it will ever be possible. I think there's a great deal of unavoidable arbitrariness in our understanding of consciousness.

Here's an illustration with math. Let's consider two kinds of hedonic experiences, A and B, with at least three different (signed) intensities each, a1<a2<a3 and b1<b2<b3, respectively, with IA={a1,a2,a3},IB={b1,b2,b3}. These intensities are at least ordered, but not necessarily cardinal like real numbers or integers and we can't necessarily compare A and B. For example, A and B might be pleasure and suffering generally (with suffering negatively signed), or more specific experiences of these.

Then, what X does is map these intensities to numbers through some function,

f:IA∪IB→Rsatisfying f(a1)<f(a2)<f(a3) and f(b1)<f(b2)<f(b3). We might even let IA and IB be some ordered continuous intervals, isomorphic to a real-valued interval, and have f be continuous and increasing on each of IA and IB, but again, it's f that's introducing the cardinalization and commensurability (or a different cardinalization and commensurability from the real one, if any); these aren't inherent to A and B.

If you're a consequentialist and you think

1. each individual can sometimes sacrifice some A for more B for themself,

2. we should be impartial, and

3. transitivity and the independence of irrelevant alternatives hold,

then it’s sometimes ethical to sacrifice A from one individual for more B for another. This isn't too surprising, but let's look at the argument, which is pretty simple, and discuss some examples.

Proof. Consider the following three options, with two individuals, x and y, and a+>a amounts of A, b+>b amounts of B:

i. x:(a:A,b+:B), y:(a:A,b:B) , read as x has amount a of A and amount b+ of B, while y has amount a of A and amount b of B.

ii. x:(a+:A,b:B), y:(a:A,b:B)

iii. x:(a:A,b:B), y:(a+:A,b:B)

Here we have i > ii by 1 for some a, a+, b and b+, and ii = iii by impartiality, so together i > iii by 3, and we sacrifice some A from y for some B from for x. QED

Remark: I did choose the amounts of A and B pretty specifically in this argument to match in certain ways. With continuous personal tradeoffs between A and B, and continuous tradeoffs between amounts of A between different individuals at all base levels of A, I think this should force continuous tradeoffs between one individual's amount of A and another's amount of B. We can omit the impartiality assumption in this case.

Possible examples:

intensesuffering, B= absence or negative ofmildsufferingIn particular, if you’d be willing to endure torture for some other good, you should be willing to allow others to be tortured for you to get more of that good.

I imagine people will take this either way, e.g. some will accept that it's actually okay to let some be tortured for some other kind of benefit to different people, and others will accept that nothing can compensate them for torture. I fall into the latter camp.

Others might also reject the independence of irrelevant alternatives or transitivity, or their "spirit", e.g. by individuating options to option sets. I'm pretty undecided about independence these days.

I've been thinking more lately about how I should be thinking about causal effects for cost-effectiveness estimates, in order to clarify my own skepticism of more speculative causes, especially longtermist ones, and better understand how skeptical I ought to be. Maybe I'm far too skeptical. Maybe I just haven't come across a full model for causal effects that's convincing since I haven't been specifically looking. I've been referred to this in the past, and plan to get through it, since it might provide some missing pieces for the value of research. This also came up here.

Suppose I have two random variables, X and Y, and I want to know the causal effect of manipulating X on Y, if any.

1. If I'm confident there's no causal relationship between the two, say due to spatial separation, I assume there is no causal effect, and Y conditional on the manipulation of X to take value A (possibly random), Y|do(X=A), is

identicalto Y, i.e. Y|do(X=A)=Y. (The do notation is Pearl's do-calculus notation.)2. If X could affect Y, but I know nothing else,

a. I might assume, based on symmetry (and chaos?) for Y, that Y|do(X=A) and Y are

identical in distribution, but not necessarily literally equal as random variables. They might be slightly "shuffled" or permuted versions of each other (see symmetric decreasing rearrangements for specific examples of such a permutation). The difference in expected values is still 0. This is how I think about the effects of my every day decisions, like going to the store, breathing at particular times, etc. on future populations. I might assume the same for variables that depend on Y.b. Or, I might think that manipulating X just injects noise into Y, possibly while preserving some of its statistics, e.g. the mean or median. A simple case is just adding random symmetric noise with mean and median 0 to Y. However, whether or not a statistic is preserved with the extra noise might be sensitive to the scale on which Y is measured. For example, if Y is real-valued, and f:R→R is strictly increasing, then for the median, med(f(Y))=f(med(Y)), but the same is not necessarily true for the expected value of Y, or for other variables that depend on Y.

c. Or, I might think that manipulating X makes Y

closerto a "default" distribution over the possible values of Y, often but not always uninformed or uniform. This can shift the mean, median, etc., of Y. For example, Y could be the face of the coin I see on my desk, and X could be whether I flip the coin or not, with X being not by default. So, if I do flip the coin and hence manipulate X, this randomizes the value of Y, making my probability distribution for its value uniformly random instead of a known, deterministic value. You might think that some systems are the result of optimization and therefore fragile, so random interventions might return them to prior "defaults", e.g. naive systemic change or changes to ecosystems. This could be (like) regression to the mean.I'm not sure how to balance these three possibilities generally. If I do think the effects are symmetric, I might go with a or b or some combination of them. In particular asymmetric cases, I might also combine c.

3. Suppose I have a plausible argument for how X could affect Y in a particular way, but no observations that can be used as suitable proxies, even very indirect, for counterfactuals with which to estimate the size of the effect. I lean towards dealing with this case as in 2, rather than just making assumptions about effect sizes without observations.

For example, someone might propose a causal path through which X affects Y with a missing estimate of effect size at at least one step along the path, but an argument to that this should increase the value of Y. It is not enough to consider only one such path, since there may be many paths from X to Y, e.g. different considerations for how X could affect Y, and these would need to be combined. Some could have

opposite effects. By 2, those other paths, when combined with the proposed causal path, reduce the effects of X on Y through the proposed path. The longer the proposed path, the more unknown alternate paths.I think this is where I am now with speculative longtermist causes. Part of this may be my ignorance of the proposed causal paths and estimates of effect sizes, since I haven't looked too deeply at the justifications for these causes, but the dampening from unknown paths also applies when the effect sizes along a path are known, which is the next case.

4. Suppose I have a causal path through some other variable Z, X→Z→Y, so that X causes Z and Z causes Y, and I model both the effects of X→Z and Z→Y, based on observations. Should I just combine the two for the effect of X on Y? In general, not in the straightforward way. As in 3, there could be another causal path, X→Z′→Y (and it could be longer, instead of with just a single intermediate variable).

As in case 3, you can think of X→Z′→Y as dampening the effect of X→Z→Y, and with long proposed causal paths, we might expect the net effect to be small, consistently with the intuition that the predictable impacts on the far future decrease over time due to ignorance/noise and chaos, even though the

actualimpacts may compound due to chaos.Maybe I'll write this up as a full post after I've thought more about it. I imagine there's been writing related to this, including in the EA and rationality communities.

I think cluster thinking and the use of sensitivity analysis are approaches for decision making under deep uncertainty, when it's difficult to commit to a particular joint probability distribution or weight considerations. Robust decision making is another. The maximality rule is another: given some set of plausible (empirical or ethical) worldviews/models for which we can't commit to quantifying our uncertainty, if A is worse in expectation than B under some subset of plausible worldviews/models, and not better than B in expectation under any such set of plausible worldviews/models, we say A < B, and we should rule out A.

It seems like EAs should be more familiar with the field of decision making under deep uncertainty. (Thanks to this post by weeatquince for pointing this out.)

See also:

EDIT: I think this approach isn't very promising.

The above mentioned papers by Mogensen and Thorstad are critical of the maximality rule for being too permissive, but here's a half-baked attempt to improve it:

Suppose you have a social welfare function U, and want to compare two options, A and B. Suppose further that you have two sets of probability distributions of size n for the outcome X of each of A of and B, PA,PB. Then A≿B (A is at least as good as B) if (and only if) there is a bijection f:PA→PB such that

EX∼P[U(X)]≥EX∼f(P)[U(X)], for all P∈PA, (1)

and furthermore, A≻B (A is strictly better than B) if the above inequality is strict for some P∈PA.

This means

pairingasymmetric/complex cluelessness arguments. Suppose you think helping an elderly person cross the street might have some important effect on the far future (you have some P∈PA), but you think not doing so could also have a similar far-future effect (according to P′∈PB), but the short-term consequences are worse, and under some pairing of distributions/arguments f:PA→PB, helping the elderly person always looks at least as good and under one pair (P,f(P)) looks better, so you should do it. Pairing distributions like this in some sense forces us to giveequal weightto P and f(P), and maybe this goes too far and assumes away too much of our cluelessness or deep uncertainty?The maximality rule as described in Maximal Cluelessness effectively assumes a pairing is already given to you, by instead using a single set of distributions P that can each be conditioned on taking action A or B. We'd omit f, and the expression replacing (1) above would be

EX∼P|A[U(X)]≥EX∼P|B[U(X)], for all P∈P.

I'm not sure what to do for different numbers of distributions for each option or infinitely many distributions. Maybe the function f should be assumed given, as a

preferredmapping between distributions, and we could relax the surjectivity, total domain, injectivity and even fact that it's a function, e.g. we compare for pairs (PA,PB)∈R, for some relation (subset) R⊆PA×PB. But assuming we already have such a function or relation seems to assume away too much of our deep uncertainty.One plausibly useful first step is to sort PA and PB according to the expected values of U(A) and U(B) under their corresponding probability distributions, respectively. Should the mapping or relation preserve the min and max? How should we deal with everything else? I suspect any proposal will seem arbitrary.

Perhaps we can assume slightly more structure on the sets PA for each option A by assuming multiple probability distributions on PA, and go up a level (and we could repeat). Basically, I want to give probability

rangesto theexpectedvalue of the action A, and then compare the possible expected values of these expected values. However, if we just multiply our higher-order probability distributions by the lower-order ones, this comes back to the original scenario.If we think

1. it's always better to improve the welfare of an existing person (or someone who would exist anyway) than to bring others into existence, all else equal, and

2. two outcomes are (comparable and) equivalent if they have the same distribution of welfare levels (but possibly different identities; this is often called Anonymity),

then not only would we reject Mere Addition (the claim that adding good lives, even those which are barely worth living but still worth living, is never bad), but the following would be true:

Given any two nonempty populations A and B, if any individual in B is worse off than any individual in A, then A∪B is worse than A. In other words, we shouldn't add to a population any individual who isn't at least as well off as the best off in the population, all else equal.

Intuitively, adding someone with worse welfare than someone who would exist anyway is equivalent to reducing the existing individual's welfare and adding someone with better welfare than them; you just swap their welfares.

More formally, suppose a, a member of the original population A with welfare u, is better off than b, a member of the added population B with welfare v, so u>v. Then consider

A′ which is A, but has b instead of a, with welfare u.

B′ which is B, but has a instead of b, with welfare v.

Then, A is better than A′∪B′ , by the first hypothesis, because the latter has all the same individuals from A (and extras from B) with exactly the same welfare levels, except for a (from A and B′) who is worse off with welfare v (from B′) instead of u (from A). So A≻A′∪B′.

And A′∪B′ is equivalent to A∪B , by the second hypothesis, because the only difference is that we've swapped the welfare levels of a and b. So A′∪B′≃A∪B.

So, by transitivity (and the independence of irrelevant alternatives),

If welfare is real-valued (specifically from an interval I⊆R), then Maximin (maximize the welfare of the worst off individual) and theories which assign negative value to the addition of individuals with non-maximal welfare satisfy the properties above.

Furthermore, if along with welfare from a real interval and property 1 in the previous comment (2. Anonymity is not necessary), the following two properties also hold:

3. Extended Continuity, a modest definition of continuity for a theory comparing populations with real-valued welfares which must be satisfied by any order representable by a real-valued function that is continuous with respect to the welfares of the individuals in each population, and

4. Strong Pareto (according to one equivalent definition, under transitivity and the independence of irrelevant alternatives): if two outcomes with the same individuals in their populations differ only by the welfare of one individual, then the outcome in which that individual is better off is strictly better than the other,

then the theory

mustassign negative value to the addition of individuals with non-maximal welfare (and no positive value to the addition of individuals with maximal welfare) as long as any individual in the initial population has non-maximal welfare. In other words, the theory must beantinatalistin principle, although not necessarily in practice, since all else is rarely equal.Proof: Suppose A is any population with an individual a with some non-maximal welfare u and consider adding an individual b who would also have some non-maximal welfare v. Denote, for all ϵ>0 small enough (0<ϵ<ϵ0),A+ϵχa: the population A, but where individual a has welfare u+ϵ (which exists for all sufficiently small ϵ>0, since u is non-maximal, and welfare comes from an interval).

Also denote

B: the population containing only b, with non-maximal welfare v, and

C: the population containing only b, but with some welfare w>v (v is non-maximal, so there must be some greater welfare level).

Then

where the first inequality follows from the hypothesis that it's better to improve the welfare of an existing individual than to add any others, and the second inequality follows from Strong Pareto, because the only difference is b's welfare.

Then, by Extended Continuity and the first inequality for all (sufficiently small) ϵ>0, we can take the limit (infimum) of A+ϵχa as ϵ→0 to get

so, it's

no betterto add b even if they would have maximal welfare, and by transitivity (and the independence of irrelevant alternatives) with 2. A∪C≻A∪B,so it's

strictly worseto add b with non-maximal welfare. This completes the proof.My current best guess on what constitutes welfare/wellbeing/value (setting aside issues of aggregation):

1. Suffering is bad in itself.

2. Pleasure doesn't matter in itself.

3. Conscious disapproval

mightbe bad in itself. If bad, this could capture the badness of suffering, since I see suffering asaffectiveconscious disapproval (an externalist account).4. Conscious approval doesn't matter in itself in an absolute sense (it may matter in a relative sense, as covered by 5). Pleasure is

affectiveconscious approval.5. Other kinds of preferences might matter, but only comparatively (in a wide/non-identity way) when they exist in both outcomes, i.e. between a preference that's more satisfied and the same or a different preference (of the same kind?) that's less satisfied, an outcome with the more satisfied one is better than an outcome with the less satisfied one, ignoring other reasons. This is a kind of preference-affecting principle.

Also, I lean towards experientialism on top of this, so I think the degree of satisfaction/frustration of the preference has to be experienced for it to matter.

To expand on 5, the fact that you have an unsatisfied preference doesn't mean you disapprove of the outcome, it only means another outcome in which it is satisfied is preferable, all else equal. For example, that someone would like to go to the moon doesn't necessarily make them worse off than if they didn't have that desire, all else equal. That someone with a certain kind of disability would like to live without that disability and might even trade away part of their life to do so doesn't necessarily make them worse off, all else equal. This is incompatible with the way QALYs are estimated and used.

I think this probably can't be reconciled with the independence of irrelevant alternatives in a way that I would find satisfactory, since it would either give us antifrustrationism (which 5 explicitly rejects) or allow that sometimes having a preference is better than not, all else equal.

More here, here and here on my shortform, and in this post.