Solution to the two envelopes problem for moral weights

Michael St Jules 🔸

Solution to the two envelopes problem for moral weights

Comments 26

Sorted by

New & upvoted

Re. the scenario with the intelligent aliens, you argue that we just have access to different facts, so it's unobjectionable that we reach different conclusions.

But the classic two-envelope problem is a problem because you get ~exploited. Offered a choice of two envelopes you pick one. And then when you open it you will predictably pay money to switch to the other envelope. Of course now you have extra facts — but that doesn't change that it looks like a mistake to predictably have this behaviour.

Similarly in this case we could set up an (admittedly construed) situation where you start by doing a bunch of reasoning about what's best, under a veil of ignorance about whether you're human or alien. Then it's revealed which you are, you remember all your experiences and can reason about how big a deal they are — and then you will predictably pay some utility in order to benefit the other species more. It similarly looks like it's a mistake to predictably have this behaviour (in the sense that, if humans and aliens are equally likely to be put in this kind of construed situation, then the world would be predictably better off if nobody had this behaviour), and I don't really feel like you've addressed this.

In the case of the classic two-envelope paradox the standard resolution is that you need to pay attention to your priors about how much money might be in envelopes. After you open your envelope and find $100, your probabilities of $50 vs $200 are no longer quite 50% — and for some values you could find, you should prefer not to switch.^[1]

So in the case with the aliens, shouldn't we similarly be discussing priors? Shouldn't we be considering how much, on some kind of ur-prior, we should expect to experience, and then comparing what our actual experience is to that? And if we're doing this in the case of aliens, shouldn't we also do it in the case of chickens?

^{^}
At least with tame priors. If your prior over the amount of money in the envelope has an infinite expectation, it's possible for it to be always correct to switch. But in that case I imagine your complaint will be that you shouldn't start with a prior with infinite expectation.

Michael St Jules 🔸

(Replying back at the initial comment to reduce thread depth and in case this is a more important response for people to see.)

I understand that you're explaining why you don't really think it's well modelled as a two-envelope problem, but I'm not sure whether you're biting the bullet that you're predictably paying some utility in unnecessary ways (in this admittedly convoluted hypothetical), or if you don't think there's a bullet there to bite, or something else?

Sorry, yes, I realized I missed this bit (EDIT: and which was the main bit...). I guess then I would say your options are:

Bite the bullet (and do moral trade).
Entertain both the human-relative stance and the alien-relative stance even after finding out which you are,^[1] say due to epistemic modesty. I assume these stances won't be comparable on a common scale, at least not without very arbitrary assumptions, so you'd use some other approach to moral uncertainty.
Make some very arbitrary assumptions to make the problem go away.

I think 1 and 2 are both decent and defensible positions. I don't think the bullet to bite in 1 is really much of a bullet at all.

From your top-level comment:

Then it's revealed which you are, you remember all your experiences and can reason about how big a deal they are — and then you will predictably pay some utility in order to benefit the other species more. It similarly looks like it's a mistake to predictably have this behaviour (in the sense that, if humans and aliens are equally likely to be put in this kind of construed situation, then the world would be predictably better off if nobody had this behaviour), and I don't really feel like you've addressed this.

The aliens and humans just disagree about what's best, and could coordinate (moral trade) to avoid both incurring unnecessary costs from relatively prioritizing each other. They have different epistemic states and/or preferences, including moral preferences/intuitions. Your thought experiment decides what evidence different individuals will gather (at least on my bullet-biting interpretation). You end up with similar problems generally if you decide behind a veil of ignorance what evidence different individuals are going to gather (e.g. fix some facts about the world and decide ahead of time who will discover which ones) and epistemic states they'd end up in. Even if they start from the same prior.

Maybe one individual comes to believe bednets are the best for helping humans, while someone else comes to believe deworming is. If the bednetter somehow ends up with deworming pills, they'll want to sell them to buy bednets. If the dewormer ends up with bednets, they'll want to sell them to buy deworming pills. They could both do this at deadweight loss in terms of pills delivered, bednets delivered, cash and/or total utility. Instead, they could just directly trade with each other, or coordinate and agree to just deliver what they have directly or to the appropriate third party.

EDIT: Now, you might say they can just share evidence and then converge in beliefs. That seems fair for the dewormer and bednetter, but it's not currently possible for me to fully explain the human experience of suffering to an alien, or to give an alien access to that experience. If and when that does become possible, we'd be able to agree much more.

Another illustration: suppose you don't know whether you'll prefer apples or oranges. You try both. From then on, you're going to predictably pay more for one than the other. Some other people will do the opposite. Whenever an apple-preferrer ends up with an orange for whatever reason, they would be inclined to trade it away to get an apple. Symmetrically for the orange-preferrer. They might both do so together at deadweight loss and benefit from directly trading with each other.

This doesn't seem like much of a bullet to bite.

^{^}
Or your best approximations of each, given you'll only have direct access to one.

Owen Cotton-Barratt

I don't think that the apples and oranges case is analogous, since then it's really about different preferences. In this case I'm assuming that all the parties have the same ultimate preferences (to make more good morally relevant good experiences and fewer bad ones), but different pieces of evidence.

I do think the deworming and bednets case is analogous. Suppose the two of us are in a room before we go out to gather evidence. We agree that there is a 50% chance that bednets are twice as good as deworming, and a 50% chance that deworming is twice as good. We neither of us have a great idea how good either of them is.

One of us goes off to study bednets. After that they have a reasonable sense of how good bednets are, and predictably prefer deworming (for 2-envelope reasons). The other goes to study deworming, and afterwards predictably prefers bednets. At this point we each have an expertise which makes our work 10% more effective on the thing we're expert in, but we each choose to eschew our expertise as the benefit from switching envelopes is higher.

We'd like to morally trade so that we each stay working in our domain of expertise. But suppose that later we'll be causally disconnected and unable to engage in moral trade. We'd still like to commit at the start to a trade where neither party switches.

Now suppose that there's only you, and you're about to flip a coin to decide if you'll go to study bednets or deworming. You'd prefer to commit to not then switching to the other thing.

But suppose you forgot to make that commitment, and are only thinking about this after having flipped the coin and discovered you're about to study bednets. Your epistemic position hasn't yet changed, only your expectation of future evidence. Surely(?) you'd still want to make the commitment at this point?

Now if you only think about it later, having studied bednets, I'm imagining that you think "well I would have wanted to commit earlier, but now that I know about how good bednets are I think deworming is better in expectation, so I'm glad I didn't commit". Is that right? (I prefer to act as though I'd made the commitment I predictably would have wanted to make.)

Michael St Jules 🔸

Now suppose that there's only you, and you're about to flip a coin to decide if you'll go to study bednets or deworming. You'd prefer to commit to not then switching to the other thing.

Maybe? I'm not sure I'd want to constrain my future self this way, if it won't seem best/rational later. I don't very strongly object to commitments in principle, and it seems like the right thing to do in some cases, like Parfit's hitchhiker. However, those assume the same preferences/scale after, and in the two envelopes problem, we may not be able to assume that. It could look more like preference change.

In this case, it looks like you're committing to something you will predictably later regret either way it goes (because you'll want to switch), which seems kind of irrational. It looks like violating the sure-thing principle. Plus, either way it goes, it looks like you'll fail to follow your own preferences later, and it will seem irrational then. Russell and Isaacs (2021) and Gustafsson (2022) also argue similarly against resolute choice strategies.

I'm more sympathetic to acausal trade with other beings that could simultaneously exist with you (even if you don't know ahead of time whether you'll find bednets or deworming better in expectation), if and because you'll expect the world to be better off for it at every step: ahead of time, just before you follow through and after you follow through. There's no expected regret. In an infinite multiverse (or a non-negligible chance of one), we should expect such counterparts to exist, though, so plausibly should do the acausal trade.

Also, I think you'd want to commit ahead of time to a more flexible policy for switching that depends on the specific evidence you'll gather.^[1]

Now if you only think about it later, having studied bednets, I'm imagining that you think "well I would have wanted to commit earlier, but now that I know about how good bednets are I think deworming is better in expectation, so I'm glad I didn't commit". Is that right? (I prefer to act as though I'd made the commitment I predictably would have wanted to make.)

Ya, that seems mostly right on first intuition.

However, acausal trade with counterparts in a multiverse still seems kind of compelling.

Also, I see some other appeal in favour of committing ahead of time to stick with whatever you study (and generally making the commitment earlier, too, contra what I say above in this comment): you know there's evidence you could have gathered that would tell you not to switch, because you know you would have changed your mind if you did, even if you won't gather it anymore. Your knowledge of the existence of this evidence is evidence that supports not switching, even if you don't know the specifics. It seems like you shouldn't ignore that. Maybe it doesn't go all the way to support committing to sticking with your current expertise, because you can favour the more specific evidence you actually have, but maybe you should update hard enough on it?

This seems like it could avoid both the ex ante and ex post regret so far. But, still you either:

can't be an EU maximizer, and so you'll be vulnerable to money pump arguments anyway or abandon completeness and often be silent on what to do (e.g. multi-utility representations), or
have to unjustifiably fix a single scale and prior over it ahead of time.

The same could apply to humans vs aliens. Even if we're not behind the veil of ignorance now and never were, there's information that we'd be ignoring: what real or hypothetical aliens would believe and the real or hypothetical existence of evidence that supports their stance.

But, it's also really weird to consider the stances of hypothetical aliens. It's also weird in a different way if you imagine finding out what it's like to be a chicken and suffer like a chicken.

^{^}
Suppose you're justifiably sure that each intervention is at least not net negative (whether or not you have a single scale and prior). But then you find out bednets have no (or tiny) impact. I think it would be reasonable to switch to deworming at some cost. Deworming could be less effective than you thought ahead of time, but no impact is as bad as it gets given your credences ahead of time.

Michael St Jules 🔸

Similarly in this case we could set up an (admittedly construed) situation where you start by doing a bunch of reasoning about what's best, under a veil of ignorance about whether you're human or alien. Then it's revealed which you are, you remember all your experiences and can reason about how big a deal they are — and then you will predictably pay some utility in order to benefit the other species more.

In this case, assuming you have no first-person experience with suffering to value directly (or memory of it), you would develop your concept of suffering third-personally — based on observations of and hypotheses about humans, aliens, chickens and others, say — and could base your ethics on that concept. This is not how humans or the aliens would typically understand and value suffering, which is largely first-personally. The human has their own vague revisable placeholder concept of suffering on which they ground value, and the alien has their own (and the chicken might have their own). Each also differ from the hypothetical third-personal concept.

Technically, we could say the humans and aliens have developed different ethical theories from each other, even if everyone's a classical utilitarian, say, because they're picking out different concepts of suffering on which to ground value.^[1] And your third-personal account would give a different ethical theory from each, too. All three (human, alien, third-personal) ethical theories could converge under full information, though, if the concepts of suffering would converge under full information (and if everything else would converge).^[2]

With the third-personal concept, I doubt there'd be a good solution to this two envelopes problem that actually gives you exactly one common moral scale and corresponding prior when you have enough uncertainty about the nature of suffering. You could come up with such a scale and prior, but you'd have to fix something pretty arbitrarily to do so. Instead, I think the thing to do is to assign credences across multiple scales (and corresponding priors) and use an approach to moral uncertainty that doesn't depend on comparisons between them. (EDIT: And these could be the alien stance and human stance which relatively prioritize the other and result in a two envelopes problem.) But what I'll say below applies even if you use a single common scale and prior.

When you have first-person experience with suffering, you can narrow down the common moral scales under consideration to ones based on your own experience. This would also have implications for your credences compared to the hypothetical third-person perspective.

If you started from no experience of suffering and then became a human, alien or chicken and experienced suffering as one of them, you could then rule out a bunch of scales (and corresponding priors). This would also result in big updates from your prior(s). You'd end up in a human-relative, alien-relative or chicken-relative account (or multiple such accounts, but for one species only).

^{^}
A typical chicken very probably couldn't be a classical utilitarian.
^{^}
A typical chicken's concept of suffering wouldn't converge, but we could capture/explain it. Their apparent normative stances wouldn't converge either, unless you imagine radically different beings.

Owen Cotton-Barratt

I understand that you're explaining why you don't really think it's well modelled as a two-envelope problem, but I'm not sure whether you're biting the bullet that you're predictably paying some utility in unnecessary ways (in this admittedly convoluted hypothetical), or if you don't think there's a bullet there to bite, or something else?

Michael St Jules 🔸

Alternatively, you might assume you actually already are a human, alien or chicken, have (and remember) experience with suffering as one of them, but are uncertain about which you in fact are. For illustration, let's suppose human or alien. Because you're uncertain about whether you're an alien or human, your concept of suffering points to one that will turn out to be human suffering with some probability, p, and alien suffering with the rest of the probability, 1-p. You ground value relative to your own concept of suffering, which could turn out to be (or revised to) the human concept or the alien concept with respective probabilities.

Let H_H be the moral weight of human suffering according to a human concept of suffering, directly valued, and A_H be the moral weight of alien suffering according to a human concept of suffering, indirectly valued. Similarly, let A_A and H_A be the moral weights of alien suffering and human suffering according to the alien concept of suffering. A human would fix H_H, build a probability distribution for A_H relative to H_H and evaluate A_H in terms of it. An alien would fix A_A, build a probability distribution for H_A relative to A_A and evaluate H_A in terms of it.

You're uncertain about whether you're an alien or human. Still, you directly value your direct experiences. Assume A_A and H_H specifically represent the moral value of an experience of suffering you've actually had,^[1] e.g. the moral value of a toe stub, and you're doing ethics relative to your toe stubs as the reference point. You therefore set A_A = H_H. You can think of this as a unit conversion, e.g. 1 unit of alien toe stub-relative suffering = 10 units of human toe stub-relative suffering.

This solves the two envelopes problem. You can either use A_A or H_H to set your common scale, and the answer will be the same either way, because you've fixed the ratio between them. The moral value of a human toe stub, H, will be H_H with probability p, and H_A with probability 1-p. The moral weight of an alien toe stub, A, will be A_H with probability p and A_A with probability 1-p. You can just take expected values in either the alien or human units and compare.

We could also allow you to have some probability of being a chicken under this thought experiment. Then you could set A_A = H_H = C_C, with C_C representing the value of a chicken toe stub to a chicken, and C_A, C_H, A_C and H_C defined like above.

But if you're actually a chicken, then you're valuing human and alien welfare as a chicken, which is presumably not much, since chickens are very partial (unless you idealize). Also, if you're a human, it's hard to imagine being uncertain about whether you're a chicken. There's way too much information you need to screen off from consideration, like your capacities for reasoning and language and everything that follows from these. And if you're a chicken, you couldn't imagine yourself as a human or being impartial at all.

So, maybe this doesn't make sense, or we have to imagine some hypothetically cognitively enhanced chicken or an intelligent being who suffers like a chicken. You could also idealize chickens to be impartial and actually care about humans, but then you're definitely forcing them into a different normative stance than the ones chickens actually take (if any).

^{^}
It would have to be something "common" to the beings under consideration, or you'd have to screen off information about who does and doesn't have access to it or use of that information, because otherwise you'd be able to rule out some possibilities for what kind of being you are. This will look less reasonable with more types of beings under consideration, in case there's nothing "common" to all of them. For example, not all moral patients have toes to stub.

Will Howard🔹

We should fix and normalize relative to the moral value of human welfare, because our understanding of the value of welfare is based on our own experiences of welfare

I used to think this for exactly the same reason, but I now no longer do. The basic reason I changed my mind is the idea that uncertainty in the amount of welfare humans (or chickens) experience is naturally scale invariant. This scale invariance means that observing any particular absolute amount of welfare (by experiencing it directly) shouldn't update you as to the relative amount of welfare under different theories.

The following is a fairly "heuristic" version of the argument, I spent some time trying to formalise it better but got stuck on the maths, so I'm giving the version that was in my head before I tried that. I'm quite convinced it's basically true though.

The argument

Consider only theories that allow the most aggregation-friendly version of hedonistic utilitarianism^[1]. Under this constraint, the total amount of utility experienced by one or more moral patients is some real quantity that can be expressed in objective units ("hedons"), and this quantity is comparable across the theories that we are allowing. You might imagine that you could consult God as to the utility of various world states and He could say truthfully "ah, stubbing your toe is -1 hedon". In your post you also suppose that you can measure this amount yourself through direct experience, which I find reasonable.

From the perspective of someone who is unable to experience utility themselves, there is a natural scale invariance to this quantity. This is clearest when considering the "ought" side of the theory: the recommendations of utilitarianism are unchanged if you scale utility up and down by any amount as it doesn't affect the rank ordering of world states.

Another way to get this intuition is to imagine an unfeeling robot that derives the concept of utility from some combination of interviewing moral patients and constructing a first principles theory^[2]. It could even get the correct theory, and derive that e.g. breaking your arm is 10 times as bad as stubbing your toe. It would still be in the dark about how bad these things are in absolute terms though. If God told it that stubbing your toe was –1 hedons that wouldn't mean anything to the robot. God could play a prank on the robot and tell it stubbing your toe was instead –1 millihedons, or even temporarily imbue the robot with the ability to feel pain and expose it to –1 millihedons and say "that's what stubbing your toe feels like". This should be equally unsurprising to the robot as being told/experiencing –1 hedon.

My claim is that the epistemic position of all the different theories of welfare are effectively that of this robot. And as a result of this, observing any absolute amount of welfare (utility) under theory A shouldn't update you as to what the amount would be under theory B, because both theories were consistent with any absolute amount of welfare to begin with. In fact they were "maximally uncertain" about the absolute amount, no amount should be any more or less of a surprise under either theory.

If you had a prior reason to think theory B gives say 5 times the welfare to humans as theory A (importantly in relative terms), then you should still think this after observing the absolute amount yourself, and this is what generates the thorny version of the two envelopes problem. I think there are sensible prior reasons to think there is such a relative difference for various pairs of theories.

For instance, suppose both A and B are essentially "neuron count" theories and agree on some threshold brain complexity for sentience, but then A says "amount of sentience" scales linearly with neuron count whereas B says it scales quadratically. It's reasonable to think that the amount of welfare in humans is much higher under B, maybe times higher.

Other examples where arguments like this can be made are:

A and B are the same except B has multiple conscious subsystems
A and B are predicting chicken welfare rather than human, and A says they are sentient whereas B says they are not. Clearly B predicts 0 times the welfare of A (equivalently A predicts infinity times the welfare of B)

Putting this in two envelopes terms

If we say we have two theories, 1 and 2, which you might imagine are a human centric ( $C_{1} / H_{1} = 10^{- 10}$ , $p_{1} = 0.9$ )^[4] and an animal-inclusive ( $C_{2} / H_{2} = 0.01$ , $p_{2} = 0.1$ ) view, then we have:

E (\frac{C}{H}) = p_{1} \frac{C_{1}}{H_{1}} + p_{2} \frac{C_{2}}{H_{2}} \approx 0.001

And

E (\frac{H}{C}) = p_{1} \frac{H_{1}}{C_{1}} + p_{2} \frac{H_{2}}{C_{2}} \approx 9 \times 10^{9} \neq 1 / E (\frac{C}{H})

As we are used to seeing.

But as you point out in your post, the quantities $H_{1}$ and $H_{2}$ are not necessarily the same (though you argue they should be treated as such) which makes this a nonsensical average of dimensionless numbers. E.g. $H_{1}$ could be 0.00001 hedons and $H_{2}$ could be 10 hedons, which would mean we are massively overcounting theory 1. The quantities we actually care about are $E (C)$ and $E (H)$ (dimension-ed numbers in units of hedons), or their ratio $E (C) / E (H)$ . We can write these as:

E (C) = p_{1} H_{1} (\frac{C_{1}}{H_{1}}) + p_{2} H_{2} (\frac{C_{2}}{H_{2}}) (1)

E (H) = p_{1} H_{1} + p_{2} H_{2} = ¯ H (2)

This may seem like a roundabout way of writing these down, but remember that what we have from our welfare range estimates are values for $C_{i} / H_{i}$ , so these can't be cancelled further and the $H_{i}$ s are the minimum number of parameters we can add to pin down the equations. The ratio $E (C) / E (H)$ is then:

E (C) / E (H) = p_{1} \frac{H_{1}}{¯ H} (\frac{C_{1}}{H_{1}}) + p_{2} \frac{H_{2}}{¯ H} (\frac{C_{2}}{H_{2}}) (3)

I find this easier to think about if the ratios are in terms of a specific theory, e.g. $H_{1}$ , so you are always comparing what the relative amount of welfare is in theory X vs some definite reference theory. We can rearrange (3) to support this by dividing all the fractions though by $H_{1}$ :

E (C) / E (H) = \frac{1}{^H} (p_{1} \frac{H_{1}}{H_{1}} (\frac{C_{1}}{H_{1}}) + p_{2} \frac{H_{2}}{H_{1}} (\frac{C_{2}}{H_{2}})) (4)

Where

^H = ¯ H / H_{1} = p_{1} \frac{H_{1}}{H_{1}} + p_{2} \frac{H_{2}}{H_{1}}

Again, maybe this seems incredibly roundabout, but in this form it is more clear that we now only need the ratios $H_{i} / H_{1}$ not their absolute values. This is good according to the previous claims I have made:

Because of scale invariance, it's not possible to say anything about the absolute value of $H_{i}$
It is possible to reason about the relative welfare values between theories, represented by $H_{i} / H_{1}$

So under this framing the "solution to the two envelopes problem for moral weights" is that you need to estimate the inter-theoretic welfare ratios for humans (or any reference moral patient), as well as the intra-theoretic ratios between moral patients. I.e. you have to estimate $H_{i} / H_{1}$ as well as $C_{i} / H_{i}$ and $p_{i}$ for each theory.

I think this is still quite a big problem because of the potential for arguing that some theories have combinatorially higher welfare than others, thus causing them to dominate even if you put a very low probability on them. The neuron count example above is like this, you could make it even worse by supposing a theory where welfare is exponential in neuron count.

Returning to the human-centric vs animal inclusive toy example

If we say we have two theories, 1 and 2, which you might imagine are a human centric ( $C_{1} / H_{1} = 10^{- 10}$ , $p_{1} = 0.9$ )^[4] and a animal-inclusive ( $C_{2} / H_{2} = 0.01$ , $p_{2} = 0.1$ ) view

Adding these $H_{i} / H_{1}$ numbers into this example we now have:

E (C) / E (H) = \frac{1}{^H} (0.9 (10^{- 10}) + 0.1 \frac{H_{2}}{H_{1}} (0.01)) (4)

What should the value of $H_{2} / H_{1}$ be? Well in this case I think it's reasonable to suppose $H_{1}$ and $H_{2}$ are in fact equal, as we don't have any principled reason not to, so this still comes out to ~0.001. As in the original version we can flip this around to see if we get a wildly different answer if we make the inter-theoretic comparison be between chickens:

E (H) / E (C) = \frac{1}{^C} (0.9 (10^{10}) + 0.1 \frac{C_{2}}{C_{1}} (100)) (4)

Now what should $C_{2} / C_{1}$ be, recalling that theory 1 says chickens are worth very little compared to humans? I think it's reasonable say that $C_{1}$ is also very little compared to $C_{2}$ , since the point of theory 1 is basically to suppose chickens aren't (or are barely) sentient, and not to say anything about humans. Supposing that none of the difference is explained by humans, we get $C_{2} / C_{1} = 10^{8}$ , this also gives $^C \approx 10^{7}$ , so $E (H) / E (C)$ comes out to ~1000. This is the inverse of $E (C) / E (H)$ as we expect.

Clearly this is just rearranging the same numbers to get the same result, but hopefully it illustrates how explicitly including these $H_{i} / H_{1}$ ratios makes the two envelope problem that you get by naively inverting the ratios less spooky, because by doing so you are effectively wildly changing the estimates of $H_{i} / H_{1}$ .

I agree with you that there are many cases where for the specific theories under consideration it is right to assume that $H_{1}$ and $H_{2}$ are equal (because we have no principled reason not to), but that this is not because we are able to observe welfare directly (even if we suppose that this is possible). And for many pairs of theories we might think $H_{1}$ and $H_{2}$ are very different.

(Apologies for switching back and forth between "welfare" and "utility", I'm basically treating them both like "utility")

^{^}
I think it's right to start with this case, because it should be the easiest. So if something breaks in this case it is likely to also break once we start trying to include things like non-welfare moral reasons
^{^}
"I've met a few of those"
^{^}
We can label the "true" theory as A, because we only get the chance to experience the true theory (we just don't know which one it is)

^{^}

You could make this actually zero, but I think adding infinity in makes the argument more confusing

Michael St Jules 🔸

There's a lot here, so I'll respond to what seems to be most cruxy to me.

Another way to get this intuition is to imagine an unfeeling robot that derives the concept of utility from some combination of interviewing moral patients and constructing a first principles theory^[2]. It could even get the correct theory, and derive that e.g. breaking your arm is 10 times as bad as stubbing your toe. It would still be in the dark about how bad these things are in absolute terms though.

I agree with this, but I don't think this is our epistemic position, because we can understand all value relative to our own experiences. (See also a thread about an unfeeling moral agent here.)

My claim is that the epistemic position of all the different theories of welfare are effectively that of this robot. And as a result of this, observing any absolute amount of welfare (utility) under theory A shouldn't update you as to what the amount would be under theory B, because both theories were consistent with any absolute amount of welfare to begin with. In fact they were "maximally uncertain" about the absolute amount, no amount should be any more or less of a surprise under either theory.

I agree that directly observing the value of a toe stub, say, under hedonism might not tell you much or anything about its absolute value under non-hedonistic theories of welfare.^[1]

However, I think we can say more under variants of closer precise theories. I think you can fix the badness of a specific toe stub across many precise theories. But then also separately fix the badness of a papercut and many other things under the same theories. This is because some theories are meant to explain the same things, and it's those things to which we're assigning value, not directly to the theories themselves. See this section of my post. And those things in practice are human welfare (or yours specifically), and so we can just take the (accessed) human-relative stances.

You illustrate with neuron count theories, and I would in fact say we should fix human welfare across those theories (under hedonism, say, and perhaps separately for different reference point welfare states), so evidence about absolute value under one hedonistic neuron count theory would be evidence about absolute value under other hedonistic theories.

I suspect conscious subsystems don't necessarily generate a two envelopes problem; you just need to calculate the expected number of subsystems and their expected aggregate welfare relative to accessed human welfare. But it might depend on which versions of conscious subsystems we're considering.

For predictions of chicken sentience, I'd say to take expectations relative to human welfare (separately with different reference point welfare states).

^{^}
I'd add a caveat that evidence about relative value under one theory can be evidence under another. If you find out that a toe stub is less bad than expected relative to other things under hedonism, then the same evidence would typically support that it's less bad for desires and belief-like preferences than you expected relative to the same other things, too.

Will Howard🔹

I'm still trying to work through the maths on this so I won't respond in much detail until I've got further with that, I may end up writing a separate post. I did start off at your position so there's some chance I will end up there, I find this very confusing to think about.

Some brief comments on a couple of things:

I agree with this, but I don't think this is our epistemic position, because we can understand all value relative to our own experiences.

I think relative is the operative word here. That is, you experience that a toe stub is 10 times worse than a papercut, and this motivates the development of moral theories that are consistent with this, and rules out ones that are not (e.g. ones that say they are equally bad). But there is an additional bit of parameter fixing that has to happen to get from the theory predicting this relative difference to predicting the absolute amount.

My claim is that at least generally speaking, and I think actually always, theories that are under consideration only predict these relative differences and not the absolute amounts. E.g. if a theory supposes that a certain pain receptor causes suffering when activated, then it might suppose that 10 receptors being activated causes 10 times as much suffering, but it doesn't say anything about the absolute amount. This is also true of more fundamental theories (e.g. more information processing => more sentience). I have some ideas about why this is^[1], but mainly I can't think of any examples where this is not the case. If you can think of any then please tell me as that would at least partially invalidate this scale invariance thing (which would be good).

I think you would also say that theories don't need to predict this overall scale parameter because we can always fix it based on our observations of absolute utility... this is the bit of maths that I'm not clear on yet, but I do currently think this is not true (i.e. the scale parameter does matter still, especially when you have a prior reason to think there would be a difference between the theories).

I agree that directly observing the value of a toe stub, say, under hedonism might not tell you much or anything about its absolute value under non-hedonistic theories of welfare.... However, I think we can say more under variants of closer precise theories.

I was intending to restrict to only theories that fall under hedonism, because I think this is the case where this kind of cross theory aggregation should work the best. And given that I think this scale invariance problem arises there then it would be even worse when considering more dissimilar theories.

So I was considering only theories where the welfare relevant states are things that feel pretty close to pleasure and pain, and you can be uncertain about how good or bad different states are for common sense reasons^[2], but you're able to tell at least roughly how good/bad at least some states are.

^{^}
Mentioned in the previous comment. One is that the prescriptions of utilitarianism have this scale invariance (only distinguish between better/worse), as do the behaviours associated with pleasure/pain (e.g. you can only communicate that something is more/less painful, or [for animals] show an aversion to a more painful thing in favour of a less painful thing).
^{^}
E.g. you might not remember them, you might struggle to factor in duration, the states might come along with some non-welfare-relevant experience which biases your recollection (e.g. a painfully bright red light vs a painfully bright green light)

Michael St Jules 🔸

My claim is that at least generally speaking, and I think actually always, theories that are under consideration only predict these relative differences and not the absolute amounts.
(...)
I have some ideas about why this is^[1], but mainly I can't think of any examples where this is not the case. If you can think of any then please tell me as that would at least partially invalidate this scale invariance thing (which would be good).

I think what matters here is less whether they predict absolute amounts, but which ones can be put on common scales. If everything could be put on the same common scale, then we would predict values relative to that common scale, and could treat the common scale like an absolute one. But scale invariance would still depend on you using that scale in a scale-invariant way with your moral theory.

I do doubt all theories can be put on one common scale together this way, but I suspect we can find common scales across some subsets of theories at a time. I think there usually is no foundational common scale between any pair of theories, but I'm open to the possibility in some cases, e.g. across approaches for counting conscious subsystems, causal vs evidential decision theory (MacAskill et al., 2019), in some pairs of person-affecting vs total utilitarian views (Riedener, 2019, also discussed in my section here). This is because the theories seem to recognize the same central and foundational reasons, but just find that they apply differently or in different numbers. You can still value those reasons identically across theories. So, it seems like they're using the same scale (all else equal), but differently.

I'm not sure, though. And maybe there are multiple plausible common scales for a given set of theories, but this could mean two envelopes problem between those common scales, not between the specific theories themselves.

And I agree that there probably isn't a shared foundational common scale across all theories of consciousness, welfare and moral weights (as I discuss here).

I think you would also say that theories don't need to predict this overall scale parameter because we can always fix it based on our observations of absolute utility

Ya, that's roughly my position, and more precisely that we can construct common scales based on our first-person observations of utility, although with the caveat that in fact these observations don't uniquely determine the scale, so we still end up with multiple first-person observation-based common scales.

this is the bit of maths that I'm not clear on yet, but I do currently think this is not true (i.e. the scale parameter does matter still, especially when you have a prior reason to think there would be a difference between the theories).

Do you think we generally have the same problem for other phenomena, like how much water there is across theories of the nature of water or the strength of gravity as we moved from the Newtonian picture to general relativity? So, we shouldn't treat theories of water as using a common scale, or theories of gravity as using a common scale? Again, maybe you end up with multiple common scales for water, and multiple for gravity, but the point is that we still can make some intertheoretic comparisons, even if vague/underdetermined, based on the observations the theories are meant to explain, rather than say nothing about how they relate.

In these cases, including consciousness, water and gravity, it seems like we first care about the observations, and then we theorize about them, or else we wouldn't bother theorizing about them at all. So we do some (fairly) theory-neutral valuing.

Owen Cotton-Barratt

Since the heart of your case is "well we know what human experience is like so we can treat that as a fixed point", I'm just going to point out various ways in which we don't necessarily know what human experience is like, and some of the implications if we more narrowly try to anchor on what we know and otherwise adopt what I take to be your stance on the two-envelope problem:

We each only experience our own consciousness
- It seems decently likely that humans vary in some dimension like degree- or intensity-of-consciousness
  - Generically, we won't know if we're above- or below-average on this
  - So in expectation, others' experiences all matter more than our own
    - But in aggregate, a society of fully altruistic people would make errors if they each act on the assumption that their own experience matters less in expectation than other people's
In the moment writing this, I don't know what intense pain or intense pleasure feel like
- I can only base my judgement of these things on memory
  - But memory, as we know in many contexts, could be faulty
- Because there is more at stake in worlds where my memory is minimizing rather than exaggerating my past experiences, I should act on the assumption that my memory is systematically skewed in this way
It's not unusual for people to lie to themselves about their own experiences
- e.g. telling themselves things are fine while at some level experiencing significant psychological suffering
- So we should assume that our top-level consciousness doesn't always have full access to our morally relevant experience even in the moment of experiencing it
- Our uncertainty should presumably include some worlds where a large majority of our morally relevant experience is opaque to us; so in expectation the moral weight we assign ourselves should be rather higher than the one which is experienced and hence "known"
We're unable to tell how many times our experience is being instantiated
- On accounts where that's morally relevant, this could have a big impact on the expectation of the moral worth of our experiences

To be clear, I don't endorse the conclusions here — but in each case my instinct is that I'm getting off the train by saying "seems like there's some two-envelope type phenomenon going on here, so I'm not happy straightforwardly taking expectations".

Michael St Jules 🔸

2y*

I basically agree with all of this, and make some similar points in my sections Multiple possible reference points and Conscious subsystems. I think there are still two envelopes problems between what we actually access, and we don't have a nice way of uniquely fixing comparisons. But, I think it's defensible to do everything human-relative or relative to your own experiences (which are human, so this is still human-relative), what's accessed. You'll need to use multiple reference points.

Owen Cotton-Barratt

Thanks for the exploration of this.

I'm concerned that this approach is structurally very vulnerable to fanaticism / muggings. This matters for insect experience, and for possible moral relevance of single-cell organisms (ok, before getting to this case we'd likely want to revisit your section on subsystems of the brain and consider the possibility of individual neurons having morally relevant experience that our consciousness doesn't get proper access to). It could matter especially for how much we chase after the possibility of artificial minds with far far greater capacity for morally relevant experience than humans.

I guess I see this as the central issue with normalizing this way, and was sort of hoping you'd say more about it. It gets discussed a little when you talk about the possibility of overlapping conscious subsystems of the brain, but I'm unclear what your stance is towards it in general, or what you would say to someone who objected to this approach because it seemed to give a fanatical weight to chickens in the human/chicken comparison? (perhaps having somewhat different probabilities than you on the likelihood of different levels of chicken moral relevance)

Michael St Jules 🔸

I agree that this approach, if you're something like a (risk neutral) expectational utilitarian, is very vulnerable to fanaticism / muggings, but that to me is a problem for expectational utilitarianism. To you and "to someone who objected to this approach because it seemed to give a fanatical weight to chickens in the human/chicken comparison", I'd say to put more weight on normative stances that are less fanatical than expectational utilitarianism.

I personally reserve substantial skepticism of expected value maximization in general (both within moral stances and for handling moral uncertainty between them), expected value maximization with unbounded value specifically, aggregation in general and aggregation by summation. I'd probably end up with "worldview buckets" based on different attitudes towards risk/uncertainty, aggregation and grounds for moral value (types of welfare, non-welfarist values, as in the problem of multiple (human) reference points). RP's CURVE sequence goes over attitudes to risk and their implications for intervention and cause prioritization. Then, I doubt these stances would be intertheoretically comparable. For uncertainty between them, I'd use an approach to moral uncertainty that didn't depend on intertheoretic comparisons, like a moral parliament, a bargain-theoretic approach, variance voting or just sizing worldview buckets proportionally to credences.

In practice, within a neartermist focus (and ignoring artificial consciousness), this could conceivably roughly end up looking like a set of resource buckets: a human-centric bucket, a bucket for mammals and birds, a bucket for all vertebrates, a bucket for all vertebrates + sufficiently sophisticated invertebrates, a bucket for all animals, and a ~panpsychist bucket.^[1] However, the boundaries between these buckets would be soft (and softer), because the actual buckets don't specifically track a human-centric view, a vertebrate view, etc.. My approach would also inform how to size the buckets and limit risky interventions within them.

For example, fix some normative stance, and suppose within it:

you thought a typical chicken had a 1% chance of having roughly the same moral weight (per year) as a typical human (according to specific moral grounds), and didn't matter at all otherwise.
you aggregate via summation.
you thought helping chickens (much) at all would be too fanatical.

Then that view would also recommend against human-helping interventions with at most a 1% probability of success.^[2] Or, you could include some chicken interventions with many more roughly statistically risky independent human-helping interventions, because many independent risky (positive expected value) bets together don't look as risky. Still, this stance shouldn't bet everything on an intervention helping humans with only a 1% chance of success, because otherwise it could just bet everything on chickens with a similar payoff distribution. This stance would limit risky bets. Every stance could limit risky bets, but the ones that end up human-centric in practice would tend to do so more than others.

^{^}
Or, maybe some of the later buckets are just replaced with longtermist buckets, if and because longtermist bets could have similar probabilities of making a difference, but better payoffs when they succeed.
^{^}
Depending on how the nature of your attitudes to risk. This could follow from difference-making risk aversion or probability difference discounting of some kind. On the other hand, if you maximized the expected utility of the arctan of total welfare, a bounded function, then you'd prioritize marginal local improvements to worlds with small populations and switching between big and small populations, while ignoring marginal local improvements to worlds with large populations. This could also mean ignoring chickens but not marginal local improvements for humans, because if chickens don't count and we go extinct soon (or future people don't count), then the population is much smaller.

Owen Cotton-Barratt

Is the two-envelope problem, as you understand it, a problem for anything except expectational utilitarianism?

I'm asking because it feels to me like you're saying roughly "yes yes although I proposed a solution to the two-envelope problem I agree it's very much still a problem, so you also need an entirely different type of solution to address it". I think this is a bit of a caricature of what you're saying, and I suspect that it's an unfair one, but I can't immediately see how it's unfair, so I'm asking this way to try to get quickly to the heart of what's going on.

Michael St Jules 🔸

Is the two-envelope problem, as you understand it, a problem for anything except expectational utilitarianism?

I think it is or would have been a problem for basically any normative stance (moral theory + attitudes towards risk, etc.) that is ~~at all~~ sensitive to risk/uncertainty and stakes roughly according to expected value.^[1]

I think I've given a general solution here to the two envelopes problem for moral weights (between moral patients) when you fix your normative stance but have remaining empirical/descriptive uncertainty about the moral weights of beings conditional on that stance. It can be adapted to different normative stances, but I illustrated it with versions of expectational utilitarianism. (EDIT: And I'm arguing that a lot of the relevant uncertainty actually is just empirical, not normative, more than some have assumed.)

For two envelopes problems between normative stances, I'm usually skeptical of intertheoretic comparisons, so would mostly recommend approaches that don't depend on them.

^{^}
(Footnote added in an edit of this comment.)
For example, I think there's no two envelopes problem for someone who maximizes the median value, because the reciprocal of the median is the median of the reciprocal.
But I'd take it to be a problem for anyone who roughly maximizes an expected value or counts higher expected value in favour of an act, e.g. does so with constraints, or after discounting small probabilities. They don't have to be utilitarian or aggregate welfare at all, either.

Owen Cotton-Barratt

OK thanks. I'm going to attempt a summary of where I think things are:

In trying to assess moral weights, you can get two-envelope problems for both empirical uncertainty and normative uncertainty
Re. empirical uncertainty, you argue that there isn't a two-envelope problem, and you can just treat it like any other empirical uncertainty
- In my other comment thread I argue that just like the classic money-based two-envelope problem, there's still a problem to be addressed, and it probably needs to involve priors
Re. normative uncertainty, you would tend to advise approaches which help to dodge facing two-envelope problems in the first place, alongside dodging facing a bunch of other issues
- I'm sympathetic to this, although I don't think it's uncontroversial
You argue that a lot of the uncertainty should be understood to be empirical rather than normative — but you also think quite a bit of it is normative (insofar as you recommend people allocating resources into buckets associated with different worldviews)
- I kind of get where you're coming from here, although I feel that the lines between what's empirical and what's normative uncertainty are often confusing, and so I kind of want action-guiding advice to be available for actors who haven't yet worked out how to disentangle them. (I'm also not certain that the "different buckets for different worldviews" is the best approach to normative uncertainty, although as a pragmatic matter I certainly don't hate it, and it has some theoretical appeal.)

Does that seem wrong anywhere to you?

Michael St Jules 🔸

This all seems right to me.

(I wouldn't pick out the worldview bucket approach as the solution everyone should necessarily find most satisfying, given their own intuitions/preferences, but it is one I tend to prefer now.)

Owen Cotton-Barratt

Ok great. In that case one view I have is that it would be clearer to summarize your position (e.g. in the post title) as "there isn't a two envelope problem for moral weights", rather than as presenting a solution.

Vasco Grilo🔸

Hi Michael,

I would be curious to know your thoughts on the approach I outlined here:

Let me try to restate your point [the 2 envelopes problem], and suggest why one may disagree. If one puts weight w on the welfare range (WR) of humans relative to that of chickens being N, and 1 - w on it being n, the expected welfare range of:
Humans relative to that of chickens is E("WR of humans"/"WR of chickens") = w*N + (1 - w)*n.
Chickens relative to that of humans is E("WR of chickens"/"WR of humans") = w/N + (1 - w)/n.
You are arguing that N can plausibly be much larger than n. For the sake of illustration, we can say N = 389 (ratio between the 86 billion neurons of a humans and 221 M of a chicken), n = 3.01 (reciprocal of RP's median welfare range of chickens relative to humans of 0.332), and w = 1/12 (since the neuron count model was one of the 12 RP considered, and all of them were weighted equally). Having the welfare range of:
Chickens as the reference, E("WR of humans"/"WR of chickens") = 35.2. So 1/E("WR of humans"/"WR of chickens") = 0.0284.
Humans as the reference (as RP did), E("WR of chickens"/"WR of humans") = 0.305.
So, as you said, determining welfare ranges relative to humans results in animals being weighted more heavily. However, I think the difference is much smaller than the suggested above. Since N and n are quite different, I guess we should combine them using a weighted geometric mean, not the weighted mean as I did above. If so, both approaches output exactly the same result:
E("WR of humans"/"WR of chickens") = N^w*n^(1 - w) = 4.49. So 1/E("WR of humans"/"WR of chickens") = (N^w*n^(1 - w))^-1 = 0.223.
E("WR of chickens"/"WR of humans") = (1/N)^w*(1/n)^(1 - w) = 0.223.
The reciprocal of the expected value is not the expected value of the reciprocal, so using the mean leads to different results. However, I think we should be using the geometric mean, and the reciprocal of the geometric mean is the geometric mean of the reciprocal. So the 2 approaches (using humans or chickens as the reference) will output the same ratios regardless of N, n and w as long as we aggregate N and n with the geometric mean. If N and n are similar, it no longer makes sense to use the geometric mean, but then both approaches will output similar results anyway, so RP's approach looks fine to me as a 1st pass. Does this make any sense?

Michael St Jules 🔸

I think a weighted geometric mean is unprincipled and won't reflect expected value maximization (if w is meant to be a probability). It's equivalent to weighing by the following, where is the ratio of moral weights (or maybe conditional on being positive):

g e o m e a n (X) = e^{E [l o g (X)]}

The expectation is in the exponent, but taking expectations is supposed to be the last thing we do, after aggregation, if we're maximizing an expected value.

It's not clear if it would be a good approximation of more principled approaches, but it seems like a compromise between the human-relative and animal-relative approaches and should (always?) give intermediate moral weights.

It and both the unmodified human-relative and animal-relative solutions also hide the differences between types of uncertainty. For example, I think conscious subsystems should be treated separately like the number of moral patients.

Also, you shouldn't be taking the square root in the weighted geometric mean. You need the exponents to sum to 1, not 0.5.

EDIT: And you need to condition on both humans and the other animal having nonzero moral weight before taking the weighted geometric mean, or else you'll get 0, infinite or undefined weighted geometric means. If you take the expected value of the conditional weighted geomean, you would have something like

g (X) = E [e^{E [l o g (X) | X > 0, X < \infty]}]

but then $g (X) * g (X^{- 1}) > 1$ (and probably at least one of the two should be infinite, anyway), so you have a two envelopes problem again.

Vasco Grilo🔸

Thanks for the reply!

I think a weighted geometric mean is unprincipled and won't reflect expected value maximization (if w is meant to be a probability).

I agree it is unprincipled, and I strongly endorse expected value maximisation in principle, but maybe using the geometric mean is still a good method in practice?

The mean ignores information from extremely low predictions, and overweights outliers.
The weighted/unweighted geometric mean performed better than the weighted/unweighted mean on Metaculus' questions.
Samotsvety aggregated predictions differing a lot between them from 7 forecasters^[1] using the geometric mean after removing the lowest and highest values.

Also, you shouldn't be taking the square root in the weighted geometric mean. You need the exponents to sum to 1, not 0.5.

Thanks! Corrected.

you need to condition on both humans and the other animal having nonzero moral weight

I think the welfare range outputted by any given model should always be positive.

^{^}
For the question "What is the unconditional probability of London being hit with a nuclear weapon in October?", the 7 forecasts were 0.01, 0.00056, 0.001251, 10^-8, 0.000144, 0.0012, and 0.001. The largest of these is 1 M (= 0.01/10^-8) times the smallest.

Michael St Jules 🔸

I agree it is unprincipled, and I strongly endorse expected value maximisation in princple, but maybe using the geometric mean is still a good method in practice?

I would want to know more about what our actual targets should plausibly be before making any such claim. I'm not sure we can infer much from your examples. Maybe an analogy is that we're aggregating predictions of different perspectives, though?

I think the welfare range outputted by any given model should always be positive.

Other animals could fail to be conscious, and so have welfare ranges of 0.

Vasco Grilo🔸

I would want to know more about what our actual targets should plausibly be before making any such claim. I'm not sure we can infer much from your examples.

I agree it would be good to know which aggregation methods perform better under different conditions, and performance targets. The geometric mean is better than the mean, in the sense of achieving a lower Brier and log score, for all Metaculus' questions. However, it might be this would not hold for a set of questions whose predictions are distributed more like the welfare ranges of the 12 models considered by Rethink Priorities. I would even be open to using different aggregation methods depending on the species, since the distribution of the 12 mean welfare ranges of each model varies across species.

Maybe an analogy is that we're aggregating predictions of different perspectives, though?

If the forecasts come from "all-considered views of experts", which I think is what you are calling "different perspectives", Jaime Sevilla suggests using the geometric mean of odds if poorly calibrated outliers can be removed, or the median otherwise. For the case of welfare ranges, I do not think one can say there are poorly calibrated outliers. So, if one interpreted each of the 12 models as one forecaster^[1], I guess Jaime would suggest determining the cumulative distribution function (CDF) of the welfare range from the geometric mean of the odds of the CDFs of the welfare ranges of the 12 models, as Epoch did for judgment-based AI timelines. I think using the geometric mean is also fine, as it performed marginally better than the geometric mean of odds in Metaculus' questions.

Jaime agrees with using the mean if the forecasts come from "models with mutually exclusive assumptions":

If you are not aggregating all-considered views of experts, but rather aggregating models with mutually exclusive assumptions, use the mean of probabilities.

However:

Models can have more or less mutually exclusive assumptions. The less they do, the more it makes sense to rely on the median, geometric mean, or geometric mean of odds instead of the mean.
There is not a strong distinction between all-considered views and the outputs of quantitative models, as the judgements of people are models themselves. Moreover, one should presumably prefer the all-considered views of the modellers over the models, as the former account for more information.
- Somewhat relatedly, Rethink recommends using the median (not mean) welfare ranges.

Other animals could fail to be conscious, and so have welfare ranges of 0.

Sorry for not being clear. I agree with the above if lack of consciousness is defined as having a null welfare range. However:

In practice, consciousness has to be operationalised as satisfying certain properties to a desired extent.
I do not think one can say that, conditional on such properties not being satisfied to the desired extent, the welfare range is 0.

So I would say one should put no probability mass on a null welfare range, and that the CDF of the welfare range should be continuous^[2]. In general, I assume zeros and infinities do not exist in the real world, even though they are useful in maths and physics to think about limiting processes.

^{^}
This sounds like a moral parliament in some way?
Side note. I sometimes link to concepts I know you are aware of, but readers may not be.
^{^}
In addition, I think the CDF of the welfare range should be smooth such that the probability density function (PDF) of the welfare range is continuous.

Michael St Jules 🔸

When ice seemed like it could have turned out to be something other than the solid phase of water, we would be comparing the options based on the common facts — the evidence or data — the different possibilities were supposed to explain. And then by finding out that ice is water, you learn that there is much more water in the world, because you would then also have to count all the ice on top of all the liquid water.^[13] If your moral theory took water to be intrinsically good and more of it to be better, this would be good news (all else equal).

Suppose we measure amounts by mass. The gram was in fact originally defined as the mass of one cubic centimetre of pure water at 0 °C.^[1] We could imagine having defined the gram as the mass of one cubic centimetre of liquid water, but using water that isn't necessarily pure, not fixing the temperature or using a not fully fixed measure for the centimetre. This introduces uncertainty about the measure of mass itself, and we'd later revise the definition as we understood more, but we could still use it in the meantime. We'd also aim to roughly match the original definition: the revised mass of one cubic centimetre of water shouldn't be too different from 1 gram under the new definition.

This is similar to what I say we'd do with consciousness: we define it first relative to human first-person experiences and measure relative to them, but revise the concept and measure with further understanding. We should also aim to make conservative revisions and roughly preserve the value in our references, human first-person experiences.

^{^}
In French:
Gramme, le poids absolu d'un volume d'eau pure égal au cube de la centième partie du mètre , et à la température de la glace fondante.
https://web.archive.org/web/20160817122340/http://www.metrodiff.org/cmsms/index.php?page=18_germinal_an_3

Comments

The argument

Other examples where arguments like this can be made are:

A and B are the same except B has multiple conscious subsystems
A and B are predicting chicken welfare rather than human, and A says they are sentient whereas B says they are not. Clearly B predicts 0 times the welfare of A (equivalently A predicts infinity times the welfare of B)

Putting this in two envelopes terms

E (\frac{C}{H}) = p_{1} \frac{C_{1}}{H_{1}} + p_{2} \frac{C_{2}}{H_{2}} \approx 0.001

And

E (\frac{H}{C}) = p_{1} \frac{H_{1}}{C_{1}} + p_{2} \frac{H_{2}}{C_{2}} \approx 9 \times 10^{9} \neq 1 / E (\frac{C}{H})

As we are used to seeing.

E (C) = p_{1} H_{1} (\frac{C_{1}}{H_{1}}) + p_{2} H_{2} (\frac{C_{2}}{H_{2}}) (1)

E (H) = p_{1} H_{1} + p_{2} H_{2} = ¯ H (2)

E (C) / E (H) = p_{1} \frac{H_{1}}{¯ H} (\frac{C_{1}}{H_{1}}) + p_{2} \frac{H_{2}}{¯ H} (\frac{C_{2}}{H_{2}}) (3)

E (C) / E (H) = \frac{1}{^H} (p_{1} \frac{H_{1}}{H_{1}} (\frac{C_{1}}{H_{1}}) + p_{2} \frac{H_{2}}{H_{1}} (\frac{C_{2}}{H_{2}})) (4)

Where

^H = ¯ H / H_{1} = p_{1} \frac{H_{1}}{H_{1}} + p_{2} \frac{H_{2}}{H_{1}}

Because of scale invariance, it's not possible to say anything about the absolute value of $H_{i}$
It is possible to reason about the relative welfare values between theories, represented by $H_{i} / H_{1}$

Returning to the human-centric vs animal inclusive toy example

If we say we have two theories, 1 and 2, which you might imagine are a human centric ( $C_{1} / H_{1} = 10^{- 10}$ , $p_{1} = 0.9$ )^[4] and a animal-inclusive ( $C_{2} / H_{2} = 0.01$ , $p_{2} = 0.1$ ) view

Adding these $H_{i} / H_{1}$ numbers into this example we now have:

E (C) / E (H) = \frac{1}{^H} (0.9 (10^{- 10}) + 0.1 \frac{H_{2}}{H_{1}} (0.01)) (4)

E (H) / E (C) = \frac{1}{^C} (0.9 (10^{10}) + 0.1 \frac{C_{2}}{C_{1}} (100)) (4)

(Apologies for switching back and forth between "welfare" and "utility", I'm basically treating them both like "utility")

^{^}
I think it's right to start with this case, because it should be the easiest. So if something breaks in this case it is likely to also break once we start trying to include things like non-welfare moral reasons
^{^}
"I've met a few of those"
^{^}
We can label the "true" theory as A, because we only get the chance to experience the true theory (we just don't know which one it is)

^{^}

You could make this actually zero, but I think adding infinity in makes the argument more confusing

^{^}

I'd add a caveat that evidence about relative value under one theory can be evidence under another. If you find out that a toe stub is less bad than expected relative to other things under hedonism, then the same evidence would typically support that it's less bad for desires and belief-like preferences than you expected relative to the same other things, too.

^{^}

Mentioned in the previous comment. One is that the prescriptions of utilitarianism have this scale invariance (only distinguish between better/worse), as do the behaviours associated with pleasure/pain (e.g. you can only communicate that something is more/less painful, or [for animals] show an aversion to a more painful thing in favour of a less painful thing).

^{^}

E.g. you might not remember them, you might struggle to factor in duration, the states might come along with some non-welfare-relevant experience which biases your recollection (e.g. a painfully bright red light vs a painfully bright green light)

^{^}

Or, maybe some of the later buckets are just replaced with longtermist buckets, if and because longtermist bets could have similar probabilities of making a difference, but better payoffs when they succeed.

^{^}

Depending on how the nature of your attitudes to risk. This could follow from difference-making risk aversion or probability difference discounting of some kind. On the other hand, if you maximized the expected utility of the arctan of total welfare, a bounded function, then you'd prioritize marginal local improvements to worlds with small populations and switching between big and small populations, while ignoring marginal local improvements to worlds with large populations. This could also mean ignoring chickens but not marginal local improvements for humans, because if chickens don't count and we go extinct soon (or future people don't count), then the population is much smaller.

^{^}

(Footnote added in an edit of this comment.)

For example, I think there's no two envelopes problem for someone who maximizes the median value, because the reciprocal of the median is the median of the reciprocal.

But I'd take it to be a problem for anyone who roughly maximizes an expected value or counts higher expected value in favour of an act, e.g. does so with constraints, or after discounting small probabilities. They don't have to be utilitarian or aggregate welfare at all, either.

^{^}

For the question "What is the unconditional probability of London being hit with a nuclear weapon in October?", the 7 forecasts were 0.01, 0.00056, 0.001251, 10^-8, 0.000144, 0.0012, and 0.001. The largest of these is 1 M (= 0.01/10^-8) times the smallest.

^{^}

This sounds like a moral parliament in some way?

Side note. I sometimes link to concepts I know you are aware of, but readers may not be.

^{^}

In addition, I think the CDF of the welfare range should be smooth such that the probability density function (PDF) of the welfare range is continuous.

^{^}

In French:

Gramme, le poids absolu d'un volume d'eau pure égal au cube de la centième partie du mètre , et à la température de la glace fondante.

https://web.archive.org/web/20160817122340/http://www.metrodiff.org/cmsms/index.php?page=18_germinal_an_3

^{^}

Karnofsky (2018) wrote:

In this case, a >10% probability on the human-inclusive view would be effectively similar to a 100% probability on the human-centric view.

I assume he meant “human-centric view” instead of “human-inclusive view”, so I correct the quote with square brackets here.

^{^}

$E [1 / X] = 1 / E [X]$ if and only if $X$ is equal to a constant with probability 1, and $E [1 / X] > 1 / E [X]$ if $X$ is nonnegative and not equal to a constant with probability 1. This follows from Jensen's inequality, because $f$ defined by $f (x) = 1 / x$ is convex.

^{^}

I would either use per unit averages for chickens and humans, respectively, or assume here that the value scales in proportion (or at least linearly) with each unit of measured welfare for each of humans and chickens, separately.

^{^}

However, some may believe objective moral value is threatened by illusionism about phenomenal consciousness, which denies that phenomenal consciousness exists. These positions do still recognize that consciousness exists, but they deny that it is phenomenal. We could just substitute an illusionist account of consciousness wherever phenomenal consciousness was used in our ethical theories, although some further revisions may be necessary to accommodate differences. For further discussion, see Kammerer, 2019, Kammerer, 2022 or a later section in this piece. The difference here is because some ethical theories directly value phenomenal consciousness specifically, and not (or less) consciousness in general.

Other examples could be free will, libertarian free will specifically or god(s) which may turn out not to exist, and so moral theories that tied some reasons specifically to them would lose those reasons.

If a moral theory only places value on things that actually exist in some form, while being more agnostic about their nature, then the value can follow the vague and revisable concepts of those things.

^{^}

Except possibly for indirect and instrumental reasons. It’s useful to know water is H₂O.

^{^}

This could be cashed out in terms of acquaintance, as in knowledge by acquaintance (Hasan, 2019, Duncan, 2021, Knowles & Raleigh, 2019), or appearance, as in phenomenal conservatism (Huemer, 2013). Adam Shriver made a similar point in conversation.

^{^}

This may be more illustrative than literal for me. Personally, it’s more that other people’s suffering seems directly and importantly bad to me, or indirectly and importantly bad through my emotional responses to their suffering.

^{^}

However, which kind of “seeming” or appearance should be used can depend on the theory of wellbeing, i.e. unpleasantness under hedonism, cognitive desires or motivational salience under desire theories and preferences under preference theories. I concede later that we may need to separate by these very broad accounts of welfare (and perhaps more finely) rather than treat them all as generating the same moral reasons.

^{^}

From conversation with multiple people, something like this seems to be the standard view.

^{^}

Our sympathetic responses to the suffering of another individual — chicken, human or otherwise — don’t necessarily reliably track how bad it is for them from their own perspective, but is probably closer for other humans, because of greater similarity between humans (neurological, functional, cognitive, psychological, behavioural).

^{^}

$E [X / Y] = \infty$ (or undefined) if $X > 0$ , $Y \geq 0$ , and $Y = 0$ with nonzero probability, because we get $X / 0$ with nonzero probability. $E [X / Y]$ is undefined if $X, Y \geq 0$ , and $X = Y = 0$ with nonzero probability, because we get $0 / 0$ with nonzero probability.

However, in principle, humans in general or each proposed type of wellbeing could not matter with nonzero probability, so we could get a similar problem normalizing by human welfare or moral weights.

^{^}

There may be some ways to address the issue.

You could treat the 0 moral weight like an infinitesimal and do arithmetic with it, but I think this entirely denies the possibility that chickens don’t matter at all. This seems ad hoc and to have little or no independent justification.

You could take conditional expected values in the denominator (and numerator) first that gives a nonzero value, assuming Cromwell’s rule, before taking the ratio and expected value of the ratio. In other words, you take the expected value of a ratio of conditional expected values of moral weights. Then, in effect, you’re treating the conditional expected value of chicken moral weight as equal across some views. Most naturally, you would take the conditional expected values over descriptive uncertainty, conditional on each fixed normative stance — so that the resulting prescriptions would agree with each normative stance — and then take the expected value of the ratio across these normative stances/theories (over normative uncertainty).

^{^}

If you had already measured all the liquid water directly and precisely, you wouldn’t expect any more or less liquid water from finding out ice is also water.

^{^}

I even doubt that there is any precise fact of the matter for the ratio of their intensities or moral disvalue.

^{^}

Approaches include Open Philanthropy’s worldview diversification approach (Karnofsky, 2018), variance voting (MacAskill et al., 2020, Ch4), moral parliaments (Newberry & Ord, 2021), a bargain-theoretic approach (Greaves & Cotton-Barratt, 2019), or the Property Rights Approach (Lloyd, 2022). For an overview of moral uncertainty, see MacAskill et al., 2020.

^{^}

With multiple values for a given $w_{R}$ , e.g. a distribution of values, we could get a distribution or set of expected moral weights for chickens and humans. To these, we could apply an approach to moral uncertainty that doesn’t depend on intertheoretic reason comparisons.

^{^}

Let $Q_{H} (q)$ and $Q_{C} (q)$ be the quantile functions of $H_{R}$ and $C_{R}$ , respectively. Then, for p between 0 and 1,

$Q_{C} (1 - 0.5 p) = inf {y \in R | 1 - 0.5 p \leq 1 - P [C_{R} \geq y]}$

$= inf {x / 100 | x \in R, 0.5 p \geq P [C_{R} \geq x / 100]}$

$= \frac{1}{100} inf {x \in R | 0.5 p \geq P [C_{R} \geq x / 100]}$

$\geq \frac{1}{100} inf {x \in R | p \geq P [H_{R} \geq x]}$

$= \frac{1}{100} inf {x \in R | 1 - p \leq 1 - P [H_{R} \geq x]}$

$\geq \frac{1}{100} Q_{H} (1 - p)$

Then,

$E [C_{R}] = \int_{0}^{1} Q_{C} (q) d q$

$= \int_{0}^{2} Q_{C} (1 - 0.5 p) 0.5 d p$

$\geq \int 0^{1} Q_{C} (1 - 0.5 p) 0.5 d p$

$\geq \int_{0}^{1} \frac{1}{100} Q_{H} (1 - p) 0.5 d p$

$= 0.005 \int_{0}^{1} Q_{H} (1 - p) d p$

$= 0.005 E [H_{R}]$

^{^}

$P [C_{R} > 0] \leq a$ and $P [C_{R} \geq b * x] \leq a * P [H_{R} \geq x]$ gives $E [C_{R}] \leq a b E [H_{R}]$ .

^{^}

However, some major moral theories don’t weigh reasons by summation, aggregate at all or take expected values. The expected moral weights of chickens and humans may not be very relevant in those cases.

^{^}

Carey and Fry (1995) showed that pigs generalize the discrimination between non-anxiety states and drug-induced anxiety to non-anxiety and anxiety in general, in this case by pressing one lever repeatedly with anxiety, and alternating between two levers without anxiety (the levers gave food rewards, but only if they pressed them according to the condition). Many more such experiments were performed on rats, as discussed in Sánchez-Suárez, 2016, summarized in Table 2 on pages 63 and 64 and discussed further across chapter 3. Rats could discriminate between the injection of the anxiety-inducing drug PTZ and saline injection, including at subconvulsive doses. Various experiments with rats and PTZ have effectively ruled out convulsions as the discriminant, further supporting that it’s the anxiety itself that they’re discriminating, because they could discriminate PTZ from control without generalizing between PTZ and non-anxiogenic drugs, and with the discrimination blocked by anxiolytics and not nonanxiolytic anticonvulsants. Rats further generalized between various pairs of anxiety(-like) states, like those induced by PTZ, drug withdrawal, predator exposure, ethanol hangover, “jet lag”, defeat by a rival male, high doses of stimulants like bemegride and cocaine, and movement restraint.

However, Mason and Lavery (2022) caution:

But could such results merely reflect a “blindsight-like” guessing: a mere discrimination response that need not reflect underlying awareness? After all, as we have seen for S.P.U.D. subjects, decerebrated pigeons can use colored lights as DSs (128), and humans can use subliminal visual stimuli as DSs [e.g., (121)]. We think several refinements could reduce this risk.

^{^}

There are exponential (non-tight) upper bounds for the number of connected subgraphs of a graph, and hence connected neural subsystems of a brain (Pandey & Patra, 2021, Filmus, 2018). However, not any such connected subsystem would be conscious. Also, with bounded degree, i.e. a bounded number of connections/synapses per neuron in your set of brains under consideration, the number of connected subgraphs can be bounded above by a polynomial function of the number of neurons (Eppstein, 2013).

^{^}

For a defense of epistemic modesty, see Lewis, 2017.

Aumann's agreement theorem, which supports convergence in beliefs between ideally rational Bayesians with common priors about events of common knowledge, may not be enough for convergence here. This is because our conscious experiences are largely private and not common knowledge. Even if they aren’t inherently private, without significant advances in theory or technology that would resolve remaining factual disagreements or far more introspection and far more detailed introspective reports than are practical, they’ll remain largely private in practice.

Or, our priors could differ, based on our distinct conscious experiences, which we use as references to understand moral patienthood and often moral value in general.

^{^}

I’d only be inclined to weigh the actual or idealized intrinsic/terminal values of actual moral patients, not any possible or conceivable moral patients or perspectives. The latter also seems particularly ill-defined. How would we weigh possible or conceivable perspectives?

^{^}

The term ‘illusionism’ seems prone to cause misunderstanding, and multiple illusionists have taken issue with the term, including Graziano (2016, ungated), Humphrey (2016) and Veit and Browning (2023, preprint).

^{^}

See my previous piece discussing how desires and hedonic states may be understood as beliefs or appearances of normative reasons. Others have defended desire-as-belief, desire-as-perception and generally desire-as-guise or desire-as-appearance of normative reasons, the good or what one ought to do. See Schroeder, 2015, 1.3 for a short overview of different accounts of desire-as-guise of good, and Part I of Deonna (ed.) & Lauria (ed), 2017 for more recent work on and discussion of such accounts and alternatives. See also Archer, 2016, Archer, 2020 for some critiques, and Milona & Schroeder 2019 for support for desire-as-guise (or desire-as-appearance) of reasons. A literal interpretation of Roelofs (2022, ungated)’s “subjective reasons, reasons as they appear from its perspective” would be as desire-as-appearance of reasons.

^{^}

Riedener, 2019 writes, where IRCs is short for intertheoretic reason-comparisons:

So I’ll propose a version of this approach, on which ought-facts are grounded in epistemic norms. In other words, I’ll propose a form of constructivism about IRCs. If I’m right, IRCs are not facts out there that hold independently of facts about morally uncertain agents. They hold in virtue of being the result of an ideally reasonable deliberation, in terms of certain epistemic norms, about what you ought to do in light of your uncertainty.

So very roughly, these norms suggest that without any explanation, you shouldn’t assume that you’ve always systematically and radically misjudged the strength of your everyday paradigm reasons. And they imply that you should more readily assume you may have misjudged some reasons if you have an explanation for why and how you may have done so, or if these reasons are less mundane and pervasive. This seems intuitively plausible. But Simplicity, Conservatism and Coherence might be false, or not quite correct as I’ve stated them, or there might be other and more important norms besides them.²⁷ My aim is not to argue for these precise norms. I’m happy if it’s plausible that some such epistemic norms hold, and that they can constrain the IRCs or ought-judgements you can reasonably make. If that’s so, we can invoke a form of constructivism to ground IRCs. We can understand truth about IRCs as the outcome of ideally reasonable deliberation – in terms of principles like the above – about what you ought to do in light of your uncertainty. By comparison, consider the view that truth in first-order moral theory is simply the result of an ideal process of systematizing our pre-theoretical moral beliefs.²⁸ On this view, it’s not that there’s some independent Platonic realm of moral facts, and that norms like simplicity and coherence are best at guiding us towards it. Rather, the principles are first, and ‘truth’ is simply the outcome of the principles. We can invoke a similar kind of constructivism about IRCs. On this view, principles like Simplicity, Conservatism and Coherence are not justified in virtue of their guiding us towards an independent realm of ought-facts or IRCs. Rather, they help constitute this realm.
So this provides an answer to why some ought-facts or IRCs hold. It’s not because of mind-independent metaphysical facts about how theories compare, or how strong certain reasons would be if we had them. It’s simply because of facts about how to respond reasonably to moral evidence or have reasonable moral beliefs. Ultimately, we might say, it’s because of facts about us – about why we might have been wrong about morality, and by how much and in what way, and so on.

^{^}

Riedener (2019) writes:

According to TU, you have all the reasons that you have according to PAD – reasons to benefit existing others – but also some additional reasons beyond them. So on this interpretation, the least radical change in your credences and the most simple ultimate credence distribution will be such that your reasons to benefit existing people are the same on both theories. Unless you have some additional beliefs that could render other beliefs more coherent, this IRC will be most reasonable in light of the above principles.

^{^}

Rabinowicz and Österberg (1996) describe similar accounts as object versions of preference views, contrasting them with satisfaction versions, which are instead concerned with preference satisfaction per se. Also similar are actualist preference-affecting views (Bykvist, 2007) and conditional reasons (Frick, 2020).

^{^}

Or in the case of illusionism vs realism about phenomenal consciousness on one interpretation of illusionism, the comparisons are grounded based on such measures or consequences for both, i.e. the (real or hypothetical) dispositions for phenomenality/qualia beliefs, but what matters are the quasi-phenomenal properties that lead to these beliefs, which are either actually phenomenal under realism or not under illusionism. On another interpretation of illusionism, it’s the beliefs themselves that matter, not quasi-phenomenal properties in general. For more on the distinction, see Frankish, 2021.

Solution to the two envelopes problem for moral weights

The argument

Putting this in two envelopes terms

The argument

Putting this in two envelopes terms

Solution to the two envelopes problem for moral weights

Summary

Acknowledgements

Background

Welfare in human-relative terms

Finding common ground

Multiple possible reference points

What can we say about the ratio of expected moral weights?

Objections

Conscious subsystems

Unresolvable disagreements

Epistemic modesty about morality

Other applications of the approach