Just as a side note, Harsanyi's result is not directly applicable to a formal setup involving subjective uncertainty, such as Savage's or the Jeffrey-Bolker framework underlying evidential and causal decision theory. Though there are results for the Savage setup too, e.g., https://www.jstor.org/stable/10.1086/421173, and Caspar Oesterheld and I are working on a similar result for the Jeffrey Bolker framework. In this setup, to get useful results, the indifference Axiom can only be applied to a restricted class of propositions where everyone agrees on beliefs.
I don't think Romeo even has to deny any of the assumptions. Harsanyi's result, derived from the three assumptions, is not enough to determine how to do intersubjective utility comparisons. It merely states that social welfare will be some linear combination of individual utilities. While this already greatly restricts the way in which utilities are aggregated, it does not specify which weights to use for this sum.
Moreover, arguing that weights should be equal based on the veil of ignorance, as I believe Harsanyi does, is not sufficient, since utility functions are only determined up to affine transformations, which includes rescalings. (This point has been made in the literature as a criticism of preference utilitarianism, I believe.) So there seems to be no way to determine what equal weights should look like, without settling on a way to normalize utility functions, e.g., by range normalization or variance normalization. I think the debate about intersubjective utility comparisons comes in at the point where you ask how to normalize utility functions.
Of course, if you are not using a kind of preference utilitarianism but instead just aggregate some quantities you believe to have an absolute scale—such as happiness and suffering—then you could argue that utility functions should just correspond to this one absolute scale, with the same scaling for everyone. Though I think this is also not a trivial argument—there are potentially different ways to get from this absolute scale or Axiology to behavior towards risky gambles, which in turn determine the utility functions.
And it turns out that the utilitarian approach of adding up utilities is *not* a bargaining solution, because it violates Pareto-optimality in some cases. Does that "disprove" total utilitarianism?
I'm not sure this is right. As soon as you maximize a weighted sum with non-negative coefficients your solution will be weakly Pareto optimal. As soon as all coefficients are strictly positive, it will be strongly Pareto optimal. The axioms mentioned above don't imply non-negative coefficients, so theoretically they are also satisfied by "anti-utilitarianism" which counts everyone's utility negatively. But one can add stronger Pareto axioms to force all coefficients to be strictly positive.
The problem with the utilitarian Bargaining solution is that it is not independent of affine transformations of utility functions. Just summing up utility functions is underspecified, one also needs to choose a scaling for the utility functions. A second criterion that might not be satisfied by the utilitarian solution (depending on the scaling chosen) is individual rationality, which means that everyone will be better off given the bargaining solution than some disagreement outcome.
Your argument seems to combine SSA style anthropic reasoning with CDT. I believe this is a questionable combination as it gives different answers from an ex-ante rational policy or from updateless decision theory (see e.g. https://www.umsu.de/papers/driver-2011.pdf). The combination is probably also dutch-bookable.
Consider the different hingeynesses of times as the different possible worlds and your different real or simulated versions as your possible locations in that world. Say both worlds are equally likely a priori and there is one real version of you in both worlds, but the hingiest one also has 1000 subjectively indistinguishable simulations (which don't have an impact). Then SSA tells you that you are much less likely a real person in the hingiest time than you are to be a real person in the 20th hingiest time. Using these probabilities to calculate your CDT-EV, you conclude that the effects of your actions on the 20th most hingiest time dominate.
Alternatively, you could combine CDT with SIA. Under SIA, being a real person in either time is equally likely. Or you could combine the SSA probabilities with EDT. EDT would recommend acting as if you were controlling all simulations and the real person at once, no matter whether you are in the simulation or not. In either case, you would conclude that you should do what is best for the hingiest time (given that they are equally likely a priori).
Unlike the SSA+CDT approach, either of these latter approaches would (in this case) yield the actions recommended by someone coordinating everyone's actions ex ante.
EV stands for Expected Value. (I think I actually meant Expected Utility more precisely)
Thanks a lot for this article!
I just wanted to link to Lukas Gloor's new paper on Fail-Safe AI, which discusses the reduction of "quality future-risks" in the context of AI safety. It turns out that there might be interventions that are less directed at achieving a perfect outcome, but instead try to avoid the worst outcomes. And those interventions might be more tractable (because they don't aim at such a tiny spot in value-space) and more neglected than other work on the control problem.
(Edit: I no longer endorse negative utilitarianism or suffering-focused ethics.)
Thank you! Cross-posting my reply as well:
If we adopt more of a preference-utilitarian view, we end up producing contradictory conclusions in the same scenarios that I discussed in my original essay—you can't claim that AMF saves 35 DALYs without knowing AMF's population effects.
Shouldn't this be fixed by negative preference utilitarianism? There could be value in not violating the "preference-equivalent" of dying one year earlier, but no value in creating additional "life-year" preferences. A YLL would be equivalent to a violated life-preference, then. You could avert YLLs by not having children, of course, which seems plausible to me (if noone is born, whose preference is violated by dying from Malaria?). Being born and dying from Malaria would be worse than non-existence, so referring to your "Bigger Problem"-scenarios, A < B < C and C = D.
Regarding EV: I agree, there has to be one ranking mapping world-states onto real numbers (or R^n if you drop the continuity-axiom). So you're right in the sense that the supposed GiveWell-ranking of world-states that you assume doesn't work out. I still think that there might be a way to make a creative mapping in the real world so that the GiveWell focus on DALYs without regarding population size can be somehow translated into a utility function. Anyway, I would kind of agree that AMF turns out to be less effective than previously thought, both from an SFE and a classical view smile emoticon
(Edit: I no longer endorse suffering-focused ethics.)
Regardless of your stance on population ethics, I think in general it makes sense to take DALYs as a heuristic for how much good you can do with your money. Clearly all population ethical views consider improving existing lives in quality (decreasing YLDs, years lived with disability) a good thing. Preventing deaths expressed through reducing YLLs (Years of Life Lost) is probably overall good as well, although different views will assign more or less value to it. I agree with Michael Dickens that if the value of longer lives comes from adding life-years (reducing YLL) alone, this would indeed amount to something like total utilitarianism.
I think a steelman of GiveWell's view would be that in fact the YLL component of DALYs can be motivated by some other things, like preference dissatisfaction or decreasing the suffering of the parents of children as well. I believe that for reasons of cooperation between agents it always makes sense to consider the preferences of other beings at least to some degree. Fulfilling already existing preferences seems like something most people would agree to, whether they would also like to bring additional fulfilled preferences into existence or not. Therefore, death is intrinsically bad according to most reasonable views, since it violates the preferences of existing beings severely. In that sense, decreasing YLLs should be always good, even for non-classical utilitarians.
Unlike Michael, I personally would be less reluctant to accept a ranking of world states that can’t be boiled down to an easy mathematical function of the aggregated wellbeing, i.e. I’d be less turned off by more “complex” moral views. And I’d be less willing to bite bullets like the repugnant conclusion, or the “very repugnant conclusion,” where a world with fewer, but very happy individuals can be worse than a world containing any finite amount of extreme torture that is outweighed by an even greater amount of beings that live lives just barely worth living. Accepting this conclusion is a quite a controversial stance in my eyes. Given anti-realism, it is absolutely unclear to me why GiveWell would have to adhere to a total utilitarian view. They could very well accept all the inconsistencies Michael mentions and still just maximize EV according to their own (complex) values. I agree that they should probably specify their view more explicitly and it remains unclear what they are really optimizing for (see also http://blog.givewell.org/2008/08/22/dalys-and-disagreement/).
A candidate I am favouring that could possibly match a lot of people's intuitions would be something like negative idealized preference utilitarianism or more generally any form of suffering-focused ethics (e.g. trying to reduce extreme involuntary suffering without doing anything crazy or anything that would be considered really bad by other agents).
(cross-posted here: https://www.facebook.com/groups/effective.altruists/permalink/1071588459564177/)