I was a bit worried about some possible methodological issues with the GiveWell measures of life satisfaction. I looked into the data, and the issue doesn't completely undermine the result, but I think having looked closely I am now moderately less convinced that the negative spillover effect observed is a real problem.
Some measures of "life evaluation" use a technique like the Cantril Ladder; this is often used as a measure of happiness or subjective well-being (e.g., in the World Happiness Report 2021). In the words of that report, the Cantril ladder question "asks respondents to think of a ladder, with the best possible life for them being a 10, and the worst possible life being a 0. They are then asked to rate their own current lives on that 0 to 10 scale".
When measuring the "negative spillover" effects discussed above in Section 6, the Cantril would be an inappropriate measure to use. That's because, when respondents think of the ladder and imagine the "best possible life" and the "worst possible life", many are likely to anchor on to exemplars that are salient/at hand, like, for instance, the distribution of people in their local community.
Imagine two people in a community, Andrew and Bob. You give Andrew $100 and Bob $0, and then ask Bob to rate his own life on a 0 to 10 scale. Bob might think of Andrew, who just got $100 for nothing, and think of that as particularly good. His idea of how good life can get just got a little bit higher. (If you think this is silly, rather than imagining, I don't know, Elon Musk or some other fantastically privileged person, keep in mind that respondents probably have a few seconds to answer this questionnaire and what is salient is very important. for more discussion on the importance of salience in questionnaire respondents, consult the work of psychologist Norbert Schwarz at USC.) So Bob would be relatively lower on the ladder and he'd give himself a lower life evaluation score. But it isn't clear this really reflects any of Bob's subjective experience of happiness from day to day, other than times when he's asked to rate himself on Cantril ladders. That would depend on him actually making those subjective comparisons and feeling bad about them on a regular basis.
Fortunately, the GiveWell study did not use a Cantril Ladder. As explained by Haushofer, Reisinger, and Shapiro (2015), they used four well-being measures:
the “happiness” and “life satisfaction” questions from the World ValuesSurvey; the total score on the Center for Epidemiologic Studies Depression scale (CESD) (Radloff1977); and total score on the Perceived Stress Scale (PSS)(Cohen, Kamarck, and Mermelstein 1983).
The happiness and life satisfaction from the World Values Survey can be viewed here and are:
"Taking all things together, would you say you are very happy, rather happy, not very happy, or not at all happy?""All things considered, how satisfied are you with your life as a whole these days? Using this card on which 1 means you are “completely dissatisfied” and 10 means you are “completely satisfied” where would you put your satisfaction with your life as a whole?"
Of the four well-being measures, the negative spillover result relates to the Life Satisfaction question specifically. It is possible that when answering (2) a respondent might use the same kind of social comparison processes that Bob used in the example above. But importantly, these weren't suggested to the respondent by the surveyer. If the respondent had used social comparison, they did so unprompted, and it's not so hard to imagine they do that often in their life in a way that affects their emotional affect.
Nevertheless, I'd be even more convinced if an effect on the happiness question had been observed, and the fact that the study observed a negative spillover result on Life Satisfaction and not Happiness does suggest that perhaps social comparison is occurring for Life Satisfaction in a way that doesn't seem to impact on the Happiness measure. If you think that Life Satisfaction tells us something additional to the Happiness measure about basic intrinsic hedonic utils, you should probably still be concerned about the negative spillover. If you think that Life Satisfaction is only important to the extent that it affects self-reported happiness, you should be cautious about interpreting the result of the negative spillover.
What the survey didn't do, because it's very expensive and hard, and require respondents to at least be able to text on a cellphone, is to measure basic momentary positive and negative affect through a method like Ecological Momentary Assessment. I'd be interested in a future study looking at that and seeing whether we observe (1) any effect of the GiveWell intervention on momentary positive and negative affect and (2) whether there are negative spillover effects there.
I'm the same. I'm a "member" and even a "community leader" in the "EA movement", and happy to identify as such. But calling yourself an "Effective Altruist" is to call yourself an "altruist", at least in the ears of someone who isn't familiar with the movement. I think it will sound morally pretentious or self-aggrandizing. Generally the label of "altruist" should be given to an individual by others, not claimed, if it should ever be applied to describe a specific individual, which actually seems a bit weird regardless of whoever is bestowing the label.
The job in question, for those curious:
PostDoc Position on The Science of Well-Being at Yale
Dr. Laurie Santos in the Department of Psychology at Yale University is seeking a Postdoctoral Research Associate to start by June 1, 2021. The ideal candidate will have a PhD in Psychology, Cognitive Science, Behavioral Science, or a related field; research interests in positive psychology; a strong background in statistics and data science; and experience working with adolescents and adults in school settings. This is a one-year appointment with possibility of renewal for additional years based on mutual agreement and University approval.
The position is part of a broad grant-funded initiative launched in 2020 to develop and test instructional programming on the science of well-being for a number of different populations: high school students (especially those from rural and low-income schools), teachers, and parents. The position will involve developing research studies to evaluate the impact of these instructional resources, as well as the option to develop other research projects on the science of well-being more broadly. The successful candidate is expected to (1) lead the evaluation and research components of this initiative, scientifically assessing whether these resources improve participant mental health and overall flourishing; (2) consult with the course development team as an in-house subject matter expert; and (3) work closely with our partner institutions, including high schools, nonprofits, universities, and professional organizations.
Desired Skills and Qualifications:
Applicants should send an email to Laurie Santos at email@example.com with the subject line “Postgraduate Research Associate Application” and include the following items.
Yale University is an affirmative action/equal opportunity employer. We especially encourage women, members of minority groups, persons with disabilities, and covered veterans to apply.
Which distribution would you use? Why the particular weights you've chosen and not slightly different ones?
I think you just have to make your distribution uninformative enough that reasonable differences in the weights don't change your overall conclusion. If they do, then I would concede that the solution to your specific question really is clueless. Otherwise, you can probably find a response.
come up with a probability distribution for the fraction of heads over 1,000,000 flips.
Rather than thinking of directly of appropriate distribution for the 1,000,000 flips, I'd think of a distribution to model p itself. Then you can run simulations based on the distribution of p to calculate the distribution of the fraction of 1000,000 flips. p∈(0.5,1.0], and then we need to select a distribution for p over that range.
There is no one correct probability distribution for p because any probability is just an expression of our belief, so you may use whatever probability distribution genuinely reflects your prior belief. A uniform distribution is a reasonable start. Perhaps you really are clueless about p, in which case, yes, there's a certain amount of subjectivity about your choice. But prior beliefs are always inherently subjective, because they simply describe your belief about the state of the world as you know it now. The fact you might have to select a distribution, or set of distributions with some weighted average, is merely an expression of your uncertainty. This in itself, I think, doesn't stop you from trying to estimate the result.
I think this expresses within Bayesian terms the philosophical idea that we can only make moral choices based on information available at the time; one can't be held morally responsible for mistakes made on the basis of the information we didn't have.
Perhaps you disagree with me that a uniform distribution is the best choice. You reason thus: "we have some idea about the properties of coins in general. It's difficult to make a coin that is 100% biased towards heads. So that seems unlikely". So we could pick a distribution that better reflects your prior belief. Perhaps a suitable choice might be Beta(2,2) with a truncation at 0.5, which will give the greatest likelihood of p just above 0.5, and a declining likelihood down to 1.0.
Maybe you and i just can't agree after all that there is still no consistent and reasonable prior choice you can make, and not even any compromise. And let's say we both run simulations using our own priors and find entirely different results and we can't agree on any suitable weighting between them. In that case, yes, I can see you have cluelessness. I don't think it follows that, if we went through the same process for estimating the longtermist moral worth of malaria bednet distribution, we must have intractable complex cluelessness about specific problems like malaria bednet distribution. I think I can admit that perhaps, right now, in our current belief state, we are genuinely clueless, but it seems that there is some work that can be done that might eliminate the cluelessness.
A good point.
There are things you can do to correct for this sort of thing-for instance, go one level more meta, estimate the probability of unforeseen consequences in general, or within the class of problems that your specific problem fits into.
We couldn't have predicted the fukushima disaster, but perhaps we can predict related things with some degree of certainty - the average cost and death toll of earthquakes worldwide, for instance. In fact, this is a fairly well explored space, since insurers have to understand the risk of earthquakes.
The ongoing pandemic is a harder example - the rarer the black swan, the more difficult it is to predict. But even then, prior to the 2020 pandemic, the WHO had estimated the amortized costs of pandemics as in the order of 1% of global GDP annually (averaged over years when there are and aren't pandemics), which seems like a reasonable approximation.
I don't know how much of a realistic solution that would be in practice.
Thanks! That was helpful, and my initial gut reaction is I entirely agree :-)Have you had an opportunity to see how Hillary Greaves might react to this line of thinking? If I had to hazard a guess I imagine she'd be fairly sympathetic to the view you expressed.
There is an argument from intuition that carry some force by Schoenfield (2012) that we can't use a probability function:
(1) It is permissible to be insensitive to mild evidential sweetening.(2) If we are insensitive to mild evidential sweetening, our attitudes cannot be represented by a probability function.(3) It is permissible to have attitudes that are not representable by a probability function. (1, 2)...You are a confused detective trying to figure out whether Smith or Jones committed the crime. You have an enormous body of evidence that to evaluate. Here is some of it: You know that 68 out of the 103 eyewitnesses claim that Smith did it but Jones' footprints were found at the crime scene. Smith has an alibi, and Jones doesn't. But Jones has a clear record while Smith has committed crimes in the past. The gun that killed the victim belonged to Smith. But the lie detector, which is accurate 71% percent of time, suggests that Jones did it. After you have gotten all of this evidence, you have no idea who committed the crime. You are no more confident that Jones committed the crime than that Smith committed the crime, nor are you more confident that Smith committed the crime than that Jones committed thecrime....Now imagine that, after considering all of this evidence, you learn a new fact: it turns out that there were actually 69 eyewitnesses (rather than 68) testifying that Smith did it. Does this make it the case that you should now be more confident in S than J? That, if you had to choose right now who to send to jail, it should be Smith? I think not....In our case, you are insensitive to evidential sweetening with respect to S since you are no more confident in S than ~S (i.e. J), and no more confident in ~S (i.e. J) than S. The extra eyewitness supports S more than it supports ~S, and yet despite learning about the extra eyewitness, you are no more confident in S than you are in ~S (i.e. J).
(1) It is permissible to be insensitive to mild evidential sweetening.(2) If we are insensitive to mild evidential sweetening, our attitudes cannot be represented by a probability function.(3) It is permissible to have attitudes that are not representable by a probability function. (1, 2)
You are a confused detective trying to figure out whether Smith or Jones committed the crime. You have an enormous body of evidence that to evaluate. Here is some of it: You know that 68 out of the 103 eyewitnesses claim that Smith did it but Jones' footprints were found at the crime scene. Smith has an alibi, and Jones doesn't. But Jones has a clear record while Smith has committed crimes in the past. The gun that killed the victim belonged to Smith. But the lie detector, which is accurate 71% percent of time, suggests that Jones did it. After you have gotten all of this evidence, you have no idea who committed the crime. You are no more confident that Jones committed the crime than that Smith committed the crime, nor are you more confident that Smith committed the crime than that Jones committed thecrime.
Now imagine that, after considering all of this evidence, you learn a new fact: it turns out that there were actually 69 eyewitnesses (rather than 68) testifying that Smith did it. Does this make it the case that you should now be more confident in S than J? That, if you had to choose right now who to send to jail, it should be Smith? I think not.
In our case, you are insensitive to evidential sweetening with respect to S since you are no more confident in S than ~S (i.e. J), and no more confident in ~S (i.e. J) than S. The extra eyewitness supports S more than it supports ~S, and yet despite learning about the extra eyewitness, you are no more confident in S than you are in ~S (i.e. J).
Intuitively, this sounds right. And if you went from this problem trying to understand solve the crime on intuition, you might really have no idea. Reading the passage, it sounds mind-boggling.
On the other hand, if you applied some reasoning and study, you might be able to come up with some probability estimates. You could identify the conditioning of P(Smith did it|an eyewitness says Smith did it), including a probability distribution on that probability itself, if you like. You can identify how to combine evidence from multiple witnesses, i.e., P(Smith did it|eyewitness 1 says Smith did it) & P(Smith did it|eyewitness 2 says Smith did it), and so on up to 68 and 69. You can estimate the independence of eyewitnesses, and from that work out how to properly combine evidence from multiple eyewitnesses.
And it might turn out that you don't update as a result of the extra eyewitness, under some circumstances. Perhaps you know the eyewitnesses aren't independent; they're all card-carrying members of the "We hate Smith" club. In that case it simply turns out that the extra eye-witness is irrelevant to the problem; it doesn't qualify as evidence, so it it doesn't mean you're insensitive to "mild evidential sweetening".
I think a lot of the problem here is that these authors are discussing what one could do when one sits down for the first time and tries to grapple with a problem. In those cases there's so many undefined features of the problem that it really does seem impossible and you really are clueless.
But that's not the same as saying that, with sufficient time, you can't put probability distributions to everything that's relevant and try to work out the joint probability.
Schoenfield, M. Chilling out on epistemic rationality. Philos Stud 158, 197–219 (2012).
> Hope this helps.
It does, thanks--at least, we're clarifying where the disagreements are.
If you think that choosing a set of probability functions was arbitrary, then having a meta-probability distribution over your probability distributions seems even more arbitrary, unless I'm missing something. It doesn't seem to me like the kind of situations where going meta helps: intuitively, if someone is very unsure about what prior to use in the first place, they should also probably be unsure about coming up with a second-order probability distribution over their set of priors .
All you need to do to come up with that meta-probability distribution is to have some information about the relative value of each item in your set of probability functions. If our conclusion for a particular dilemma turns on a disagreement between virtue ethics, utilitarian ethics, and deontological ethics, this is a difficult problem that people will disagree strongly on. But can you even agree that these each bound, say, to be between 1% and 99% likely to be the correct moral theory? If so, you have a slightly informative prior and there is a possibility you can make progress. If we really have completely no idea, then I agree, the situation really is entirely clueless. But I think with extended consideration, many reasonable people might be able to come to an agreement.
Upon immediately encountering the above problem, my brain is like the mug: just another object that does not have an expected value for the act of giving to Malaria Consortium. Nor is there any reason to think that an expected value must “really be there”, deep down, lurking in my subconscious.
I agree with this. If the question is, "can anyone, at any moment in time, give a sensible probability distribution for any question", then I agree the answer is "no".
But with some time, I think you can assign a sensible probability distribution to many difficult-to-estimate things that are not completely arbitrary nor completely uninformative. So, specifically, while I can't tell you right now about the expected long-run value for giving to Malaria Consortium, I think I might be able to spend a year or so understanding the relationship between giving to Malaria Consortium and long-run aggregate sentient happiness, and that might help me to come up with a reasonable estimate of the distribution of values.
We'd still be left with a case where, very counterintuitively, the actual act of saving lives is mostly only incidental to the real value of giving to Malaria Consortium, but it seems to me we can probably find a value estimate.
About this, Greaves (2016) says,
averting child deaths has longer-run effects on population size: both because the children in question will (statistically) themselves go on to have children, and because a reduction in the child mortality rate has systematic, although difficult to estimate, effects on the near-future fertility rate. Assuming for the sake of argument that the net effect of averting child deaths is to increase population size, the arguments concerning whether this is a positive, neutral or a negative thing are complex.
And I wholeheartedly agree, but it doesn't follow from the fact you can't immediately form an opinion about it that you can't, with much research, make an informed estimate that has better than an entirely indeterminate or undefined value.
EDIT: I haven't heard Greaves' most recent podcast on the topic, so I'll check that out and see if I can make any progress there.
EDIT 2: I read the transcript to the podcast that you suggested, and I don't think it really changes my confidence that estimating a Bayesian joint probability distribution could get you past cluelessness.
So you can easily imagine that getting just a little bit of extra information would massively change your credences. And there, it might be that here’s why we feel so uncomfortable with making what feels like a high-stakes decision on the basis of really non-robust credences, is because what we really want to do is some third thing that wasn’t given to us on the menu of options. We want to do more thinking or more research first, and then decide the first-order question afterwards.Hilary Greaves: So that’s a line of thought that was investigated by Amanda Askell in a piece that she wrote on cluelessness. I think that’s a pretty plausible hypothesis too. I do feel like it doesn’t really… It’s not really going to make the problem go away because it feels like for some of the subject matters we’re talking about, even given all the evidence gathering I could do in my lifetime, it’s patently obvious that the situation is not going to be resolved.
So you can easily imagine that getting just a little bit of extra information would massively change your credences. And there, it might be that here’s why we feel so uncomfortable with making what feels like a high-stakes decision on the basis of really non-robust credences, is because what we really want to do is some third thing that wasn’t given to us on the menu of options. We want to do more thinking or more research first, and then decide the first-order question afterwards.
Hilary Greaves: So that’s a line of thought that was investigated by Amanda Askell in a piece that she wrote on cluelessness. I think that’s a pretty plausible hypothesis too. I do feel like it doesn’t really… It’s not really going to make the problem go away because it feels like for some of the subject matters we’re talking about, even given all the evidence gathering I could do in my lifetime, it’s patently obvious that the situation is not going to be resolved.
My reaction to that (beyond I should read Askell's piece) is that I disagree with Greaves that a lifetime of research could resolve the subject matter for something like giving to Malaria Consortium. I think it's quite possible one could make enough progress to arrive at an informative probability distribution. And perhaps it only says "across the probability distribution, there's a 52% likelihood that giving to x charity is good and a 48% probability that it's bad", but actually, if the expected value is pretty high, it's still strong impetus to give to x charity.
I still reach the problem where we've arrived at a framework where our choices for short-term interventions are probably going to be dominated by their long-run effects, and that's extremely counterintuitive, but at least I have some indication.
Her choice to use multiple, independent probability functions itself seems arbitrary to me, although I've done more reading since posting the above and have started to understand why there is a predicament.
Instead of multiple independent probability functions, you could start with a set of probability distributions for each of the items you are uncertain about, and then calculate the joint probability distribution by combining all of those distributions. That'll give you a single probability density function on which you can base your decision.
If you start with a set of several probability functions, with each representing a set of beliefs, then calculating their joint probability would require sampling randomly from each function according to some distribution specifying how likely each of the functions are. It can be done, with the proviso that you must have a probability distribution specifying the relative likelihood of each of the functions in your set.
However, I do worry the same problem arises in this approach in a different form. If you really do have no information about the probability of some event, then in Bayesian terms, your prior probability distribution is one that is completely uninformative. You might need to use an improper prior, and in that case, they can be difficult to update on in some circumstances. I think these are a Bayesian, mathematical representation of what Greaves calls an "imprecise credence".
But I think the good news is that many times, your priors are not so imprecise that you can't assign some probability distribution, even if it is incredibly vague. So there may end up not being too many problems where we can't calculate expected long-term consequences for actions.
I do remain worrying, with Greaves, that GiveWell's approach of assessing direct impact for each of its potential causes is woefully insufficient. Instead, we need to calculate out the very long term impact of each cause, and because of the value of the long-term future, anything that affects the probability of existential risk, even by an infinitesimal amount, will dominate the expected value of our intervention.
And I worry that this sort of approach could end up being extremely counterintuitive. It might lead us to the conclusion that promoting fertility by any means necessary is positive, or equally likely, to the conclusion that controlling and reducing fertility by any means necessary is positive. These things could lead us to want to implement extremely coercive measures, like banning abortion or mandating abortion depending on what we want the population size to be. Individual autonomy seems to fade away because it just doesn't have comparable value. Individual autonomy could only be saved if we think it would lead to a safer and more stable society in the long run, and that's extremely unclear.
And I think I reach the same conclusion that I think Greaves has, that one of the most valuable things you can do right now is to estimate some of the various contingencies, in order to lower the uncertainty and imprecision on various probability estimates. That'll raise the expected value of your choice because it is much less likely to be the wrong one.
Thanks for your remarks. I'm looking forward to her full article being published, because I agreed that as it is, she's been pretty vague. The full article might clear up some of the gaps here.
From what you and others have said, the most important gap seems to be "why we should not be consequentialists", which is much bigger than just EA! I think there is something compelling; I might reconstruct her argument something like:
Probably consequentialists will reply that (3) is wrong; actually if you improve justice and equality but this doesn't improve long-term well-being, it's not actually good. I suppose I believe that, but I'm unsure about it.