Hide table of contents

I’m annoyed by the way some Effective Altruists use the expression “value drift”. I will avoid giving examples to not "point fingers"; it's not personal or about anyone specific:)

Note that I do overall appreciate the research that's been done on member dropout! My goals here are mostly to a) critique the way "value drift" has been used, and at the same time b) suggest other avenues to study value drift. I hope this doesn't come off in a bad way; I know it can be easy to critique and make research suggestions without doing it myself.

Here’s the way I’ve seen EAs use the term "value drift", and why it bothers me.

1. A focus on one value category

They mean something much more narrow, which is “someone becoming less of an effective altruist”. They never mention drifts from some altruistic values to other altruistic values or from some non-altruistic values to other non-altruistic values (ex.: going from a positive hedonist to a negative hedonist). And they especially never frame becoming more altruistic as a value drift, even though I have the impression a bunch of Effective Altruists approach outreach like that (but not everyone; ex.: Against moral advocacy). I feel it's missing perspective to frame 'value' in such a narrow idiosyncratic way.

2. A focus on the community, not the values

I think they often intrinsically care about “quitting the community of Effective Altruism” as opposed to becoming less of an effective altruist. Maybe this is fine, but don't call it a value.

a) Checking when it's desirable

I would be interested in seeing when quitting the community is desirable, even without a value drift. Some examples might include:

  • if you're focusing on a specific intervention with its own community,
  • if you're not interested in the community (which doesn't mean you can do effective altruism; and just want to donate to some prizes,
  • if you're not altruist and were just in the community to hang out with cool people,
  • if you have personal issues that need to be prioritized.

b) Checking when it's not actually value drift

A lot of the data are naturally just proxies or correlational, and they're likely good enough to be used the way they are. But I would be curious to know how much value drift-like behavior are not actually caused by a value drift. For example:

  • someone can stop donating because they took a pay cut to do direct work,
  • someone can stop donating because they had planned on giving more early in their career given they thought there would be more opportunities early on.

Someone can also better introspect on their values and change their behavior consequently, even though their fundamental values haven't change.

Toon Alfrink argues that most member dropouts aren't caused by value drift, but rather incentive drift; see zir argument in Against value drift. That seems overall plausible to me, except that I still put higher credence on intrinsic altruism in humans.

3. A focus on involvement in the community, not on other interventions

The main intervention they care about to reduce drifting towards non-altruism is increasing involvement with the Effective Altruism community. I’ve not seen any justification for focusing (exclusively) on that. Other areas that seem-to-me worth exploring include:

  1. Age-related changes, such as early brain development, puberty, senescence, and menopause
  2. Change in marital status, level of physical intimacy, pregnancy, and raising children
  3. Change in abundance / poverty / slack
  4. Change in level of political / social power
  5. Change in general societal trust, faith in humanity, altruistic opportunities
  6. Exposition to contaminants
  7. Consumption of recreational drugs, medicinal drugs, or hormones
  8. Change in nutrition
  9. Brain injuries

I haven’t really done any research on this. But 1, 2, 3, 4, and 5 seem to have plausible evolutionary mechanisms to support them. I think I’ve seen some research providing evidence on 1, 2, 3, 4, 5, 7; some evidence against 7 as well. i seems like a straightforward plausible mechanism. And 6 and 8 seem to often have an impact on that sort of things.

I do think community involvement also has a plausible evolutionary mechanism, as well as a psychological one. And there are reasons why some of the areas I proposed are less interesting to study: some have a Chesterton's fence around them, some seems to already be taken care of, some seems hard to influence, etc.

Additional motivation to care about value drift

I'm also interested in this from a survival perspective. I identify with my values, so having them changed would correspond to microdeaths. To the extent you disvalue death (yours and other's), you might also care about value drift for that reason. But that's a different angle: seeing value drift as an inherent moral harm instead of an instrumental one.

Summary of the proposal

If you're interested in studying member retention / member dropout, that's fine, but I would rather you call it just that.

Consider also checking when a member dropout is fine, and when a behavior that looks like a dropout isn't.

It also seems to me like it would be valuable to also research drifting of other values, as well as other types of interventions to reduce value drift. Although part of it comes from disvaluing value drift intrinsically.

Notification of further research

I'm aware of two people that have written on other interventions. If/once they publish it, I will likely post in the comment section. I might also post other related pieces in the comment section as I see them. So if you're interested, you can subscribe to the comment section (below the title: "..." > "Subscribe to comments").

Related work

Search the forum for "value drift" for more.

38

0
0

Reactions

0
0

More posts like this

Comments22
Sorted by Click to highlight new comments since: Today at 4:35 AM

I think that what is causing some confusion here is that "value drift" is (probably?) a loanword from AI-alignment which (I assume?) originally referred to very fundamental changes in goals that would unintentionally occur within iterative versions of self improving intelligences, which...isn't really something that humans do. The EA community borrowed this sort of scary alien term and is using it to describe a normal human thing that most people would ordinarily just call "changing priorities".

A common sense way to say this is that you might start out with great intentions, your priorities end up changing, and then your best intentions never come to life. It's not that different from when you meant to go to the gym every morning...but then a phone call came, and then you had to go to work, and now you are tired and sitting on the couch watching television instead.

Logistically, it might make sense to do the phone call now and the gym later. The question is: "Will you actually go to the gym later?" If your plan involves going later, are you actually going to go? And if not, maybe you should reschedule this call and just going to the gym now. I don't see it as a micro death that you were hoping to go to the gym but did not, it's that over the day other priorities took precedence and then you became too tired. You're still the same person who wanted to go... you just ...didn't go. Being the person who goes to the gym requires building a habit and reinforcing the commitment, so if you want to go then you should keep track of which behaviors cause you to actually go and which behaviors break the habit and lead to not going.

Similarly you should track "did you actually help others? And if your plan involves waiting for a decade ...are you actually going to do it then? Or is life going to have other plans?" That's why the research on this does (and ought to) focus on things like "are donations happening", "is direct work getting done" and so on. Because that's what is practically important if your goal is to help others. You might argue for yourself "it's really ok, I really will help others later in life" or you might argue "what if I care about some stuff more than helping others" and so on, but I think someone who is in the position of attempting to effectively help others in part through the work of other people (whether through donations or career or otherwise) over the course of decades should to some degree consider what usually happens to people's priorities in aggregate when modeling courses of action.

I think another term would better fit your description. Maybe "executive failure".

I don't see it as a micro death

Me neither. Nor do I see it as a value drift though.

I think another term would better fit your description. Maybe "executive failure".

It seems to me that there's a bunch of terms/concepts which all seem like they fit/are relevant to some extent here, including (off the top of my head):

  • value drift
  • changing priorities
  • executive failure (I've never heard this, and don't think there's a need to introduce the term if it doesn't exist yet, but I can see why it'd work)
  • failure of willpower
  • failure to follow through
  • akrasia
  • procrastination
  • hyperbolic discounting
    • How relevant this is depends on the specifics

Which term is most appropriate in a given case will probably depend on the specifics and on the author/speaker's aims. And it would probably be good for people to more often think a little more carefully about which term they use, and to more often briefly explain what they mean by the term.

The main intervention they care about to reduce drifting towards non-altruism is increasing involvement with the Effective Altruism community. I’ve not seen any justification for focusing (exclusively) on that.

The first point I'd make here is that I'd guess people writing on these topics aren't necessarily primarily focused on a single dimension from more to less altruism. Instead, I'd guess they see the package of values and ideas associated with EA as being particularly useful and particularly likely to increase how much positive impact someone has. To illustrate, I'd rather have someone donating 5% of their income to any of the EA Funds than donating 10% of their income to a guide dog charity, even if the latter person may be "more altruistic". 

I don't think EA is the only package of values and ideas that can cause someone to be quite impactful, but it does seem an unusually good and reliable package. 

Relatedly, I care a lot less about drifts away from altruism among people who aren't likely to be very effectively altruistic anyway than among people who are. 

So if I'm more concerned about drift away from an EA-ish package of values than just away from altruism, and if I'm more concerned about drift away from altruism among relatively effectiveness-minded people than less effectiveness-minded people, I think it makes sense to pay a lot of attention to levels of involvement in the EA community. (See also.) Of course, someone can keep the values and behaviours without being involved in the community, as you note, but being in the community likely helps a lot of people retain and effectively act on those values.

And then, as discussed in Todd's post, we do have evidence that levels of engagement with the EA community reduces drop-out rates. (Which is not identical to value drift, and that's worth noting, but it still seems important and correlated.)

Finally, I think most of the other 9 areas you mention seem like they already receive substantial non-EA attention and would be hard to cost-effectively influence. That said, this doesn't mean EAs shouldn't think about such things at all. 

The post Reducing long-term risks from malevolent actors is arguably one example of EAs considering efforts that would have that sort of scope and difficulty and that would potentially, in effect, increase altruism (though that's not the primary focus/framing). And I'm currently doing some related research myself. But it does seem like things in this area will be less tractable and neglected than many things EAs think about.

Thanks!

I agree with your precisions.

levels of engagement with the EA community reduces drop-out rates

"drop-out" meaning 0 engagement, right? so the claim has the form of "the more you do X, the less likely you are of stopping doing X completely". it's not clear to me to which extent it's causal, but yeah, still seems useful info!

I think most of the other 9 areas you mention seem like they already receive substantial non-EA attention

oh, that's plausible!

The post Reducing long-term risks from malevolent actors is arguably one example of EAs considering efforts that would have that sort of scope and difficulty and that would potentially, in effect, increase altruism

Good point! In my post, I was mostly thinking at the individual level. Looking at a population level and on a longer term horizon, I should probably add other possible interventions such as:

  • Incentives to have children (political, economical, social)
  • Immigration policies
  • Economic system
  • Genetic engineering
  • Dating dynamics
  • Cultural evolution

"drop-out" meaning 0 engagement, right? so the claim has the form of "the more you do X, the less likely you are of stopping doing X completely". it's not clear to me to which extent it's causal, but yeah, still seems useful info!

I can see why "we do have evidence that levels of engagement with the EA community reduces drop-out rates" might sound like a somewhat empty/tautological sentence. (Then there's also the question of causality, which I'll get to at the the end.) But I think it's meaningful when you consider Todd's definitions (which I perhaps should've quoted before). 

He defines the drop out rate as "the rate at which people both (i) stop engaging with the effective altruism community and (ii) stop working on paths commonly considered high-impact within the community" (emphasis added). 

And I don't think he precisely defines engagement, but he writes: 

My guess is that the most significant factor [in drop-out rates] is someone’s degree of social integration—i.e. I expect that people with friends or colleagues who are into EA are less likely to drop out of the community.

Relatedly, I think the degree to which someone identifies with EA will be important. For instance, someone who has been featured in the media as being into EA seems much less likely to drop out. We could think of both of these as aspects of ‘engagement’.

So I think the claim is something like "more social integration into EA and identification as an EA at time 1 predicts a higher chance of staying engaged with EA and still pursuing paths commonly considered high-impact at time 2". 

(I'd encourage people to read Todd's post for more details; these are just my quick comments.)

Then there is of course still the question of causality: Is this because engagement reduces drop out, or because some other factor (e.g., being the sort of person who EA really fits with) both increases engagement and reduces drop out? My guess is that both are true to a significant extent, but I'm not sure if we have any data on that.

I see, thanks!

I think you make some good points. This post, plus Ben Todd's recent post, has updated me towards thinking that:

  • we should often use the term "drop out rate" instead
  • when using the term "value drift", we should ensure we've made it at least decently clear what we mean
    • Perhaps by saying things like "value drift away from EA", or "negative value drift", or just having a paragraph/footnote outlining what we are vs aren't counting

That said, I do think most writing I've seen on value drift has explicitly stated that one could separately ask how to precisely define this phenomena and how good vs bad it is, but that the post at hand will instead just provide empirical info on value drift. (The main posts that come to mind which I think did that are Todd's post and this post.) And that seems reasonable to me.

I'll provide some other minor pushbacks in separate comments. Most don't detract much from the key thrust of what you're saying.

Thanks.

I think "negative value drift" is still too idiosyncratic; it doesn't say negative for whom. For the value holder, any value drift generally has negative consequences.

I (also) think it's a step in the right direction to explicitly state that a post isn't trying to define value drift, but just provide empirical info. Hopefully my post will have provided that definition, and people will now be able to build on this.

I think "negative value drift" is still too idiosyncratic; it doesn't say negative for whom. 

I feel comfortable saying things like "positive impact", "negative impact", "improve the world", or simply "good" and "bad" without specifying which value system that's in relation to. (And people often speak in that way outside of EA as well.) I think that, when one doesn't specify that, it's safe to assume one means something like "from my moral perspective", or "from the perspective of what I'd guess my morals would be if I knew more and reflected further", or "from the perspective of my best guess at what's morally true".

One could argue that we should always specify what perspective we're defining positive/negative/good/bad in relation to. But I think that would slow us down unnecessarily compared to just sometimes specifying that. And I don't see a strong reason to do that for value drift if we don't do it for other things.

It's true that value drift will tend to be negative from the naive perspective of the values that had previously been held, and positive from the naive perspective of the values that are now held. (I say "naive perspective" because the value drift could be in a direction that the prior value system would've endorsed on reflection, or something like that.) But I don't think that prevents us from having our own views on whether the value drift was good or bad. Analogously, I think I can have views on whether a change in what someone donates to is good or bad, regardless of what that person's aims for their donations are.

This seems reasonable to me. I do use the shortcut myself in various contexts. But I think using it on someone when you know it's because they have different values is rude.

I use value drift to refer to fundamental values. If your surface level values change because you introspected more, I wouldn't call it a drift. Drift has a connotation of not being in control. Maybe I would rather call it value enlightenment.

Drift has a connotation of not being in control.

I agree with this (though I think the connotations might be pretty weak). I think "value drift" also has weakly negative connotations even aside from that.

I think that, if we want to introduce a new term to avoid negative connotations or connotations of not being in control, "value shift" or "value change" would be better than "value enlightenment". "Value enlightenment" would have strong positive connotations and connotations of having found the truth, and it's also less immediately obvious what it refers to. It seems to me that it's obvious what "value shift" or "value change" mean, and that those terms are consistent with the change being positive, negative, neutral, or of unknown value.

yeah, 'shift' or 'change' work better for neutral terms. other suggestion: 'change in reveal preferences'

Someone can also better introspect on their values and change their behavior consequently, even though their fundamental values haven't change.

I'd say that this would indeed not be fundamental value drift, but it seems ok to call it value drift. This was a change in what people think their values are, and in what values effectively influence their behaviours, and that's often what I care about in practice.

Other thoughts:

  • It seems epistemically dangerous to discourage such value enlightenment as it might prevent ourselves from become more enlighten.
  • It seems pretty adversarial to manipulate people into not becoming more value enlighten, and allowing this at a norm level seems net negative from most people's point of view.
  • But maybe people want to act more altruistically and trusting in a society as also espouse those values. In which case, surface-level values could change in a good way for almost everyone without any fundamental value drift. Which is also a useful phenomenon to study, so probably fine to also call this 'value drift'.

I'm a bit confused by why you made your first two points in response to my comment. Did you perceive me to be endorsing discouraging reflection on values, or endorsing "manipulating people" into not reflecting on their values and shifting their surface-level values in light of that? 

I didn't aim to endorse those things with my comment; merely to point out that it seems reasonable to me to call something a shift in values even if it's not a shift in "fundamental values". 

(I also don't think there's a sharp distinction between fundamental values and surface-level values. But I think a fuzzy distinction like that can be useful, as can a distinction between moral advocacy focusing on encouraging reflection vs pushing people in one direction, and a distinction between central and peripheral routes for persuasion.

That said, I also think the word "manipulating" is probably not useful here; it's very charged with connotations, so I'd prefer to talk about the actual behaviours in question, which may or may not actually be objectionable.)

Ok yeah, my explanations didn't make the connection clear. I'll elaborate.

I have the impression "drift" has the connotation of uncontrolled, and therefore undesirable change. It has a negative connotation. People don't want to value drift. If you call rational surface-value update "value drift", it could confuse people, and make them less prone to make those updates.

If you only use 'value drift' only to refer to EA-value drift, it also sneaks in an implication that other value changes are not "drifts". Language shapes our thoughts, so this usage could modify one's model of the world in such a way that they are more likely to become more EA than they value.

I should have been more careful about implying certain intentions from you in my previous comment though. But I think some EAs have this intention. And I think using the word that way has this consequence whether or not that's the intent.

They mean something much more narrow, which is “someone becoming less of an effective altruist”. They never mention drifts from some altruistic values to other altruistic values or from some non-altruistic values to other non-altruistic values (ex.: going from a positive hedonist to a negative hedonist). And they especially never frame becoming more altruistic as a value drift, even though I have the impression a bunch of Effective Altruists approach outreach like that (but not everyone; ex.: Against moral advocacy). I feel it's missing perspective to frame 'value' in such a narrow idiosyncratic way.

I think these are interesting points, with some merit. But I also think a key point to note is that these conversations/writings are usually about value drift among EAs. Given that context, it seems to me understandable that the focus is on shifts from more to less altruism, rather than the inverse or shifts of the non-altruistic values. (That said, it is of course true that "even EAs" will have non-altruistic values that could shift.)

I also think "value drift" as typically used does include shifting from some altruistic values to other altruistic values. For example, I think if someone had been considering high-impact, EA-aligned careers but then switched to allocating most of their energies to things like working at a homeless shelter, this would typically be seen as value drift; the person is still altruistic, but no longer as effective. (Two caveats are required here: First, it's conceivable that for some people the highest impact option would be working at a homeless shelter. Second, a person can of course still "count as EA" even if they spend some of their energies on "common-sense" do-gooding.)

If they have the same value, but just became worse at fulfilling them, then it's more something like "epistemic drift"; although I would probably discourage using that term.

On the other end, if they started caring more about homeless people intrinsically for some reason, then it would be a value drift. But they wouldn't be "less effective", they would, presumably, be as effective, but just at a different goal.

I agree that someone can become less effective at acting on a set of values without changing what their values are. The inverse can also occur; e.g. training, practice, or reading can help one be more effective at achieving their values. (See also.)

(My comment wasn't in tension with that idea; I just didn't bring that up as I didn't see that as part of the point you were making in the paragraph I quoted.)

But I think "value drift" away from EA is probably relatively rarely that someone has the exact same values (including "surface-level values") but "forgets" how to act on them, or develops mistaken beliefs about how to act on them. I think it's more often either that their "fundamental" values shift in a substantial way, or something like they just stop caring as much about EA cause areas or approaches. The latter doesn't require that they no longer think what EAs are doing is valuable; they might just feel less engaged or motivated by it. That still seems to me like something we can call "value drift".

We could also perhaps call it "motivation drift", "revealed preference drift", or something else like that. But "value drift" seems adequate to me, for those cases.

On the other end, if they started caring more about homeless people intrinsically for some reason, then it would be a value drift. But they wouldn't be "less effective", they would, presumably, be as effective, but just at a different goal.

I agree that they can be described as effectively pursuing another goal (if indeed they're doing that effectively). But I don't think that prevents us from saying they're "less effective", as a shorthand for "they're less effectively achieving good in the world". And this in turn can be from our own perspective. As I mentioned in another comment, I think people speak in that sort of way very often (e.g., when saying "positive impact"), and that it's easy enough to understand that that's what people mean.

I've spent a lot of time thinking about this, and I largely agree with you. I also think studying "pure" value drift (as opposed to "symptoms" of value drift, which is what a lot of the research in this area focuses on, including, to some extent, my own) comes with a few challenges. (Epistemic status: Pretty uncertain and writing this in haste. Feel free to tell me why I'm wrong.)

  • EA isn't (supposed to be) dogmatic, and hence doesn't have clearly defined values. We're "effective" and we're "altruistic," and those are more or less the only requirements to being EA. But what is altruism? Is it altruistic to invest in yourself so you can have more of an impact later on in life? Effectiveness, on the surface, seems more objective, since it mostly means relying on high-quality evidence and reasoning. But evidence and reason can get messy and biased, which can make defining effectiveness more difficult. For example, even if valuing effectiveness leads you to choose to believe in interventions that have the most robust evidence, it's possible that that robust evidence might come from p-hacking, publication bias, or studies with an over-representation of middle-class people from high-income countries. At some point effectiveness (from the EA point of view) also hinges on a valuing certainty vs. risk-taking, and probably a number of other sub-values as well.
  • Measuring raw values relies primarily on self-reporting, which is a notoriously unreliable social science method. People often say they value one thing and then act in a contradictory manner. Sometimes it's a signaling thing, but sometimes we just don't really understand ourselves that well. Classic example: a young college student says they don't care much about financial stability, until they actually enter the workforce, get a not-super-well-paid job, and realize that maybe they actually do care. I think this is a big reason why people have chosen to focus on behavior and community involvement. It's the closest thing to objective data we can get.

This isn't an argument against what you've written. I still think a lot of people err on assigning the label "value drift" to things like leaving the EA community that could be caused by a number of different scenarios in which it actually perfectly reflects your values to do that thing. I guess I don't know what the solution is here, but I do think it's worth digging into further.

EA isn't (supposed to be) dogmatic, and hence doesn't have clearly defined values.

I agree.

I think this is a big reason why people have chosen to focus on behavior and community involvement.

Community involvement is just instrumental to the goals of EA movement building. I think the outcomes we want to measure are things like career and donations. We also want to measure things that are instrumental to this, but I think we should keep those separated.

Related: my comment on "How have you become more (or less) engaged with EA in the last year?"

Curated and popular this week
Relevant opportunities