903Joined Mar 2015


This might get us off track but it’s easy to not sufficiently understand the nature and importance of interindividual variability in this area. Most effectiveness studies can only show you how effective a therapy (or drug) is for the average individual suffering from a given syndrome. (More sophisticated studies and meta analyses include moderator analyses; they might look at personality variables such as high self-criticism, personality variables, and so on. More on this below.)

This is all good and informative. If you suffer from a mental health problem, say, depression, you should just try the therapy or antidepressant that works the best for the average person. You should just go with the prior that you are like the average person (unless you have evidence to the contrary). 

However, once you tried out the first line treatment (be it a drug or a therapy) and it didn’t work for you, you can either i) give up or ii) try other treatments or drugs. I generally recommend step ii). This post was written, as mentioned in the summary, primarily for people who are interested in therapy (and inner multiplicity) but have had disappointing experiences with IFS and/or CBT. What is your advice? That they try the same approach repeatedly even if it hasn’t worked for them in the past? 

I’d like to explain in more detail how it is possible that i) most studies find that CFT, ST and CBT are comparatively effective and ii) some people might see bigger therapeutic improvements from seeing a CFT or ST therapist than from a CBT therapist—and the reverse

For the sake of illustration, let’s use a completely hypothetical and unrealistic scenario: Assume that all clients suffer from depression but only differ in two aspects: Some are high on self-criticism, some are low; some are good at mental imagery, some are poor. 50% of clients are low on self-criticism and poor at mental imagery, 25% high on self-criticism and good at mental imagery, and 25% fall in the other two categories. Researchers run a perfectly designed and administered RCT that tests the effectiveness of two different therapies on this sample: therapy A and therapy B. The results are as follows.

 Average reduction in depression score
 Low self-criticism & poor mental imagery (50%)High self-criticism & good mental imagery (25%)Remaining population (25%)Overall
Therapy A-400-2
Therapy B0-80-2

If you average across all people, therapy A and B are equally effective. They both reduce depression scores by 2 points, on average. 

A person that doesn’t know whether she is high or low on self-criticism or mental imagery, should be indifferent between the two therapies or go with the therapy that is cheaper, easier, or more widely used. BUT, if you have tried therapy A and it didn’t work for you, you should try therapy B (and vice versa). Likewise, if you know that you are high on self-criticism, you should try out therapy B, and vice versa.

Coming back to the real world. More and more studies start to include such moderator analyses (e.g., the study by Leaviss & Uttley (2015) mentioned above). However, in reality, people differ in hundreds of potentially therapy-relevant aspects, not all of which can be studied. For example, a therapist might work well for most people but doesn’t work well for committed atheists who are high on conscientiousness, low on neuroticism and extraversion, high on self-criticism, low on mental imagery, have a family history of schizophrenia and bipolar disorder, like Bayesianism, and who were bullied early in their lives. 

I think there are actually good reasons to expect such relevant interindividual differences. One can already observe them in the case of antidepressants, for example. Generally, there is enormous variability between humans in all sorts of psychological and physiological traits. A few individuals do perfectly fine with 5 hours of sleep, some need 9 or more hours (I think Bostrom and Brian Tomasik are among those). For most people, whole grain bread is probably fine and even healthy. But some have Coeliac disease. Some people swear to benefit from meditation, others claim to have suffered enormous negative consequences. The list goes on.

Of course, I’m not saying one should now totally give up on evidence-based medicine. If you start out and haven’t done a lot of self-experimentation, go with the first-line treatment and recommendations that work for the majority of people (e.g., CBT, SSRIs, 8 hours of sleep, etc.). But if a therapy, food, lifestyle choice or antidepressant isn’t working for you, take an empirical approach, and try out alternatives that might work better for you.

I acknowledge that there is a failure mode here of trusting your own experience and experiments too much and, e.g., start swearing on the effectiveness of energy crystals, tarot cards, and so on. I’d say that trying out CFT and ST (and IFS) is still far removed from this failure mode; for the reasons listed in the post and my above comment. 

First of all, I’d say that schema therapy (ST) and CFT ( to some extent) are part of the mainstream. (It’s  fair to say that IFS is not part of mainstream medicine. This is also a major reason for why we wrote this post: To introduce readers to more “mainstream” therapy approaches that work with inner multiplicity.) 

  • You can find dozens of articles about CFT and ST published in mainstream psychology and psychotherapy journals, often with hundreds of citations.
  • CFT and ST are taught in many CBT schools. For example, Ewelina learned about these CFT and ST in her 4-year CBT training (accredited by The European Association for Behavioural and Cognitive Therapies, probably the largest and most mainstream CBT organization in Europe). 
  • As we write in the post, schema therapy was developed by a CBT therapist (Young) in close collaboration with Aaron Beck, the founder of CBT (see also Beck’s review of this standard ST book). Paul Gilbert, the founder of CFT, also collaborated with Aaron Beck who was supportive of his work

Generally, there isn’t a “clash” between ST, CBT and CFT and many other often called “third-wave” CBT schools (like, e.g., ACT and DBT). ~No expert in this area thinks that CBT is the only therapy school that is “evidence-based” and that there is no reason to explore or work with other therapy modalities.

My sense is that most CBT scholars respect CFT and ST and other CBT-affiliated schools and view them as complimentary (more on this below). It’s very common that major CBT figures recommend CFT and ST and incorporate techniques from these (and other!) schools. Just as an example, Judith Beck, the daughter of Aaron Beck and perhaps one of the most respected and well-known living CBT therapists had this to say about the standard CFT book for clinicians: 

This excellent and comprehensive volume contains a biopsychosocial model, informed by evolutionary theory. It is an important resource, systematically exploring compassion from many vantage points. It has certainly motivated me to put an increased focus on incorporating principles of compassion-focused therapy into my cognitive behavior therapy practice. I highly recommend this book!

Second, as we mention in the post, there is (tentative) evidence that CFT and ST work as effectively as CBT and for certain problems more effectively. For example, ST seems to work better for personality disorders (Bamelis et al., 2014), particularly borderline personality disorder (Jacob & Arntz, 2013, p.175 -178) than standard CBT. [1]This is not too surprising because ST was developed particularly for clients with long-standing emotional problems and personality disorders, for whom CBT doesn’t seem to work so well. 

Likewise, there is tentative evidence that CFT works better for patients high on self-criticism and shame (Leaviss & Uttley, 2015). Again, this makes sense because Gilbert developed CFT when he realized that standard CBT therapy doesn’t work that well for such clients. 

Of course, the evidence base isn’t as strong as one might like (it never is). Kaj discusses some of the reasons for this.

  1. ^

    It should be mentioned that for borderline in particular, the evidence base for DBT is probably even stronger (Lynch et al., 2007); we mention DBT briefly in the last section.

(I now wrote a comment elaborating on some of these inconsistencies here.)

Thanks Magnus for your more comprehensive summary of our population ethics study.

You mention this already, but I want to emphasize how much different framings actually matter. This surprised me the most when working on this paper. I’d thus caution anyone against making strong inferences from just one such study.

For example, we conducted the following pilot study (n = 101) where participants were randomly assigned to two different conditions: i) create a new happy person, and ii) create a new unhappy person. See the vignette below:

Imagine there was a magical machine. This machine can create a new adult person. This new person’s life, however, would definitely [not] be worth living. They would be very unhappy [happy] and live a life full of suffering and misery [bliss and joy].

You can push a button that would create this new person.

Morally speaking, how good or bad would it be to push that button?

The response scale ranged from 1 = Extremely bad to 7 = Extremely good. 

Creating a happy person was rated as only marginally better than neutral (mean = 4.4), whereas creating an unhappy person was rated as extremely bad (mean = 1.4). So this would lead one to believe that there is strong popular support for the asymmetry. [1]

However, those results were most likely due to the magical machine framing and/or the “push-a-button” framing. Even though these framings clearly “shouldn’t” make such a huge difference.

All in all, we tested many different framings, too many to discuss here. Occasionally, there were significant differences between framings that shouldn't matter (though we also observed many regularities). For example, we had one pilot with the “multiplier framing”: 

Suppose the world contains 1,000 people in total. How many times bigger would the number of extremely happy people have to be than the number of extremely unhappy people for you to think that this world is overall positive rather than negative (i.e., so that it would be better for the world to exist rather than not exist)?

Here, the median trade ratio was 8.5 compared to the median trade ratio of 3-4 that we find in our default framing. It’s clear that the multiplier framing shouldn’t make any difference from a philosophical perspective. 

So seemingly irrelevant or unimportant changes in framings (unimportant at least from a consequentialist perspective) sometimes could lead to substantial changes in median trade ratios. 

However, changes in the intensity of the experienced happiness and suffering—which is arguably the most important aspect of the whole thought experiment—affected the trade ratios considerably less than the above mentioned multiplier framing.

To see this, it’s worth looking closely at the results of study 1b. Participants were first presented with the following scale: 

Let's assume a happiness scale ranging from -100 (extreme unhappiness) to 0 (neutral) to +100 (extreme happiness). Someone on level 0 is in a neutral state that feels neither good nor bad. Someone on level -1 experiences a very mild form of unhappiness, only slightly worse than being in a neutral state. Someone on level +1 experiences a very mild form of happiness, only slightly better than being in a neutral state. Someone on level -100 experiences the absolute worst form of suffering imaginable. Someone on level +100 experiences the absolute best form of bliss imaginable.

[Editor’s note: From now on, the text is becoming more, um, expressive.]

Note that “worst form of suffering imaginable” is pretty darn bad. Being brutally tortured while kept alive by nano bots is more like -90 on this scale. Likewise, “absolute best form of bliss imaginable” is pretty far out there. Feeling, all your life, like you just created friendly AGI and found your soulmate, while being high on ecstasy would still not be +100. 

(Note that we also conducted a pilot study where we used more concrete and explicit descriptions such as “torture”, “falling in love”, “mild headaches”, and “good meal” to describe the feelings of mild or extreme [un]happiness. The results were similar.)

Afterwards, participants were asked:

Given this information, what percentage of extremely [mildly] happy people vs. extremely [mildly] unhappy people would there have to be for you to think that this world is overall positive rather than negative (i.e., so that it would be better for the world to exist rather than not exist)? 
In my view, the percentage [...] would need to be as follows: X% extremely [mildly] happy people; Y% extremely [mildly] unhappy people. 

So how do the MTurkers approach these awe-inspiring intensities? 

First, extreme happiness vs. extreme unhappiness. MTurkers think that there need to exist at least 72% people experiencing the absolute best form of bliss imaginable in order to outweigh the suffering of 28% of people experiencing the worst form of suffering imaginable. 

Toby Ord and the classical utilitarians rejoice, that’s not bad! That’s like a 3:1 trade ratio, pretty close to a 1:1 trade ratio! “And don’t forget that people’s imagination is likely biased towards negativity for evolutionary reasons!”, Carl Shulman says. “In humans, the pleasure of orgasm may be less than the pain of deadly injury, since death is a much larger loss of reproductive success than a single sex act is a gain.” Everyone nods in agreement with the Shulmaster. 

How about extreme happiness vs. mild unhappiness? MTurkers say that there need to exist at least 62% of people experiencing the absolute best form of bliss imaginable in order to outweigh the extremely mild suffering of unhappy people (e.g., people who are stubbing their toes a bit too often for their liking). Brian Tomasik and the suffering-focused crowd rejoice, a 1.5 : 1 trade ratio for practically hedonium to mild suffering?! There is no way the expected value of the future is that good. Reducing s-risks is common sense after all!

How about mild happiness vs. extreme unhappiness? The MTurkers have spoken: A world in which 82% of people experience extremely mild happiness—i.e., eating particularly bland potatoes and listening to muzak without one’s hearing aids on—and 18% of people are brutally tortured while being kept alive by nano bots, is… net positive. 

“Wait, that’s a trade ratio of 4.5:1 !” Toby says. “How on Earth is this compatible with a trade ratio of 3:1 for practically hedonium vs. highly optimized suffering, let alone a trade ratio of 1.5:1 for practically hedonium vs. stubbing your toes occasionally!” Carl screams. He looks at Brian but Brian has already fainted.

Toby, Carl and Brian meet the next day, still looking very pale. They shake hands and agree to not do so much descriptive ethics anymore. 

Years later, all three still cannot stop wincing with pain when “the Long Reflection” is mentioned. 

  1. ^

    We also had two conditions about preventing the creation of a happy [unhappy] person. Preventing a happy person from being created (mean = 3.1) was rated as somewhat bad. Preventing an unhappy person (mean = 5.5) from being created was rated as fairly good.

I'd probably be interested in reading this! 

I think you could post it on your EA forum shortform, in the Effective Altruism Peer Support Facebook group, or in some subreddits (e.g., the Depression subreddit). (Just to be clear, you could post in all of those places, you don't have to pick only one.)

I wouldn’t be totally surprised if it was less predictive than say ”openness to new ideas” or something.

That seems possible, yeah. (Generally, it would be interesting to see if other personality traits are also predictive.)

I wonder if you could learn more by interviewing people who are just starting to get interested in EA and seeing how their responses change over say a year? Interviewing people who have just started an intro to EA fellowship/virtual program could work well for this.

Good idea, that would definitely be informative!


Is it possible that being E and A correlates with EAs who have been involved and absorbed EA ideas but wouldn’t correlate with EAs if you were able to survey them before they got involved in EA?

That there is no correlation at all seems unlikely to me. (I could expand on that.)

However, I do agree that there is plausibly an effect where being involved in EA, interacting with fellow EAs and hearing EA arguments makes you score even more highly on expansive altruism and effectiveness-focus scales than when you first encountered EA.

I could also imagine someone who is very open to reasonable arguments but isn’t particularly E or A but comes to agree with the statements over time.

That seems plausible to me as well, particularly for effectiveness-focus.

I agree that this is not an especially cost-effective intervention. I was hoping to convey something else with my comment.

If that new FTX Future Fund invested all $1B into Ukraine it will be a minority percentage of all total funds

Sure, but the fact that an area has already received dozens of billions in funding is in itself not a knock-down argument. For example, hundreds of billions of dollars are spent on climate change every year and hundreds of billions were spent on COVID vaccine development alone. But posts about interventions in these areas would receive much less pushback (or usually no pushback). 

Overall, I think that interventions in this space are plausibly more cost-effective than the average climate change intervention discussed by EAs. (That being said, there are additional strategic and PR reasons to praise climate change as a cause area since this is one of the ideological cornerstones of EA’s main political ally.)

The main reason I wrote my comment was not to suggest that this is the most cost-effective intervention (which I agree it is not). I wanted to respond to the large number of downvotes and, if I am to be frank, my impression of the somewhat hostile tone of Dony’s comment, which made me think that many EAs think that OP’s post is clearly net negative.

In addition, I felt that arguments in favor of concessions/giving in to Putin’s threats (e.g., this post) were overrepresented on this Forum (and among EAs I know in private). I was responding more to these sentiments (and also to Dony’s claim that there is no debate). Lastly, there are also game-theoretic reasons to not advertize one's willingness to give in to coercion.

Load More