huw

Co-Founder & CTO @ Kaya Guides
2182 karmaJoined Working (6-15 years)Sydney NSW, Australia
huw.cool

Bio

Participation
2

I live for a high disagree-to-upvote ratio

Comments
298

huw
38
15
0

If I can put this succinctly:

  • Basically all of the charities you reviewed seem to be operating in good faith
  • Your reviews were regularly incendiary and accusatory in spite of this, with little (and in many cases no) evidence for them operating in bad faith
  • In every case I saw, they took on your feedback and responded reasonably
  • You asked for feedback on your approach and were given it, then didn’t change anything

Ironically, in spite of yourself, you probably did have a positive impact, because many of the charities were engaging with you in good faith and took on your feedback. You should be happy mate!

If OpenAI are sincere in adding this to their ToS or there’s further regulatory pressure, the models will presumably get better at preventing this. I think it’s important.

ChatGPT’s usage terms now forbid it from giving legal and medical advice:

So you cannot use our services for: provision of tailored advice that requires a license, such as legal or medical advice, without appropriate involvement by a licensed professional (https://openai.com/en-GB/policies/usage-policies/)

Some users are reporting that ChatGPT refuses to give certain kinds of medical advice. I can’t figure out if this also applies to API usage.

It sounds like the regulatory threats and negative press may be working, and it’ll be interesting to see if other model providers follow suit. It will be interesting to see if jurisdictions formally regulate this (I can see the EU doing so, but not the U.S.).

In my opinion, the upshot of this is probably that OpenAI are ceding this market to specialised providers who can afford the higher marginal costs of moderation, safety, and regulatory compliance (or black-market-style providers who refuse to put these safeguards on and don’t bow to regulatory pressure). This is probably a good thing—the legal, medical, and financial industries have clearer, industry-specific regulatory frameworks that can more adequately monitor for and prevent harm.

huw
11
2
0

Yep—Beast Philanthropy actually did an AMA here in the past! My takeaway was that the video comes first, so that your chances of a partnership would greatly increase if you can make it entertaining. This is somewhat in contrast with a lot of EA charities, which are quite boring, but I suspect on the margins you could find something good.

What IMHO worked for GiveDirectly in that video, and for Shrimp Welfare in their public outreach, has been the counterintuitiveness of some of these interventions. Wild animals, cultured meat, shrimp, are more likely to fit in this bucket than corporate campaigns for chickens I reckon.

PurpleAir contribute all of their sensors to IQAir also! So you can get a very comprehensive sense of air quality very quickly and compare private and public sources.

I suspect the synthesis here is that unguided is very effective when adhered to, but the main challenge is adherence. The reason to believe this is that there is usually a strong dosage effect in psychotherapy studies, and that the Furukawa study I posted in the first comment found that the only value humans provided was for adherence, not effect size.

Unfortunately, this would then cause big problems, because there is likely a trial bias affecting adherence, potentially inflating estimates by 4× against real-world data. I’m surprised that this isn’t covered in the literature, and my surprise is probably good evidence that I have something wrong here. This is one of the reasons I’m keen to study our intervention’s real-world data in a comparative RCT.

You make a strong point about the for-profit space and relative incentives, which is partly why, when I had to make a decision between founding a for-profit unguided app and joining Kaya Guides, I chose the guided option. As you note, the way the incentives seem to work is that large for-profits can serve LMICs only when profit margins are competitive with expanding further in HICs. This is the case for unguided apps, because translation and adaptation is a cheap fixed cost. But as soon as you have marginal costs, like hiring humans (or buying books, or possibly, paying for AI compute), it stops making sense. This is why BetterHelp have only now begun to expand beyond the U.S. to other rich countries.

But I think you implicitly note—if one intervention has zero marginal cost, then surely it’s going to be more cost-effective and therefore more attractive to funders? One model I’ve wondered about for an unguided for-profit is essentially licensing its core technology and brand to a non-profit at cost, which would then receive donations, do translations, and distribute in other markets.

Tong et al. (2024)

The devil’s in the details here. The meta-analysis you cite includes an overall estimate for unguided self-help which aggregates over different control condition types (waitlist, care-as-usual, others). When breaking down by control condition, and adding Karyotaki et al. (2021), which looks at guided vs unguided internet-based studies:

  • Waitlist control
    • Cuijpers et al. (2019): 0.52 (Guided: 0.87)
    • Karyotaki et al. (2021): 0.6 (Guided: 0.8)
    • Tong et al. (2024): 0.71
  • Care as usual control
    • Cuijpers et al. (2019): 0.13 (Guided: 0.47)
    • Karyotaki et al. (2021): 0.2 (Guided: 0.4)
    • Tong et al. (2024): 0.35

Now, Tong et al. (2024) does seem to find higher effects in general, but 32% of the interventions studied included regular human encouragement. These conditions found effect sizes of 0.62, compared to 0.47 for no support (I wish these would disaggregate against control conditions, but alas).

Tong has significantly more self-guided studies than Cuijpers. When both limit to just low risk-of-bias studies, Tong reports an effect sizes of 0.43 (not disaggregated across controls, unfortunately). Cuijpers reports 0.44 for waitlist controls, and 0.13 for care as usual, for the same restriction. So Tong has included more high risk-of-bias studies, which is quite dramatically lifting their effect size.

Now, as Cuijpers and Karyotaki are both co-authors on the Tong analysis, I’m sure that there’s value in including those extra studies, and Tong probably makes me slightly update on the effectiveness of unguided self-help. But I would be wary about concluding that Tong is ‘normal’ and Cuijpers is ‘unusually pessimistic’; probably the inverse is true.

(All up, though, I think that it’s quite likely that unguided interventions could end up being more optimally cost-effective than guided ones, and I’m excited to see how this space develops. I’d definitely encourage people to try and pursue this more! I don’t think the case for unguided interventions rests on their relative effectiveness, but much more on their relative costs.)

Apps vs books

I don’t have a strong sense here. The Jesus-Romero study is good and persuasive, but to be convinced, I’d want to see a study of revealed preferences rather than stated ones. To illustrate, one reason we think apps might be better is because our digital ads reach users while they’re on media apps like Instagram, which is a behaviour people are probably likely to engage in when depressed. I think there’s probably a much lower activity threshold to click on an ad and book a call with us, than there is to remember you ordered/were sent a book and to do an exercises from it.

Regardless, it’s likely that digital interventions are much cheaper (again, probably about $1 per participant in engineering vs. ~$5 (??) for book printing and delivery, assuming both interventions spend $1 on social media), and can scale much faster (printing and delivery requires a lot of extra coordination and volume prediction). There’s a good reason many for-profit industries have digitised; it’s just much cheaper and more flexible.

huw
*28
0
0
6

G’day, welcome to the Forum!

I help lead a highly cost-effective EA bibliotherapy charity in India. I agree with most of your points, and in fact, Charity Entrepreneurship’s original report into cost-effective psychotherapy interventions recommended buying physical books and distributing in much the same way you suggest. My charity, Kaya Guides, was incubated from this idea by CE in 2022. We have learned a lot since then, that might add some colour to your post:

  1. Apps are much cheaper than physical books: We deliver our intervention over WhatsApp, which, at scale, will cost us a maximum of like, 20c per participant. The only reason not to do this is in internet-poor countries, but India has a high and rapidly accelerating rate of internet adoption among their under-resourced depressed population.

  2. You don’t need to distribute randomly, just use digital ads: We recruit new participants for ~$0.60 via Instagram ads that bounce users directly to WhatsApp. Meta are a trillion dollar company because they are extremely good at identifying people who might want to use your product; ~70% of users who make it to our depression questionnaire screen for depression.

  3. Adding weekly calls likely doubles to quadruples your adherence: The bibliotherapy you’re referring to, as far as I can infer from your cost model, is also known as ‘unguided self-help’. I should note that the meta-analyses you link are mostly of ‘guided self-help’, which is when the books are paired with regular contact with a therapist. Guided self-help has indistinguishable effect sizes from individual and group therapy. You may be interested in this 2019 meta-analysis which looks at 155 RCTs of different delivery formats for CBT, which finds that, relative to care as usual, guided self-help has double the effect size of unguided self-help. The reasons why aren’t perfectly understood, but the general belief is that the therapists don’t provide much in the way of extra help, but just provide a sense of accountability that helps participants make their way through the entire curriculum. See this study, which found that ‘human encouragement’ had significant effects on retention, but no unique-to-human element had a significant effect on effect sizes directly.

FWIW, I wrote about the difference between unguided and guided self-help previously on the Forum; the main things that have changed in my thinking since is that (a) you can get the cost of counsellors down further than previously thought, including if you augment them with AI, and (b) the adherence of unguided interventions seems lower than I thought. Anyway, we’ll be doing an RCT on this soon hopefully to put the issue to bed a bit more thoroughly :)

I also have updated estimates of the cost-effectiveness relative to cash transfers. Using the Happier Lives Institute’s methodology, we probably create around 1.28 wellbeing-years of value per participant (including spillovers to other members of their household), which is a bit less than top group therapy charities such as StrongMinds and Friendship Bench. But we currently treat people for around $50 per person, should reach ~$20 per person next year, and ~$5–7 per person at scale (I intend to write a post on this soon). Cash transfers, meanwhile, create about 9.22 wellbeing-years when administered by GiveDirectly, and cost about $1,200.

Costly signals like hunger strikes are only likely to persuade public opinion if the public actually hears about them, which is only going to happen if the issue already has some level of public salience. (Whereas protests are better for building public salience, because they’re better suited for mass turnout).

I then also argued that a costly signal like this is unlikely to persuade people at Anthropic, who are already unusually familiar with the debate around AI safety or pausing AI and don’t hold a duty of care over the protestor. It’s at this point that the harms (including inspiring other hunger strikers) overwhelm the benefits.

Well, at least now I have a very salient example of a hunger strike inspiring other people to hunger strike.

Load more