# bruce

541Joined Oct 2021

# Bio

Doctor from NZ, now doing Global Health & Development Research @ Rethink Priorities, but interested and curious about most EA topics.

Outside of RP work, I spend some time doing independent "grand futures"/ GPR research (Anders Sandberg/BERI) and very sporadic grantmaking (EAIF). Also looking to re-engage with UN processes for OCA/Summit of the Future.

Feel free to reach out if you think there's anything I can do to help you or your work, or if you have any Qs about Rethink Priorities! If you're a medical student / junior doctor reconsidering your clinical future, or if you're quite new to EA / feel uncertain about how you fit in the EA space, have an especially low bar for reaching out.

Outside of EA, I do a bit of end of life care research and climate change advocacy, and outside of work I enjoy some casual basketball, board games and good indie films. (Very) washed up classical violinist and oly-lifter.

All comments in personal capacity unless otherwise stated.

# Posts 1

Sorted by New

If this comment is more about "how could this have been foreseen", then this comment thread may be relevant. I should note that hindsight bias means that it's much easier to look back and assess problems as obvious and predictable ex post, when powerful investment firms and individuals who also had skin in the game also missed this.

TL;DR:
1) There were entries that were relevant (this one also touches on it briefly)
2) They were specifically mentioned
3) There were comments relevant to this. (notably one of these was apparently deleted because it received a lot of downvotes when initially posted)
4) There has been at least two other posts on the forum prior to the contest that engaged with this specifically

My tentative take is that these issues were in fact identified by various members of the community, but there isn't a good way of turning identified issues into constructive actions - the status quo is we just have to trust that organisations have good systems in place for this, and that EA leaders are sufficiently careful and willing to make changes or consider them seriously, such that all the community needs to do is "raise the issue". And I think looking at the systems within the relevant EA orgs or leadership is what investigations or accountability questions going forward should focus on - all individuals are fallible, and we should be looking at how we can build systems in place such that the community doesn't have to just trust that people who have power and who are steering the EA movement will get it right, and that there are ways for the community to hold them accountable to their ideals or stated goals if it appears to, or risks not playing out in practice.

i.e. if there are good processes and systems in place and documentation of these processes and decisions, it's more acceptable (because other organisations that probably have a very good due diligence process also missed it). But if there weren't good processes, or if these decisions weren't a careful + intentional decision, then that's much more concerning, especially in context of specific criticisms that have been raised,[1]  or previous precedent. For example, I'd be especially curious about the events surrounding Ben Delo,[2] and processes that were implemented in response. I'd be curious about whether there are people in EA orgs involved in steering who keep track of potential risks and early warning signs to the EA movement, in the same way the EA community advocates for in the case of pandemics, AI, or even general ways of finding opportunities for impact. For example, SBF, who is listed as a EtG success story on 80k hours, has publicly stated he's willing to go 5x over the Kelly bet, and described yield farming in a way that Matt Levine interpreted as a Ponzi. Again, I'm personally less interested in the object level decision (e.g. whether or not we agree with SBF's Kelly bet comments as serious, or whether Levine's interpretation as appropriate), but more about what the process was, how this was considered at the time with the information they had etc. I'd also be curious about the documentation of any SBF related concerns that were raised by the community, if any, and how these concerns were managed and considered (as opposed to critiquing the final outcome).

Outside of due diligence and ways to facilitate whistleblowers, decision-making processes around the steering of the EA movement is crucial as well. When decisions are made with benefits that clearly affect one part of the EA community while bringing risks which are pertinent to all,[3] we need to look at how these decisions were made and what was considered at the time of the decision, and going forward, how to either diversify those risks, or make decision-making more inclusive of a wider range stakeholders, keeping in mind the best interests of the EA movement as a whole.

(this is something I'm working on in a personal capacity along with the OP of this post, as well as some others - details to come, but feel free to DM me if you have any thoughts on this. It appears that CEA is also already considering this)

If this comment is about "are these red-teaming contests in fact valuable for the money and time put into it, if it misses problems like this"

I think my view here (speaking only for the red-teaming contest) is that even if this specific contest was framed in a way that it missed these classes of issues, the value of the very top submissions[4] may still have made the efforts worthwhile. The potential value of a different framing was mentioned by another panelist. If it's the case that red-teaming contests are systematically missing this class of issues regardless of framing, then I agree that would be pretty useful to know, but I don't have a good sense of how we would try to investigate this.

1. ^

This tweet seems to have aged particularly well. Despite supportive comments from high-profile EAs on the original forum post, the author seemed disappointed that nothing came of it in that direction. Again, without getting into the object level discussion of the claims of the original paper, it's still worth asking questions around the processes. If there was were actions planned, what did these look like? If not, was that because of a disagreement over the suggested changes, or the extent that it was an issue at all? How were these decisions made, and what was considered?

2. ^

Apparently a previous EA-aligned billionaire ?donor who got rich by starting a crypto trading firm, who pleaded guilty to violating the bank secrecy act

3. ^

Even before this, I had heard from a primary source in a major mainstream global health organisation that there were staff who wanted to distance themselves from EA because of misunderstandings around longtermism.

4. ^

As requested, here are some submissions that I think are worth highlighting, or considered awarding but ultimately did not make the final cut. (This list is non-exhaustive, and should be taken more lightly than the Honorable mentions, because by definition these posts are less strongly endorsed  by those who judged it. Also commenting in personal capacity, not on behalf of other panelists, etc):

Bad Omens in Current Community Building
I think this was a good-faith description of some potential / existing issues that are important for community builders and the EA community, written by someone who "did not become an EA" but chose to go to the effort of providing feedback with the intention of benefitting the EA community. While these problems are difficult to quantify, they seem important if true, and pretty plausible based on my personal priors/limited experience. At the very least, this starts important conversations about how to approach community building that I hope will lead to positive changes, and a community that continues to strongly value truth-seeking and epistemic humility, which is personally one of the benefits I've valued most from engaging in the EA community.

Seven Questions for Existential Risk Studies
It's possible that the length and academic tone of this piece detracts from the reach it could have, and it (perhaps aptly) leaves me with more questions than answers, but I think the questions are important to reckon with, and this piece covers a lot of (important) ground. To quote a fellow (more eloquent) panelist, whose views I endorse: "Clearly written in good faith, and consistently even-handed and fair - almost to a fault. Very good analysis of epistemic dynamics in EA." On the other hand, this is likely less useful to those who are already very familiar with the ERS space.

Most problems fall within a 100x tractability range (under certain assumptions)
I was skeptical when I read this headline, and while I'm not yet convinced that 100x tractability range should be used as a general heuristic when thinking about tractability, I certainly updated in this direction, and I think this is a valuable post that may help guide cause prioritisation efforts.

The Effective Altruism movement is not above conflicts of interest
I was unsure about including this post, but I think this post highlights an important risk of the EA community receiving a significant share of its funding from a few sources, both for internal community epistemics/culture considerations as well as for external-facing and movement-building considerations. I don't agree with all of the object-level claims, but I think these issues are important to highlight and plausibly relevant outside of the specific case of SBF / crypto. That it wasn't already on the forum (afaict) also contributed to its inclusion here.

I'll also highlight one post that was awarded a prize, but I thought was particularly valuable:

Red Teaming CEA’s Community Building Work
I think this is particularly valuable because of the unique and difficult-to-replace position that CEA holds in the EA community, and as Max acknowledges, it benefits the EA community for important public organisations to be held accountable (and to a standard that is appropriate for their role and potential influence). Thus, even if listed problems aren't all fully on the mark, or are less relevant today than when the mistakes happened, a thorough analysis of these mistakes and an attempt at providing reasonable suggestions at least provides a baseline to which CEA can be held accountable for similar future mistakes, or help with assessing trends and patterns over time. I would personally be happy to see something like this on at least a semi-regular basis (though am unsure about exactly what time-frame would be most appropriate). On the other hand, it's important to acknowledge that this analysis is possible in large part because of CEA's commitment to transparency.

I'm pretty sympathetic to patient philanthropy for longtermist causes that aren't to do with nearterm Xrisks, because my view is that as long as we preserve option value for the future, they will likely be better placed to use the resources than we are, so we should just save the pool for them to use as they see fit.

The example I usually give when explaining my position is thinking about polio and the iron lung. Say someone in the 1910s wanted to invest in significant iron lung production facilities to make sure polio would never be a problem in the future. 20 years later, the polio vaccine is created and all this investment is obsolete. If that money was saved it could perhaps be used to speed up the distribution of polio vaccines and help eradicate polio etc.

One uncertainty I have about this though, is that I don't know how to implement this in practice (what % to give later vs give now? How do I know when I should use this pool?). Curious on any takes!

Hey team!
I'd love if someone can give me a TL;DR on donation matching - it's something I always get a bit confused about in terms of like "how much more should I donate because of this". And someone asked in a slack I was in about counterfactuals, which I realised I didn't know about either - how else is the money usually used?

Also, does anyone know what the optimal % split between donating to a matching pool vs donating to the charity (am I basically trading off between how much a matching pool actually increases the pie VS the money not being donated?), and how does this change if the org is fully EA funded vs partially vs not at all etc?

Many thanks for doing this AMA!

I'm personally excited about more work in the EA space on topics around subjective well-being, and was initially excited to see StrongMinds (SM) come so strongly recommended. I do have a few Qs about the incredible success the pilots have shown so far:[1]

1. I couldn't find number needed to treat (NNT)[2] figures anywhere (please let me know if I've missed this!), so I've had a rough go based on the published results, and came to an NNT of around 1.35.[3] Limitations of the research aside, this suggests StrongMinds is among the most effective interventions in all of medicine in terms of achieving its stated goals.
1. If later RCTs and replications showed much higher NNT figures, what do you think would be the most likely reason for this? For comparison:
1. This meta-analysis suggests an NNT of 3 when comparing IPT to a control condition;
2. This systematic review suggests an NNT of 4 for interpersonal therapy (IPT) compared to treatment as usual[4];
3. This meta-analysis suggests a response rate of 41% and an NNT of 4 when comparing therapy to 'waitlist' conditions (and lower when only considering IPT in subgroup analyses); or
4. this meta-analysis which suggests an NNT of 7 when comparing psychotherapies to placebo pill.
2. Admittedly, there are many caveats here - the various linked studies aren't a perfect comparison to SM's work, NNT clearly shouldn't be used as sole basis for comparison between interventions, and I haven't done enough work here to feel super confident about the quality of SM's research. But my initial reaction upon skimming and seeing response to treatment in the range of 94-99%, or 100+ people with PHQ-9 scores of over 15 basically all dropping down to 1-4[5] after 12-16 weeks of group IGT by lay counsellors was that this seemed far "too good to be true", and fairly incongruent with ~everything I've learnt or anecdotally seen in clinical practice about the effectiveness of mental health treatments (though clearly I could be wrong!). This is especially surprising given SM dropped the group of participants with minimal or mild depression from the analysis.[6]
3. Were these concerns ever raised by the researchers when writing up the reports? Do you have any reason to believe that the Ugandan context or something about the SM methodology makes your intervention many times more effective than basically any other intervention for depression?
2. How did you come to the 10% figure when adjusting for social desirability bias?
3. Was there a reason an RCT couldn't have been done as a pilot? Just noting that "informal control populations were established for both Phase 1 and Phase 2 patients, consisting of women who screened for depression but did not participate", and the control group in both the pilots were only 36 people, compared to the 244 and 270 in the treatment arm for phase 1 and phase 2 respectively. As a result, 11 / 24 of the villages where the interventions took place did not have a control arm at all. (pg 9)
4. Are you happy to go into a bit more detail about the background of the lay counsellors? E.g. what they know prior to the SM pilots, how much training (in number of hours) they receive, and who runs it (what relevant qualifications / background? How did the trainers get their IPT-G certification - e.g. is this a postgrad psychology qualification, or a one-off training course?) I briefly skimmed the text (appendix E) but also got a bit confused over the difference between "lay counsellor", "mental health facilitator", "mental health supervisor" and "senior technical advisor" and how they're relevant for the intervention.
5. Can you give us a cost breakdown of 170 / person figure for delivering the programme? 6. In the most recent publication (pg 5), published 2017, the report says: "Looking forward, StrongMinds will continue to strengthen our evaluation efforts and will continue to follow up with patients at 6 or 12 month intervals. We also remain committed to implementing a much more rigorous study, in the form of an externally-led, longitudinal randomized control trial, in the coming years." 1. Have either the follow-up or the externally-led longitudinal RCT happened yet? If so, are the results shareable with the public? (I note that there has been a qualitative study done on a teletherapy version published in 2021, but no RCT.) Thanks again! (Commenting in personal capacity etc) 1. ^ Apologies in advance if I've missed anything - I've only briefly skimmed your website's publications, and I haven't engaged with this literature for quite a while now! 2. ^ Quick primer on NNT for other readers. Lower = better, where NNT = 1 means your treatment gets the desired effect 100% of the time. 3. ^ SM's results of 95% depression-free (85% after the 10% adjustment for social desirability bias) give an EER of 0.15 after adjustment. By a more conservative estimate, based on this quote (pg 3): "A separate control group, which consisted of depressed women who received no treatment, experienced a reduction of depressive symptoms in only 11% of members over the same 12-week intervention period" and assuming all of those are clinically significant reductions in depressive symptoms, the CER is 0.89, which gives an NNT of 1 / (0.89 - 0.15) = 1.35. The EER can be adjusted upwards because not all who started in the treatment group were depressed, but this is only 2% and 6% for phase 1 and 2 respectively - so in any case the NNT is unlikely to go much higher than 1.5 even by the most conservative estimate. 4. ^ They also concluded: "We did not find convincing evidence supporting or refuting the effect of interpersonal psychotherapy or psychodynamic therapy compared with ‘treatment as usual’ for patients with major depressive disorder. The potential beneficial effect seems small and effects on major outcomes are unknown. Randomized trials with low risk of systematic errors and low risk of random errors are needed." 5. ^ 6. ^ As pointed out in the report (pg 9): A total of 56 participants with Minimal or Mild Depression (anyone with total raw scores between 1-9) at baseline in both the treatment intervention (46 participants) and control (10 participants) groups were dropped from the GEE analysis of determining the depression reduction impact. In typical practice around the world, individuals with Minimal/Mild Depression are not considered for inclusion in group therapy because their depressive symptoms are relatively insignificant. StrongMinds consciously included these Minimal/Mild cases in Phase Two because these patients indicated suicidal thoughts in their PHQ-9 evaluation. However, their removal from the GEE analysis serves to ensure that the Impact Evaluation is not artificially inflated, since reducing the depressive symptoms of Minimal/Mild Depressive cases is generally easier to do. bruce6d9 ❤️2 [hastily written, apologies in advance if I don't respond further] Hi! Thanks for sharing. Unfortunately I don't have the bandwidth to engage with the attention this comment deserves. Part of the reason I didn't default to including my reasoning initially is because I didn't want it to turn into a "neartermism vs longtermism" discussion thread. But it's my fault for not clarifying this, and I appreciate you sharing your thoughts, which will hopefully be helpful for other readers who do have time to engage! I think in short it sounds like I have larger uncertainties than you around both empirical and philosophical considerations, and because of these uncertainties, it's not clear to me that funding things like AI alignment youtube content or longtermist researchers or forecasting infrastructure are clearly better than preventing malaria or getting more kids vaccinated, or cash transfers to the very poor. If you have a compelling case that the cost-effectiveness of GiveWell's top charities is much lower than they think it is, I'd encourage you to also reach out to GiveWell! They recently had a "Change our Minds" contest, and while the contest has closed, I'm confident they'll be open to a forum post that strongly justified why they are too optimistic about their top charities. Lastly, as mentioned earlier, I prefer to try and contribute to longtermist causes with my work time,[1] and donate to neartermist causes. 1. ^ Things I'm involved with / have done that could plausibly be categorised as longtermist in nature: writing a grand futures paper with Anders Sandberg @ FHI, taking part in a forecasting tourney organised by Tetlock, getting involved with advocating for mitigating existential risks for UN OCA process, moderating an AI safety panel at the Internet Governance Forum, and contributing to someone's script submission for longtermist youtube content. I didn't go into this previously because I think it is irrelevant, but I'm just noticing a feeling of somehow needing to defend my choice of only donating to neartermist causes. But to be clear I currently think neartermist donation choices are justifiable even without any contributions from one's work life to existential risk or the far future of humanity. bruce6d9 ❤️5 Sure - admittedly I don't put as much thought into these as I probably should, and have deferred a reasonable amount to GiveWell. At a high level I generally donate to GHD causes because I'm more risk averse with my donations than my personal time (i.e. I'm very happy to think and contribute to things that are potentially high payoff but more speculative in my work time, but I prefer my donations to be making tangible, measurable impact with short feedback loops, even if that means "only" getting ~100/DALY).

More specifically, I chose Malaria Consortium (MC) over AMF because MC is marginally more cost effective at the moment, and because there was a recent paper about the benefits of combined chemoprevention alongside the recently released RTS,S malaria vaccine.

I chose New Incentives because it's also super cost-effective, and I have a soft spot for improving vaccine coverage.

I chose GiveDirectly because that seems "best-in-class" in terms of optimising for preferences of recipients / beneficiaries. I don't have fully formed views about the extent to which we should defer to recipients VS prioritise other measures, but until I do, I opt to donate a bit to GiveDirectly for this reason despite it being less cost effective by GiveWell metrics.

And then I chose my local women's refuge because I think sexual assault / intimate partner violence (IPV) is terrible (I get pretty riled up by abuse of power / trust generally), and I care a lot about it for other reasons that I won't get into here. I'm mindful that this probably means I'm not maximising DALYs averted in expectation, but I'm okay with this because doing so appeals to the less maximising / less utilitarian parts of me. I donate to the local refuge instead of some LMIC version that might have higher EV in part because it's more convenient to, in part because NZ have among the worst domestic violence / IPV rates in the developed world, in part because I care about my local community, and in part because I'm fully aware I'm not donating to this because I think it's the most cost-effective by EA standards.[1]

1. ^

That being said, I do think the harm/suffering from intimate partner violence, both to the victim as well as family members is really difficult to capture. I wouldn't be surprised if something like IPV was near the top in terms of cause areas that might be incorrectly underfunded, conditional on the DALY-maximising approach being wrong. (Something like better access to palliative care / relieving terrible end of life care suffering is probably also something that fits in this kind of category). It's not primarily intended as a hedge against a DALY-maximising approach, but I guess it functionally acts as one.

🙌1
🎉4
❤️11

Malaria Consortium, New Incentives, GiveDirectly, and my local women's refuge (in descending order of donation size).

No worries! I'll DM ya some additional thoughts  :)

Thanks again!

Just chiming in with an extra anecdotal data point that (on my laptop at least) I think the design looks great, from colour scheme to font choice - it's clear that a lot of effort has been put into this. I also really like the save highlight function, which I hadn't seen before, and thought it was a neat design choice to use an asterix there too (as well as the blurbs that come up when you hover over titles). I've only skimmed 1 article so far so can't comment on the content, but definitely would not hesitate to recommend this to people based on its current design, and I'd probably also anti-recommend adding dall-e images (at least the 4 that have come up).

Thanks to Clara and the team who have put this together!