To be 100% clear about what I see as the main issue (by what I think are 80k's lights), it's not that the podcast is less interesting for an EA Forum audience, but rather that it's less interesting in general. It's a niche podcast for people who already think AI is very important.
I'm sort of confused by how this interacts with the goals laid out in the Google Doc. I think it's great to target elite decision-makers — but I would have assumed the greatest impact is on the extensive margin, among people who (1) have decision-making power but aren't AI special...
Hey Matt, obviously there's a tonne one could say here, just to offer some quick thoughts:
No, I really don't. Sometimes you see things in the same territory on Dwarkesh (which is very AI-focused) or Econtalk (which is shorter and less and less interesting to me lately). Rationally Speaking was wonderful but appears to be done. Hear This Idea is intermittent and often more narrowly focused. You get similar guests on podcasts like Jolly Swagman but the discussion is often at too low of a level, with worse questions asked. I have little hope of finding episodes like those with Hannah Ritchie, Christopher Brown, Andy Weber, or Glen Weyl anywhere el...
It's great to see the podcast expanding. I think the ship has already sailed on this, but it feels important for me to flag two experiences I've had since the podcast's "shift to AI."
Thanks Matt and other commenting here. I have independently starting worrying about the show being too narrow and repetitive this year, and will be factoring in the issues people have raised here in planning for next year!
(Unfortunately I can't say we'll probably get back to being as interesting for an EA Forum audience as we once were, as we're working with a different theory of change now and I think for better or worse the times we're living in call for shifting strategy.)
This also applies to the 80k brand as a whole. I used to recommend it to people interested in having an impact with their career but ever since 80k pivoted to an AI career funnel I recommend it to fewer people and always with the caveat of "They focus only on AI now, but there is some useful content hidden beneath"
I had a similar experience. I recommended the podcast to dozens of people over the years, because it was one of the best to have fascinating interviews with great guest on a very wide range of topics. However, since it switched to AI as the main topic, I have recommended it to zero people and I don't expect this to change if the focus stays this way.
"Though I think AI is critically important, it is not something I get a real kick out of thinking and hearing about."
-> Personally, I find a whole lot of non-technical AI content to be highly repetitive. It seems like a lot of the same questions are being discussed again and again with fairly little progress.
For 80k, I think I'd really encourage the team to focus a lot on figuring out new subtopics that are interesting and important. I'm sure there are many great stories out there, but I think it's very easy to get trapped into talking about the routine updates or controversies of the week, with little big-picture understanding.
While this is true, I think this comment somewhat misunderstands the point of the post, or at least doesn't engage with the most promising interpretation of it.
I work at Founders Pledge, and I do think it is true that the correct function of an org like FP is (speaking roughly) to move money that is not effectiveness-focused from "bad stuff" to "good stuff." Over the years, FP has had many conversations about the extent to which we want to encourage "better" giving as opposed to "best" giving. I think we've basically cohered on the view that focusing on "b...
FP Research Director here.
I think Aidan and the GWWC team did a very thorough job on their evaluation, and in some respects I think the report serves a valuable function in pushing us towards various kinds of process improvements.
I also understand why GWWC came to the decision they did: to not recommend GHDF as competitive with GiveWell. But I'm also skeptical that any organization other than GiveWell could pass this bar in GHD, since it seems that in the context of the evaluation GiveWell constitutes not just a benchmark for point-estimate CEAs but also a...
[highly speculative]
It seems plausible to me that the existence of higher degrees of random error could inflate a more error-tolerant evaluator's CEAs for funded grants as a class. Someone could probably quantify that intuition a whole lot better, but here's one thought experiment:
Suppose ResourceHeavy and QuickMover [which are not intended to be GiveWell and FP!] are evaluating a pool of 100 grant opportunities and have room to fund 16 of them. Each has a policy of selecting the grants that score highest on cost-effectiveness. ResourceHeavy spends a ton o...
Hey Darren, thanks for doing this AMA — and thanks for doing your part to steer money to such a valuably and critically important cause.
Can you describe a bit about the decision-making process at Beast Philanthropy? More to the point, what would an optimal decision-making process look like, in your view? e.g. how would you use research, how would you balance giving locally vs globally, think about doing the most good possible (or constraining that in some way), etc?
I listened to the whole episode — if I understood correctly, they are mostly skeptical that there are effects at very low blood lead levels. At the end of the podcast, Stuart or Tom (can't remember which) explicitly says that they're not skeptical that lead affects IQ, and they spend most of the episode addressing the claimed relationship at low BLLs (rather than the high ones addressed by LEEP, CGD, other interventions).
I'd be interested in exploring funding this and the broader question of ensuring funding stability and security robustness for critical OS infrastructure. @Peter Wildeford is this something you guys are considering looking at?
I'm also strongly interested in this research topic — note that although the problem is worst in the U.S., the availability and affordability of fentanyl (which appears to be driving OD deaths) suggests that this could easily spread to LMICs in the medium-term, suggesting that preventive measures such as vaccines could even be cost-effective by traditional metrics.
Easily reconciled — most of our money moved is via advising our members. These grants are in large part not public, and members also grant to many organizations that they choose irrespective of our recommendations. We provide the infrastructure to enable this.
The Funds are a relatively recent development, and indeed some of the grants listed on the current Fund pages were actually advised by the fund managers, not granted directly from money contributed to the Fund (this is noted on the website if it's the case for each grant). Ideally, we'd be able to gro...
We (Founders Pledge) do have a significant presence in SF, and are actively trying to grow much faster in the U.S. in 2024.
A couple weakly held takes here, based on my experience:
My point is precisely that you should not assume any view. My position is that the uncertainties here are significant enough to warrant some attention to nuclear war as a potential extinction risk, rather than to simply bat away these concerns on first principles and questionable empirics.
Where extinction risk is concerned, it is potentially very costly to conclude on little evidence that something is not an extinction risk. We do need to prioritize, so I would not for instance propose treating bad zoning laws as an X-risk simply because we can't demonstra...
If you leave 1,000 - 10,000 humans alive, the longterm future is probably fine
This is a very common claim that I think needs to be defended somewhat more robustly instead of simply assumed. If we have one strength as a community, is in not simply assuming things.
My read is that the evidence here is quite limited, the outside view suggests that losing 99.9999% of a species / having a very small population is a significant extinction risk, and that the uncertainty around the long-term viability of collapse scenarios is enough reason to want to avoid near-extinction events.
I disagree with the valence of the comment, but think it reflects legitimate concerns.
I am not worried that "HLI's institutional agenda corrupts its ability to conduct fair-minded and even-handed assessment." I agree that there are some ways that HLI's pro-SWB-measurement stance can bleed into overly optimistic analytic choices, but we are not simply taking analyses by our research partners on faith and I hope no one else is either. Indeed, the very reason HLI's mistakes are obvious is that they have been transparent and responsive to criticism.
We disagree...
I agree that there are some ways that HLI's pro-SWB-measurement stance can bleed into overly optimistic analytic choices, but we are not simply taking analyses by our research partners on faith and I hope no one else is either.
Individual donors are, however, more likely to take a charity recommender's analysis largely on faith -- because they do not have the time or the specialized knowledge and skills necessary to kick the tires. For those donors, the main point of consulting a charity recommender is to delegate the tire-kicking duties to someone who has the time, knowledge, and skills to do that.
I guess I would very slightly adjust my sense of HLI, but I wouldn't really think of this as an "error." I don't significantly adjust my view of GiveWell when they delist a charity based on new information.
I think if the RCT downgrades StrongMinds' work by a big factor, that won't really introduce new information about HLI's methodology/expertise. If you think there are methodological weaknesses that would cause them to overstate StrongMinds' impact, those weaknesses should be visible now, irrespective of the RCT results.
I can also vouch for HLI. Per John Salter's comment, I may also have been a little sus early (sorry Michael) on but HLI's work has been extremely valuable for our own methodology improvements at Founders Pledge. The whole team is great, and I will second John's comment to the effect that Joel's expertise is really rare and that HLI seems to be the right home for it.
Just a note here as the author of that lobbying post you cite: the CEA including the 2.5% change in chance of success is intended to be illustrative — well, conservative, but it's based on nothing more than a rough sense of effect magnitude from having read all those studies for the lit review. The specific figures included in the CEA are very rough. As Stephen Clare pointed out in the comments, it's also probably not realistic to have modeled that is normal on the [0,5] 95% CI.
Hey Vasco, you make lots of good points here that are worth considering at length. These are topics we've discussed on and off in a fairly unstructured way on the research team at FP, and I'm afraid I'm not sure what's next when it comes to tackling them. We don't currently have a researcher dedicated to animal welfare, and our recommendations in that space have historically come from partner orgs.
Just as context, the reason for this is that FP has historically separated our recommendations into three "worldviews" (longtermism, current generations, and ani...
Hey Matthew, thanks for sharing this. Can you provide some more information (or link to your thoughts elsewhere) on why fervor around UV-C is misplaced? As you know, ASHRAE Standards 185.1 and 185.2 concern testing of UV devices for germicidal irradiation, so I'd be particularly interested to know if this was an area that ASHRAE itself had concluded was unpromising.
I thought of some other down-the-line feature requests
Ah, great! I think it would be nice to offer different aggregation options, though if you do offer one I agree that geo mean of odds is the best default. But I can imagine people wanting to use medians or averages, or even specifying their own aggregation functions. Especially if you are trying to encourage uptake by less technical organizations, it seems important to offer at least one option that is more legible to less numerate people.
Honestly, what surprises me most here is how similar all four organizations' numbers are across most of the items involved
This was also gratifying for us to see, but it's probably important to note that our approach incorporates weights from both GiveWell and HLI at different points, so the estimates are not completely independent.
I haven't thought extensively about what kind of effect size I'd expect, but I think I'm roughly 65-70% confident that the RCT will return evidence of a detectable effect.
But my uncertainty is more in terms of rating upon re-evaluating the whole thing. Since I reviewed SM last year, we've started to be a lot more punctilious about incorporating various discounts and forecasts into CEAs. So on the one hand I'd naturally expect us to apply more of those discounts on reviewing this case, but on the other hand my original reason for not discounting HLI's...
As promised, I am returning here with some more detail. I will break this (very long) comment into sections for the sake of clarity.
My overview of this discussion
It seems clear to me that what is going on here is that there are conflicting interpretations of the evidence on StrongMinds' effectiveness. In particular, the key question here is what our estimate of the effect size of SM's programs should be. There are other uncertainties and disagreements, but in my view, this is the essential crux of the conversation. I will give my own (personal) interpretat...
During the re-evaluation, it would be great if FP could also check the partnership programme by StrongMinds - e.g. whether this is an additional source of revenue for them, and what the operational costs of the partners who help treat additional patients for them are. At the moment these costs are not incorporated into HLI's CEA, but partners were responsible for ~50 and ~80% of the clients treated in 2021 and 2022 respectively. For example, if we crudely assume costs of treatment per client are constant regardless of whether it's treated by StrongMinds or...
Hey Simon, I remain slightly confused about this element of the conversation. I take you to mean that, since we base our assessment mostly on HLI's work, and since we draw different conclusions from HLI's work than you think are reasonable, we should reassess StrongMinds on that basis. Is that right?
If so, I do look forward to your thoughts on the HLI analysis, but in the meantime I'd be curious to get a sense of your personal levels of confidence here — what does a distribution of your beliefs over cost-effectiveness for StrongMinds look like?
Fair enough. I think one important thing to highlight here is that though the details of our analysis have changed since 2019, the broad strokes haven’t — that is to say, the evidence is largely the same and the transformation used (DALY vs WELLBY), for instance, is not super consequential for the rating.
The situation is one, as you say, of GIGO (though we think the input is not garbage) and the main material question is about the estimated effect size. We rely on HLI’s estimate, the methodology for which is public.
I think your (2) is not totally fair to S...
“I think my main takeaway is my first one here. GWWC shouldn't be using your recommendations to label things top charities. Would you disagree with that?”
Yes, I think so- I’m not sure why this should be the case. Different evaluators have different standards of evidence, and GWWC is using ours for this particular recommendation. They reviewed our reasoning and (I gather) were satisfied. As someone else said in the comments, the right reference class here is probably deworming— “big if true.”
The message on the report says that some details have changed, bu...
Yes, I think so- I’m not sure why this should be the case. Different evaluators have different standards of evidence, and GWWC is using ours for this particular recommendation. They reviewed our reasoning and (I gather) were satisfied. As someone else said in the comments, the right reference class here is probably deworming— “big if true.”
I'm afraid that doesn't make me super impressed with GWWC, and it's not easy for non-public reasoning to be debunked. Hopefully you'll publish it and we can see where we disagree.
I think there's a big difference between ...
Hi Simon, thanks for writing this! I’m research director at FP, and have a few bullets to comment here in response, but overall just want to indicate that this post is very valuable. I’m also commenting on my phone and don’t have access to my computer at the moment, but can participate in this conversation more energetically (and provide more detail) when I’m back at work next week.
I basically agree with what I take to be your topline finding here, which is that more data is needed before we can arrive at GiveWell-tier levels of confidence about Strong
The 2019 report you link (and the associated CEA) is deprecated— FP hasn’t been resourced to update public-facing materials, a situation that is now changing—but the proviso at the top of the page is accurate: we stand by our recommendation.
The page doesn't say deprecated and GWWC are still linking to it and recommending it as a top charity. I do think your statements here should be enough for GWWC to remove them as a top charity.
This is what triggered the whole thing in the first place - I have had doubts about StrongMinds for a long time (I private...
Thanks for this! Useful to get some insight into the FP thought process here.
The effect sizes observed are very large, but it’s important to place in the context of StrongMinds’ work with severely traumatized populations. Incoming PHQ-9 scores are very, very high, so I think ... 2) I’m not sure that our general priors about the low effectiveness of therapeutic interventions are likely to be well-calibrated here.
(emphasis added)
Minor nitpick (I haven't personally read FP's analysis / work on this):
Appendix C (pg 31) details the recruitment...
Hey Nick, thanks for this very valuable experience-informed comment. I'm curious what you make of the original 2002 RCT that first tested IPT-G in Uganda. When we (at Founders Pledge) looked at StrongMinds (which we currently recommend, in large part on the back of HLI's research), I was surprised to see that the results from the original RCT lined up closely with the pre/post scores reported by recent program participants.
Would your take on this result be that participants in the treated group were still basically giving what they saw as...
I agreed with your comment (I found it convincing) but downvoted it because if I was a first-time poster here, I would be much less likely to post again after having my first post characterized as foolish.
As one of many “naive functionalists”, I found the OP was very valuable as a challenge to my thinking, and so I want to come down strongly against discouraging such posts in any way.
I agree- the EA community claims to be "open to criticism" but having someone comment that a post is foolish on a first time poster's well articulated and argued post is quite frankly really disappointing.
In addition, the poster is a professional and has valuable knowledge regardless of how you feel about the merits of their argument.
I'm a university student and run an EA group at my university. I really wish the community would be more open to professionals like this poster who aren't affiliated with an EA organization, but can contribute different perspectives that aren't as common within the community.
These seem like broadly reasonable heuristics, but they kick the can on who is an expert, which is where most of the challenge in deference lies.
The canonical (recent) example of this is COVID, when doctors and epidemiologists, who were perceived by the general public as the relevant experts, weighed in on questions of public policy, in many cases giving the impression of consensus in their communities. I think there is a good argument to be made that public policy “experts” were in fact better-placed to give recommendations in many of these issues. Regard...
(I am research director at FP)
Thanks for all of your work on this analysis, Vasco. We appreciate your thoroughness and your willingness to engage with us beforehand. The work is obviously methodologically sound and, as Johannes indicated, we generally agree that climate is not among the top bets for reducing existential risk.
I think that "mitigating existential risk as cost-effectively as possible" is entailed by the goal of doing as much good as possible in the world, which is why FP exists. To be absolutely clear, FP's goal is to do the maximum possible ...
Do you have any plans for interoperability with other PPLs or languages for statistical computing? It would be pretty useful to be able to, e.g. write a model in Squiggle and port it easily to R or to PyMC3, particularly if Bayesian updating is not currently supported in Squiggle. I can easily imagine a workflow where we use Squiggle to develop a prior, which we'd then want to update using microdata in, say, Stan (via R).
Founders Pledge is hiring an Applied Researcher to work with our climate lead evaluating funding opportunities, finding new areas to research within climate, evaluating different theories of change, and granting from FP's Climate Fund.
We're open to multiple levels of seniority, from junior researchers all the way up to experienced climate grantmakers. Experience in climate and a familiarity with energy systems is a big plus, but not 100% necessary.
Our job listing is here. Please note that the first round consists of a resume screen and a preliminary task. ...
Something I've considered making myself is a Slackbot for group decision-making: forecasting, quadratic voting, etc. This seems like it would be very useful for lots of organizations and quite a low lift. It's not the kind of thing that seems easily monetizable at first, but it seems reasonable to expect that if it provides valuable, it could be the kind of thing that people would eventually have to buy "seats" for in larger organizations.
I appreciate your taking the time to write out this idea and the careful thought that went into your post. I liked that it was kind of in the form of a pitch, in keeping with your journalistic theme. I agree that EAs should be thinking more seriously about journalism (in the broadest possible sense) and I think that this is as good a place as any to start. I want to (a) nitpick a few things in your post with an eye to facilitating this broader conversation and (b) point out what I see as an important potential failure mode for an effort like this.
You chara...
While I’m skeptical about the idea that particular causes you’ve mentioned could truly end up being cost effective paths to reducing suffering, I’m sympathetic to the idea that improving the effectiveness of activity in putatively non-effective causes is potentially itself effective. What interventions do you have in mind to improve effectiveness within these domains?
To clarify, I'm not sure this is likely to be the best use of any individual EA's time, but I think it can still be true that it's potentially a good use of community resources, if intelligently directed.
I agree that perhaps "constitutionally" is too strong - what I mean is that EAs tend (generally) to have an interest in / awareness of these broadly meta-scientific topics.
In general, the argument I would make would be for greater attention to the possibility that mainstream causes deserve attention and more meta-level arguments for this case (like your post).
Thanks for this! It seems like much of the work that went into your CEA could be repurposed for explorations of other potentially growth- or governance-enhancing interventions. Since finding such an intervention would be quite high-value, and since the parameters in your CEA are quite uncertain, it seems like the value of information with respect to clarifying these parameters (and therefore the final ROI distribution) is probably very high.
Do you have a sense of what kind of research or data would help you narrow the uncertainty in the parameter inputs of your cost-effectiveness model?
On the face of it, it seems like researching and writing about "mainstream" topics is net positive value for EAs for the reasons you describe, although not obviously an optimal use of time relative to other competing opportunities for EAs. I've tried to work out in broad strokes how effective it might be to move money within putatively less-effective causes, and it seems to me like (for instance) the right research, done by the right person or group, really could make a meaningful difference in one of these areas.
Items 2.2 and 2.3 (in your summary) are, to...
I think about this all the time. It seems like a really high-value thing to do not just for the sake of other communities but even from a strictly EA perspective— discourse norms seem to have a real impact on the outcome of decision-relevant conversations, and I have an (as-yet unjustified) sense that EA-style norms lead to better normative outcomes. I haven't tried it, but I do have a few isolated, perhaps obvious observations.
I guess a more useful way to think about this for prospective funders is to move things about again. Given that you can exert c/x leverage over funds within Cause Y, then you're justified in spending c to do so provided you can find some Cause Y such that the distribution of DALYs per dollar meets the condition...
...which makes for a potentially nice rule of thumb. When assessing some Cause Y, you need only ("only") identify a plausibly best or close-to-best opportunity, as well as the median one, and work from there.
Obviously this condition...
Under what circumstances is it potentially cost-effective to move money within low-impact causes?
This is preliminary and most likely somehow wrong. I'd love for someone to have a look at my math and tell me if (how?) I'm on the absolute wrong track here.
Start from the assumption that there is some amount of charitable funding that is resolutely non-cause-neutral. It is dedicated to some cause area Y and cannot be budged. I'll assume for these purposes that DALYs saved per dollar is distributed log-normally within Cause Y:
I want t...
Thanks! This is very helpful/informative — particularly the thing about YouTube!