Hey everyone, this is an old draft from 2022 for a post I’ve been hand-wringing about posting any version of – but it was the most popular draft amnesty idea from last year that I didn’t do, and I’m finally at the point where I think something basic like this post is good enough to get my contribution to this discussion out there:

Well before I regularly visited the EA Forum, I remember one of the first debates I ran into on it were the criticisms, from 2021, about ACE’s then recent social justice activities. The anonymous user “Hypatia” published a set of criticisms of some of these activities including accusations of promoting unrigorous, vague anti-racist statements, and canceling a speaker who was critical of BLM. Some people involved with ACE contested how these events were characterized, and it seems that as of today much of the debate has cooled down and no one has changed their minds very much.

One of the criticisms Hypatia brought up was that ACE was evaluating organizations, in part, based on how good it thought their values were outside of just their effectiveness at helping animals. Hypatia pointed out that it seemed valuable to have an organization that evaluated pure animal welfare effectiveness, without assuming other values on the part of donors, and then donors could decide for themselves what else mattered to them. At the time, this seemed mostly reasonable to me, and I couldn’t think of a lot wrong with this reasoning. One sort of pushback I’ve seen from ACE, for instance Leah Edgerton brought it up in her interview with Spencer Greenberg, is to deny the premise. Looking at aspects of the internal culture of these organizations is useful to predict harder to measure aspects of how healthy an organization is, and how much good it can ultimately do for animals. Another piece of pushback which I have not seen, but suspect would be commonsense to many people, is to bite the bullet. Yes, as an organization we are willing to look at things other than pure animal welfare. If an animal rights organization was going around lighting the houses of factory farmers on fire, a good evaluation organization would be perfectly reasonable in the decision not to rank them at all even if they did effectively help animals.

Both tactics seem respectable in theory, but in practice carry some burden of proof ACE may not meet. Is failure to meet certain norms ACE prioritizes really so predictive of eventual failure? Certainly Hypatia is skeptical in some cases, and I find it hard to believe that these social justice norms are precisely the ones you would look for in an organization if your sole goal was predicting stability. And what about the setting fires example, this is proof of concept that sometimes conflicts of values are strong enough to justify disqualification, but these organizations don’t seem to be doing anything a fraction that serious. After listening to the Edgerton interview, I thought about this debate for the first time in a while, and found that I actually had something that, as far as I know, hasn’t been mentioned yet, that seems to me like it is ACE’s strongest defense, and generally may be a consideration EA cause evaluators should consider more.

One of my favorite recent (as of my initial drafting) posts on EA culture from the forum was “Unsurprising things about the EA movement that surprised me” by Ada-Maaria Hyvarinen. In it, she raises the very relatable point that

“In particular, there is no secret EA database of estimates of the effectiveness of every possible action (sadly). When you tell people effective altruism is about finding effective, research-based ways of doing good, it is a natural reaction to ask: ‘so, what are some good ways of reducing pollution in the Baltic Sea/getting more girls into competitive programming/helping people affected by [current crisis that is on the news]’ or ‘so, what does EA think of the effectiveness of [my favorite charity]’. Here, the honest answer is often ‘nobody in EA knows’, and it is easy to sound dismissive by adding ‘and we are not going to find out anytime soon, since it is obvious that the thing you wanted to know about is not going to be the most effective thing anyway’.”

Maybe it is obvious to some people, but Hypatia’s reaction to ACE looking at values other than animal effectiveness makes the most sense in worlds where this point is not true, in which all EA organizations are sort of like Charity Navigator, and seek, as a primary part of their mission, to evaluate as comprehensive a list of charities as possible. Saying that ACE should stick to animal effectiveness is sort of like asking for an organizational model in which every cause evaluator is engaging in a different effective altruism. Insofar as none of the other measures of how good an organization are fit the specific version of EA they are dedicated to, it is simply none of their business. ACE doesn’t look at the impact of its organizations on the global poor, and Givewell doesn’t look at the impact of its organizations on animals, and none of them ask about how good or bad their organizations are for squishier, harder to measure values like promoting a more tolerant, welcoming movement culture. In the real world where EA evaluators aren’t like Charity Navigator, no one is checking the impact of Give Directly on chickens because it isn’t a candidate for the best charity in the world for the chicken-interested effective altruism.

I cannot think of any reason to like this world, no justification other than, “it is not my problem”. The alternative solution, the one that seems to be the default expectation of many EA organizations right now, is a “buyer beware” mentality, in which people looking to donate to one of the recommended causes can personally decide against it based on their own research into how well these organizations fit their own values.

It seems to me as though a world in which individual donors, possibly just by checking how the organization talks about itself or obvious things about its approach (whether it sets houses on fire) are good fits for their values, is strictly worse than a world in which EA evaluators are expected to, in some sense, finish the job. To consider lots of possible values that might go into deciding where to donate, and carefully investigating all of them in its most promising choices. Alright, this point has a complicated relationship to what ACE’s initiatives actually look like. They are, after all, moving money through grants, and not just evaluations. I think this is a more complicated issue, though similar considerations to the ones I bring up here will, I think, vindicate some approach that looks at values other than just animal welfare (regardless of whether this looks like the stuff ACE is currently looking at, or if it should look different in some way).

ACE could also be more upfront about separating out the results of these evaluations so that donors can more easily weight them for themselves. They could also look into a wider range of values, or different ones if you dislike ACE’s value picks for independent reasons. There is a ton of room for improvement, but this is roughly where my ranking for EA charity evaluator approaches currently lines up, from best to worst:

  1. There are charity evaluators for every single important value. Global health, animal welfare, existential risk, social justice, and everything else donors have a reason to care about. They all do their research as broadly as Charity Navigator, and as deeply as effective altruist charity evaluators. A different, independent evaluator, aggregates these results, and can give you rankings based on different weights you specify for each value.
  2. There are separate charity evaluators for different types of values. They all look at the top picks of one another, and give thorough feedback about how they interact with the values the reviewing organization is most interested in.
  3. An independent charity evaluator is dedicated to doing research into the top causes of different evaluators, and providing reports of how these organizations interact with values other than the ones each evaluator prioritizes.
  4. Each charity evaluator organization evaluates only the top organizations in its own field, but investigates how each of these organizations interact with values other than the ones the evaluator is meant to most prioritize, and reports on this.
  5. Each charity evaluator looks into the top organizations in its own field, looks at how these organizations relate to other values they think are important, and publish recommendations that incorporate both things (being transparent about how they weigh different values)
  6. Each charity evaluator only looks at one value, like animal welfare. It finds top charities in this field, ranks them, and just shows this to the public.

ACE seemed to be in an uncomfortable middle ground between the 4th and 5th best option. Arguably it is worse than that, it only looks at a handful of values other than animal welfare, and if Edgerton’s testimony is representative, they do not even concern themselves with these other values beyond instrumental value to animal welfare. I find this latter claim hard to believe, and not necessary to justify ACE’s practices (or some idealized version of those practices), but if I do believe this claim about how ACE considers other values, I think they have reason to go even further.

Even if their practices are subpar, I think there is reason to expect them to be strictly better than 6, what I see as the default for charity evaluators. Meanwhile, 1 is pretty much impossible. 2-3 all seem probably possible, but 2 is just not done and a broad culture shift within EA would be necessary for it, and 3 is easier, but does require a new organization. It happens in a less focused way through scattered forum posts and some global priorities research, but not in the most thorough way it could be. All seem pretty good to me in the grand scheme of things, especially compared to 6, and thinking about them gives me the impression that there is a great deal of room for impactful entrepreneurship in this field.

I’m not sure why this point doesn’t seem very discussed. One possibility is that it just isn’t actually a good idea. For instance, maybe figuring out how well charities indirectly fare on different values from the ones they are targeting is just harder to evaluate than how they fare on target values, or maybe when this impact is easier to measure, it is so obvious that reports from a charity evaluator aren’t necessary. Taken together, this could mean that the resources that would go into a project like this wouldn’t be worth it.

That said, the fact that ACE did come to different conclusions about these charities when it looked to values that aren’t directly related to animal welfare makes me think this is not so obvious. Another possibility is that although I haven’t really heard discussion about it, it really is something that people talk about, or that these organizations take part in, and I just don’t notice. I can certainly think of organizations that do things like this, but mostly they are grant-giving organizations like Open Philanthropy, and I don’t think quite fit what I’m talking about here. Regardless, I thought these thoughts were worth bringing up in case they really did touch on a factor that is neglected, not just in the ACE debate, but charity evaluation more broadly.

Comments5


Sorted by Click to highlight new comments since:

Thanks for this interesting perspective on how to balance different values within the work of evaluations, Devin. Considering you drafted this in 2022, we do want to note that a lot has changed at ACE in the last three years, not least of which has been a shift to new leadership. Since early 2022, ACE has transitioned to a new Executive Director, Programs Director, Charity Evaluations Manager, Movement Grants Manager, Operations Director, and Communications Director. 

That said, ACE continues to assess organizational health as part of our charity evaluations—we assess whether any aspects of an organization’s governance or work environment pose a risk to its effectiveness or stability, thereby reducing its potential to help animals. Furthermore, bad actors and toxic practices could negatively affect the reputation of the broader animal advocacy movement, which is highly relevant for a growing social movement, as well as advocates’ wellbeing and their willingness to remain in the movement. You can read more about our reasoning here and about our current evaluation criteria here.

Thanks for your thought-provoking piece. We are continually refining our evaluation methods so we will consider your points further about the kinds of instrumental information we might want to gather and how we could do so in a pragmatic way.

Thanks, Elisabeth

Thanks for the response, I appreciate it!

Executive summary: Charity evaluators like ACE should aim to assess not just their primary focus (e.g., animal welfare) but also other relevant values, to provide a fuller picture for donors and improve decision-making in effective altruism.

Key points:

  1. The debate over ACE’s evaluation methods highlights a tension between prioritizing pure animal welfare and considering broader values like social justice or organizational culture.
  2. Some argue that ACE should focus solely on effectiveness in animal welfare, while others defend its broader approach as useful for predicting an organization’s overall impact.
  3. Many charity evaluators operate in silos, ignoring cross-cutting impacts; a better system would involve coordination between evaluators to assess organizations from multiple value perspectives.
  4. An ideal system would either feature comprehensive evaluators assessing all major values or independent aggregators synthesizing findings from specialized evaluators.
  5. While ACE’s current approach is imperfect, it is still preferable to narrowly focused evaluations that ignore externalities and broader ethical considerations.
  6. The EA community may benefit from new initiatives or organizations dedicated to filling these evaluation gaps, though practical challenges in assessing indirect impacts remain a key obstacle.

 

 

This comment was auto-generated by the EA Forum Team. Feel free to point out issues with this summary by replying to the comment, and contact us if you have feedback.

Looking at aspects of the internal culture of these organizations is useful to predict harder to measure aspects of how healthy an organization is, and how much good it can ultimately do

This also could've helped with other orgs over the years, where the "culture" stuff turned out to have important signal. E.g. FTX, Leverage Research.

Curated and popular this week
LintzA
 ·  · 15m read
 · 
Cross-posted to Lesswrong Introduction Several developments over the past few months should cause you to re-evaluate what you are doing. These include: 1. Updates toward short timelines 2. The Trump presidency 3. The o1 (inference-time compute scaling) paradigm 4. Deepseek 5. Stargate/AI datacenter spending 6. Increased internal deployment 7. Absence of AI x-risk/safety considerations in mainstream AI discourse Taken together, these are enough to render many existing AI governance strategies obsolete (and probably some technical safety strategies too). There's a good chance we're entering crunch time and that should absolutely affect your theory of change and what you plan to work on. In this piece I try to give a quick summary of these developments and think through the broader implications these have for AI safety. At the end of the piece I give some quick initial thoughts on how these developments affect what safety-concerned folks should be prioritizing. These are early days and I expect many of my takes will shift, look forward to discussing in the comments!  Implications of recent developments Updates toward short timelines There’s general agreement that timelines are likely to be far shorter than most expected. Both Sam Altman and Dario Amodei have recently said they expect AGI within the next 3 years. Anecdotally, nearly everyone I know or have heard of who was expecting longer timelines has updated significantly toward short timelines (<5 years). E.g. Ajeya’s median estimate is that 99% of fully-remote jobs will be automatable in roughly 6-8 years, 5+ years earlier than her 2023 estimate. On a quick look, prediction markets seem to have shifted to short timelines (e.g. Metaculus[1] & Manifold appear to have roughly 2030 median timelines to AGI, though haven’t moved dramatically in recent months). We’ve consistently seen performance on benchmarks far exceed what most predicted. Most recently, Epoch was surprised to see OpenAI’s o3 model achi
Dr Kassim
 ·  · 4m read
 · 
Hey everyone, I’ve been going through the EA Introductory Program, and I have to admit some of these ideas make sense, but others leave me with more questions than answers. I’m trying to wrap my head around certain core EA principles, and the more I think about them, the more I wonder: Am I misunderstanding, or are there blind spots in EA’s approach? I’d really love to hear what others think. Maybe you can help me clarify some of my doubts. Or maybe you share the same reservations? Let’s talk. Cause Prioritization. Does It Ignore Political and Social Reality? EA focuses on doing the most good per dollar, which makes sense in theory. But does it hold up when you apply it to real world contexts especially in countries like Uganda? Take malaria prevention. It’s a top EA cause because it’s highly cost effective $5,000 can save a life through bed nets (GiveWell, 2023). But what happens when government corruption or instability disrupts these programs? The Global Fund scandal in Uganda saw $1.6 million in malaria aid mismanaged (Global Fund Audit Report, 2016). If money isn’t reaching the people it’s meant to help, is it really the best use of resources? And what about leadership changes? Policies shift unpredictably here. A national animal welfare initiative I supported lost momentum when political priorities changed. How does EA factor in these uncertainties when prioritizing causes? It feels like EA assumes a stable world where money always achieves the intended impact. But what if that’s not the world we live in? Long termism. A Luxury When the Present Is in Crisis? I get why long termists argue that future people matter. But should we really prioritize them over people suffering today? Long termism tells us that existential risks like AI could wipe out trillions of future lives. But in Uganda, we’re losing lives now—1,500+ die from rabies annually (WHO, 2021), and 41% of children suffer from stunting due to malnutrition (UNICEF, 2022). These are preventable d
Rory Fenton
 ·  · 6m read
 · 
Cross-posted from my blog. Contrary to my carefully crafted brand as a weak nerd, I go to a local CrossFit gym a few times a week. Every year, the gym raises funds for a scholarship for teens from lower-income families to attend their summer camp program. I don’t know how many Crossfit-interested low-income teens there are in my small town, but I’ll guess there are perhaps 2 of them who would benefit from the scholarship. After all, CrossFit is pretty niche, and the town is small. Helping youngsters get swole in the Pacific Northwest is not exactly as cost-effective as preventing malaria in Malawi. But I notice I feel drawn to supporting the scholarship anyway. Every time it pops in my head I think, “My money could fully solve this problem”. The camp only costs a few hundred dollars per kid and if there are just 2 kids who need support, I could give $500 and there would no longer be teenagers in my town who want to go to a CrossFit summer camp but can’t. Thanks to me, the hero, this problem would be entirely solved. 100%. That is not how most nonprofit work feels to me. You are only ever making small dents in important problems I want to work on big problems. Global poverty. Malaria. Everyone not suddenly dying. But if I’m honest, what I really want is to solve those problems. Me, personally, solve them. This is a continued source of frustration and sadness because I absolutely cannot solve those problems. Consider what else my $500 CrossFit scholarship might do: * I want to save lives, and USAID suddenly stops giving $7 billion a year to PEPFAR. So I give $500 to the Rapid Response Fund. My donation solves 0.000001% of the problem and I feel like I have failed. * I want to solve climate change, and getting to net zero will require stopping or removing emissions of 1,500 billion tons of carbon dioxide. I give $500 to a policy nonprofit that reduces emissions, in expectation, by 50 tons. My donation solves 0.000000003% of the problem and I feel like I have f