Idk if there are best practices for reporting study results in a forum post, but I’ve decided to keep most the nerdy stuff out. My write-up, along with my methods and data are available here: (https://osf.io/bkgvr/). I spent the first semester of my psychology undergrad thesis trying to better understand how to introduce people to EA.

Background/Summary: There are mixed results in the literature about whether giving people information about cost-effectiveness differences between charities will increase their donations to effective charities. Recent work from Lucius Caviola and others, using cost-effectiveness differences of 100x per expert opinion, seems promising. I wanted to see if such information would increase people’s interest in effective altruism. It didn’t, and I also found no significant effect of such information on Effective Donations (though read below for more details), contrary to expectations. Possible takeaway: I’ve sometimes used cost-effectiveness variance information in my EA pitches – this pilot study suggests that such information is not effective at increasing interest in EA. Evidence for selection effect regarding interest in EA and political beliefs. More to explore: People were highly interested in EA, but were not very enthusiastic about getting involved in EA – how do we bridge this gap? Stronger manipulation, larger sample, etc.

Hypotheses: H1: Correcting misconceptions around charity effectiveness will increase people’s interest in the EA movement. H2: Correcting misconceptions around charity effectiveness will increase donations to effective charities. Various exploratory hypotheses.

Participants: 44 undergrad students at a small liberal arts college in the US, majority female, majority white, majority politically left/liberal, mostly intro psych students. Participants had not learned about EA for more than 5 minutes.

Methods: Recruited by email, participants completed an online Qualtrics survey lasting 6.5 minutes. I used a 2x2 factorial design. Participants either did an exercise where they were asked to estimate the difference in cost-effectiveness between average and highly effective charities – or they had a control task. Then, participants either were told that the difference in cost-effectiveness is 100x – or they were told control, irrelevant information. All participants received an explanation of why charities can vary in cost-effectiveness (different problems, different methods, or different efficiencies/overhead). Participants then split a donation (real money, but not theirs) between an average cost-effectiveness charity and a highly effective charity. All participants had the same procedure for the rest of the survey. All read a short description of EA “Effective Altruism is community and a social movement dedicated to answering the question “how can we do the most good with the resources we have?” Effective Altruism advises that before jumping in to trying to improve the world, we should carefully consider how to do so most effectively both so that we don’t mistakenly make things worse, and so that we can help as many people as possible. By using evidence and reasoning we can figure out which methods of doing good are the most impactful, and then we can focus our resources – be that our time, money, or careers – on these methods of improving the world.” Then, participants answered 6 questions asking about how interesting EA sounds and generally their favorability toward it. Next, participants answered 5 questions about whether they wanted to do certain things to get involved (while this was not numeric, I made a numeric scale out of it for analysis purposes). They next answered a comprehension check question to ensure they had read the description of EA (almost all participants answered correctly, and those who did not were excluded). Participants answered demographic questions including the following question (response 1-7): “Here is a 7-point scale on which the political views that people might hold are arranged from extremely liberal (left) to extremely conservative (right). Where would you place yourself on this scale?” All participants were then informed about the nature of the study and the fact that not everybody had been given complete information. All then saw that the most cost-effective charities save 100x more lives than average charities for the same amount of money. Participants then made a Final Effective Donation in which they did the same donation task as before – but with full information for everybody.

Measures: % donated to highly effective charity (Effective Donation). % donated to highly effective charity at end of survey (Final Effective Donation). Final Effective Donation – Effective Donation (Donation Change). Score based on 6 questions asking about interest/favorability toward EA, 2 reverse scored (Interest Score). Score based on 5 questions asking about getting involved in EA (Involvement Score).

Primary Hypothesis Results: H1: No main effects and no interaction – neither cost-effectiveness information nor doing the exercise about cost-effectiveness increased interest in EA. H2: No main effects and no interaction – neither cost-effectiveness information nor doing the exercise about cost-effectiveness increased Effective Donations.

Interesting other results: Political identification was associated with EA where being more left or liberal was associated with greater interest in EA (no effect on Effective Donations or Involvement Score). I interpret this result as evidence for a selection effect regarding the significant political left demographics of the EA community; that is, people on the left are more interested in EA initially, rather than our community having a treatment effect of making people left (though I didn’t study this). Caveats: participants were generally left, average of 1.5 on a scale of 0-6 (converted from 1-7) explained above; sample size was small; exploratory analysis; question about political beliefs was unidimensional whereas actual beliefs are complex. Donation Change was associated with Info Condition. This change was larger for participants who did not receive the info at the time of their first donation but did by the time of their second. These participants had larger increases in effective donations, which is what we would expect in line with H2. I’ve heard, but can’t track down the source, that less than 10% of Oxford students hear about EA before graduating (or this used to be the case). This was not the case in my study. I screened participants for previous experience with EA. 28 (52%) participants had never heard of EA before. 21 (39%) had heard of EA but learned about it for less than 5 minutes, and 4 (7%) had learned about EA for more than 5 minutes so were excluded from participating. Note: this includes people who did not complete the study and/or failed comprehension check, which is why n > 44. Caveat: some people who had learned about EA for more than 5 mins may not have clicked on the survey at all due to the exclusion criteria in recruiting materials. Big caveat: People who know me are probably more likely to participate in my study than random students; most of these people have also heard me talk about EA and therefore have heard of it, making this (likely) a biased sample. These results should be taken with caution, but they indicate that the % of students at my school who have heard about EA is above 10% (but probably not as high as 46%), although the % of students who have actually learned about EA is still quite low (anecdotal).

Less interesting other results: Final Effective Donation was higher than initial Effective Donation for almost all participants. Exercise responses were in line with other studies. Overall, high interest/favorability toward EA (~5.5/Scale of 1-7). Participants were not as inclined to get involved in EA, as the average answer on most questions was 0.64, or about halfway between “No thanks” (0) and “Maybe” (1), where the other option was “Yes!” (2).

Sorry I didn't include graphics in this post, LaTeX is confusing, and they don't seem necessary. Some are in the Presentation on OSF. Feel free to ask questions. For more about everything, check out all my materials, my full write-up, and data (stripped of most demographic information to preserve anonymity): https://osf.io/bkgvr/

Comments4


Sorted by Click to highlight new comments since:
[anonymous]20
0
0

Thanks for sharing this, Aaron! Really interesting pilot work.

One quick thought-- I wouldn't rely too heavily on statistical significance tests, particularly with small sample sizes. P-values are largely a function of sample size, and it's nearly impossible to get statistical significance with 44 participants (unless your effect size is huge!). 

Speaking of effect sizes, it seems like you powered to detect an effect of d=0.7. For a messaging study with rather subtle manipulations, an effect of d=0.7 seems huge! I would be pretty impressed if giving people CE info resulted in an effect size of d=0.2 or d=0.3, for instance. I'm guessing you were constrained by the # of participants you could recruit (which is quite reasonable-- lots of pilot studies are underpowered). But given the low power, I'd be reluctant to draw strong conclusions.

I also appreciate that you reported the mean scores in the results section of your paper, which allowed me to skim to see if there's anything interesting. I think there might be!

There was no significant difference in Effective Donation between the Info (M = 80.21, SD = 18.79) and No Info (M = 71.79, SD = 17.05) conditions, F(1, 34) = 1.85, p = .183, ηp2 = .052. 

If this effect is real, I think this is pretty impressive/interesting. On average, the Effective Donation scores are about 10% higher for the Info Group participants than the No Info group participants (and I didn't do a formal calculation for Cohen's d but it looks like it'd be about d=0.5). 

Of course, given the small sample size, it's hard to draw any definitive conclusions. But it seems quite plausible to me that the Info condition worked-- and at the very least, I don't think these findings provide evidence against  the idea that the info condition worked.

Would be curious to see if you have any thoughts on this. If you end up having an opportunity to test this with a larger sample size, that would be super interesting. Great work & excited to see what you do next!

Thanks for your thorough comment! Yeah I was shooting for about 60 participants, but due to time constraints and this being a pilot study I only ended up with 44, so even more underpowered.

Intuitively I would expect a larger effect size, given that I don't consider the manipulation to be particularly subtle; but yes, it was much subtler than it could have been. This is something I will definitely explore more if I continue this project; for example, adding visuals and a manipulation check might do a better job of making the manipulation salient. I would like to have a manipulation check like "What is the difference between average and highly cost-effective charities?" And then set it up so that participants who get it wrong have to try again.

The fact that Donation Change differed significantly between Info groups does support that second main hypothesis, suggesting that CE info affects effective donations. This result, however, is not novel. So yes, the effect you picked up on is probably real – but this study was underpowered to detect it at a level of p<.05 (or even marginal significance).

In terms of CE info being ineffective, I'm thinking mainly about interest in EA – to which there really seems to be nothing going on, "There was no significant difference between the Info (M = 32.52, SD = 5.92) and No Info (M = 33.12, SD = 4.01) conditions, F(1, 40) = .118, p = .733, ηp2 = .003." There isn't even a trend in the expected direction. This was most important to me because, as far as I know, there is no previous empirical evidence to suggest that CE info affects interest in EA. It's also more relevant to me as somebody running an EA group and trying to generate interest from people outside the group.

Thanks again for your comment! Edit: Here's the previous study suggesting CE info influences effective donations: http://journal.sjdm.org/20/200504/jdm200504.pdf

Another consideration here: participants knew they were in an experiment, and probably had a good sense of what you were aiming at.

The difference in treatment and control was whether people

  1. were asked to estimate the difference in cost-effectiveness between average and highly effective charities – or they had a control task."

and whether (2x2 here)

  1. "participants either were told that the difference in cost-effectiveness is 100x – or they were told control, irrelevant information. "

If either of these increased their stated interest in EA or their giving behavior, it would be informative, but we still might want to be careful in making inferences to the impact of these activities and this 'de-biasing' in real world contexts.

Either of these tasks might have heightened the 'desirability bias' or the extent to which people considered their choices in a particular analytical way that they might not have done had they not known they were in an experiment.

Thanks for sharing and for putting this on OSF. Some thoughts and suggestions, echoing those below.

Maybe consider rewriting/re-titling this? To say "did not increase" seems too strong and definitive.

You "failed to find a statistically significant effect" in standard tests that were basically underpowered. This is not strong evidence of a near-zero true effect. If anything, you found evidence suggesting a positive effect, at least on the donation 'action' (if I read Aaron's comment carefully).

You might consider a Bayesian approach, and then put some confidence bounds on the true effects, given a reasonably flat/informative prior. (You can do something similar with 'CIs' in a standard frequentist approach.)

Then you will be able to say something about 'with this prior, our posterior 80% credible interval over the true effect is between -X% and +X%' (perhaps stated in terms of Cook's d or something relatable) ... if that interval rules out a 'substantial effect' then you could make a more meaningful statement. (With appropriate caveats about the nature of the sample, the context, etc., as you do.)

(Also, if you rewrite, can you break this into shorter paragraphs -- the long paragraph chunks become overwhelming to read.)

Curated and popular this week
LintzA
 ·  · 15m read
 · 
Cross-posted to Lesswrong Introduction Several developments over the past few months should cause you to re-evaluate what you are doing. These include: 1. Updates toward short timelines 2. The Trump presidency 3. The o1 (inference-time compute scaling) paradigm 4. Deepseek 5. Stargate/AI datacenter spending 6. Increased internal deployment 7. Absence of AI x-risk/safety considerations in mainstream AI discourse Taken together, these are enough to render many existing AI governance strategies obsolete (and probably some technical safety strategies too). There's a good chance we're entering crunch time and that should absolutely affect your theory of change and what you plan to work on. In this piece I try to give a quick summary of these developments and think through the broader implications these have for AI safety. At the end of the piece I give some quick initial thoughts on how these developments affect what safety-concerned folks should be prioritizing. These are early days and I expect many of my takes will shift, look forward to discussing in the comments!  Implications of recent developments Updates toward short timelines There’s general agreement that timelines are likely to be far shorter than most expected. Both Sam Altman and Dario Amodei have recently said they expect AGI within the next 3 years. Anecdotally, nearly everyone I know or have heard of who was expecting longer timelines has updated significantly toward short timelines (<5 years). E.g. Ajeya’s median estimate is that 99% of fully-remote jobs will be automatable in roughly 6-8 years, 5+ years earlier than her 2023 estimate. On a quick look, prediction markets seem to have shifted to short timelines (e.g. Metaculus[1] & Manifold appear to have roughly 2030 median timelines to AGI, though haven’t moved dramatically in recent months). We’ve consistently seen performance on benchmarks far exceed what most predicted. Most recently, Epoch was surprised to see OpenAI’s o3 model achi
Dr Kassim
 ·  · 4m read
 · 
Hey everyone, I’ve been going through the EA Introductory Program, and I have to admit some of these ideas make sense, but others leave me with more questions than answers. I’m trying to wrap my head around certain core EA principles, and the more I think about them, the more I wonder: Am I misunderstanding, or are there blind spots in EA’s approach? I’d really love to hear what others think. Maybe you can help me clarify some of my doubts. Or maybe you share the same reservations? Let’s talk. Cause Prioritization. Does It Ignore Political and Social Reality? EA focuses on doing the most good per dollar, which makes sense in theory. But does it hold up when you apply it to real world contexts especially in countries like Uganda? Take malaria prevention. It’s a top EA cause because it’s highly cost effective $5,000 can save a life through bed nets (GiveWell, 2023). But what happens when government corruption or instability disrupts these programs? The Global Fund scandal in Uganda saw $1.6 million in malaria aid mismanaged (Global Fund Audit Report, 2016). If money isn’t reaching the people it’s meant to help, is it really the best use of resources? And what about leadership changes? Policies shift unpredictably here. A national animal welfare initiative I supported lost momentum when political priorities changed. How does EA factor in these uncertainties when prioritizing causes? It feels like EA assumes a stable world where money always achieves the intended impact. But what if that’s not the world we live in? Long termism. A Luxury When the Present Is in Crisis? I get why long termists argue that future people matter. But should we really prioritize them over people suffering today? Long termism tells us that existential risks like AI could wipe out trillions of future lives. But in Uganda, we’re losing lives now—1,500+ die from rabies annually (WHO, 2021), and 41% of children suffer from stunting due to malnutrition (UNICEF, 2022). These are preventable d
 ·  · 9m read
 · 
TL;DR In a sentence:  We are shifting our strategic focus to put our proactive effort towards helping people work on safely navigating the transition to a world with AGI, while keeping our existing content up. In more detail: We think it’s plausible that frontier AI companies will develop AGI by 2030. Given the significant risks involved, and the fairly limited amount of work that’s been done to reduce these risks, 80,000 Hours is adopting a new strategic approach to focus our efforts in this area.   During 2025, we are prioritising: 1. Deepening our understanding as an organisation of how to improve the chances that the development of AI goes well 2. Communicating why and how people can contribute to reducing the risks 3. Connecting our users with impactful roles in this field 4. And fostering an internal culture which helps us to achieve these goals We remain focused on impactful careers, and we plan to keep our existing written and audio content accessible to users. However, we are narrowing our focus as we think that most of the very best ways to have impact with one’s career now involve helping make the transition to a world with AGI go well.   This post goes into more detail on why we’ve updated our strategic direction, how we hope to achieve it, what we think the community implications might be, and answers some potential questions. Why we’re updating our strategic direction Since 2016, we've ranked ‘risks from artificial intelligence’ as our top pressing problem. Whilst we’ve provided research and support on how to work on reducing AI risks since that point (and before!), we’ve put in varying amounts of investment over time and between programmes. We think we should consolidate our effort and focus because:   * We think that AGI by 2030 is plausible — and this is much sooner than most of us would have predicted 5 years ago. This is far from guaranteed, but we think the view is compelling based on analysis of the current flow of inputs into AI