Evidence from two studies of EA careers advice interventions

Jamie_Harris

Many thanks to Lauren Mee, David Reinstein, Brenton Mayer, Aaron Gertler, Alex Holness-Tofts, Lynn Tan, Vaidehi Agarwalla, David Moss, and Renee Bell for providing feedback on drafts of this writeup, as well as all who provided feedback on the studies themselves.

Summary

Animal Advocacy Careers (AAC) ran two longitudinal studies aiming to compare and test the cost-effectiveness of our one-to-one advising calls and our online course. Various forms of these two types of careers advice service have been used by people seeking to build the effective altruism (EA) movement for years, and we expect the results to be informative to EA movement builders, as well as to AAC.

We interpret the results as tentative evidence of positive effects from both services, but the effects of each seem to be different. Which is more effective overall depends on your views about which sorts of effects are most important; our guess is that one-to-one calls are slightly more effective per participant, but not by much. One-to-one calls seem substantially more costly per participant, which makes the service harder to scale.

There therefore seems to be a tradeoff between costs and apparent effects per participant. We’d guess that the online course was (and will be, once scaled up) slightly more cost-effective, all things considered, but the services might just serve different purposes, especially since the applicants might be different for the different services.

Background

Animal Advocacy Careers (AAC) ran a longitudinal study testing the effects of our ~1 hour one-to-one careers advising calls, which operated in a similar style to calls given by 80,000 Hours and the organisers of local effective altruism (EA) groups across the world. Over roughly the same time period, we ran a second study using very similar methodology that tested the effects of our ~9 week online course, which taught some core content about effective animal advocacy, effective altruism, and impact-focused career strategy and culminated in support to develop a career plan, either via a group workshop or by redirecting to planning materials by 80,000 Hours.

Each study was designed as a randomised controlled trial,[1] and pre-registered on the Open Science Framework (here and here), although a few methodological difficulties mean that we shouldn’t interpret the results as giving very conclusive answers. Despite these difficulties, we think that the studies provide useful evidence both for AAC and others focusing on building the effective altruism movement (i.e. the community striving to help others as much as possible using the best evidence available) to help us prioritise our time and resources. We’ll be sharing more about the methodological lessons from the studies in a forthcoming post called “EA movement building: Should you run an experiment?”

The findings are also written up in the style of a formal academic paper, viewable here. That version provides more detail on the methodology (participants, procedure, and instruments) and contains extensive appendices (predictions, full results, anonymised raw data, R code, and more). In the rest of this post, we summarise some of the key results and takeaways.

Which service has larger effects?

The ideal evaluation of whether a career advice intervention genuinely increases a participant’s expected impact for altruistic causes would be very challenging and expensive.[2] So instead, we designed and collected data on four metrics that we expected to be useful indicators of whether people were making changes in promising directions:

“Attitudes,” e.g. views on cause prioritisation, inclination towards effective altruism.
“Career plans,” e.g. study plans, internship plans, job plans, and long-term plans.
“Career-related behaviours,” e.g. secured a new role, applied to one or more positions or programmes, joined the effective animal advocacy directory.
“Self-assessed expected impact for altruistic causes,” asked via a single survey question.

All of these questions were asked six months after the person applied to the service, and all except the “attitudes” questions (which were more static) explicitly asked about changes over the past six months, i.e. since they first applied.[3]

The main results are summarised in the table below. The “mean difference” refers to the difference between the average score of the applicants who were invited to participate in the service itself and the average score of the applicants who were randomly assigned to a control group (who didn’t receive any service). Positive numbers suggest positive overall effects of the service, though given some of the flaws of the study — most notably differential attrition — we can’t be very confident that differences are necessarily due to the services themselves.[4]

Some examples of where these numbers come from[5]:

The mean difference of 1.05 in career-related behaviours from the one-to-one calls was mostly caused by around one in ten additional people (relative to the proportions in the control group) reporting to have: secured a new role that they saw as being able to facilitate their impact for animals; intentionally and substantially changed the amount of money that they donate or the time that they spend volunteering; having had multiple in-depth conversations about their career plans; and joined the effective animal advocacy directory. Two in ten additional people also reported having changed which nonprofits they donate to or volunteer for.
The mean difference of 0.67 in career plans for the online course was mostly caused by 2.1 in 10 additional people reporting to have made small changes in (or 1 in 10 reporting to have made substantial changes in) their long-term career plans and 1.4 in 10 additional people reporting to have made small changes to (or 0.7 in 10 reporting to have made substantial changes to) the job that they were planning to apply for next.
The mean difference of 0.48 in self-assessed expected impact for altruistic causes from the one-to-one calls is roughly equivalent to each participant moving about one-quarter of a notch (or one quarter of the participants moving a full notch) up the scale for their expected impact due to recent career plan changes, e.g. from an answer of “No change or very similar impact” to an answer of “Somewhat higher impact.”

These differences seem pretty impressive to us, if indeed they are due to the interventions themselves rather than methodological issues.[6] We’re a little disappointed by the lack of effects on attitudes, but we’re not too worried by this, since attitudes do not necessarily need to change for expected impact to increase.[7]

The difference in career plans between the one-to-one calls group and the control group isn’t significant, but the mean difference is actually quite similar to the difference we see for the online course.[8] All the other metrics look more promising for the one-to-one calls, so this analysis suggests that, per participant, the effects may be stronger for one-to-one calls than an online course.

We also used LinkedIn to compare the roles that the participants appeared to be in at the time of their application to their roles in late July or early August 2021, i.e. 7.5 to 13 months after application. This methodology is subjective and is limited by the relatively short follow-up period — likely not long enough for most people to put plan changes into action — but provides an additional type of evidence.[9] Both of these sets of results seem promising to us:

We had more people who participated in the online course, due to much higher application numbers. Out of 321 applicants, 161 were invited to enroll in the online course, 120 completed at least 10% of the course and 42 completed 100% of it.[10] This compares to 134 applicants to the one-to-one calls service, 68 invitations to a call and 62 actual calls. Note that the analyses above include everyone invited to participate for whom we have valid data, not just those who completed the service.

Which service is more costly?

We estimate that we spent about 15 weeks’ full-time equivalent work setting up and running the one-to-one calls, compared to about 10 weeks for the online course. So it took us about 3.5 times as much time input to secure one applicant for the one-to-one calls as for the online course.[11]

The online course platform (Thinkific) also charged us $49 per month, whereas the one-to-one calls didn’t require any extra financial costs. But this is a small cost compared to our labour costs.

So the costs per applicant (from the perspective of AAC[12]) were far lower (~30%) for the online course. The online course has higher initial setup costs but is far more scalable; an additional participant incurs a negligible additional time cost, whereas we spent about 2 hours per additional one-to-one call advisee (including prep and follow-up).

So which is more cost-effective?

Our outcome metrics don’t give us a straightforward answer to what we really care about — positive impact for animals (or other altruistic causes) relative to costs. Nevertheless, the differences in costs between the services seem bigger than the differences in effects, so our guess is that the online course is a slightly more cost-effective service, all things considered.[13]

But directly comparing the costs and effects of the two services as we’ve done so far is a little misleading, because they attracted slightly different applicants. Our impression — supported by some of the survey results from the application forms — is that the applicants to the one-to-one calls tended to be a lot more aligned and familiar with the effective altruism community already, i.e. they were a bit further down the “funnel.” So the services might just be more or less useful for different people.

Recommendations for the effective altruism movement

Peter Singer and the Good Food Institute have offered online courses for some time, with 80,000 Hours and the Centre for Effective Altruism having recently introduced online courses. There seems to have been an increase in local EA groups running fellowships, which seem comparable to an online course. The findings from our online course study weakly suggest that these are positive developments! We’re inclined to believe that the community could benefit from offering more such courses.

They seem like a good balance between scalability and maintaining high fidelity in spreading EA ideas.[14] There have been some concerns that the growth in the number of people engaged in the EA community has begun to stagnate or just been outstripped by the growth in funding, and online courses offer one promising method for broad outreach that might help address this. We expect that a similar model to AAC’s course (mostly cause-specific, some cross-cutting EA discussion and implications) could be used in almost any other promising cause area and may attract individuals who otherwise would not be interested in engaging with effective altruism.[15]

The lower setup costs and seemingly higher effects per participant of one-to-one advising calls suggests that this is probably a better focus for most local EA groups. Fellowships or courses could be great if the setup costs (e.g. curriculum design) can be shared between groups, however.

Our guess is that the services could work well in combination. For example, someone could learn some of the more basic ideas through an online course, which might lead them to update their career plans; they could then potentially be supported to refine and implement these plans through one-to-ones.

Next steps for Animal Advocacy Careers

We’re pleased enough with these results that both services seem worth offering again. We’re also pretty pleased with the feedback we’ve had so far,[16] but haven’t had very large numbers of participants yet, partly due to the experimental design that we used. As a result, our priority is to re-launch the services (with slight modifications).

Our intuition (backed up by some internal analysis of data from the participants so far) is that the people whose expected impact for animals might increase the most (on average) after a one-to-one call are those who are not yet very familiar with the principles of EA, but are nevertheless already keen to help animals and already taking steps to do so. Another group that seems promising is people who have already built up professional expertise that could make them a good fit for roles that are difficult to fill in effective animal advocacy nonprofits.[17] We’d like to try proactively reaching out to people who fall into one (or both) of these categories and inviting them to a one-to-one call. We may also open up a public application form, where we select for people in these groups.

As well as relaunching the current version of the course, we expect to try creating edited versions that appeal more to certain high-priority groups. For example, we could develop a course that encourages and supports experienced fundraising professionals to transition into roles in animal advocacy nonprofits.[18] Or, we could collaborate with animal advocates in specific countries to tailor the content more to their needs and experience, then translate it into their language(s).[19]

Help us grow these services

If you think you could benefit from any of these services yourself, please sign up to our newsletter for updates. If you have a friend, colleague, or connection who you think could benefit, please share their email with us here. We'll use this to send them up to two emails next time we open up applications for the service; we won't add them to a mailing list without their permission.

Footnotes

[1] Half the applicants were sent an email apologising and telling them that we were unable to provide them the service they had applied for. The other half were invited to participate and sent a link either to book a time for a call or enroll in the course.

[2] E.g. very long follow-up periods and very high financial incentives to ensure high survey completion rates.

[3] The full questions are visible in the appendix of the full writeup.

[4] Additionally, Rethink Priorities reanalysed our data using different methods and though the size of the effects seem similar, none of the differences were statistically significant. We did, however, run some supplementary regression analyses, the results of which reassure us somewhat that the differences are likely due to the services. We talk more about these difficulties and analyses in the full paper.

[5] Note, however, that few of the differences in these subcomponent questions were significant, and these results should not be interpreted literally. We think that the overall metrics like “career plans” are more informative, but included this list to make those metrics seem less abstract.

[6] See the footnote above. As noted in this spreadsheet, we actually initially predicted better results than we eventually identified. But we think that we didn’t initially take enough time to really think through what the scores would mean; the results we identified still seem impressive to us.

[7] An individual might already hold attitudes conducive to careers that are highly impactful for animals, just not have identified the best career pathways. For example, they might not have thought about certain promising options before.

[8] At this point we’re just making guesses, but it’s possible, for example, that the difference in career plans caused by the one-to-one calls was slightly smaller than the difference caused by the online course only because the applicants were more likely to already have pretty good ideas and plans in place. This seems plausible given that they tended to be more familiar and engaged with effective altruism (see the “So which is more cost-effective?” section).

[9] Two key benefits are that this method is not vulnerable to the differential attrition or social desirability bias problems that affect the main analysis.

[10] 54 completed at least 90% and got at least 20 out of 30 on the test.

[11] Very few people were offered a one-to-one call but didn’t take it, whereas only 44% of those invited to participate in the online course completed 70% or more. However, these differences are already accounted for in the results reported above.

[12] If you take the perspective of the applicant, then the time cost for a one-to-one call is much lower (~1 hour rather than ~10).

[13] We’ve done some rough cost-effectiveness modelling. The results suggested that a scaled-up version of the online course would be about twice as cost-effective as a scaled-up version of the one-to-one calls service. However, the confidence intervals are very wide (and very much overlapping!) and the final outcomes fluctuate wildly with small changes in the inputs, so we don’t place much weight on the exact numbers. Both services look very favourable compared to direct work at Animal Charity Evaluators’ Top Charities, but this comparison is far more methodologically difficult, so we place even less weight on that comparison.

[14] Recall that, compared to our online course (1) tended to have applicants from people with lower awareness of and alignment with effective altruism and (2) seem cheaper per participant.

[15] As a simple demonstration of this, we can look at the gender distribution. Animal advocacy nonprofit staff are about 70% female, whereas the EA community is about 70% male. 73% of the applicants to our online course were female (compared to 56% of the one-to-one calls applicants). The course promotes both animal advocacy and effective altruism ideas and actions, yet seems to have attracted an audience much more similar to the current animal advocacy community than the current EA community.

[16] See “Appendix A” of our “2021 Plans and 2020 Review” post. We’ve had a little more feedback since we wrote that post, but there weren’t any major updates.

[17] See “Appendix B” of our “2021 Plans and 2020 Review” post.

[18] See “Appendix B” of our “2021 Plans and 2020 Review” post.

[19] For some relevant discussion, see our skills profile on “Growing the animal advocacy community in countries where it is small or new.”

david_reinstein3y34

Thanks for doing this. I really appreciate your running this as a controlled trial. I hope this fosters a range of additional experimental work and evidence-gathering. Also great that you are making the data and analysis public, and that you pre-registered your hypotheses. I think this was a success in terms of following a careful protocol, learning, and getting better at this stuff. It is putting us on a good path.

A few things I might have done or reported differently (we have discussed much of this, but I want to share it publicly, for others to consider and maybe weigh in on).

You did state your results tentatively, but I would have been even a bit more tentative. Given the differential attrition, I'm just not sure that we really can be confident that the interventions had an effect. And given this and the self-selection to each treatment group, we can't say much about the relative efficacy of each of the two interventions.

The (differential) attrition problem is really a substantial one. As we've discussed, a large (and different) share in both treatments, and in the control group, did not complete both rounds of the longitudinal study.

The chief concern here is that the treatments themselves may have had an impact on the composition of those who completed the second survey. To some extent we can see evidence that something other than 'random attrition' is going on in the data -- we see greater attrition in the control than in the treatment. IIRC composition of the treated and control groups (ex-post) also differed in terms of ex-ante observable traits, traits which could not have been affected by the treatment themselves. Thus, given a randomly assigned treatment, these could only be due to differential attrition.

Note that there are some reasonable statistical 'Bounding' approaches (see e.g., Lee 2009) for dealing with differential attrition, although these tend to lead to very wide bounds when you have substantial (differential) levels of attrition.

I appreciate your use of the LinkedIn data for follow-up, I would pursue this further. To the extent that you can track down the future outcomes of a large set of respondents through LinkedIn, this will help recover very meaningful estimates, in my opinion. You note, correctly, that this is much less vulnerable to differential attrition bias, as well as less biased towards "pleasing those who you spoke to" (differential desirability bias). I would follow up on this for future outcomes, and do this more carefully, using a blind external rater (or an AI tool to rate these, maybe GPT3 as a classifier).

You noted the difference in self-selection into the two types of treatments and the resulting limitations to the comparability of these. However, a great deal of the post does discuss these differences and the implications for cost-effectiveness. To me, this seems to be digging too deeply into an area where we don't have strong evidence yet.

I'd love to see followups of this or similar experiments. Perhaps you can run more in the future, with larger sample sizes, and more carefully considering plans to limit the possibility of differential attrition. Perhaps limiting the study to those with LinkedIn accounts would be one way of doing this. Another possibility (which you could even in principle pursue with the previous tested group) would be to find the funds to pay fairly large rewards to follow up again with a survey for everyone in each group. If the rewards were sufficient, I guess you could probably track down everyone or nearly everyone.

By the way, I made a recording where I read your post (with some comments mostly overlapping the comment here), which I will put up on my podcast (and link here) shortly.

AUDIO on my podcast HERE

Jamie_Harris3y6

Thanks David! And thanks again for all your help. I agree with lots of this, e.g. differential attrition being a substantial problem and follow-ups being very desirable. More on some of that in the next forum post that I'll share next week.

(Oh, and thanks for recording!)

PeterSlattery3y5

Thanks for this. Excellent work.

Some quick thoughts.

I'd like to see a more rigorous study exploring how these interventions affect career choice.

Related to that, I wonder if EA should do more research/work to understand how to encourage better career change and choice. These are key to the success of EA and I am not sure they are as well researched as they should be. I am sure that lots of our organisations have good insights but I don't know if there is much if any public experimental data.

If it is worth doing experiments to test interventions to promote volunteering, charity, diet change etc, then it seems even more valuable to understand how to promote 'prosocial career choice or similar'. However, I am not aware of any research on this (though I haven't looked for it specifically - I'd expect to have seen something by now given all the work I have done).

I suspect that part of the reason is that EA is the only group I know of who thinks about career choice as a key prosocial behaviour. Most researchers probably don't even consider it. I wonder if it is a candidate for field-building(i.e., starting a new research field/focus area)?

Jamie_Harris3y3

Thanks Peter!

I'd love to know more detail, if you're happy to share.

However, I am not aware of any research on this

Likewise. I did do some digging for this; see the intro of the full paper for the vaguely relevant research I did find.

brentonmayer3y3

I was interested in seeing a breakdown of the endpoints, before they'd been compressed into the scales AAC uses above.

Jamie kindly pulled this spreadsheet together for me, which I'm sharing (with permission), as I thought it might be helpful to other readers too.

Effective Altruism Forum
EA Forum