Some 2021 CEA Retention Statistics

Ben_West🔸

Summary

One of CEA’s goals is for people who are highly engaged with effective altruism to stay highly engaged.
In order for us to pursue this goal, we need some way of measuring the retention rate. In this document, I calculate retention rates using engagement with CEA's projects as a proxy.
I find that 50-70% of people who engaged with CEA's projects in 2020 also engaged with one of our projects so far in 2021, using a naïve method of matching people (mostly looking at email addresses).
I further manually classify all EAGxVirtual attendees who self-reported being employed by an EA organization. I find that 95.3% of them were retained by at least one of several proxies, which is almost identical to the 95.6% retention estimate given by Ben Todd in his analysis last year.
Note: I expect this post is only interesting to a small number of people who are highly engaged with EA, so I haven't spent a lot of time cleaning it up. Please feel free to comment or reach out to me with any questions you might have.

Data sources

I considered the following data sets:

Everyone who read a post on the EA Forum in 2020 or 2021. Note that this requires the user to have been logged in when they read the post.
Everyone who donated on EA Funds in 2020 or 2021. I did not filter by which organization they donated to.
Events: attendees of EAGxVirtual, EAG Reconnect, or the EA Picnic. I further filtered by:
1. Whether they self-reported as having taken "significant action" in their EAGxVirtual application (this includes having taken the Giving What We Can pledge, currently working at an EA organization, having previously worked at an EA organization, having spent 100 hours on an EA project, or having changed their career due to EA considerations). Note that this is self-reported information, and does not perfectly correlate with whether an expert judge might evaluate them as having taken significant action.
2. Whether they self-reported working for an EA organization in their EAGxVirtual application. The list of organizations I used can be found in an appendix. Note that this excludes people who work at non-EA organizations for EA reasons (and may include people who work at EA organizations for non-EA reasons).
EA Survey: everyone who responded to the EA survey in 2020 and consented to sharing information with CEA
Everyone who attended one of CEA’s virtual programs (VP)

Previous Work

Previous retention rate estimates have ranged from 85% to 99.6% annual retention. These have generally required manually evaluating whether or not individuals in some population have stayed engaged.
Peter Wildeford has done the largest non-manual retention analysis I know, which looked at the percentage of people who answered the EA survey using the same email in multiple years. He found retention rates of around 27%, but cautioned that this was inaccurate due to people using different email addresses each year.
Over the past six months, CEA has moved to unify our login systems. As of this writing, event applications, the EA Forum, and EA Funds/GWWC all use the same login system. This means that we are less likely to have issues with people using different emails.

Matching algorithm

All data sources provided (encrypted) email addresses, which was what I primarily used for matching.
I additionally used name and LinkedIn information to match events and survey data.
Note on privacy: set intersections are performed using encrypted information, where relevant. This lets us e.g. calculate the percentage of Forum users who donated on Funds, while not actually knowing the email addresses of any Funds users.

Results

Population	Population Size	Attended event in 2021	Read a post on the Forum in 2021	Answered the 2020 EA Survey	Donated on EA Funds in 2021	Attended VP in 2021	Event or Forum	Event, Forum or Survey	Event, Forum, Survey or Funds	Event, Forum, Survey, Funds or VP
All EAGxVirtual Attendees	1091	40%	26%	22%	5%	5%	50%	54%	56%	56%
EAGxVirtual Attendees who took significant action	568	40%	27%	22%	5%	6%	52%	56%	57%	58%
EAGxVirtual Attendees who worked for an EA organization	129	57%	40%	29%	1%	1%	68%	70%	70%	70%
Read Forum post in 2020	2347	21%	63%	21%	6%	4%	66%	67%	68%	68%

The final column is the most relevant one. This indicates that, depending on the population, 50-70% of the individuals who engaged in 2020 also engaged in some way in 2021.

This is substantially higher than the 27% rate found by Peter using EA Survey data, but is still substantially lower than what I expect the true rate to be.

Manual classification

Since automated classification was unable to classify a large fraction of the population, I manually classified the remaining attendees who worked for an EA organization. For each of them, I used LinkedIn and the organization's website to see if they were still listed as staff. These were the results:

Classification	Number of people
Kept job listed in EAGx application	16
Personally known by me to still be involved	8
Seems to have genuinely left their employer and not started a new EA position	6
Got a new job judged by me to be EA	5
Weren't actually originally employed by EA organization (e.g. were just a volunteer)	3
Couldn't find any information	1

In summary, approximately six of the 129 EAGxVirtual attendees who took significant action (= 4.7%[1]) seem to have genuinely left working for an EA employer, and did not otherwise engage with any of CEA's projects.

Ben Todd estimated a five-year dropout rate of 20% for people engaged at the level of working at an EA organization, which implies a 95.6% annual retention rate. This is almost identical to the 95.3% retention rate found here.

Power Analysis

It would be nice if we could regularly track retention rates and notice if things are changing. Based on these results, I believe it would require a fairly large data set and substantial manual effort to do this.

For example, to detect a change in retention rate from 95% to 90%, we need a sample of 185 individuals.[2] This would be a massive doubling of the dropout rate, but still requires a larger sample than I evaluated here.

Given this, CEA is evaluating alternative metrics. Our current top choice is to focus on people who use our products, instead of those who are "engaged" with EA in a more subjective sense. This allows us to analyze larger populations, improving the power of our tests.

Appendix – EA Organizations

This list was created by looking at the employers reported by EAGxVirtual attendees and filtering for ones which seemed EA-related in my subjective opinion. It is definitely the case that some employees of these organizations do not qualify as "highly engaged EA’s”, and thatmany highly engaged EAs work for none of these organizations.

The SQL query I used to classify people can be found here.

Footnotes

Arguably the volunteers should be removed from the denominator, meaning the dropout rate is 6/126 = 4.7%.
Using the standard type I error rate of 5% and type II error rate of 20%, and this calculator

Peter Wildeford4y10

Peter Wildeford has done the largest non-manual retention analysis I know, which looked at the percentage of people who answered the EA survey using the same email in multiple years. He found retention rates of around 27%, but cautioned that this was inaccurate due to people using different email addresses each year.

Thanks for citing me, and I'm excited for the new data sources you are looking at.

One thing you might want to add is that I looked at two different approaches. You quote the first approach, but the second approach - which I think is more accurate, and is based on comparing the year people say they joined EA versus the survey take rate for that year - shows that roughly ~60% of EAs still stay around after 4-5 years.

Effective Altruism Forum
EA Forum