EA sexual harassment: Where's the evidence we're worse than baseline?

Nathan Young

I used AI while writing this, though I have spent 5 - 15 hours on it and I take responsibility for every word and all numbers.

Every time there is a report of sexual harassment within EA there is a lot of introspection. I want to push back on this, not to dismiss the experiences described, which sound genuinely painful, but because I am unsure the evidence supports the belief that EA is uniquely bad here.

So what do we know? In 2022, CEA's Community Health team reported handling around 19 interpersonal harm cases the previous year, ranging from uncomfortable interactions to serious allegations, across a community of thousands. On my read, 3–6 of those 19 clearly involve sexual misconduct across a community of 4,700–10,000 EAs, putting a narrower rate at around 0.03–0.13%.

Alongside this there have been perhaps 3–5 high-profile public cases on the Forum in the last five years. The TIME piece in 2023 cited 7 women, though at the time it was disputed how clearly the cases related to EA. These data don't seem clearly outside of the above range.

The cleanest peer numbers come from universities' Title IX offices, which publish annual case counts. Harvard, Stanford and MIT report disclosure rates of roughly 0.6–2% per year, and formal-investigation rates of around 0.09–0.17%. EA's reported rates sit below university disclosure rates and at-or-below their formal-investigation rates.

A McKinsey workplace survey produces much larger percentages (~1–5% annually, derived from career-lifetime rates of 35% reported in LeanIn/McKinsey 2024). But surveys measure lived experience; CEA's 19 and universities' Title IX figures measure reports filed to a central body. The chart below gives all of these rates with lower and upper bounds.

Full sources and caveats^[1].

In contrast to this, many EA forum commenters seem to believe that EA is unusually bad in some general sense^[2]. These are examples of comments are widely upvoted and agreed-with. I give them as evidence for a certain widely-agreed belief, but this is not a systematic review:

Funnily enough, I think EA does worse than other communities / movements I'm involved with (grassroots animal advocacy & environmentalism).

Comment (karma 92, agree 54 / disagree 11)

Thank you for speaking up—it takes a lot, especially when women in EA often face such unproductive engagement for doing so

Comment (karma 118, agree 28 / disagree 1)

there is still such a pervasive culture of sexism.

Comment (karma 41, agree 11 / disagree 0)

The EA community has a significant undersupply of information from victims of abusive conduct, since the victims are often branded as 'triggered' or 'irrational'.

Comment (karma 20, agree 14 / disagree 2)

And here are another set implying that broad cultural change is required to fix this:

MY RECOMMENDATIONS
...create systems of checks and balances that do not allow for conflicts of interest to enable biased decisions...
...Create some kind of educational content around how to be a good ally to victims and how to identify bad situations so people can intervene...
...Identify when situations involve biased actors and correct by introducing unbiased actors...
...Hire an external arbiter...
...please work with [J_J]...who has been doing grassroots justice in the Bay Area for some time now.

Comment (karma 47, agree 21 / disagree 7)

I'm a little concerned by the lack of response from org leaders (unless I missed something), and I think there's a risk that CEA leaders and others might under-update from this

Comment on a report of sexual misconduct (karma 46, agree 22 / disagree 3)

Thank you for flagging the aspects of our ecosystem's culture that we need to be extra mindful of.

Comment on a report of sexual misconduct (karma 26, agree 11 / disagree 6)

While writing this piece and rereading many of the comments on such articles I can only recall one widely upvoted comment which pushes against this notion:

Women who identified as EA were less likely report lifetime sexual harassed at work than other women, 18% vs. 20%. They were also less likely to report being sexually harassed outside of work, 57% vs. 61%.
... Overall I am not sure that anything can be concluded from these results either way.

Comment referencing an SSC survey (karma 143, agree 20, disagree 5)

So overall I claim there are lots of widely-upvoted comments that imply EA is unusually bad and few that dispute it.

When anecdotes and data disagree, there is often something wrong with the data^[3], but I am confused about what's happening here. By way of example, I went to university in Durham, which after making sexual assault reporting easier, was ranked at or near the top of UK universities for reported sexual assault for the next few years. ie their ranking got considerably worse after they took steps to improve things. Was Durham culture bad in 2011? In my view, yes. Was Durham notably worse than other UK universities? I don't know. How sparse reporting and anecdotes link to the overall picture is a generally difficult problem in sexual misconduct reporting.

As for EA, there are multiple stories that fit the data we see:

EA is a predominantly male/non-traditional community where women are treated badly and reporting is suppressed.
Women are treated as well as in other communities, but it should still be an area of focus.
EA is a transparent culture which is taking this seriously, and things are actually better than in other communities, even if occasionally there are awful occurrences.

So I would like more data. CEA's Community Health team conducted a Gender Experiences Project in 2023 with ~40 interviews and analysis of survey data. I am unable to find any published outputs. There are ways we could distinguish between these hypotheses and try and generate a story where the data and anecdotes are less in conflict.

But until we do, I am not currently convinced by the "EA is unusually bad" hypothesis which many commenters seem to believe. A majority of those who engage with posts about sexual harassment seem to implicitly agree with this hypothesis. I am unconvinced by it. If we imagine a community of thousands of people which was excellent in this area, I still would expect some awful accounts. If I saw this data and these accounts about another community, I think I'd think "they are probably better than most communities".

What's more, we as EAs generally want to communicate accurately about things. If the rate was previously at the lower end of normal bounds (and has probably fallen since), then this community is not abnormally bad, and we should discuss the issue accordingly. And suggested community changes should discuss tradeoffs.

I am not the ideal messenger—I don't consider myself particularly sensitive on these issues. Some will feel this isn't the right moment. I have no interest in litigating specific cases in the comments. But we of all people know the value of scale. Commenters regularly discuss this as if the community is unusually bad, requiring broad cultural change, and those arguments deserve a response.

^{^}
Chart notes:
General notes on the ranges. The bars upper and lower bound comes from denominator choice, not numerator uncertainty. For the universities, the lower bound of the denominator counts students-only + academic staff), the high end adds non-academic staff (security, dining, facilities, admin); a much larger denominator that drags the rate down. For the CEA Community Health bars, the range is based on community size estimates. For the two survey-derived bars (US workplace, open-source contributors), the range reflects the conversion from lifetime to annual rate; † on the chart flags these. The EEOC bar has no range because both numerator (charges filed) and denominator (US workforce) are fixed annual figures.
US workplace, women. LeanIn.org and McKinsey & Company, "Women in the Workplace 2024: The 10th Anniversary Report" (full PDF). Derived from lifetime survey responses then turned into an annual rate; not a directly measured annual rate.
Open-source contributors, women. GitHub Open Source Survey, 2017. Derived from lifetime survey responses then turned into an annual rate; not a directly measured annual rate.
Harvard Title IX disclosures and formal investigations (FY20, FY22). Harvard Title IX and Office for Dispute Resolution annual reports: csndr.harvard.edu/data-dashboard; FY22 coverage in The Crimson. Denominator (lower bound: students+ academic staff; upper bound: students + academic staff + non-academic staff).
MIT Title IX disclosures, 2018-19. Title IX & Bias Response annual report, 2018-19 (PDF); current Institute Discrimination and Harassment Response Office; Title IX annual report archive. Denominator from MIT Facts; range follows the same students and academic staff vs students and all staff.
Stanford Title IX disclosures, 2023-24. Stanford SHARE / Title IX annual reports: 2023-24, 2022-23; Stanford Daily coverage of the 2023-24 report. Denominator from Stanford Facts; range as above.
CEA Community Health, all 2021. Julia Wise, "The community health team's work on interpersonal harm in the community", EA Forum, 18 August 2022 (CEA mirror). Range reflects active-vs-engaged community-size estimates (~4,700 to ~10,000), per Todd 2021.
CEA CH sexual harassment subset. Same source as above. The subset count is my own recategorisation of the 19 cases Wise enumerates in that post; there is no separate source.
EEOC sexual harassment charges (US). EEOC, "Sexual Harassment in Our Nation's Workplaces"; EEOC Select Task Force on the Study of Harassment in the Workplace, June 2016. Workforce denominator from the Bureau of Labor Statistics Current Population Survey.
^{^}
One might agree that EA is only bad at handling cases or has bad 'vibes', but when finding these comments (and there were many more) they often seem to say EA is bad, and rarely seem to give caveats. My read is that there is a general sense across the topic of sexual harassment that EA is unusually bad and that broad cultural change is needed. I'm not claiming everyone thinks this. If you can find several widely upvoted comments which push against these or provide caveats, let me know.
^{^}
"When the data and the anecdotes disagree, the anecdotes are usually right." Jeff Bezos

^{^}

I tried to choose comments and posts that were narrowly about the topic such that upvotes were related to this.

Show all footnotes

38 Reactions

More posts like this

Comments21

Sorted by

New & upvoted

Click to highlight new comments since: Today at 8:42 PM

titotalMay 2111

It is extremely difficult to determine base rates for something like sexual harassment, because it's an offence that allows for ambiguity and plausible deniability, because there's room for retaliation, etc, and it will strongly depend on how much people trust the bodies they are reporting to.

What we can do is look at the responses to the incidents that do get raised, and the experiences of victims, and judge whether or not they live up to the standards we want to see in a group that takes sexual harrasment seriously. I do not think the grades are very good on this front.

Only 3 months ago we had a writeup detailing a shockingly terrible response to sexual harrassment by one of the most prominent EA orgs out there. The response is far worse than anything I've ever seen at any organisation I've ever been in. This indicates to me that the environment is nowhere the high standards that should be aimed for.

Regardless of the actual base rates, the question that matters the most is whether there is room for improvement, and I think it's blindingly obvious that the answer is yes.

Nathan YoungMay 212

I disagree. The community response to that case was pretty unified. Were there any comments saying that the organisation behaved well? So while it may tell us things about the specific organisation, I don't think it should cause us negatively update us on the community.

Do you think that mosts orgs are like the org in that case?

peteMay 2112

Thanks for writing, Nathan! I think there are two separate arguments: one says that EA is doing worse than baseline, and the other is that EA is doing worse than our shared -- sometimes implicit -- community standards. It's fair to ask people to chart their expectations against baseline for almost anything, but it's also fair to establish community norms above baseline. Some of the comments you quoted (inc. one of mine) are critiques relying on the higher standard.

Nathan YoungMay 218

So in your head you are aiming for a significantly higher bar than top universities and workplaces?

peteMay 211

I think so, yes. Part of why, for example, I was and am critical of CEA's leadership's response to the recent sexual harassment report is because I believe we as a community generally are often capable of and desirous of achieving a higher standard than top universities and workplaces, which I think you believe too, based on the data in this post! My guess is that we're fundamentally on the same page, but I don't want to assume this.

Nathan YoungMay 212

I think I probably agree yes, though I am a bit uncertain about how hard the marginal gains are past some point.

Do you think most forum commenters think that EA is doing better than many universities and workplaces? Like do you think they share your view of a significantly higher standard? Do you think if someone commented "we are doing as well as many other communities, but we have more to do" they would upvote and agree with that?

peteMay 211

No idea -- I don't have nearly enough to go off of. Curious what you learn, though.

Nathan YoungMay 214

While I intend to discuss this with Pete directly, my general objection is that commenters don't discuss this as if EA is seeking to have a much higher bar and failing. They mostly discuss it as if EA is unusually bad.

If there is general agreement that EA has a much higher bar I would expect to see widely upvoted comments saying how much better EA is than comparable spaces even as it still has work to do. I don't see that.

peteMay 211

I think the absence of particular harms is easy to overlook, in all sorts of areas, especially in a space focused on identifying tractable problems, so I wouldn't update the same way based on lack of positive discussion. I'll happily go on the record and say (as a woman) that I've had much, much better experiences in male-dominated EA spaces than male-dominated non-EA spaces.

Liv GortonMay 217

I agree that often "EA is unusually bad" isn't grounded in data (often instead in first person accounts, although I'd note that if we assume reporting infrastructure won't capture things well, that's actually where most of the signal lives). I'm pretty skeptical that we should conclude from this data that EA isn't unusually bad.

Others have made similar points but just at a high level, my concerns when comparing to the other data (e.g. Title IX) would be that we are biasing both the numerator and denominator of the reporting rate in a way that biases us to think we are doing better than we are.

Numerator: There are lots of different reporting mechanisms and places and so we shouldn't expect CEA CH to capture all the cases as it is one of many parallel endpoints. Something like Title IX sits at the end of a mandatory reporting funnel. If I tell my RA that my professor harassed me, they are mandatory reporters of that. The same isn't true for if I tell someone at org X that someone at org Y harassed me (it is true within a single workplace but because EA is quite fractured, this isn't a mandatory reporting funnel to CEA CH).
Denominator: For a great deal of "EAs" what this means is just engaging online and donating to GiveWell. The Title IX denominator is people who are co-located on the same campus. The exposure-per-person is likely wildly different to other contexts.

I basically think errors here aren't random so your number here is a reasonable lower bound but I'm not sure it provides insight into the overall rate.

Nathan YoungMay 212

I'm pretty skeptical that we should conclude from this data that EA isn't unusually bad

Can we move to a position where we are more uncertain on EA's badness? And where we discuss it in those more uncertain terms?

On the second point, what is a more accurate estimate of the EA rate, in your view?

John SalterMay 217

Thanks for questioning the common narrative even though it's awkward! I think it's great you delayed until things got less heated.

I think one big question is whether you'd also need to consider all the complaints made to all the HR departments in EA and not just CEA's CH team. I imagine you're only going to CH if it's someone from some workplace besides your own, and that you'd go to your manager / HR team if it was someone you actually work with. If that's right, you might also need to talk to the people who run groups / coworking spaces.

If that's the best estimation strategy, I'm guessing you'd need to bump up your estimate by ~100x or so. Another thing is EA has far fewer women than these other places, so you might need to raise it by another factor of ~3.

BellaMay 218

I think this is a great point directionally, but hard to make the maths work for the numbers you've suggested.

Like, for there to be 100x more incidents of interpersonal misconduct than are reported to CH... well, there'd need to be 1000s of incidents! And there are only a few thousand EAs. It could be that the rate is in the tens of percent, but I'd find that surprising.

Ditto your adjustment for women: to get a 3x adjustment given the amount of women in other spaces:

If the other spaces are 50% women, i.e. representative of gen. pop., then EA would need to be ~17% women for a 3x adjustment to make sense
But EA survey data suggests EA is 26% women
I guess you might think "professional EA orgs" or something are more male-dominated than EA survey respondents. That seems possible, depending on what you wanna count. One thing is 80k is currently majority-women and I think however you carve it up we're a big chunk of the space :)
- (Edit: that made me curious so I went and counted; assuming I'm right about everyone's gender then we're 60% women! I think it was a bit skewed the other way when I joined 4y ago.)

Nathan YoungMay 212

Well put. And as I said in my response I also think the women in other spaces argument could equally cut in the other direction - if we assume harassment is done by men to women, there are more men per woman in EA than comparable spaces.

John SalterMay 212

Re 2 vs 3 - I didn't want to have to literally whip a calculator out and crunch some stats before making a comment so I just put a squiggly.

Re 100, I think you're absolutely right. I think I'm off by an order of magnitude or two. I wouldn't be surprised if there were around 1000 individual incidents, but then I'd only expect a few % to be reported and for them to be grouped together. Tbh, I kinda scimmed the post and didn't read it throughly enough before commenting :S

Nathan YoungMay 212

So to clarify rather than the number being 300x too low, you think it might be somewhere between 2 - 20x too low?

And you're saying that you wouldn't be surprised if there were 1000 annual individual incidents among the 3000 or so EA women.

Nathan YoungMay 21*2

Thank you. It was quite a lot of work to get the piece into a good shape.

I'm guessing you'd need to bump up your estimate by ~100x or so

This seems way too high. I can see a case for 5x. I would be shocked to find out that in total EA HR depts receive 100x as many complaints as CH.

The factor of 3 thing is less clear. I think I can see an argument in the other direction: if it’s due to the number of badly behaved men, and in some ways EA might compare even more favourably to the university contexts. This feels like the sort of thing that one shouldn't do when one is estimating. Seems better just to present the numbers as they are.

You want to try and make a case for why it's clearly a multiple in that direction, not the other direction, then I might think about it and add it to the chart.

You seem to be taking the premise on board that we can do this kind of math. If you became convinced that a reasonable estimate was on the low end of comparable communities, would you agree with the conclusions of the article?

Ben Millwood🔸May 212

I'm not really convinced by your evidence that there's a widespread belief that EA is worse than other communities. Your first quote clearly says this, I grant. The other quotes seem to me to all be saying something more like "this community has this problem", and "I wish this community was better / had hoped that it would be better", without saying "we are worse than relevant comparisons", much less "we are worse than relevant comparisons because of intrinsic aspects of our culture".

FWIW, my belief, and I'm sure I'm not the only one, is more like: EA handles this badly, but also handling this badly is widespread. I don't think I've ever been in a community that I was confident had a good answer for this. EA should address this within its own community, because that's where we understand what's happening best and have the best tools for addressing it. This is true regardless of whether EA is worse, the same, or better than its peers: it's true for as long as it seems feasible to do better than we're doing.

In the end, interventions should be decided on their merits, and I think we should tip the balance away from the abstract and towards the concrete questions of what specific things we should or shouldn't do.

Nathan YoungMay 212

Hey,

I found lots and lots of comments, but to me it is so obvious that these people implicitly believe this that I struggle to argue the point. What about the comments on this article?

Thanks for questioning the common narrative even though it's awkward!

I'm pretty skeptical that we should conclude from this data that EA isn't unusually bad.

What we can do is look at the responses to the incidents that do get raised, and the experiences of victims, and judge whether or not they live up to the standards we want to see in a group that takes sexual harrasment seriously. I do not think the grades are very good on this front.

Currently about half of the comments disagreeing here seem to espouse the view that the community is bad. I expect those comments to be upvoted and agreed with. Am I reading them wrong? Do you think they are unrepresentative of the forum community at large?

I might get on to your second point in a bit.

Ben Millwood🔸May 212

I think you're reading some of them right, and many of them wrong, because you seem to continue to be equating "EA's performance on this issue makes me sad" and "EA's performance on this issue is worse than peers". Some people believe both for sure! But you keep including people saying the first thing as if they're saying the second. "Currently about half of the comments disagreeing here seem to espouse the view that the community is bad." -- again not distinguishing between "bad" and "worse".

Honestly my guess would be that most people don't have a clear considered belief on the comparison. They see bad behaviour and they object. It's not obvious to me why they would feel the need for a belief on the comparison. Whatever it is, the bad behaviour is still objectionable.

I hear you as pushing in a direction of "maybe we can't do anything about it, because no-one around us has succeeded in doing better". And, well, maybe! But I think this is a weak heuristic as compared with thinking directly about whether we should have stronger codes of conduct at EA conferences, or whether we need to develop more training resources or other support for orgs that are too small to have a proper HR function, or what stopped CEA from acting on what seems like clear evidence of inappropriate behaviour.

Nathan YoungMay 212

Can we stick on whether I am reading them right? Which of the three comments that I quoted do you think doesn't imply that this community is unusually bad here?