259 karmaJoined Aug 2019


Rachel Freedman, AIS researcher at CHAI. London/Berkeley. Cats are not model-based reinforcement learners.


I’m strongly in support of this initiative, and hope to help out as my schedule permits.

I agree with Larks that the linked studies have poor methodology and don’t provide sufficient support for their claims. I wish that there was better empirical research on this topic, but I think that’s unlikely to happen for various reasons (specifying useful outcome metrics is extremely difficult, political and researcher bias pushes hard toward a particular conclusion, human studies are expensive, etc.).

In lieu of reliable large-scale data, I’m basing my opinion on personal experiences and observations from my 5 years as a full time (cis female) AIS researcher, as well as several years of advising junior and aspiring researchers. I want to be explicit that I’d like better data than this, but am using it because it’s the best I have available.

I see two distinct ways that this initiative could be valuable for AIS research:

  1. It could help us to recruit and retain more promising researchers. As Lewis commented, we need all the help we can get. While this community tries hard to be meritocratic, and is much less overtly hostile to women than neighboring communities I’ve experienced, I have personally noticed and experienced unintentional-yet-systemic patterns of behavior that can make it particularly difficult to remain and advance in this field as a woman. I’d prefer not to get into an in-depth discussion of that on here, though I have written about a bit of it in a related comment.[1] I believe that a more gender-balanced environment, and particularly more accessible senior female researchers and mentors, would likely reduce this.

    I also suspect that more balanced gender representation would make more people feel comfortable entering the field. I am often the only woman at lab meetings, research workshops, and other AIS events, and very often the only woman who isn’t married to a man who is also in attendance. This doesn’t bother me, but I think that’s just a random quirk of my personality. I think it’s totally reasonable and not uncommon for people to be reluctant to join a group where their demographics make them stand out, and we could be losing female entrants this way. (Though I have noticed much more gender diversity in AIS researchers who’ve joined in the past <2 years than in those who joined >=5 years ago, so it’s possible this problem is already going away!)
  2. Women (or any member of an underrepresented group or background) could provide important perspective for some areas of AIS research. It's important to distinguish between different research areas here, so I’m gonna messily put AIS topics on a spectrum between “fundamental” and “applied”. By “fundamental”, I mean topics like interpretability, decision theory, science of deep learning, etc — work to understand, predict, and figure out how to control AI behavior at all. By “applied”, I mean topics like practical implications of RLHF when teachers have differing preferences, or constructing meaningful evaluations for foundation models — work to understand, predict, and dictate how AI interacts with the real world and groups of humans. 

    On the “fundamental” end of the spectrum, I don’t think that diversity in researcher background and life experience really matters either way. But in topics further toward the “applied” end of the spectrum, it can help a whole lot. There’s plausibly-important safety work happening all along this spectrum, especially now that surprisingly powerful AI systems are deployed in the real world, so there are areas where researchers with diverse backgrounds can be particularly valuable.


Overall, I think that this is an excellent thing to dedicate some resources to on the margin.

  1. ^

    A relevant excerpt: "most of these interactions were respectful, and grew to be a problem only because they happened so systematically -- for a while, it felt like every senior researcher I tried to get project mentorship from tried to date me instead, then avoided me after I turned them down, which has had serious career consequences."

Thanks for the update; I'm curious to hear what you think!

This video was just released on Nebula. I expect it will be out on youtube in the next couple of days. I watched the entire thing and, overall, thought it was reasonably evenhanded. Some of the critiques seem valid, and though not necessarily novel, worth discussing more (ie Measurability Bias). Some of them seemed a bit more hand-wavey (ie, paraphrased: "morality is about the interactions that we have with each other, not about our effects on future people, because future people don't even exist!") or shallow (ie, paraphrased: "malicious AI won't spread uncontrolled through the internet; complex programs need special hardware, and we can just turn that off!"). There was also a healthy dose of "dismantle the System" and complaint that EA legitimizes capitalism by making earning money compatible with morality.

Overall, it struck me as unusually truth-seeking for a piece of media produced primarily for entertainment. While Thorn seems to have some core ideological differences with EA (she's really into "Dismantle the System"), she also seems to have made a significant effort here, including reading both What We Owe the Future and The Precipice in addition to Torres and other critics. Hopefully her audience will come away with a nuanced view.

There's a lot of discussion here about why things don't get reported to the community health team, and what they're responsible for, so I wanted to add my own bit of anecdata.

I'm a woman who has been closely involved with a particularly gender-imbalanced portion of EA for 7 years, who has personally experienced and secondhand heard about many issues around gender dynamics, and who has never reported anything to the community health team (despite several suggestions from friends to). Now I'm considering why.

Upon reflection, here are a few reasons:

  1. Early on, some of it was naiveté. I experienced occasional inappropriate comments or situations from senior male researchers when I was a teenager, but assumed that they could never be interested in me because of the age and experience gap. At the time I thought that I must be misinterpreting the situation, and only see it the way I do now with the benefit of experience and hindsight. (I never felt unsafe, and if I had, would have reported it or left.)

  2. Often, the behavior felt plausibly deniable. "Is this person asking me to meet at a coffeeshop to discuss research or to hit on me? How about meeting at a bar? Going for a walk on the beach?" I was unsure what crossed into inappropriate territory, and whether it was I who was problematically sexualizing everything. Most of this is only obvious in hindsight, and because I have enough experience to notice patterns in behavior across individuals. Moreover, most of these interactions were respectful, and grew to be a problem only because they happened so systematically -- for a while, it felt like every senior researcher I tried to get project mentorship from tried to date me instead, then avoided me after I turned them down, which has had serious career consequences. I didn't report this because it was unclear what to report -- no particular individual was clearly acting inappropriately, and (at least the first few times) I doubted myself.

  3. I moved to the bay a few years ago for a PhD, and access to collaborative workspace, networking events, and supplemental funding (very necessary for me, with health problems on an academic stipend) are all gated by a couple of people here. They are all men (as far as I know), one or more them have asked me out or shown romantic interest (respectfully), and there are few enough women in my field here that I didn't feel I had any hope of remaining anonymous. I thought making a big fuss about these things would tank my career, or at least lose me the trust I need to access these spaces and resources, and I wasn't willing to do that. I moved here and made a bunch of personal sacrifices to work on incredibly important problems, after all.

Over the past 7 years, my motivation has developed from mostly-1 to mostly-2 to mostly-3. Regardless, I honestly don't know of anything that the community health team could do to help with any of this. There were no extreme situations that warranted a specific individual being banned. The problematic dynamics were subtle, and I didn't see how any broad communication could help with them. I didn't want the team to take any action that might de-anonymize me, for career reasons. I don't see anything to blame the community health team for here.

Thanks for the nuanced response. FWIW, this seems reasonable to me as well:

I agree that it's important to separate out all of these factors, but I think it's totally reasonable for your assessment of some of these factors to update your assessment of others.

Separately, I think that people are sometimes overconfident in their assessment of some of these factors (e.g. intelligence), because they over-update on signals that seem particularly legible to them (e.g. math accolades), and that this can cause cascading issues with this line of reasoning. But that's a distinct concern from the one I quoted from the post.

In my experience, smart people have a pretty high rate of failing to do useful research (by researching in an IMO useless direction, or being unproductive), so I'd never be that confident in someone's research direction just based on them seeming really smart, even if they were famously smart.

I've personally observed this as well; I'm glad to hear that other people have also come to this conclusion.

I think the key distinction here is between necessity and sufficiency. Intelligence is (at least with a certain threshold) necessary to do good technical research, but it isn't sufficient. Impressive quantitative achievements, like competing in the international math olympiad, are sufficient to demonstrate intelligence (again, above a certain threshold), but not necessary (most smart people don't compete in IMO and, outside of specific prestigious academic institutions, haven't even heard of it). But mixing this up can lead to poor conclusions, like one I heard the other night: "Doing better technical research is easy; we just have to recruit the IMO winners!"

I strongly agree with this particular statement from the post, but have refrained stating it publicly before out of concern that it would reduce my access to EA funding and spaces.

EAs should consciously separate:

  • An individual’s suitability for a particular project, job, or role
  • Their expertise and skill in the relevant area(s)
  • The degree to which they are perceived to be “highly intelligent”
  • Their perceived level of value-alignment with EA orthodoxy
  • Their seniority within the EA community
  • Their personal wealth and/or power

I've been surprised how many researchers, grant-makers, and community organizers around me do seem to interchange these things. For example, I recently was surprised to hear someone who controls relevant funding and community space access remark to a group "I rank [Researcher X] as an A-Tier researcher. I don't actually know what they work on, but they just seem really smart." I found this very epistemically concerning, but other people didn't seem to.

I'd like to understand this reasoning better. Is there anyone who disagrees with the statement (aka, disagrees that these factors should be consciously separated) who could help me to understand their position? 

This is a great idea! I don't currently have capacity for one-to-one calls, but I do hold monthly small group calls in an "office hours" format.

I'm a technical AI safety researcher at CHAI and PhD student at UC Berkeley, and I'm happy to talk about my research, others' research, graduate school, careers in AI safety, and other related topics. If you're interested, you can find out more about my research here, and sign up to join an upcoming call here.

Thank you for explaining more. In that case, I can understand why you'd want to spend more time thinking about AI safety.

I suspect that much of the reason that "understanding the argument is so hard" is because there isn't a definitive argument -- just a collection of fuzzy arguments and intuitions. The intuitions seem very, well, intuitive to many people, and so they become convinced. But if you don't share these intuitions, then hearing about them doesn't convince you. I also have an (academic) ML background, and I personally find some topics (like mesa-optimization) to be incredibly difficult to reason about.

I think that generating more concrete arguments and objections would be very useful for the field, and I encourage you to write up any thoughts that you have in that direction!

(Also, a minor disclaimer that I suppose I should have included earlier: I provided technical feedback on a draft of TAP, and much of the "AGI safety" section focuses on my team's work. I still think that it's a good concrete introduction to the field, because of how specific and well-cited it is, but I also am probably somewhat biased.)

Thank you for writing this! I particularly appreciated hearing your responses to Superintelligence and Human Compatible, and would be very interested to hear how you would respond to The Alignment Problem. TAP is more grounded in modern ML and current research than either of the other books, and I suspect that this might help you form more concrete objections (and/or convince you of some points). If you do read it, please consider sharing your responses.

That said, I don’t think that you have any obligation to read TAP, or to consider thinking about AI safety at all. It sounds like you aren’t drawn to a career in the field, and that’s fine. There are plenty of other ways to do good with an ML skill set. But if you don’t need to weigh working in AI safety against other career options, and you don’t find it interesting or enjoyable to consider, then why focus on forming personal views about AI safety at all?

Edited to add a disclaimer: I provided technical feedback on a draft of TAP, and much of the "AGI safety" section focuses on my team's work. I still think that it's a good concrete introduction to the field, because of how specific and well-cited it is, but I also am probably somewhat biased.

This closely matches my personal experience of EAG. I typically have back-to-back meetings throughout the entire conference, including throughout all talks. At the most recent EAG London, I and a more senior person in my field mutually wanted to meet, and exchanged many messages like the one in the screenshot above -- "I just had a spot open up in 15 minutes if you're free?", "Are you taking a lunch break tomorrow?", etc. (We ultimately were not able to find mutual availability, and met on zoom a couple of weeks later.)

Like Charles, I don't necessarily think that this is a bad thing. However, if this is the primary intent of the conference, it could be improved somewhat to make small meetings easier (and possibly to include more events like the speaker reception, where people who spend the rest of the conference in prearranged 1:1s can casually chat).

I personally would be very excited about a conference app that allowed people to book small group (1:2) or (1:3) meetings. I find that many people I speak to ask the same questions, and that I am frustratingly unable to accommodate everyone who wants to have a 1:1. I sometimes hold group zoom calls (1:3 or 1:5) afterward for people who I wasn't able to meet during the conference, and this format seems to work well.

Load more