AI Safety Researcher @ Independent Researcher / SERI MATS
Working (6-15 years of experience)
259Joined Aug 2017


I work primarily on AI Alignment. My main direction at the moment is to accelerate alignment work via language models and interpretability.


I think I agree with this. I’ve thought of starting one myself. Not sure if I will yet.

Here, I wrote about how AI applied to non-profits could be neglected at the moment.

Near-Term AI capabilities probably bring low-hanging fruits for global poverty/health

I'm an alignment researcher, but I still think we should be vigilant about how models like GPT-N could potentially be used to make the world a better place. I like the work that Ought is doing with respect to the academic field (and, hopefully, alignment soon as well). However, my guess is that there are low-hanging fruits popping up because of this new technology, and the non-profit sector has yet to catch up.

This shortform is a Call To Action for any EA entrepreneur, you could potentially boost efficiency of the non-profit sector with the use of these tools. Of course, be careful since GPT-3 will hallucinate sometimes. But putting it in a larger system with checks and balances could 1) make non-profits save time and money 2) make previously inefficient or non-viable non-profits become a top charity.

I could be wrong about this, but my expectation is that there will be a lag between the time people can use GPT effectively for the non-profit sector and when they actually do.

Thank you for writing this post. I'm currently a technical alignment researcher who spent 4 years in government doing various roles, and my impression has been the same as yours regarding the current "strategy" for tackling x-risks. I talk about similar things (foresight) in my recent post. I'm hoping technical people and governance/strategy people can work together on this to identify risks and find golden opportunities for reducing risks.

"DM me if you're interested in dating me"

Before EAGSF this year, (on Twitter) I mentioned putting this on your SwapCard profile as a way to prevent the scenarios above where people ask others for meetings because they are romantically interested in them. So, instead, they could contact them off-site if interested and EAGs would hopefully have more people just focused on going to it for career reasons. My thought was that if you don't do something like this, people are just going to continue hiding their intentions (though I'm sure some would still do this regardless).

I was criticized for saying this. Some people said they have an uncomfortable feeling after hearing that suggestion because they now have it in their minds that you might be doing a 1-on-1 with them because you find them attractive. Fair enough! Even if you, let's say, link to a dating doc off-site or contact info that they can reach after the conference. I hoped that we could make it more explicit the fact that people in the community are obviously looking to date others in the community and are finding that very difficult. Instead, my guess is that we are placed in a situation where people will set-up 1-on-1s because they find someone attractive even if they don't admit it. I do not condone this, and it's not something I've done (for all the reasons listed in this thread).

Personally, I do not plan to ask anyone out from the community at any point. Initially, I had hoped to find someone with similar values, but I just don't think there is any place it seems appropriate. Not even parties. It's just not worth the effort to figure out how to ask out an EA lady in a way that's considered acceptable. This might sound extreme to some, but I just don't find it worth the mental energy to navigate my way through this and just want to be in career-mode (and, at most, friendship-mode) when engaging with other EAs. And, more importantly, there's too much work and fun mixed, and it just leads to uncomfortable situations and posts like this.

I'm not making a judgement on what others should do, but hopefully whichever way the community goes, it becomes more welcoming for people who want to do good.

Here’s a comment I wrote on LessWrong in order to provide some clarification:


So, my difficulty is that my experience in government and my experience in EA-adjacent spaces has totally confused my understanding of the jargon. I'll try to clarify:

  • In the context of my government experience, forecasting is explicitly trying to predict what will happen based on past data. It does not fully account for fundamental assumptions that might break due to advances in a field, changes in geopolitics, etc. Forecasts are typically used to inform one decision. It does not focus on being robust across potential futures or try to identify opportunities we can take to change the future.
  • In EA / AGI Risk, it seems that people are using "forecasting" to mean something somewhat like foresight, but not really? Like, if you go on Metaculus, they are making long-term forecasts in a superforecaster-mindset, but are perhaps expecting their long-term forecasts are as good as the short-term forecasts. I don't mean to sound harsh, it's useful what they are doing and can still feed into a robust plan for different scenarios. However, I'd say what is mentioned in reports typically does lean a bit more into (what I'd consider) foresight territory sometimes.
  • My hope: instead of only using "forecasts/foresight" to figure out when AGI will happen, we use it to identify risks for the community, potential yellow/red light signals, and golden opportunities where we can effectively implement policies/regulations. In my opinion, using a "strategic foresight" approach enables us to be a lot more prepared for different scenarios (and might even have identified a risk like SBF much sooner).

My understanding of forecasting is that you would optimally want to predict a distribution of outcomes, i.e. the cone but weighted with probabilities. This seems strictly better than predicting the cone without probabilities since probabilities allow you to prioritize between scenarios. 

Yes, in the end, we still need to prioritize based on the plausibility of a scenario.

I understand some of the problems you describe, e.g. that people might be missing parts of the distribution when they make predictions and they should spread them wider but I think you can describe these problems entirely within the forecasting language and there is no need to introduce a new term. 

Yeah, I care much less about the term/jargon than the approach. In other words, what I'm hoping to see more of is to come up with a set of scenarios and forecasting across the cone of plausibility (weighted by probability, impact, etc) so that we can create a robust plan and identify opportunities that improve our odds of success.

I think the information you are sharing is useful (some parts less so, I agree with pseudonym), just don't deadname/misgender them. EA is better than that.

I feel like anyone reaching out to Elon could say "making it better for the world" because that's exactly what would resonate with Elon. It's probably what I'd say to get someone on my side and communicate I want to help them change the direction of Twitter and "make it better."

Honestly, I’m happy with this compromise. I want to hear more about what ‘leadership’ is thinking, but I also understand the constraints you all have.

This obviously doesn’t answer the questions people have, but at least communicating this instead of radio silence is very much appreciated. For me at least, it feels like it helps reduce feelings of disconnectedness and makes the situation a little less frustrating.

Personally, I’ve mostly seen people confused and trying to demonstrate willingness to re-evaluate what might have led to these bad outcomes. They may overly sway in one direction, but this only just happened and they are re-assessing their worldview in real-time. Some are just asking questions about how decisions were made in the past so we just have more information and can improve things going forward (which might mean doing nothing differently in some instances). My impression is that a lot of the criticism about EA leadership are overblown and most (if not all) were blindsided.

That said, I haven’t really had the impression it’s as bad and widespread as this post makes it seem though. Maybe I just haven’t read the same posts/comments and tweets.

I do think that working together so we can land on our feet and continue to help those in need sounds nice and hope you’ll still be there since critical posts like this are obviously needed.

One worry one might have is the following reaction: “I don’t need mental health help, I need my money back! You con artists have ruined my life and now want to give me a pat on the back and tell me it’s going to be ok?”

Then again, I do want us to do something if it makes sense. :(

Load More