What Should the Average EA Do About AI Alignment?

by Raemon25th Feb 201741 comments



I'm trying to get a handle on what advice to give people who are convinced AI is a problem worthy of their time, *probably* the most important problem, but are not sure if they have the talent necessary to contribute.

A trending school of thought is "AI Alignment needs careful, clever, agenty thinkers. 'Having the correct opinion' is not that useful. There is nobody who can tell you what exactly to do, because nobody knows. We need people who can figure out what to do, in a very messy, challenging problem."

This sort of makes sense to me, but it seems like only a few sorts of people can realistically contribute in this fashion (even given growth mindset considerations). It also seems like, even if most people could contribute, it doesn't provide very good next-actions to people who have reached the "okay, this is important" stage, but who aren't (yet?) ready to change their career direction.

Here is the advice I currently give, followed by the background assumptions that prompted it. I'm looking for people to challenge me on any of these:

Options for the non-or-minimally-technical-ish:

1) Donate. (1%, or more if you can do so without sacrificing the ability to take valuable financial risks to further your career. MIRI, FHI, 80k and CFAR seem like the most credible ways to turn money into more AI Alignment career capital)

2) Arrange your life such that you can easily identify volunteer opportunities for gruntwork, operations, or other nontechnical skills for AI safety orgs, and dedicate enough time and attention to helping with that gruntwork that you are more of an asset than a burden. (i.e. helping to run conferences and workshops). To help with AI specific things, it seems necessary to be in the Bay, Boston, Oxford, Cambridge or London.

3a) Embark on projects or career paths that will cause you to gain deep skills, and in particular, train the habit/skill of noticing things that need doing, and proactively developing solutions to accomplish them. (These projects/careers can be pretty arbitrary. To eventually tie them back into AI, you need to get good enough that you'll either be able help found a new org or provide rare skills to an existing org)

3b) Ideally, choose projects that involve working together in groups, that require you to resolve differences in opinion on how to use scarce resources, and which require you to interacting with other groups with subtly different goals. Practice coordination skills mindfully.

4) Provide a reading list of blogs and social-media feeds to stay up-to-date on the more accessible, less technically demanding thoughts relating to AI Safety. Practice thinking critically on your own about them. (this doesn't really come with an obvious "Part 2" that translates that into meaningful action on its own)

If technical-ish, and/or willing to learn a LOT

5) Look at the MIRI and 80k AI Safety syllabus, and see if how much of it looks like something you'd be excited to learn. If applicable to you, consider diving into that so you can contribute to the cutting edge of knowledge.

6) If you're a talented programmer, learn a lot about ML/Deep Learning and then stay up to date on the latest actual AI research, so you can position yourself at the top AI companies and potentially have influence with them on which direction they go.

An important question I'd like to answer is "how do can you tell if it makes sense to alter your career in pursuit of #5 and #6?"? This is very non-obvious to me. 

I talk to a lot of people that seem roooooughly analagous to myself, ie. pretty smart but not extremely smart. In my case I think I have a credible claim on "community building" being my comparative advantage, but I notice a lot of people default to "be a community person or influencer", and I'm really wary of a decision tree that outputs a tower of meta-community-stuff for anyone who's not obviously expert at anything else. I'd like to have better, fleshed out, scalable suggestions for people fairly similar to me.

Background assumptions

Various things that fed into the above recommendations (sometimes directly, sometimes indirectly). This is a living document that I'll update as people persuade me otherwise. Again, appreciate getting challenged on any of these.

AI Timelines and Goals

AI timelines are anywhere between 5 years (if DeepMind is more advanced than they're telling anyone), 20 years (if it turns out general AI is only a couple breakthroughs away from current Deep Learning trends, and we're (un)lucky on how soon those breakthroughs come), or much longer if General AI turns out to be harder. We should be prepared for each possibility.

Eventually, all of our efforts will need to translate into the ability into one of the following:

 - the ability to develop insights about AI Alignment
 - the ability to cause AI research to be safely aligned
 - the ability to stop or slow down AI research until it can be safely aligned


 - MIRI seems like the most shovel-ready instance of "actual AI Safety research". It's not obvious to me whether MIRI is doing the best work, but they seem to be at least doing good work, and they do seem underfunded, and funding them seems like the most straightforward way to turn money into more professional AI researchers.

 - FHI is a contender for second-best funding-target for X-risk reduction, including some thought about AI alignment.

 - 80k, CFAR and Leverage are the orgs I know of that seem to be concretely attempting to solve the "career capital gap", with different strategies. They each have elements that seem promising to me. I'm sure what their respective funding constraints are. (Note: I recently became a bit more interested in Leverage than I had been, but examining Leverage is a blogpost unto itself and I'm not going to try doing so here) 

 - The Far Future Fund (recently announced, run by Nick Beckstead) may be a good way to outsource your donation decision. 

Career Capital, Agency and Self Improvement

 - An important limiting reagent is "people able to be agents." More than any single skillset, we need people who are able to look at organizations and worldstates, figure out what's not being done yet, figure out if they currently have the skills to do it, and backchain from that to being able to become the sort of people who have the skills to do that.

 - To self-improve the fastest, as a person and as an org, you need high quality feedback loops. 

 - In my experience, there is a critical threshold between an "agent" and a non-agent. People get activated as agents when they a) have a concrete project to work on that seems important to them that's above their current skill level, and b) have some high status mentor-figure who takes time out of their day to tell them in a serious voice "this project you are working on is important." (The latter step is not necessary but it seems to help a lot. Note: this is NOT a mentor figure who necessarily spends a lot of time training you. They are Gandalf, telling you your mission is important and they believe in you, and then mostly staying out of the way)

(Actual longterm mentorship is also super helpful but doesn't seem to be the limiting issue)

 - Beyond "be an agent", we do need highly skilled people at a variety of specific skills - both because AI Safety orgs need them, and because high skill allows you to get a job at an AGI research institution.

 - Despite attempting to achieve this for several years, it's not obvious that CFAR has developed the ability to produce agents, but it's succeeded (at least slightly) at attracting existing agents, training them in some skills, and focusing them on the right problems.

Thinking Critically

 - We need people who can think critically, and who spend time/attention being able to think critically and deeply about the right things. 

 - Thinking usefully critically requires being up to speed on what other people are thinking, so you aren't duplicating work.

 - It is currently very hard to keep up with ALL the different developments across the AI/EA/Career-Capital-Building spaces. Both because the updates come from all over the internet (and sometimes in person), and because people's writing is often verbose and inconcise.

 - It is possible for the average EA to learn to think more critically, but it requires significant time investment


 - Coordination problems are extraordinarily hard. Humanity essentially failed the "Nuclear Weapons test" (i.e. we survived the Cold War, but we easily might not have. Squeaking by the with a C- is not acceptable). 

 - Some people have argued the AI problem is much harder than Nukes, which isn't clear to me, (in the longterm you do need to stop everyone ever from developing unsafe AI, but it seems like the critical period is the window wherein AGI is first possible, where it'll be something like 6-20 companies working on it at once)

 - The Rationality and EA communities aren't obviously worse than the average community at coordination, but they are certainly not much better. And EAs are definitely not better than-average at inducing coordination/cooperation among disparate groups with different goals that aren't aligned with us.

 - If your goal is to influence orgs or AGI researchers, you need to make sure you're actually following a path that leads to real influence. (i.e. "You can network your way into being Elon Musk's friend who he invites over for dinner, but that doesn't mean he'll listen to you about AI safety. The same goes for networking your way onto the GoogleBrain team or the Google AI Ethics board. Have a clear model of influence and how much of it you credibly have.")

 -Mainstream politics is even harder than coordinating corporations, and to a first approximation is useless for purposes of AI alignment.

Open Questions

This is mostly a recap.

0) Is anything in my framework grossly wrong?

1) My primary question is "how do we filter for people who should consider dropping everything and focusing on the technical aspects of AI Safety, or seriously pursue careers that will position them to influence AGI research institutions?" These seem like the most important things to actually output, and it seems most important for those people to cultivate particular types of critical thinking, technical skill and ability-to-influence.

For people who are not well suited, or not yet ready to do 1), how can we either:

2) Make it easier for them to translate marginal effort into meaningful contribution, or creating a clearer path towards:

3) Level up to the point where they are able to take in the entire field, and generate useful things to do (without requiring much effort from other heavily involved people whose time is scarce).

Potential Further Reading

I have not read all of these, so cannot speak to which are most important, but I think it's useful to at least skim the contents of each of them so you have a rough idea of the ideas at play. I'm including them here mostly for easy reference.

(If someone wanted to generate a 1-3 sentence summary of each of these and indicate who the target audience is, I'd be happy to edit that in. I hopefully will eventually have time to do that myself but it may be a while)

MIRI's Research Guide

80,000 Hours AI Safety Syllabus

UC Berkeley Center for Human Compatible AI Bibliography

Case Study of CFAR's Effectiveness

AI Impacts Timelines and Strategies (examples of how to think strategically given different AI timelines)

Concrete Problems in AI Safety

OpenAI's Blog

AgentFoundations.org (this is sort of a stack-overflow / technical discussion forum for discussing concepts relevant to AI alignment)

Deliberate Grad School