This is a linkpost for https://docs.google.com/document/d/1-58zgC2lRMbMK-CXU44VR3ApGYbTI0aJKX-cKkxDeyo/edit?usp=sharing
EDIT 3/17/2023: I've reorganized the doc and added some governance projects.
I intend to maintain a list at this doc. I'll paste the current state of the doc (as of January 19th, 2023) below. I encourage people to comment with suggestions.
- Levelling Up in AI Safety Research Engineering [Public] (LW)
- Highly recommended list of AI safety research engineering resources for people at various skill levels.
- AI Alignment Awards
- Alignment jams / hackathons from Apart Research
- Past / upcoming hackathons: LLM, interpretability 1, AI test, interpretability 2
- Projects on AI Safety Ideas: LLM, interpretability, AI test
- Resources: black-box investigator of language models, interpretability playground (LW), AI test
- Examples of past projects; interpretability winners
- How to run one as an in-person event at your school
- Neel Nanda: 200 Concrete Open Problems in Mechanistic Interpretability (doc and previous version)
- Project page from AGI Safety Fundamentals and their Open List of Project ideas
- AI Safety Ideas by Apart Research; EAF post
- Most Important Century writing prize (Superlinear page)
- Center for AI Safety
- Competitions like SafeBench
- Student ML Safety Research Stipend Opportunity – provides stipends for doing ML research.
- course.mlsafety.org projects CAIS is looking for someone to add details about these projects on course.mlsafety.org
- Distilling / summarizing / synthesizing / reviewing / explaining
- Forming your own views on AI safety (without stress!) – also see Neel's presentation slides and "Inside Views Resources" doc
- Answer some of the application questions from the winter 2022 SERI-MATS, such as Vivek Hebbar's problems
- 10 exercises from Akash in “Resources that (I think) new alignment researchers should know about”
- [T] Deception Demo Brainstorm has some ideas (message Thomas Larsen if these seem interesting)
- Upcoming 2023 Open Philanthropy AI Worldviews Contest
- Alignment research at ALTER – interesting research problems, many have a theoretical math flavor
- Open Problems in AI X-Risk [PAIS #5]
- Amplify creative grants (old)
- Evan Hubinger: Concrete experiments in inner alignment, ideas someone should investigate further, sticky goals
- Richard Ngo: Some conceptual alignment research projects, alignment research exercises
- Buck Shlegeris: Some fun ML engineering projects that I would think are cool, The case for becoming a black box investigator of language models
- Implement a key paper in deep reinforcement learning
- “Paper replication resources” section in “How to pursue a career in technical alignment”
- Daniel Filan idea
- Summarize a reading from Reading What We Can