Effective Altruism ForumTopics
EA Forum

AI interpretability

•

Applied to Solving adversarial attacks in computer vision as a baby version of general AI alignment 1mo ago

•

Applied to AI Alignment Research Engineer Accelerator (ARENA): Call for applicants v4.0 3mo ago

•

Applied to Rational Animations' intro to mechanistic interpretability 4mo ago

•

Applied to ML4Good Brasil - Applications Open 5mo ago

•

Applied to A Selection of Randomly Selected SAE Features 6mo ago

•

Applied to AI alignment as a translation problem 8mo ago

•

Applied to ML4Good UK - Applications Open 9mo ago

•

Applied to Assessment of AI safety agendas: think about the downside risk 10mo ago

•

Applied to Public Call for Interest in Mathematical Alignment 10mo ago

•

Applied to AI Alignment Research Engineer Accelerator (ARENA): call for applicants 11mo ago

•

Applied to Announcing Timaeus 1y ago

•

Applied to Don't Dismiss Simple Alignment Approaches 1y ago

•

Applied to Safety-First Agents/Architectures Are a Promising Path to Safe AGI 1y ago

•

Applied to Concrete open problems in mechanistic interpretability: a technical overview 1y ago

•

Applied to Announcing Apollo Research 1y ago

•

Applied to Why and When Interpretability Work is Dangerous 1y ago

•

Applied to Call for Pythia-style foundation model suite for alignment research 1y ago

•

Applied to High-level hopes for AI alignment 1y ago