Effective Altruism Forum
Topics
EA Forum

Anthropic

•

Applied to Takes on "Alignment Faking in Large Language Models" 3d ago

•

Applied to Alignment Faking in Large Language Models 3d ago

•

Applied to I read every major AI lab’s safety plan so you don’t have to 6d ago

•

Applied to Anthropic teams up with Palantir and AWS to sell AI to defense customers 1mo ago

•

Applied to The current state of RSPs 2mo ago

•

Applied to Anthropic rewrote its RSP 2mo ago

•

Applied to Dario Amodei — Machines of Loving Grace 2mo ago

•

Applied to #197 – On whether Anthropic's AI safety policy is up to the task (Nick Joseph on The 80,000 Hours Podcast) 4mo ago

•

Applied to OpenAI and Anthropic Donate Credits for AI Forecasting Benchmark Tournament 5mo ago

•

Applied to Claude 3.5 Sonnet 6mo ago

•

Applied to AI Safety Newsletter #37: US Launches Antitrust Investigations Plus, recent criticisms of OpenAI and Anthropic, and a summary of Situational Awareness 6mo ago

•

Applied to Jan Leike: "I'm excited to join @AnthropicAI to continue the superalignment mission!" 7mo ago

•

Applied to Introducing Senti - Animal Ethics AI Assistant 7mo ago

•

Applied to OMMC Announces RIP 9mo ago

•

Applied to Introducing Alignment Stress-Testing at Anthropic 1y ago

•

Applied to Would an Anthropic/OpenAI merger be good for AI safety? 1y ago

•

Applied to Scalable And Transferable Black-Box Jailbreaks For Language Models Via Persona Modulation 1y ago

•

Applied to Responsible Scaling Policies Are Risk Management Done Wrong 1y ago

•

Applied to Thoughts on responsible scaling policies and regulation 1y ago

•

Applied to Frontier Model Forum 1y ago