This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
Effective Altruism Forum
Topics
EA Forum
Login
Sign up
Anthropic
•
Applied to
Takes on "Alignment Faking in Large Language Models"
3d
ago
•
Applied to
Alignment Faking in Large Language Models
3d
ago
•
Applied to
I read every major AI lab’s safety plan so you don’t have to
6d
ago
•
Applied to
Anthropic teams up with Palantir and AWS to sell AI to defense customers
1mo
ago
•
Applied to
The current state of RSPs
2mo
ago
•
Applied to
Anthropic rewrote its RSP
2mo
ago
•
Applied to
Dario Amodei — Machines of Loving Grace
2mo
ago
•
Applied to
#197 – On whether Anthropic's AI safety policy is up to the task (Nick Joseph on The 80,000 Hours Podcast)
4mo
ago
•
Applied to
OpenAI and Anthropic Donate Credits for AI Forecasting Benchmark Tournament
5mo
ago
•
Applied to
Claude 3.5 Sonnet
6mo
ago
•
Applied to
AI Safety Newsletter #37: US Launches Antitrust Investigations Plus, recent criticisms of OpenAI and Anthropic, and a summary of Situational Awareness
6mo
ago
•
Applied to
Jan Leike: "I'm excited to join @AnthropicAI to continue the superalignment mission!"
7mo
ago
•
Applied to
Introducing Senti - Animal Ethics AI Assistant
7mo
ago
•
Applied to
OMMC Announces RIP
9mo
ago
•
Applied to
Introducing Alignment Stress-Testing at Anthropic
1y
ago
•
Applied to
Would an Anthropic/OpenAI merger be good for AI safety?
1y
ago
•
Applied to
Scalable And Transferable Black-Box Jailbreaks For Language Models Via Persona Modulation
1y
ago
•
Applied to
Responsible Scaling Policies Are Risk Management Done Wrong
1y
ago
•
Applied to
Thoughts on responsible scaling policies and regulation
1y
ago
•
Applied to
Frontier Model Forum
1y
ago