Benchmarks are tests which enable us to measure the progress of AI capabilities, and test for characteristics which might pose safety risks.
The Benchmark Lottery
BASALT: A Benchmark for Learning from Human Feedback - AI Alignment Forum
Misaligned Powerseeking — SERI ML Alignment Theory Scholars Program | Summer 2022
[2110.06674] Truthful AI: Developing and governing AI that does not lie
I'm not totally sure whether this should exist, and whether it should be called this.
I'm not totally sure whether this should exist, and whether it should be called this.