This is a linkpost for https://www.alignmentforum.org/posts/yHFhWmu3DmvXZ5Fsm/clarifying-metr-s-auditing-role
METR clarifying what auditing it has or hasn't done, and what its plans are for the future, seems very important for understanding the landscape.
Excerpt:
METR has not intended to claim to have audited anything, or to be providing meaningful oversight or accountability, but there has been some confusion about whether METR is an auditor or planning to be one.
To clarify this point:
- METR’s top priority is to develop the science of evaluations, and we don’t need to be auditors in order to succeed at this.
- We aim to build evaluation protocols that can be used by evaluators/auditors regardless of whether that is the government, an internal lab team, another third party, or a team at METR.
- We should not be considered to have ‘audited’ GPT-4 or Claude.
- Those were informal pilots of what an audit might involve, or research collaborations – not providing meaningful oversight. For example, it was all under NDA – we didn’t have any right or responsibility to disclose our findings to anyone outside the labs – and there wasn’t any formal expectation it would inform deployment decisions. We also didn’t have the access necessary to perform a proper evaluation.
Also relevant: AI companies aren't really using external evaluators