Hide table of contents

I think this discussion was had on LW a few years ago (and probably sporadically since then).  Just quickly some parameters off the top of my head. 

Pro: 

  • Improves Forecasting
  • Necessary infrastructure for a variety of verification tech that will be needed for international treaties
  • Know when to sound alarm bells
  • Helps us know what type of defensive technologies we need to build

Cons:

  • Increases the speed of ai development
  • Unclear if US and China are even interested in coordinating
  • Unclear if a number from a eval will be enough to cause significant political pressure

Unsure

  •  Depending on trajectory of benchmarking, builds/kills hype and reduces/increases investment. 

19

0
0

Reactions

0
0
New Answer
New Comment

2 Answers sorted by

Know when to sound alarm bells

What is the situation where people coordinate to sound alarm bells over an AI benchmark? I basically don't think people pay attention to benchmarks in the way that matters: a benchmark comes out demonstrating some new potential danger, AI safety people raise concerns about it, and the people with the actual power continue to ignore them.

Thinking of some historical examples:

  • Anthropic's findings on alignment faking should have alerted people that AI is too dangerous to keep building, but it didn't.
  • Anthropic/OpenAI recently finding that their latest models may cross ASL-4 / CBRN thresholds should've been an alarm bell that they can't release those models, but they released them anyway and nobody could do anything to stop them.

A relevant question I'm not sure about: for people who talk to politicians about AI risk, how useful are benchmarks? I'm not involved in those conversations so I can't really say. My guess is that politicians are more interested in obvious capabilities (e.g. Claude can write good code now) than they are in benchmark performance.

Increases the speed of ai development

This is the biggest con by far. There are at least two mechanisms by which it could happen:

  • Make AI capabilities growth more apparent, incentivizing investment.
  • Help AI companies learn how to build more powerful AI systems.
Curated and popular this week
Relevant opportunities