M

MichaelDickens

7205 karmaJoined
mdickens.me

Bio

I do independent research on EA topics. I write about whatever seems important, tractable, and interesting (to me).

I have a website: https://mdickens.me/ Much of the content on my website gets cross-posted to the EA Forum, but I also write about some non-EA stuff over there.

My favorite things that I've written: https://mdickens.me/favorite-posts/

I used to work as a software developer at Affirm.

Sequences
1

Quantitative Models for Cause Selection

Comments
957

A relevant question I'm not sure about: for people who talk to politicians about AI risk, how useful are benchmarks? I'm not involved in those conversations so I can't really say. My guess is that politicians are more interested in obvious capabilities (e.g. Claude can write good code now) than they are in benchmark performance.

Know when to sound alarm bells

What is the situation where people coordinate to sound alarm bells over an AI benchmark? I basically don't think people pay attention to benchmarks in the way that matters: a benchmark comes out demonstrating some new potential danger, AI safety people raise concerns about it, and the people with the actual power continue to ignore them.

Thinking of some historical examples:

  • Anthropic's findings on alignment faking should have alerted people that AI is too dangerous to keep building, but it didn't.
  • Anthropic/OpenAI recently finding that their latest models may cross ASL-4 / CBRN thresholds should've been an alarm bell that they can't release those models, but they released them anyway and nobody could do anything to stop them.

Increases the speed of ai development

This is the biggest con by far. There are at least two mechanisms by which it could happen:

  • Make AI capabilities growth more apparent, incentivizing investment.
  • Help AI companies learn how to build more powerful AI systems.

We focus on intentional deaths because they are most revealing of terminal preferences, which in turn are most predictive of future harm.

Why is an ideological desire to kill the outgroup more likely to influence the long-term future than (say) a desire to preserve wild-animal suffering for aesthetic reasons?

Considering that wild animal suffering is currently somewhere around 10^3 to 10^15 times worse than any ideologically-driven tragedy has ever been, by default I'm far more concerned about the former.

Ah yes, this supports my pre-conceived belief that (1) we cannot reliably ascertain whether a model has catastrophically dangerous capabilities, and therefore (2) we need to stop developing increasingly powerful models until we get a handle on things.

A donor wanted to spend their money this way; it would not be fair to the donor for Eliezer to turn around and give the money to someone else. There is a particular theory of change according to which this is the best marginal use of ~$1 million: it gives Eliezer a strong defense against accusations like

If they suddenly said that the risk of human extinction from AGI or superintelligence is extremely low, in all likelihood that money would dry up and Yudkowsky and Soares would be out of a job.

I kinda don't think this was the best use of a million dollars, but I can see the argument for how it might be.

Copying from my other comment:

The reason Eliezer gets paid so much is because a donor specifically requested it. The express purpose of the donation was to make Eliezer rich enough that he could afford to say "actually AI risk isn't a big deal" and shut down MIRI without putting himself in a difficult financial situation.

(I don't know about Nate's salary but $235K looks pretty reasonable to me? That's less than a mid-level software engineer makes.)

Edit Feb 2: Apparently the donation I was thinking of is separate from Eliezer's salary, see his comment

MIRI pays Eliezer Yudkowsky $600,000 a year.

I believe this is because a donor specifically requested it. The express purpose of the donation was to make Eliezer rich enough that he could afford to say "actually AI risk isn't a big deal" and shut down MIRI without putting himself in a difficult financial situation.

Edit Feb 2: Apparently the donation I was thinking of is separate from Eliezer's salary, see his comment

I spent a few days thinking about this but I struggled to come up with a bet structure that I was confident was good for both of us. The financial implications of this sort of bet are complicated. I didn't want to spend more time on it so I'll punt on this for now but I will keep it in the back of my mind in case I come up with anything.

How do I know that they're aligned? e.g. I asked Claude to find some quotes from Mike Rounds and he mentioned biorisk from AI but that was about it. Rounds also said "America needs to lead in AI to make sure that our warriors have every advantage" which sounds anti-aligned.

Load more