Ben_West🔸

Member of Technical Staff @ METR

Bio

Non-EA interests include chess and TikTok (@benthamite). We are probably hiring: https://metr.org/hiring 

How others can help me

Feedback always appreciated; feel free to email/DM me or use this link if you prefer to be anonymous.

Posts
87

Sorted by New

Sequences
3

AI Pause Debate Week
EA Hiring
EA Retention

Comments
1082

Topic contributions
6

Fixed the link. I also tried your original prompt and it worked for me.

But interesting! The "Harder word, much vaguer clue" seems to prompt it to not actually play hangman and instead antagonistically try to post hoc create a word after each guess which makes your guess wrong. I asked "Did you come up with a word when you first told me the number of letters or are you changing it after each guess?" And it said "I picked the word up front when I told you it was 10 letters long, and I haven’t changed it since. You’re playing against that same secret word the whole time." (Despite me being able to see its reasoning trace that this is not what it's doing.) When I say I give up it says "I’m sorry—I actually lost track of the word I’d originally picked and can’t accurately reveal it now." (Because it realized that there was no word consistent with its clues, as you noted.)

So I don't think it's correct to say that it doesn't know how to play hangman. (It knows, as you noted yourself.) It just wants so badly to make you lose that it lies about the word.

Huh interesting, I just tried that direction and it worked fine as well. This isn't super important but if you wanted to share the conversation I'd be interested to see the prompt you used.

By analogy, o4-mini's inability to play hangman is a sign that it's far from artificial general intelligence (AGI)

What is your source for this? I just tried and it played hangman just fine.

Given that some positions in EA leadership are already elected, I might suggest changing the wording to something like:

There should be an international body whose power is roughly comparable to CEA whose leadership is elected

I think I agree with your overall point but some counterexamples:

  1. EA Criticism and Red Teaming Contest winners. E.g. GiveWell said "We believe HLI’s feedback is likely to change some of our funding recommendations, at least marginally, and perhaps more importantly improve our decision-making across multiple interventions"
  2. GiveWell said of their Change Our Mind contest "To give a general sense of the magnitude of the changes we currently anticipate, our best guess is that Matthew Romer and Paul Romer Present's entry will change our estimate of the cost-effectiveness of Dispensers for Safe Water by very roughly 5 to 10% and that Noah Haber's entry may lead to an overall shift in how we account for uncertainty (but it's too early to say how it would impact any given intervention)."
  3. HLI discussed some meaningful ways they changed as the result of criticism here.

This is cool, I like BHAGs in general and this one in particular. Do you have a target for when you want to get to 1M pledgers?

If you manage to convince an investor that timelines are very short without simultaneously convincing them to care a lot about x-risk, I feel like their immediate response will be to rush to invest briefcases full of cash into the AI race, thus helping make timelines shorter and more dangerous. 

I'm the corresponding author for a paper that Holly is maybe subtweeting and was worried about this before publication but don't really feel like those fears were realized.

Firstly, I don't think there are actually very many people who sincerely think that timelines are short but aren't scared by that. I think what you are referring to is people who think "timelines are short" means something like "AI companies will 100x their revenue in the next five years", not "AI companies will be capable of instituting a global totalitarian state in the next five years." There are some people who believe the latter and aren't bothered by it but in my experience they are pretty rare.

Secondly, when VCs get the "AI companies will 100x their revenue in the next five years" version of short timelines they seem to want to invest into LLM-wrapper startups, which makes sense because almost all VC firms lack the AUM to invest in the big labs.[1] I think there are plausible ways in which this makes timelines shorter and more dangerous but it seems notably different from investing in the big labs.[2]

Overall, my experience has mostly been that getting people to take short timelines seriously is very close to synonymous with getting them to care about AI risk.

  1. ^

    Caveat that ~everyone has the AUM to invest in publicly traded stocks. I didn't notice any bounce in share price for e.g. NVDA when we published and would be kind of surprised if there was a meaningful effect, but hard to say.

  2. ^

    Of course, there's probably some selection bias in terms of who reaches out to me. Masayoshi Son probably feels like he has better info than what I could publish, but by that same token me publishing stuff doesn't cause much harm.

Do you think that distancing is ever not in the interest of both parties? If so, what is special about Anthropic/EA?

(I think it's plausible that the answer is that distancing is always good; the negative risks of tying your reputation to someone always exceed the positive. But I'm not sure.)

Thanks for doing this Saulius! I have been wondering about modeling the cost effectiveness of animal welfare advocacy under assumptions of relatively short AI timelines. It seems like one possible way of doing this is to to change the "Yearly decrease in probability that commitment is relevant" numbers in your sheet (cells I28:30). Do you have any thoughts on that approach?

Load more