MikhailSamin

executive director @ AI Governance and Safety Institute

494 karmaJoined Apr 2019

contact.ms

Message

Interests:

AI alignmentAI safetyAI governance

Bio

Participation
5

Are you interested in AI X-risk reduction and strategies? Do you have experience in comms or policy? Let’s chat!

aigsi.org develops educational materials and ads that most efficiently communicate core AI safety ideas to specific demographics, with a focus on producing a correct understanding of why smarter-than-human AI poses a risk of extinction. We plan to increase and leverage understanding of AI and existential risk from AI to impact the chance of institutions addressing x-risk.

Early results include ads that achieve a cost of $0.10 per click (to a website that explains the technical details of why AI experts are worried about extinction risk from AI) and $0.05 per engagement on ads that share simple ideas at the core of the problem.

Personally, I’m good at explaining existential risk from AI to people, including to policymakers. I have experience of changing minds of 3/4 people I talked to at an e/acc event.

Previously, I got 250k people to read HPMOR and sent 1.3k copies to winners of math and computer science competitions (including dozens of IMO and IOI gold medalists); have taken the GWWC pledge; created a small startup that donated >100k$ to effective nonprofits.

I have a background in ML and strong intuitions about the AI alignment problem. I grew up running political campaigns and have a bit of a security mindset.

My website: contact.ms

You’re welcome to schedule a call with me before or after the conference: contact.ms/ea30

Posts
15

Sorted by New

Samin's Quick takes

MikhailSamin

· 3y ago · 1m read

How to Give in to Threats (without incentivizing them)

MikhailSamin

· 3mo ago

Superintelligence's goals are likely to be random

MikhailSamin

· 4mo ago

No one has the ball on 1500 Russian olympiad winners who've received HPMOR

MikhailSamin

· 5mo ago

Claude 3 claims it's conscious, doesn't want to die or be modified

MikhailSamin

· 1y ago

FTX expects to return all customer money; clawbacks may go away

MikhailSamin

· 1y ago · 2m read

An EA used deceptive messaging to advance her project; we need mechanisms to avoid deontologically dubious plans

MikhailSamin

· 1y ago · 6m read

NYT is suing OpenAI&Microsoft for alleged copyright infringement; some quick thoughts

MikhailSamin

· 2y ago

Some quick thoughts on "AI is easy to control"

MikhailSamin

· 2y ago

-4

It's OK to eat shrimp: EAs Make Invalid Inferences About Fish Qualia and Moral Patienthood

MikhailSamin

· 2y ago · 9m read

A transcript of the TED talk by Eliezer Yudkowsky

MikhailSamin

· 2y ago

Comments
75

Samin's Quick takes

MikhailSamin2mo26

The original commitment was (IIRC!) about defining the thresholds, not about mitigations. I didn’t notice ASL-4 when I briefly checked the RSP table of contents earlier today and I trusted the reporting on this from Obsolete. I apologized and retracted the take on LessWrong, but forgot I posted it here as well; want to apologize to everyone here, too, I was wrong.

Samin's Quick takes

MikhailSamin2mo1

Existential riskShow more

In RSP, Anthropic committed to define ASL-4 by the time they reach ASL-3.

With Claude 4 released today, they have reached ASL-3. They haven’t yet defined ASL-4.

Turns out, they have quietly walked back on the commitment. The change happened less than two months ago and, to my knowledge, was not announced on LW or other visible places unlike other important changes to the RSP. It’s also not in the changelog on their website; in the description of the relevant update, they say they added a new commitment but don’t mention removing this one.

Anthropic’s behavior is not at all the behavior of a responsible AI company. Trained a new model that reaches ASL-3 before you can define ASL-4? No problem, update the RSP so that you no longer have to, and basically don’t tell anyone. (Did anyone not working for Anthropic know the change happened?)

When their commitments go against their commercial interests, we can’t trust their commitments.

You should not work at Anthropic on AI capabilities.

[This comment is no longer endorsed by its author]Reply

Contracting Opportunity: Be a shortform video editor for the new 80,000 Hours Video Program (even if you haven't edited before!)

MikhailSamin3mo3

This is awesome to see!

Samin's Quick takes

MikhailSamin3mo3

AI safetyShow more

I do not believe Anthropic as a company has a coherent and defensible view on policy. It is known that they said words they didn't hold while hiring people (and they claim to have good internal reasons for changing their minds, but people did work for them because of impressions that Anthropic made but decided not to hold). It is known among policy circles that Anthropic's lobbyists are similar to OpenAI's.

From Jack Clark, a billionaire co-founder of Anthropic and its chief of policy, today:

Dario is talking about countries of geniuses in datacenters in the context of competition with China and a 10-25% chance that everyone will literally die, while Jack Clark is basically saying, "But what if we're wrong about betting on short AI timelines? Security measures and pre-deployment testing will be very annoying, and we might regret them. We'll have slower technological progress!"

This is not invalid in isolation, but Anthropic is a company that was built on the idea of not fueling the race.

Do you know what would stop the race? Getting policymakers to clearly understand the threat models that many of Anthropic's employees share.

It's ridiculous and insane that, instead, Anthropic is arguing against regulation because it might slow down technological progress.

Discussion Thread: Existential Choices Debate Week

MikhailSamin3mo1

100% agree

Our lightcone is an enormous endowment. We get to have a lot of computation, in a universe with simple physics. What these resources are spent on matters a lot.

If we get AI right (create a CEV-aligned ASI), we get most of the utility out of these resources automatically (almost tautologically, see CEV: to the extent after considering all the arguments and reflecting we think we're ought to value something, this is what CEV points to as an optimization target). If it takes us a long time to get AI right, we lose a literal galaxy of resources every year, but this is approximately nothing in relative terms.

If we die because of AI, we get ~0% of the possible value/max CEV.

Increasing the chance AI goes well is what's important. Work to marginally shift the % around the maximum seems relatively unimportant compared to the chance AI goes well. Whether we die because of AI is the largest input.

(I find negative % of CEV very implausible because it almost always doesn't make sense to spend resources on penalizing other agent's utility if the other agent is smart enough to make it not worth it and for other, more speculative reasons.)

No one has the ball on 1500 Russian olympiad winners who've received HPMOR

MikhailSamin5mo4

21k copies/61k hardcover books, each book ~630 pages long, yep!

I agree that most of the impact is from a fun attraction to adjacent ideas, not from what the book itself communicates.

No connection to the grant, yep.

It was a crowdfunding campaign, and I committed to spend at least as much on books and shipping costs (including to libraries and for educational/science popularization purposes) as we've received through the campaign. We've then run out of that money and had to spend our own (about 2.2m rubles so far) to send the books to winners of olympiads and libraries and also buy a bunch of copies of Human Compatible and The Precipice (we were able to get discounted prices). On average, it costs us around $5 to deliver a copy to a door.

We've distributed around 15k copies in total so far, most to the crowdfunding participants.

Where I Am Donating in 2024

MikhailSamin5mo6

I wouldn't include OpenAI/Anthropic's lobbying efforts in the "EA's lobbying behind closed doors" category. What evidence do you have for movement in that direction among actual EA orgs?

No one has the ball on 1500 Russian olympiad winners who've received HPMOR

MikhailSamin5mo3

I'm confused about this discrepancy between LessWrong and EA Forum. (Feedback is welcome!)

No one has the ball on 1500 Russian olympiad winners who've received HPMOR

MikhailSamin5mo33

Anecdotally, approximately everyone who's now working on AI safety with Russian origins got into it because of HPMOR. Just a couple of days ago, an IOI gold medalist reached out to me, they've been going through ARENA.

HPMOR tends to make people with that kind of background act more on trying to save the world. It also gives some intuitive sense for some related stuff (up to "oh, like the mirror from HPMOR?"), but this is a lot less central than giving people the ~EA values and making them actually do stuff.

(Plus, at this point, the book is well-known enough in some circles that some % of future Russian ML researchers would be a lot easier to alignment-pill and persuade to not work on something that might kill everyone or prompt other countries to build something that kills everyone.

Like, the largest Russian broker decided to celebrate the New Year by advertising HPMOR and citing Yudkowsky.)

I'm not sure how universal this is- the kind of Russian kid who is into math/computer science is the kind of kid who would often be into the HPMOR aesthetics- but it seems to work.

I think many past IMO/IOI medalists are generally very capable and can help, and it's worth looking at the list of them and reaching out to people who've read HPMOR (and possibly The Precipice/Human Compatible) and getting them to work on AI safety.

No one has the ball on 1500 Russian olympiad winners who've received HPMOR

MikhailSamin5mo4

We also have 6k more copies (18k hard-cover books) left. We have no idea what to do with them. Suggestions are welcome.

Here's a map of Russian libraries that requested copies of HPMOR, and we've sent 2126 copies to:

Sending HPMOR to random libraries is cool, but I hope someone comes up with better ways of spending the books.

MikhailSamin

Bio

Participation5

Posts 15

Comments75

Participation
5

Posts
15

Comments
75