MikhailSamin

· 2y ago · 1m read

· 6mo ago

· 6mo ago

Claude 3 claims it's conscious, doesn't want to die or be modified

· 6mo ago

FTX expects to return all customer money; clawbacks may go away

· 9mo ago

An EA used deceptive messaging to advance her project; we need mechanisms to avoid deontologically dubious plans

· 9mo ago · 2m read

NYT is suing OpenAI&Microsoft for alleged copyright infringement; some quick thoughts

· 9mo ago · 6m read

Some quick thoughts on "AI is easy to control"

· 11mo ago

It's OK to eat shrimp: EAs Make Invalid Inferences About Fish Qualia and Moral Patienthood

· 1y ago

-4

A transcript of the TED talk by Eliezer Yudkowsky

· 1y ago · 9m read

Where I Am Donating in 2024

· 1y ago

Comments
62

MikhailSamin6d30

Note that we've only received a speculation grant from the SFF and haven’t received any s-process funding. This should be a downward update on the value of our work and an upward update on a marginal donation's value for our work.

I'm waiting for feedback from SFF before actively fundraising elsewhere, but I'd be excited about getting in touch with potential funders and volunteers. Please message me if you want to chat! My email is ms@contact.ms, and you can find me everywhere else or send a DM on EA Forum.

On other organizations, I think:

MIRI’s work is very valuable. I’m optimistic about what I know about their comms and policy work. As Malo noted, they work with policymakers, too. Since 2021, I’ve donated over $60k to MIRI. I think they should be the default choice for donations unless they say otherwise.
OpenPhil risks increasing polarization and making it impossible to pass meaningful legislation. But while they make IMO obviously bad decisions, not everything they/Dustin fund is bad. E.g., Horizon might place people who actually care about others in places where they could have a huge positive impact on the world. I’m not sure, I would love to see Horizon fellows become more informed on AI x-risk than they currently are, but I’ve donated $2.5k to Horizon Institute for Public Service this year.
I’d be excited about the Center for AI Safety getting more funding. SB-1047 was the closest we got to a very good thing, AFAIK, and it was a coin toss on whether it would’ve been signed or not. They seem very competent. I think the occasional potential lack of rigor and other concerns don't outweigh their results. I’ve donated $1k to them this year.
By default, I'm excited about the Center for AI Policy. A mistake they plausibly made makes me somewhat uncertain about how experienced they are with DC and whether they are capable of avoiding downside risks, but I think the people who run it are smart and have very reasonable models. I'd be excited about them having as much money as they can spend and hiring more experienced and competent people.
PauseAI is likely to be net-negative, especially PauseAI US. I wouldn’t recommend donating to them. Some of what they're doing is exciting (and there are people who would be a good fit to join them and improve their overall impact), but they're incapable of avoiding actions that might, at some point, badly backfire.
I’ve helped them where I could, but they don’t have good epistemics, and they’re fine with using deception to achieve their goals.
E.g., at some point, their website represented the view that it’s more likely than not that bad actors would use AI to hack everything, shut down the internet, and cause a societal collapse (but not extinction). If you talk to people with some exposure to cybersecurity and say this sort of thing, they’ll dismiss everything else you say, and it’ll be much harder to make a case for AI x-risk in the future. PauseAI Global’s leadership updated when I had a conversation with them and edited the claims, but I'm not sure they have mechanisms to avoid making confident wrong claims. I haven't seen evidence that PauseAI is capable of presenting their case for AI x-risk competently (though it's been a while since I've looked).
I think PauseAI US is especially incapable of avoiding actions with downside risks, including deception^[1], and donations to them are net-negative. To Michael, I would recommend, at the very least, donating to PauseAI Global instead of PauseAI US; to everyone else, I'd recommend ideally donating somewhere else entirely.
Stop AI's views include the idea that a CEV-aligned AGI would be just as bad as an unaligned AGI that causes human extinction. I wouldn't be able to pass their ITT, but yep, people should not donate to Stop AI. The Stop AGI person participated in organizing the protest described in the footnote.

^{^}
In February this year, PauseAI US organized a protest against OpenAI "working with the Pentagon", while OpenAI only collaborated with DARPA on open-source cybersecurity tools and is in talks with the Pentagon about veteran suicide prevention. Most participants wanted to protest OpenAI because of AI x-risk and not because of Pentagon, but those I talked to have said they felt it was deceptive upon discovering the nature of OpenAI's collaboration with the Pentagon. Also, Holly threatened me trying to prevent the publication of a post about this and then publicly lied about our conversations, in a way that can be easily falsified by looking at the messages we've exchanged.

Samin's Quick takes

MikhailSamin8d1

Effective givingShow more

(Haven’t thought about this really, might be very wrong, but have this thought and seems good to put out there.) I feel like putting 🔸 at the end of social media names might be bad. I’m curious what the strategy was.

The willingness to do this might be anti-correlated with status. It might be a less important part of identity of more important people. (E.g., would you expect Sam Harris, who is a GWWC pledger, to do this?)
I’d guess that ideally, we want people to associate the GWWC pledge with role models (+ know that people similar to them take the pledge, too).
Anti-correlation with status might mean that people will identify the pledge with average though altruistic Twitter users, not with cool people they want to be more like.
You won’t see a lot of e/accs putting the 🔸 in their names. There might be downside effects of perception of a group of people as clearly outlined and having this as an almost political identity; it seems bad to have directionally-political properties that might do mind-killing things both to people with 🔸 and to people who might argue with them.

[Linkpost] An update from Good Ventures

MikhailSamin5mo2

Can you give an example of a non-PR risk that you had in mind?

It's OK to eat shrimp: EAs Make Invalid Inferences About Fish Qualia and Moral Patienthood

MikhailSamin6mo5

Uhm, for some reason I have four copies of this crosspost on my profile?

If trying to communicate about AI risks, make it vivid

If fish indeed don’t feel anything towards their children (which is not what at least some people who believe fish experience empathy think), then this experiment won’t prove them wrong. But if you know of a situation where fish do experience empathy, a similarly designed experiment can likely be conducted, which, if we make different predictions, would provide evidence one way or another. Are there situations where you think fish feel empathy?

It's OK to eat shrimp: EAs Make Invalid Inferences About Fish Qualia and Moral Patienthood

Great job!

Did you use causal mediation analysis, and can you share the data?

I want to note that the strawberry example wasn’t used to increase the concern, it was used to illustrate the difficulty of a technical problem deep into the conversation.

I encourage people to communicate in vivid ways while being technically valid and creating correct intuitions about the problem. The concern about risks might be a good proxy if you’re sure people understand something true about the world, but it’s not a good target without that constraint.

It's OK to eat shrimp: EAs Make Invalid Inferences About Fish Qualia and Moral Patienthood

Yep, I was able to find studies by the same people.

The experiment I suggested in the post isn’t “does fish have detectable feelings towards fish children”, it’s “does fish have more of feelings similar to those it has towards its children when it sees other fish parents with their children than when it sees just other fish children”. Results one way or another would be evidence about fish experiencing empathy, and it would be strong enough for me to stop eating fish. If fish doesn’t feel differently in presence of its children, the experiment wouldn’t provide evidence one way or another.