Isaac Dunn

553 karmaJoined


I think misaligned AI values should be expected to be worse than human values, because it's not clear that misaligned AI systems would care about eg their own welfare.

Inasmuch as we expect misaligned AI systems to be conscious (or whatever we need to care about them) and also to be good at looking after their own interests, I agree that it's not clear from a total utilitarian perspective that the outcome would be bad.

But the "values" of a misaligned AI system could be pretty arbitrary, so I don't think we should expect that.

This is a true, counterfactual match, and we will only receive the equivalent amount to what we can raise.

What will happen to the money counterfactually? Presumably it will be donated to other things the match funder thinks are roughly as good as GWWC?

Is this a problem? Seems fine to me, because the meaning is often clear, as in two of your examples, and I think it adds value in those contexts. And if it's not clear, doesn't seem like a big loss compared to a counterfactual of having none of these types of vote available.

I think that trying to get safe concrete demonstrations of risk by doing research seems well worth pursuing (I don't think you were saying it's not).

Do you have any thoughts on how should people decide between working on groups at CEA and running a group on the ground themselves?

I imagine a lot of people considering applying could be asking themselves that question, and it doesn't seem obvious to me how to decide.

To be fair, I think I'm partly making wrong assumptions about what exactly you're arguing for here.

On a slightly closer read, you don't actually argue in this piece that it's as high as 90% - I assumed that because I think you've argued for that previously, and I think that's what "high" p(doom) normally means.

Relatedly, I also think that your arguments for "p(doom|AGI)" being high aren't convincing to people that don't share your intuitions, and it looks like you're relying on those (imo weak) arguments, when actually you don't need to

I think you come across as over-confident, not alarmist, and I think it hurts how you come across quite a lot. (We've talked a bit about the object level before.) I'd agree with John's suggested approach.

Makes sense. To be clear, I think global health is very important, and I think it's a great thing to devote one's life to! I don't think it should be underestimated how big a difference you can make improving the world now, and I admire people who focus on making that happen. It just happens that I'm concerned the future might be even higher priority thing that many people could be in a good position to address.

On your last point, if you believe that the EV from a "effective neartermism -> effective longtermism" career change is greater than a "somewhat harmful career -> effective neartermism" career change, then the downside of using a "somewhat harmful career -> effective longtermism" example is that people might think the "stopped doing harm" part is more important than the "focused on longtermism" part.

More generally, I think your "arguments for the status quo" seem right to me! I think it's great that you're thinking clearly about the considerations on both sides, and my guess is that you and I would just weight these considerations differently.

Load more