R

RohanS

55 karmaJoined Apr 2021

Posts
2

Sorted by New

Comments
4

I think both things happen in different contexts. (Being socially rewarded just for saying you care about AI Safety, and not being taken seriously because (it seems like) you have not thought it through carefully, that is.)

Thanks so much for writing this post Dave; I find this really helpful for pinning down some of the perceived and real issues with the EA community.

I think some people have two stable equilibria: one being ~“do normal things” and the other being “take ideas seriously” (obviously an oversimplification). I think getting from the former to the latter often requires some pressure, but the latter can be inhabited without sacrificing good epistemics and can be much more impactful. Plus, people who make this transition often end up grateful that they made it, and wish they’d made it earlier. I think other people basically don’t have these two stable equilibria, but some of those have an unstable equilibrium for taking ideas seriously which is epistemically unsound, and it becomes stable through social dynamics rather than by thinking through the ideas carefully, which is bad… but also potentially good for the world if they can do good work despite the unsound epistemic foundation… This is messy and I don’t straightforwardly endorse it, but I also can’t honestly say that it’s obvious to me we should always prioritize pure epistemic health if it trades off against impact here. Reducing “the kind of outreach and social pressure that harms epistemic health” might also reduce the number of both kinds of people who take ideas seriously. Maybe there is no tradeoff; maybe this is ultimately bad from an impact perspective too, or maybe there’s a way to get the good without the bad. But that’s not clear to me, and I would love to hear anyone’s suggestions.

(The stable vs. unstable equilibrium concept isn’t described exactly right, but I think the point is clear.)

Answer by RohanSJul 20, 20232
0
0

Does he think all the properties of superintelligent systems that will be relevant for the success of alignment strategies already exist in current systems? That they will exist in systems within the next 4 years? (If not, aren't there extremely important limitations to our ability to empirically test the strategies and figure out if they are likely to work?)

Answer by RohanSApr 28, 20212
0
0

I think Asimov’s Foundation is a great example of this! It features the establishment of a space colony with the goal of drastically reducing the length of an age of galactic barbarism, and ways that select inhabitants very cleverly manage to help this colony survive against various threats.