Wooo super happy you managed to organise this!
Agree. I don't know if you meant this too, but I also think that focusing on one particular person who manages to have a lot of influence among the fellows of his or her local EA group/organisation, or generally creating a kind of cult of personality on a few leading personalities of the movement, can be dangerous in the long run. SBF is a kind of example of the unilateralist curse somehow.
Please red-team my comment, I may be talking nonsense but:
Actually, my guess is that the probability of the threat being executed is a decreasing function of the intelligence gap between the two AGIs.
The rationales behind this statement are the following:
However, if this claim is true it leads to a kind of paradox:
If timelines towards AGI are shorts and if we're in a fast take-off scenario, then x-risks are less likely to occur in conflict scenarios but at the same time it’s more likely that AGI will not be aligned thus potentially leading to catastrophic scenarios including x-risks scenarios.
The rationales/premises behind this statement are the following:
Red-team this comment please! I'm pretty sure I've misunderstood something or perhaps there is a logical flaw in my reasoning.
According to the CLR, since resource acquisition is an instrumental goal - regardless of the utility function of the AGI - , it is possible that such goal can lead to a race where each AGI can threaten others such that the target has an incentive to hand over resources or comply with the threateners’ demands. Is such a conflict scenario (potentially leading to x-risks) from two AGIs possible if these two AGIs have a different intelligence level? If so, isn't there a level of intelligence gap at which x-risks become unlikely? How to characterize this function (the probability of the threat being executed with respect to the intelligence gap between the two AGIs)? In other words, the question here is something like: how does the distribution of agent intelligence affect the threat dynamic? Has any work already been done on this?
If AGI has several terminal goals, how does it classify them? Some kind of linear combination?
I have the feeling that there is a tendency in the AI safety community to think that if we solve the alignment problem, we’re done and the future must be necessarily flourishing (I observe that some EAs say that either we go extinct or it’s heaven on earth depending on the alignment problem, in a very binary way actually). However, it seems to me that post aligned-AGI scenario merit attention as well: game theory provides us a sufficient rationale to state that even rational agents (in this cases >2 AGIs) can take sub-optimal decisions (including catastrophic scenarios) when face with some social dilemma. Any thoughts on this please?
What a great initiative! There is a lot to do around this, from energy/climate change issues to AI safety, engineers have very relevant knowledge and skills in solving some global catastrophic risks. I think this will definitely be of interest to people in the "EA maths&physics" slack group!
Thanks for this helpful post!