Game theory work on AI alignment with diverse AI systems, human individuals, & human groups?

Geoffrey Miller

CLR (https://longtermrisk.org/ ) works on multipolar scenarios, multi-agent systems and game theory, both technical problems and macrostrategy, prioritizing the reduction of conflicts that increase s-risks. The associated foundation https://www.cooperativeai.com/ supports work on similar problems.

For a technical paper on Pareto improvements, see https://link.springer.com/article/10.1007/s10458-022-09574-6

CLR and CRS (https://centerforreducingsuffering.org/ ) also have worked on risks from malevolent actors (https://forum.effectivealtruism.org/posts/LpkXtFXdsRd4rG8Kb/reducing-long-term-risks-from-malevolent-actors ). I'm not sure if s-risks from sadism or retributionism are being worked on, but they're discussed briefly here: https://centerforreducingsuffering.org/research/a-typology-of-s-risks/

I imagine the work often focuses on a small number of groups, but maybe it generalizes. I'm not aware of more concrete realistic models (rather than more toy models or models that aren't aiming to capture likely preferences and values), but I wouldn't be surprised if they exist. This isn't my area of focus, so I'm not that well-informed. I imagine AI safety/governance groups and especially CLR and CRS are thinking about this, but may not have built explicit models.

Effective Altruism Forum
EA Forum

[ Question ]

Game theory work on AI alignment with diverse AI systems, human individuals, & human groups?

22

22

Reactions

1 Answers sorted by
Top

Mar 02, 2023

[ Question ]

Game theory work on AI alignment with diverse AI systems, human individuals, & human groups?

22

22

Reactions

1 Answers sorted by Top

Mar 02, 2023

1 Answers sorted by
Top