In order to understand, plan, and communicate about research on the superintelligence control problem, I’ve found it helpful to split the field into three main areas, corresponding to the kinds of questions that each piece of work is ultimately trying to answer and the ways that those answers will help to mitigate risk. Thinking about these areas also helps me to keep my bearings and to understand how specific ideas fit into the broader picture. For the moment, I think there are three usefully distinct areas of research: technical foresight, strategy, and system design.
- Technical foresight: by understanding the potential properties of superintelligent machines as precisely and methodically as possible, this work helps us to understand and communicate about the risks we could face. This understanding can be used to inform requirements and options in strategy and system design; can be used to gain support for further research investment and implementation of strategies; and can attract people with high rigor requirements to the field, especially if good methodologies and standards are used for the work.
- Strategy: it is not yet clear how we as a society can best mitigate the risk of loss of control of superintelligent AI systems; more strategic work is needed on this question. This work can also give us perspective to help steer superintelligence theory and system design work, by clarifying what the most relevant questions in those areas are likely to be. It can also attract resources and people who currently don’t see any way to effectively mitigate risks.
- System design: system design work’s direct impact comes through enabling us to implement strategies that require particular technical capabilities, such as construction of reliable superintelligent machines or containment of potentially superintelligent machines. This object-level knowledge could be especially impactful if machine superintelligence arrives surprisingly early. System design and engineering research can also inform policy requirements by showing what kinds of options are more or less likely to be feasible, and has been successful in attracting people and resources for further research.
I’d love to hear any comments from people on this forum on the content and format, and to answer any questions you have on the topic!