Updated 13 September 2022 with a link to our arXiv paper and corrections to out-of-date items
This is a linkpost to our working paper “Towards AI Standards Addressing AI Catastrophic Risks: Actionable-Guidance and Roadmap Recommendations for the NIST AI Risk Management Framework”, which we co-authored with our UC Berkeley colleagues Jessica Newman and Brandie Nonnecke. Here is a link:
- pdf on arXiv (55 pp, last revised 7 September 2022)
We seek feedback from readers considering catastrophic risks as part of their work on AI safety and governance. Please email feedback to Tony Barrett at anthony.barrett@berkeley.edu.
If you are providing feedback on the draft guidance in this document, in addition to any comments via email, it would be particularly helpful if you answer the questions in Appendix 2 of this document or in the following Google Form:
https://docs.google.com/forms/d/1XHshf_mKBijbP5RZI2JAlclnE6LZJrujDYWUANDt090/
We may update the links or content in this post to reflect the latest version of the document.
Background on the NIST AI RMF
The National Institute of Standards and Technology (NIST) is currently developing the NIST Artificial Intelligence Risk Management Framework, or AI RMF. NIST intends the AI RMF as voluntary guidance on AI risk assessment and other AI risk management processes for AI developers, users, deployers, and evaluators. NIST plans to release Version 1.0 of the AI RMF in early 2023.
As voluntary guidance, NIST would not impose “hard law” mandatory requirements for AI developers or deployers to use the AI RMF. However, AI RMF guidance would be part of “soft law” norms and best practices, which AI developers and deployers would have incentives to follow as appropriate. For example, insurers or courts may expect AI developers and deployers to show reasonable usage of relevant NIST AI RMF guidance as part of due care when developing or deploying AI systems in high-stakes contexts, in much the same way that NIST Cybersecurity Framework guidance can be used as part of demonstrating due care for cybersecurity. In addition, elements of soft-law guidance are sometimes adapted into hard-law regulations, e.g., by mandating that particular industry sectors comply with specific standards.
Summary of our Working Paper
In this document, we provide draft elements of actionable guidance focused primarily on identifying and managing risks of events with very high or catastrophic consequences, intended to be easily incorporated by NIST into the AI RMF. We also provide our methodology for development of our recommendations.
We provide actionable-guidance recommendations for AI RMF 1.0 on:
- Identifying risks from unintended uses and misuses of AI systems
- Including potential catastrophic-risk factors within the scope and time frame of risk assessments and impact assessments
- Identifying and mitigating human rights risks
- Reporting information on AI risk factors including catastrophic-risk factors
We also provide recommendations on additional issues for NIST to address as part of the roadmap for later versions of the AI RMF or supplementary publications, on the grounds that they are critical topics but appropriate guidance development would take additional time. Our recommendations for the AI RMF roadmap include:
- Providing an AI RMF Profile with supplementary guidance for cutting-edge increasingly general-purpose AI. For development of such AI, examples of actionable guidance could include: only increase compute for AI model training incrementally, and use red-teaming or other testing methods to identify emergent properties of AI models after each incremental increase in training compute.
- A comprehensive set of governance mechanisms or controls to help organizations mitigate identified risks
- Methods for characterization and measurement of the following AI system characteristics:
- Objectives specification (i.e. alignment of system behavior with designer goals)
- Generality (i.e. breadth of AI applicability/adaptability)
- Recursive improvement potential
- Other measurement/assessment tools for technical specialists testing key aspects of AI safety, reliability, robustness, etc.
Key Sections of our Working Paper
Readers considering catastrophic risks as part of their work on AI safety and governance may be most interested in the following sections:
- Section 3.2.2.1 “Map” Guidance for Including Catastrophic-Risk Factors in Scope and Time Frame of Risk Assessments and Impact Assessments regarding risk identification and factors that could lead to severe or catastrophic consequences for society”
- Section 4.1 An AI RMF Profile for Cutting-Edge, Increasingly Multi-Purpose or General-Purpose AI (6 pp) for examples of supplementary guidance for developers of large-scale machine learning models and proto-AGI systems, especially Sections 4.1.1 and 4.1.2 on factors to consider in risk analysis, and Section 4.1.3 on risk mitigation steps to consider in risk management
Next Steps
As mentioned above, feedback to Tony Barrett (anthony.barrett@berkeley.edu) would be helpful. We will consider feedback as we work on revised versions. These will inform our recommendations to NIST on how best to address catastrophic risks and related issues in the NIST AI RMF, as well as our follow-on work for standards-development and AI governance forums.