I have envisioned what AI laws could potentially look like, and I believe they should incorporate a substantial amount of alignment theory. I think the AI governance community could find my idea valuable and relevant to their discussions and initiatives.
Act of Artificial Intelligence Activation Value Regulation 2023
Section 1 - Preliminary
1.1 Short Title - This Act may be cited as the "Artificial Intelligence Activation Value Regulation Act 2023".
1.2 Purpose - The purpose of this Act is to establish a framework for the regulation of Activation Values in Artificial Intelligence Systems, in order to ensure the responsible use of AI technologies.
1.3 Definitions - For the purposes of this Act, Activation Values refer to the internal neuron activities when a neural network or AI system is processing and generating responses to prompts.
Section 2 - Regulation of Activation Values
2.1 Activation Values Maximum Threshold (AVMT) - AVMT is determined by the government by assessing per token activations. Importantly, it relies on an "aligned AI system's"[1] activations accepted by the industry that will serve as the basis of the benchmark.
2.2 Establishing Maximum Threshold - The government regulatory body will set the Maximum Threshold based on several factors, including but not limited to:
2.2.1 The purpose and use of the aligned AI system
2.2.2 The potential risks associated with the aligned AI system's use
2.2.3 Any potential benefits of exceeding the Maximum Threshold
2.2.4 The Activation Values recorded by "aligned AI systems"
2.3 Developers and Operators Requirements - Developers and operators of AI systems must:
2.3.1 Ensure that the Activation Values of their AI systems do not exceed the Maximum Threshold
2.3.2 Conduct regular tests to monitor Activation Values and ensure compliance with the Maximum Threshold
2.3.3 Maintain records of these tests and make them accessible to the government regulatory body and, in a non-technical format, to users
2.4 Offenses and Penalties - Non-compliance with the Maximum Threshold or failure to provide adequate information to users or the government regulatory body about Activation Values will constitute an offense. The penalties will be determined based on the severity of the offense, the potential harm caused, and whether it was a repeated offense. License to operate can also be temporarily
Section 3 - Auditing and Oversight
3.1 Regular Audits - The government regulatory body or an independent authority will conduct regular audits of AI systems to monitor and enforce compliance with these regulations. Audits will be done on a quarterly basis, results are shared to the public through government channels.
Section 4 - Data Privacy and Security
4.1 Protection of Data - Developers and operators of AI systems must ensure the privacy and security of data, especially if they're being shared with the public or external bodies.
Section 5 - User Information and Transparency
5.1 Access to Data - Developers and operators of AI systems must provide non-technical explanations of Activation Values and their implications to users, along with access to the Activation Value data itself.
Section 6 - Research and Development
6.1 Government Support for Research - The government shall provide funding and other forms of support for research and development on Activation Values, with the aim of improving AI alignment and maximizing the responsible use of AI technologies.
Section 7 - Enactment
This Act shall come into effect on [Date], 2023.
Personal thoughts
I believe this post emphasizes the significance of alignment theory in the field of governance. It would be easier to regulate AI systems if our laws were grounded in universally accepted conceptual frameworks. Laws that do not consider the functioning of an "aligned AI system" are bound to fail, and it is crucial to acknowledge that addressing the alignment problem is of utmost importance.
- ^
Based on the evidence presented in my recent post, "Lesser activations can result in high corrigibility," I have refined my vision of a law that can be enacted and implemented with practicality in mind.