Is anyone aware of work going into detail on how an international regulator for AI would function, how compliance might be monitored etc?

12

0
0

Reactions

0
0
Comments5
Sorted by Click to highlight new comments since: Today at 5:22 AM

Not really, or it depends on what kinds of rules the IAIA would set.

For monitoring large training runs and verifying compliance, see Verifying Rules on Large-Scale NN Training via Compute Monitoring (Shavit 2023).

Some more sketching of auditing with model evals is in Model evaluation for extreme risks (DeepMind 2023).

For completeness, here's what OpenAI says in its "Governance of superintelligence" post:

Second, we are likely to eventually need something like an IAEA for superintelligence efforts; any effort above a certain capability (or resources like compute) threshold will need to be subject to an international authority that can inspect systems, require audits, test for compliance with safety standards, place restrictions on degrees of deployment and levels of security, etc. Tracking compute and energy usage could go a long way, and give us some hope this idea could actually be implementable. As a first step, companies could voluntarily agree to begin implementing elements of what such an agency might one day require, and as a second, individual countries could implement it. It would be important that such an agency focus on reducing existential risk and not issues that should be left to individual countries, such as defining what an AI should be allowed to say.


 

It's interesting how OpenAI basically concedes that it's a fruitless effort further down in the very same post:

Because the upsides are so tremendous, the cost to build it decreases each year, the number of actors building it is rapidly increasing, and it’s inherently part of the technological path we are on, stopping it would require something like a global surveillance regime, and even that isn’t guaranteed to work.

It's not hard to imagine compute eventually becoming cheap and fast enough to train GPT4+ models on high-end consumer computers. How does one limit homebrewed training runs without limiting capabilities that are also used for non-training purposes?

This doesn't point to detailed work in the space, but in "Nearcast-based 'deployment problem' analysis", Karnofsky writes:

I’m also introducing another hypothetical actor, “IAIA[1]”: an organization, which could range from a private nonprofit to a treaty-backed international agency, that tracks[2] transformative AI projects and takes actions to censure or shut down dangerous ones, as well as doing other things where a central, neutral body (as opposed to an AI company) can be especially useful. (Some more details on the specifics of what IAIA can do, and what sort of “power” it might have, in a footnote.[3])

  1. ^

    A hypothetical International AI Agency (name inspired by IAEA). Pronunciation guide here.

  2. ^

    Monitoring would be with permission and assistance in the case where IAIA is a private nonprofit, i.e., in this case AI companies would be voluntarily agreeing to be monitored.

  3. ^

    And here's that footnote:

    There’s a wide variety of possible powers for IAIA. For most of this post, I tend to assume that it is an agency designed for flexibility and adaptiveness, not required or enabled to execute any particular formal scheme along the lines of “If verifiable event X happens, IAIA may/must take pre-specified action Y.”

    Instead, IAIA’s central tool is its informal legitimacy. It has attracted top talent and expertise, and when it issues recommendations, the recommendations are well-informed, well-argued, and commonly seen as something governments should follow by default.

    In the case where IAIA has official recognition from governments or international bodies, there may be various formal provisions that make it easier for governments to quickly take IAIA’s recommendations (e.g., Congressional pre-authorizations for the executive branch to act on formal IAIA recommendations).

I don't have a link to the report itself but Jason Hausenloy started some work on this a few months ago. https://youtu.be/1QY1L61TKx0