AI Governance Needs Technical Work

Mau

Summary and Introduction

People who want to improve the trajectory of AI sometimes think their options for object-level work are (i) technical work on AI alignment or (ii) non-technical work on AI governance. But there is a whole other category of options: technical work in AI governance. This is technical work that mainly boosts AI governance interventions, such as norms, regulations, laws, and international agreements that promote positive outcomes from AI. This piece provides a brief overview of some ways to do this work—what they are, why they might be valuable, and what you can do if you’re interested. I discuss:

Engineering technical levers to make AI coordination/regulation enforceable (through hardware engineering, software/ML engineering, or heat/electromagnetism-related engineering)
Information security: Developing and implementing systems and best practices for securing model weights and other AI technology
Forecasting AI development
Technical standards development
Grantmaking or management to get others to do the above well
Advising on the above
Other work

[Update] Additional categories which the original version of this piece (from 2022) under-emphasized or missed are:

AI control: Developing systems and best practices for overseeing and constraining AI systems that may not be trustworthy (example)
Model evaluations: Developing technical evaluations of the safety of AI systems (discussion, examples)
Forecasting hardware trends (examples)
Cooperative AI: Research in game theory, ML, and decision theory for designing AI systems in ways that avoid costly coordination failures (discussion, examples)

I expect there will likely be one or more resources providing more comprehensive introductions to many of these topics in early 2024. For now, see the above links to learn more about the topics added in the update, and see below for more discussion of the originally listed topics.

Acknowledgements

Thanks to Lennart Heim, Jamie Bernardi, Gabriel Mukobi, Girish Sastry, and others for their feedback on this post. Mistakes are my own.

Context

What I mean by “technical work in AI governance”

I’m talking about work that:

Is technical (e.g. hardware/ML engineering) or draws heavily on technical expertise; and
Contributes to AI’s trajectory mainly by improving the chances that AI governance interventions succeed^[1] (as opposed to by making progress on technical safety problems or building up the communities concerned with these problems).

Neglectedness

As of writing, there are (by one involved expert’s estimate) ~8-15 full-time equivalents doing this work with a focus on especially large-scale AI risks.^[2]

Personal fit

For you to have a strong personal fit for this type of work, technical skills are useful, of course (including but not necessarily in ML), and interest in the intersection of technical work and governance interventions presumably makes this work more exciting for someone.

Also, whatever it takes to make progress on mostly uncharted problems in a tiny sub-field^[3] is probably pretty important for this work now, since that’s the current nature of these fields. That might change in a few years. (But that doesn’t necessarily mean you should wait; time’s ticking, someone has to do this early-stage thinking, and maybe it could be you.)

What I’m not saying

I’m of course not saying this is the only or main type of work that’s needed. (Still, it does seem particularly promising for technically skilled people, especially under the debatable assumption that governance interventions tend to be more high-leverage than direct work on technical safety problems.)

Types of technical work in AI governance

Engineering technical levers to make AI coordination/regulation enforceable

To help ensure AI goes well, we may need good coordination and/or regulation.^[4] To bring about good coordination/regulation on AI, we need politically acceptable methods of enforcing them (i.e. catching and penalizing/stopping violators).^[5] And to design politically acceptable methods of enforcement, we need various kinds of engineers, as discussed in the next several sections.^[6]

Hardware engineering for enabling AI coordination/regulation

To help enforce AI coordination/regulation, it might be possible to create certain on-chip devices for AI-specialized chips or other devices at data centers. As a non-exhaustive list of speculative examples:

Devices on network switches that identify especially large training runs could be helpful.
- They could help enforce regulations that apply only to training runs above a certain size (which, among other benefits, seem much easier politically than trying to regulate all uses of compute).
If there were on-chip devices tracking the number of computations done on chips, that could help an agency monitor how much compute various data centers and organizations are using.
- That could help enforce regulations whose application depends on the amount of compute being used by an AI developer or data center (which, among other benefits, seems much easier politically than trying to regulate everyone who uses compute).
Dead man’s switches on AI hardware (or other hardware-enabled authorization requirements) could peacefully keep rogue organizations from harmful AI development or deployment (e.g. by interfering early on in a training run).

Part of the engineering challenge here is that, ideally (e.g. for political acceptability), we may want such devices to not only work but to also be (potentially among other desired features):

Secure;
Privacy-preserving;
Cheap;
Tamper-indicating; and
Tamper-proof.^[7]

Software/ML engineering for enabling AI coordination/regulation

Software (especially ML) engineering could help enforce AI coordination/regulation in various ways^[8], including the following:

Methods/software for auditing ML models could help determine when and how regulations should be applied (e.g. it could help determine that some model may not be deployed yet because it has capabilities that current safety methods do not address) (see here for an example of such work);
ML applications to satellite imagery (visual and infrared) could help identify secret data centers;
Software (maybe ML) for analyzing hardware devices or perhaps video data could help detect efforts to tamper with the hardware devices discussed in the previous section; and
ML applications to open-source data or other types of data could help identify violations.

Heat/electromagnetism-related engineering for enabling AI coordination/regulation

For enforcing AI coordination/regulation against particularly motivated violators, it could be helpful to be able to identify hidden chips or data centers using their heat and electromagnetic signatures. People who know a lot about heat and electromagnetism could presumably help design equipment or methods that do this (e.g. mobile equipment usable at data centers, equipment that could be installed at data centers, methods for analyzing satellite data, and methods for analyzing data collected about a facility from a nearby road.)

Part of the challenge here is that these methods should be robust to efforts to conceal heat and electromagnetic signatures.

Information security

Information security could matter for AI in various ways, including the following:

It would be bad if people steal unsafe ML models and deploy them. It would also be bad if AI developers rush to deploy their own models (e.g. with little testing or use of safety methods) because they are scared that, if they wait too long, someone else will steal their models and deploy them first. Sufficiently good information security in AI developers (including their external infrastructure) would mitigate these problems.
Information security in regulatory agencies might help enable coordination/regulations on AI to be enforced in a politically acceptable way; it could assure AI developers that their compliance will be verified without revealing sensitive information, while assuring a regulator that the data they are relying on is authentic.
- This could include the use of cryptographic techniques in the hardware devices, model evaluation software, and other equipment discussed above.
Information security in hardware companies could help keep the semiconductor supply chain concentrated in a small number of allied countries, which might help enable governance of this supply chain.

See here, here, and here (Sections 3.3 and 4.1), and listen here [podcast] for more information. As these sources suggest, information security overlaps with—but extends beyond—the engineering work mentioned above.

Forecasting AI development

AI forecasters answer questions about what AI capabilities are likely to emerge when. This can be helpful in several ways, including:

Helping AI governance researchers account for ways in which near-term advances in AI will change the strategic landscape (e.g. through the introduction of new tools or new threats, or through raising how much attention various actors are paying to AI);
Helping determine the urgency and acceptable timelines for various kinds of work; and
Helping set parameters for (coordinated) AI regulations (e.g. if some regulation would only apply to models trained with at least some amount of compute, precisely how many FLOPs should be treated as highly risky? What are the cost penalties of decentralized training, which might change what regulators need to look for at each data center?)

Typically, this work isn’t engineering or classic technical research; it often involves measuring and extrapolating AI trends, and sometimes it is more conceptual/theoretical. Still, familiarity with relevant software or hardware often seems helpful for knowing what trends to look for and how to find relevant data (e.g. “How much compute was used to train recent state–of-the-art models?”), as well as for being able to assess and make arguments on relevant conceptual questions (e.g. “How analogous is gradient descent to natural selection?”).

See here (Section I) and here^[9] for some collections of relevant research questions; see [1], [2], [3], [4], and [5] for some examples of AI forecasting work; and listen here [podcast] for more discussion.

Technical standards development

One AI risk scenario is that good AI safety methods will be discovered, but they won’t be implemented widely enough to prevent bad outcomes.^[10] To help with this, translating AI safety work into technical standards (which can then be referenced by regulations, as is often done) might help. Relatedly, standard-setting could be a way for AI companies to set guardrails on their AI competition without violating antitrust laws.

Technical expertise (specifically, in AI safety) could help standards developers (i) identify safety methods that it would be valuable to standardize, and (ii) translate safety methods into safety standards (e.g. by precisely specifying them in widely applicable ways, or designing testing and evaluation suites for use by standards^[11]).

Additionally, strengthened cybersecurity standards for AI companies, AI hardware companies, and other companies who process their data could help address some of the information security issues mentioned above.

See here for more information.

Grantmaking or management to get others to do the above well

Instead of doing the above kinds of work yourself, you might be able to use your technical expertise to (as a grantmaker or manager) organize others in doing such work. Some of the problems here appear to be standard, legible technical problems, so it might be very possible for you to leverage contractors, grantees, employees, or prize challenge participants to solve these problems, even if they aren’t very familiar with or interested in the bigger picture.

Couldn’t non-experts do this well? Not necessarily; it might be much easier to judge project proposals, candidates, or execution if you have subject-matter expertise. Expertise might also be very helpful for formulating shovel-ready technical problems.

Advising on the above

Some AI governance researchers and policymakers may want to bet on certain assumptions about the feasibility of certain engineering or infosec projects, on AI forecasts, or on relevant industries. By advising them with your relevant expertise, you could help allies make good bets on technical questions. A lot of this work could be done in a part-time or “on call” capacity (e.g. while spending most of your work time on what the above sections discussed, working at a relevant hardware company, or doing other work).

Others?

I’ve probably missed some kinds of technical work that can contribute to AI governance, and across the kinds of technical work I identified, I’ve probably missed many examples of specific ways they can help.

Potential next steps if you’re interested

Contributing in any of these areas will often require you to have significant initiative; there aren’t yet very streamlined career pipelines for doing most of this work with a focus on large-scale risks. Still, there is plenty you can do; you can:

Learn more about these kinds of work, e.g. by following the links in the above sections (as well as this link, which overlaps with several hardware-related areas).
Test your fit for these areas, e.g. by taking an introductory course in engineering or information security, or by trying a small, relevant project (say, on the side or in a research internship).
Build relevant expertise, e.g. by extensively studying or working in a relevant area.
- Grantmakers like the Long-Term Future Fund might be interested in supporting relevant self-education projects.
Learn about and pursue specific opportunities to contribute, especially if you have a serious interest in some of this work or relevant experience, e.g.:
- Reach out to people who work in related areas (e.g. cold-email authors of relevant publications, or reach out at community conferences).
- Apply for funding if you have a project idea.
  - Georgetown’s Center for Security and Emerging Technology (CSET) might be interested in funding relevant projects (though, speculating based on a public announcement from the relevant grantmaker, they might have limited capacity in this area for the next few months).
- Keep an eye out for roles on relevant job boards.
Feel free to reach out to the following email address if you have questions or want to coordinate with some folks who are doing closely related work^[12]:
- technical-ai-governance [ät] googlegroups [döt] com

Notes

This includes creating knowledge that enables decision-makers to develop and pursue more promising AI governance interventions (i.e. not just boosting interventions that have already been decided on). ↩︎
Of course, there are significantly more people doing most of these kinds of work with other concerns, but such work might not be well-targeted at addressing the concerns of many on this forum. ↩︎
courage? self-motivation? entrepreneurship? judgment? analytical skill? creativity? ↩︎
To elaborate, a major (some would argue central) difficulty with AI is the potential need for coordination between countries or perhaps labs. In the absence of coordination, unilateral action and race-to-the-bottom dynamics could lead to highly capable AI systems being deployed in (sometimes unintentionally) harmful ways. By entering enforceable agreements to mutually refrain from unsafe training or deployments, relevant actors might be able to avoid these problems. Even if international agreements are infeasible, internal regulation could be a critical tool for addressing AI risks. One or a small group of like-minded countries might lead the world in AI, in which case internal regulation by these governments might be enough to ensure highly capable AI systems are developed safely and used well. ↩︎
To elaborate, international agreements and internal regulation both must be enforceable in order to work. The regulators involved must be able to catch and penalize (or stop) violators—as quickly, consistently, and harshly as is needed to prevent serious violations. But agreements and regulations don’t “just” need to be enforceable; they need to be enforceable in ways that are acceptable to relevant decision-makers. For example, decision-makers would likely be much more open to AI agreements or regulations if their enforcement (a) would not expose many commercial, military, or personal secrets, and (b) would not be extremely expensive. ↩︎
After all, we currently lack good enough enforcement methods, so some people (engineers) need to make them. (Do you know of currently existing and politically acceptable ways to tell whether AI developers are training unsafe AI systems in distant data centers? Me neither.) Of course, we also need others, e.g. diplomats and policy analysts, but that is outside the scope of this post. As a motivating (though limited) analogy, the International Atomic Energy Agency relies on a broad range of equipment to verify that countries follow the Treaty on the Non-Proliferation of Nuclear Weapons. ↩︎
Literally “tamper-proof” might be infeasible, but “prohibitively expensive to tamper with at scale” or “self-destroys if tampered with” might be good enough. ↩︎
This overlaps with cooperative AI. ↩︎
Note the author of this now considers it a bit outdated. ↩︎
In contrast, some other interventions appear to be more motivated by the worry that there won’t be time to discover good safety methods before harmful deployments occur. ↩︎
This work might be similar to the design of testing and evaluation suites for use by regulators, mentioned in the software/ML engineering section. ↩︎
I’m not managing this email; a relevant researcher who kindly agreed to coordinate some of this work is. They have a plan that I consider credible for regularly checking what this email account receives. ↩︎

MMMaasSep 7 20229

+1 to this proposal and focus.

On 'technical levers to make AI coordination/regulation enforceable', there is a fair amount of work suggesting that e.g. arms control agreements have often dependend on/been enabled by new technological avenues for enabling unilateral monitoring (or for enabling cooperative, but non-intrusive monitoring - e.g. sensors on missile factories, as part of the US-USSR INF Treaty), have been instrumental (see Coe and Vaynmann 2020 ).

That doesn't mean that it's always an unalloyed good: there are indeed cases where new capabilities can introduce new security or escalation risks (e.g. Vaynmann 2021); they can also perversely hold up negotiations; e.g. Richard Burns (link, introduction) discusses a case where the involvement of engineers in designing a monitoring system for the Comprehensive Test Ban Treaty, actually held up negotiations of the regime, basically because the engineers focused excessively on technical perfection of the monitoring system [beyond a level of assurance that would've been strictly politically required by the contracting parties], which enabled opponents of the treaty to paint it as not giving sufficiently good guarantees.

Still, beyond improving enforcement, there's interesting work on ways that AI technology could speed up and support the negotiation of treaty regimes (Deeks 2020, 2020b, Maas 2021), both for AI governance specifically, and in supporting international cooperation more broadly.

tamgentSep 7 20226

I am a software engineer who transitioned to tech/AI policy/governance. I strongly agree with the overall message (or at least title) of this article: that AI governance needs technical people/work, especially for the ability to enforce regulation.

However in the 'types of technical work' you lay out I see some gaping governance questions/gaps. You outline various tools that could be built to improve the capability of actors in the governance space, but there are many such actors, and tools by their nature are dual use - where is the piece on who these tools would be wielded by, and how they can be used responsibly? I would be more excited about seeing new initiatives in this space that clearly set out which actors it works with for which kinds of policy issues and which not and why. Also there is a big hole around not being conflicted etc. There's lots of legal issues that can't be avoided that crop up when you need to actually use such tools in any context beyond a voluntary initiative of a company (which does not give as many guarantees as things that apply to all current and future companies, like regulations or to some extent standards). There is and will be increasingly a huge demand for companies with practical AI auditing expertise - this is a big opportunity to start trying to fill that gap.

I think the section on 'advising on the above' could be fleshed out a whole lot more. At least I've found that because this area is very new, there is a lot of talking to do with lots of different people, lots of translation, before getting to actually do these things... it's helpful if you're the kind of technical person who is willing to learn how to communicate to a non-technical audience, and to learn from people with other backgrounds about the constraints and complexities of the policymaking world, and derives satisfaction from this. I think this is hugely worthwhile though - and if you're the kind of person who is willing to do that and looking for work in the area, do get in touch as I have some opportunities (in the UK).

Finally, I'll just more explicitly now highlight the risk of technical people being used for the aims of others (that may or may not lead to good outcomes) in this space. In my view, if you really want to work in this intersection you should be asking the above questions about anything you build - who will use this thing and how, and what are the risks and can I reduce them. And when you advise powerful actors, bringing your technical knowledge and expertise, do not be afraid to also give your opinions to decision-makers on what might lead to what kinds of real world outcomes, and ask questions about the application aims, and improve those aims.

MauSep 7 20224

Thanks for the comment! I agree these are important considerations and that there's plenty my post doesn't cover. (Part of that is because I assumed the target audience of this post--technical readers of this forum--would have limited interest in governance issues and would already be inclined to think about the impacts of their work. Though maybe I'm being too optimistic with the latter assumption.)

Were there any specific misuse risks involving the tools discussed in the post that stood out to you as being especially important to consider?

Effective Altruism Forum
EA Forum