List of projects that seem impactful for AI Governance

JaimeRV; Teun van der Weij

List of projects that seem impactful for AI Governance

JaimeRV,

Comments 2

Sorted by

New & upvoted

Yashvardhan

2y

1

"Create a series of in-depth essays or a blog series that breaks down and elucidates complex AI governance concepts for a general audience [1]"

is there perhaps a list of these "AI Governance Concepts" that one could find? Just curious to learn more about these and potentially distill them, if necessary.

Teun van der Weij

2y

1

I don't know of a list like this, but you could go through AI safety fundamentals governance track and look for key concepts. Potentially look through relevant papers and posts as well.

Comments

List of projects that seem impactful for AI Governance — EA Forum

AI safety

Collections and resources

AI governance

Frontpage

Goals of this post

The main goal of this post is to clarify what projects can be done at the frontier of AI Governance. Indirect goals are:

Inspiring people to work on new and fruitful projects.
Providing possible projects for newcomers.
Staking out new projects for ENAIS.
Promoting discussion on projects’ merit through making this list public.

How we collected the list

We filtered for posts from the last 14 months (with one exception) with at least 15 karma (usually >100). This post is mostly a collection of ideas from other posts/articles, but any errors are our own. We do recommend reading the original posts for additional context. We did not filter the projects in these posts; we neither in- or excluded projects based on our views, nor did we remove potentially controversial projects. Through just stating the projects, we hope to invoke Cunningham’s law to improve the projects and provide context on their expected value.

Moreover, our collection methods meant that the projects have widely varying scopes and difficulty levels. Additionally, many of these projects could have many subprojects.

We (Jaime and Teun from ENAIS) spent less than 30 hours in total on this post. We might decide to do a deeper research based on how the post is received.

Before you start on a project

If you are interested in any of the projects, we recommend you to check what has been done before. We did not look into this, the status of these projects may have changed between the posts being published, and you starting working on it. If you find out that a project already has been addressed to some extent, you could add value building on top of it, reviewing what has been done and flagging errors, collecting different views in one report, etc.

List of projects

Writing

Here we suggest any kind of project where the outcome is something written: post, paper, report, … Variants of each of these suggestions with different scopes are obviously also possible.

The general intuition behind this section is to further AI Governance by clarifying relevant concepts, as explained here: “Figure out areas where people are confused, come up with takes that would make them less confused or find people with good takes in those areas, and write them up into clear blog posts.”

Compute Governance

Conduct a detailed study on the impact of compute governance in AI advancements and publish a series of articles explaining the key findings and implications for the industry [1].
Develop a comprehensive guide on the design of compute monitoring standards for AI governance, detailing best practices, methodologies, and case studies [2].
Create a whitepaper or toolkit that outlines methods for verifying the implementation of AI compute monitoring standards in various organizational settings [2].
Design a framework or set of guidelines for enforcing compliance with AI compute monitoring standards, including potential penalties and incentives [2].
Develop a white paper on "Tamper-evident logging in GPUs": Explore the design and implications of implementing tamper-evident logging systems within GPUs to enhance security and traceability [6].
Create a framework for "Global tracking of GPUs": Design a system to track the distribution and usage of GPUs worldwide, potentially to monitor and regulate compute power [6].
Research on "Proof-of-learning algorithms": Investigate and develop algorithms that can provide proof of learning, ensuring integrity and verifiability in machine learning processes [6].
Proposal for "On-site inspections of models": Develop a set of guidelines and protocols for conducting physical inspections of AI models and their training environments [6].
Development of "Detecting data centers": Create tools or methodologies for identifying and monitoring large-scale data centers, particularly those used for intensive computing tasks [6].
Build "A suite for verifiable inference": Design a set of tools or software that can verify the inferences made by AI models, ensuring they are accurate and trustworthy [6].
Conduct a study on "Measuring effective compute use": Explore ways to measure and control algorithmic progress to ensure efficient and ethical use of computational resources [6].
White a report on "Regulating large-scale decentralized training": Investigate the challenges and propose frameworks for regulating decentralized AI training, especially if it becomes a competitive alternative to centralized methods [6].

Evals/Audits

Write-up about what kind of evals ought to be build, what specific failure modes might show up in the upcoming years,.. [2].
A small post suggesting a specific possible idea about how evals could be implemented across the industry, what kinds of agreements will be feasible, which stakeholders will be required to sign-on, and what kinds of needs/concerns those stakeholders are likely to have [2].
Suggest standardized protocols for third-party audits of models before deployment, focusing on safety and security [4].
Establish a framework for continuous evaluation of models for dangerous capabilities post-deployment [4].
Create a comprehensive guide for conducting risk assessments before training powerful models [4].
Develop detailed proposals for auditing AI systems, focusing on transparency and accountability [5].
Conduct an investigation into the signs and signals of tampering in AI training runs and how to detect them [6].
Write a report on the feasibility of autonomous AI replication across the internet and its implications [6].
Development of a white paper on the existing tools and methods the US government can utilize for auditing tech companies, with suggestions for improvement [6].

Distillation/Communication

Create a series of in-depth essays or a blog series that breaks down and elucidates complex AI governance concepts for a general audience [1].
Produce a series of high-quality podcasts or webinars aimed at educating diverse audiences about AI governance, focusing on clear and high-fidelity communication [2].
Organize workshops or training sessions for individuals with a technical background to learn how to effectively communicate AI-related ideas to policymakers [2].
Develop a guide or set of best practices for communicating about AI risks, addressing common misunderstandings, and focusing on what ideas are most important to spread [2].
A strategic plan or series of workshops aimed at raising awareness, motivating talented individuals, and enabling work in AI governance through funding and resources [3].
Compare and visualize the different public statements from AI labs leaders/employees about their views on the risks and benefits from AGI, including the level of risk they are willing to take in its development [4].
Start a series of blog posts aimed at clarifying confusing areas in AI, interviewing experts with insightful takes, and translating these into accessible, clear content for a broader audience [7].
Develop outline for educational materials and workshops to inform Congress and the public about AI risks and policy considerations, leveraging the current open-mindedness in DC[9].

Information Security

Establish a set of best practices and standards for information security specifically tailored to AI Labs and their suppliers. The project could include creating a certification process for security measures, outline training programs for employees at different levels, and forming a rapid response team to address any emerging threats quickly and efficiently [1].
Compare initiatives for AI labs to share threat intelligence and security incident information [4].
Write-up comparing the information security plans in AI labs compared to other organizations like intelligence agencies [4].
Develop a guide or framework for preventing neural network weight exfiltration, outlining potential risks and mitigation strategies [6].
Create a white paper exploring potential privilege escalation risks posed by misaligned coding assistants within secure systems [6].
Develop a system or set of protocols for effective datacenter monitoring to detect unauthorized model copies [6].
Research and propose methods for detecting unauthorized communication channels between different copies of an AI model [6].
Analyze the vulnerabilities of nuclear command and control systems to AI threats and propose safeguards [6].
Design a scalable behavior monitoring system that can aggregate and analyze monitoring logs from millions of AIs [6].

Other write-ups

Write a concrete proposal of what governments (or a specific government) should do to make AI go well [1].
Write a comprehensive proposal outlining specific actions for major AI labs (or a particular AI lab: OpenAI, Anthropic, …) to ensure ethical AI development and use [1].
Create a dynamic policy action plan for an AI lab, focusing on ethical, safe, and responsible AI development, including a roadmap for implementation [1].
Undertake a comparative analysis of AI governance models in China and the West, focusing on compute and lab governance, and write an essay with recommendations for harmonizing international AI policies [1].
Do research into the real-world application of AI policies in leading institutions versus their theoretical frameworks. Outcome could be a report that offers insights and recommendations for bridging gaps between theory and practice [1].
Develop and publish a comprehensive guide on AI threat modeling, including case studies and strategies for mitigation, aimed at policymakers and AI developers [1].
Develop a comprehensive framework for evaluating potential extreme risks posed by AI systems. The project will involve creating methodologies for assessing risks, including worst-case scenarios, and developing guidelines for implementing safeguards and monitoring mechanisms. It could include an outline for a series of workshops and collaborations with AI developers and risk analysts to validate and refine the evaluation tools [1].
Investigate technologies and methodologies for tamper-proof monitoring and verification of AI training runs. This project could involve researching hardware solutions, designing protocols for secure and verifiable training processes, and suggestions on how to implement these measures in real-world settings [1].
A legal analysis or white paper clarifying the current legal situation regarding the data that can be used for training advanced AI systems, with recommendations for AI companies to mitigate risks [3].
A proposal or pilot project for a system of bounties and whistleblower protection to incentivize responsible reporting of irresponsible decisions in AI labs and governments [3].
Suggest frameworks for AI labs to identify, analyze, and evaluate risks from powerful models before deployment [4].
Develop a set of guidelines or a framework for establishing appropriate safety restrictions for powerful models post-deployment [4].
Review of strategies and services in AI labs for commissioning external red teams before deploying powerful models [4].
Write a report on how the different models released from AI labs are used (proprietary vs OpenSource…) and their societal impact [4].
Compile a repository of state-of-the-art safety and alignment techniques for AI labs to implement [4].
Compare the AI labs across different metrics, i.e. their security incident response plans, emergency response plans, set of measures to tackle the risk of state-sponsored or industrial espionage, third-party audits of their governance structures, containment models(e.g. via boxing or air-gapping), plans for the staged deployment of powerful models.. [4].
Compare different protocols for when and how AI labs should pause the development of models with dangerous capabilities [4].
Design a framework for implementing bug bounty programs in AI labs [4].
Suggest different guidelines for when and how AI labs should open-source powerful models [4].
Write a post suggesting a credentialing system for individuals in the AI governance field [5].
Explore the implications and feasibility of banning open-source large language models and write a report on it [5].
Research and proposal on the most effective regulatory apparatus within the US government for overseeing large AI training runs [6].
Analysis of gaps in US export controls to China concerning AI technology and strategies for enhancement [6].
Survey or study predicting AI applications or demonstrations that will evoke the strongest reactions from society [6].
Scenario analysis on the deployment of AI for sensitive tasks (e.g., advising world leaders), focusing on privacy and ethical implications [6].
Research on how political discourse around AI might polarize societies and strategies to mitigate this polarization [6].
Feasibility study and roadmap for automating crucial infrastructure like factories and weapons with AI technology [6].
Creating research agendas with concrete projects and proving their academic viability by publishing early stage work in those research agendas, which would significantly help with recruiting academics [7].
Create concrete research projects with heavy engineering slants and with clear explanations for why these projects are alignment relevant, which seems to be a significant bottleneck for recruiting engineers [7].
Formulate a promising research agenda for AIS and recruit junior researchers to contribute under the guidance of senior researchers [7].
Write an in-depth analysis post discussing Ryan Greenblatt's views on ambitious mechanical interpretation, outlining why it's preferred over narrow or limited approaches, and elaborating on the broader implications for AI development [7].
Develop a speculative article exploring scenarios where an unaligned AI (referred to as an 'alien jupiter brain') takes control of a datacenter, discussing potential control failures and why such an outcome might still be manageable [7].
Craft a comprehensive comparison post of various AI optimization and finetuning methods (like RLHF, BoN, quantilization, DPO, etc.), analyzing their theoretical equivalences, practical differences, and the impact of optimization power [7].
Write a persuasive post highlighting the importance of including 'Related Work' sections in research, discussing how it benefits the field, fosters collaboration, and avoids redundant efforts [7].
Create a detailed post discussing the variety of crucial roles beyond technical AI research, offering advice on how the community can better encourage and support diverse career paths [7].
Draft an opinion piece arguing why the AI community should reduce the emphasis on status, discussing the negative impacts of status considerations and suggesting alternative focus areas for healthier discourse [7].
Research and propose strategies for how long-term planning can be more effectively incorporated into political processes, despite the disincentives for long-term thinking in current political structures[8].
Produce a series of policy briefs or a symposium discussing how governments can be leveraged for massive change, including historical examples and future prospects, particularly in the context of AI and technology policy[8].
Conduct a detailed analysis of the legislative process to identify strategic intervention points for AI safety bills and develop bipartisan support strategies[9].
Establish legal precedents in which AI companies are held liable for damages[10].
Investigate and publicly make the case for/against explosive growth being likely and risky[12].
Painting a picture of a great outcome[12].
Policy-analysis of issues that could come up with explosive technological growth[12].
Norms/proposals for how to navigate an intelligence explosion[12].
Analyze: What tech could change the landscape?[12].
Big list of questions that labs should have answers for

Support AI Governance people

Note: this sublist is entirely suggested by us from ENAIS. It is more abstract than most of the previous projects and we appreciate any feedback to concretize any of these project ideas.

Connect them to others working on similar topics
Create a database of up-to-date documents related to Governance in the EU
Help them to find new people
- Senior
- Junior
Facilitate the connection between US and EU and other countries
Mentorship programs for new people entering the field
Reproduce in the EU upskilling programs that exist atm in US
Help with grantmaking/fund-raising
Help them fill their technical knowledge gaps
- Connect them to people explaining the technical risks
- Help them people with the relevant expertise
- Help them understand what technical knowledge is even needed.

Activism

Publicly demand transparency from AI labs to publish their strategies for ensuring systems are safe and aligned.
Demand AI labs to publish results or summaries of internal risk assessments.
Demand AI labs to publish the results or summaries of external scrutiny efforts, unless this would unduly reveal proprietary information or itself produce significant risk.
Ask AI labs to make public statements about how they make high-stakes decisions regarding model development and deployment.
Encourage and facilitate public declaration of positions on AI risks and policies among community members to foster transparency and accountability.
Mail your representatives and ask them to read about AI risks and discuss it in parliament / commissions[13].
Join or organize protests to increase the awareness of journalists, the public and decision makers and urge them to implement sensible regulations[13].
Prepare for the AI Safety Summit in Seoul[13]
- help those attending so they are properly informed before the summit.
- work on a draft Treaty that prevents dangerous AI from being created.
Help with engaging / recruitment of volunteers[13].

Liability

The judicial path: establish legal precedents in which AI companies are held liable for damages[10].
The regulatory path: get regulatory agencies which maintain liability-relevant rules to make rule clarifications or even new rules under which AI companies will be unambiguously liable for various damages[10].
The legislative path: get state and/or federal lawmakers to pass laws making AI companies unambiguously liable for various damages[10].
Find a broad class of people damaged in some way by hallucinations, and bring a class-action suit against the company which built the large language model[10].
Find some celebrity or politician who’s been the subject of a lot of deepfakes, and bring a suit against the company whose model made a bunch of them[10].
Find some companies/orgs which have been damaged a lot by employees/contractors using large language models to fake reports, write-ups, etc, and then sue the company whose model produced those reports/write-ups/etc[10].

Regulatory Survey and Small Interventions

Deeply understanding the proposed rules and the rulemakers’ own objectives[10].
“Solving for the equilibrium” of the incentives they create (in particular looking for loopholes which companies are likely to exploit)[10].
Suggesting implementation details which e.g. close loopholes or otherwise tweak incentives in ways which both advance X-risk reduction goals and fit the rulemakers’ own objectives[10].

Others

Create an online dashboard that aggregates, synthesizes, and visualizes predictions from various forecasting platforms like Metaculus, alongside other prediction markets relevant to the development of advanced AI. This project aims to democratize access to expert and community forecasts, making it easier for a wider audience, including policymakers, researchers, and the general public, to understand the potential futures of AI. It will involve designing an intuitive user interface and implementing interactive features that allow users to explore different scenarios. The project will also include outreach initiatives to promote the use of the dashboard as a resource for informed decision-making in AI governance and development [1].
Develop a specialized writing workshop or online course aimed at researchers and professionals in the AI field, focusing on improving writing skills for clearer communication and better policy and research documentation [1].
Develop a training program to enhance communication skills specifically for technical researchers [5].
Create a mentorship program that pairs junior researchers with experienced professionals to enhance their analytical and problem-solving skills, with a focus on breaking down and addressing complex AI-related questions [1].
A study or report identifying and categorizing the primary sources of existential risks associated with AI development [3].
An analysis project focused on understanding China's AI capabilities and assessing the likelihood of it becoming an AI superpower [3].
A post on the potential of significant military interest in AI technologies and the possibility of military AI megaprojects [3].
A comprehensive study or series of expert interviews aimed at projecting how the field of AI governance will evolve as more interest groups join the debate, considering the implications of transformative economic applications, job losses, disinformation, and automation of critical decisions [3].
Create a platform to report safety incidents. This can be used by state actors and other labs as well as the general public. Anyone can add incidents. Might serve as a database of safety incidents [4]. Note: there is already a similar platform (https://incidentdatabase.ai/)
Create a platform or series of events aimed at strengthening networks among AI governance professionals [5].
Create a program/survey/… that actively involves current policymakers in the discourse on AI safety and governance [5].
Create programs to identify and support promising ML PhD and graduate policy students through scholarships, internships, or mentorship programs. Replicate successful programs in other locations, like EU [5].
Initiate a project aimed at incorporating diverse perspectives into AI safety work, beyond those traditionally involved in the EA community [5].
Develop materials and strategies to include AI safety as a cause area in most EA outreach efforts [5].
Create outreach programs for AI safety that don't assume an EA framework, making them accessible to a broader audience [5].
(If you have academic credentials and are relatively senior) Connecting senior academics and research engineers (REs) with professors or other senior REs, who can help answer more questions and will likely be more persuasive than junior people without much legible credentials. Note that I don’t recommend doing this unless [7].
Create a structured mentorship program to guide young researchers in the AI Safety (AIS) field, focusing on providing concrete projects and research management guidance [7]. Note: There's already MATS. (Astra isn't really for junior people, empirically)
Establish a support system to encourage and assist promising researchers in pursuing PhDs, particularly in fields relevant to AIS [7].
Develop internship programs in collaboration with academia and AIS organizations, aimed at utilizing pre-existing mentorship capacities and providing a pathway to full-time employment [7].
Being a PhD student and influencing your professor/lab mates. The highest impact here is, probably, to do a PhD at a location with a small number of AIS-interested researchers, as opposed to going to a university without any AIS presence [7].
Design initiatives that make efficient use of existing mentorship capacities, possibly through more direct routes to full-time positions at AIS organizations, to address the lack of research mentorship and management capacity [7].
Outreach efforts that involve interactions between the AI safety community and (a) members of AI labs + (b) members of the ML community [11].
- More conferences that bring together researchers from different groups who are working on similar topics (e.g., Anthropic recently organized an interpretability retreat with members from various different AI labs and AI alignment organizations).
- More conferences that bring together strategy/governance thinkers from different groups (e.g., Olivia and AW recently ran a small 1-day strategy retreat with a few members from AI labs and members).
- Discussions like the MIRI 2021 conversations, except with a greater emphasis on engaging with researchers and decision-makers at major AI labs by directly touching on their cruxes.
- Collaborations on interventions that involve coordinating with AI labs (e.g., figuring out if there are ways to collaborate on research projects, efforts to implement publication policies and information-sharing agreements, efforts to monitor new actors that are developing AGI, etc.)
- More ML community outreach. Examples include projects by the Center for AI Safety (led by Dan Hendrycks) and AIS field-building hub (led by Vael Gates).
Decrease the power of bad actors[12].