Hide table of contents

Welcome to the AI Safety Newsletter by the Center for AI Safety. We discuss developments in AI and AI safety. No technical background required.

Subscribe here to receive future versions.

Listen to the AI Safety Newsletter for free on Spotify.


White House Executive Order on AI

While Congress has not voted on significant AI legislation this year, the White House has left their mark on AI policy. In June, they secured voluntary commitments on safety from leading AI companies. Now, the White House has released a new executive order on AI. It addresses a wide range of issues, and specifically targets catastrophic AI risks such as cyberattacks and biological weapons. 

Companies must disclose large training runs. Under the executive order, companies that intend to train “dual-use foundation models” using significantly more computing power than GPT-4 must take several precautions. First, they must notify the White House before training begins. Then, they’ll need to report on their cybersecurity measures taken to prevent theft of model weights. Finally, the results of any red teaming and risk evaluations of their trained AI system must be shared with the White House. 

This does not mean that companies will need to adopt sufficient or effective safety practices, but it does provide visibility for the White House on the processes of AI development and risk management. To improve the science of AI risk management, NIST has been tasked with developing further guidelines. 

Compute clusters must register and report on foreign actors. AIs are often trained on compute clusters, which are networks of interconnected computer chips that can be rented by third parties. The executive order requires large computing clusters to be reported to the Department of Commerce. Further, to provide transparency on AI development by foreign actors, any foreign customer of a US-based cloud compute service will need to verify their identity to the US government. Some have argued that these know-your-customer requirements should extend to domestic customers as well. 

A new poll shows that US voters largely support the new executive order. Source.

Requiring safety precautions at biology labs. One nightmare scenario for biosecurity researchers is that someone could submit an order to a biology lab for the synthesized DNA of a dangerous pathogen. Some labs screen incoming orders and refuse to synthesize dangerous pathogens, but other labs do not. 

To encourage adoption of this basic precaution, the executive order requires any research funded by the federal government to exclusively use labs that screen out dangerous compounds before synthesis. This may help combat the growing concern that AI could help rogue actors build biological weapons. The executive order also tasks several federal agencies with analyzing biosecurity risks from AI, including by producing a report that specifically focuses on the biorisks of open source AI systems. 

Building federal AI capacity. The executive order supports many efforts to help the US government use AI safely and effectively. Several agencies have been tasked with using AI to find and fix security vulnerabilities in government software. The National Science Foundation has been directed to create a pilot version of the National AI Research Resource, which would provide computing resources for AI researchers outside of academia. 

The full text of the executive order addresses many other issues, including privacy, watermarking of AI-generated content, AI-related patent and copyright questions , pathways to immigration for AI experts, and protections for civil rights. Right now, the White House is still in the stages of gathering information and developing best practices around AI. But this executive order will lead to meaningful progress on both of those fronts, and signals a clear commitment to address growing AI risks. 

Kicking Off The UK AI Safety Summit

Today marks the first day of the UK’s AI Safety Summit, where politicians, academics, and members of industry and civil society (including the Center for AI Safety’s Director Dan Hendrycks) will meet to discuss AI risks and how governments can help mitigate them. Before the summit began, the UK government announced several new initiatives, including the creation of an international expert panel to assess AI risks and a new research institute for AI safety. 

Rishi Sunak’s speech on AI extinction risk. UK Prime Minister Rishi Sunak delivered a speech on the opportunities and catastrophic risks posed by AI. Building on recent papers from the British government, he noted that “AI could make it easier to build chemical or biological weapons.” Then he directly quoted the CAIS expert statement on AI extinction risk, and said, “there is even the risk that humanity could lose control of AI completely.”

The speech also addressed doubts about AI risks. “There is a real debate about this,” Sunak said, and “some experts think it will never happen at all. But however uncertain and unlikely these risks are, if they did manifest themselves, the consequences would be incredibly serious.” Therefore, “leaders have a responsibility to take them seriously, and to act.”

UK Prime Minister Rishi Sunak delivered a speech ahead of the AI Safety Summit.

The UK will propose an international expert panel on AI. The UN Intergovernmental Panel on Climate Change (IPCC) summarizes scientific research on climate change to help inform policymaking efforts on the topic. Many have suggested that a similar body of scientific experts could help establish consensus on AI risks. Sunak announced in his speech that the UK will propose a “global expert panel nominated by the countries and organisations attending [the AI Safety Summit] to publish a State of AI Science report.”

New AI Safety Institute to evaluate AI risks. Sunak also announced “the world’s first AI Safety Institute” which will “carefully examine, evaluate, and test new types of AI so that we understand what each new model is capable of.” Few details have been provided so far, but it’s possible that this could serve as a “CERN for AI” allowing countries to work together on AI and AI safety research, thereby mitigating coordination challenges and enabling centralized oversight of AI development. 

Progress on Voluntary Evaluations of AI Risks

One common recommendation from those concerned about AI risks is that companies should commit to evaluating and mitigating risks before releasing new AI systems. This recommendation has recently received support from the United States, United Kingdom, and G7 alliance. 

The White House’s new executive order on AI requires any company developing a dual-use foundation model to “notify the federal government when training the model, and [they] must share the results of all red-team safety tests.” To help develop better AI risk management techniques, the executive order also directs NIST to develop rigorous standards for red-teaming that companies could adopt. 

At the request of the United Kingdom, six leading AI companies have published descriptions of their risk assessment and mitigation plans. There are important differences between the policies. For example, Meta argues that open sourcing their models will improve safety, while OpenAI, DeepMind, and others prefer to monitor use of their models to prevent misuse. But each company has provided their safety policy, and the UK has summarized the policies in a review of existing AI safety policies

Finally, the G7 has released a code of conduct that AI companies can voluntarily choose to follow. The policy would, among other things, require companies to evaluate catastrophic risks posed by their systems, invest in cybersecurity, and detect and prevent misuse during deployment. 

These voluntary commitments are no substitute for binding legal requirements to ensure safety in AI development. Moreover, a commitment to assess and mitigate risks does not ensure that the risks will be eliminated or reduced below a manageable threshold. Further work is needed to create binding commitments that prevent companies from releasing unsafe AI systems. 

A recent poll of UK voters suggests that most would support stronger action by the government to prevent the development of superhuman AI systems. Source

Finally, it is important to note that even the ideal safety evaluations would not eliminate AI risks. Militaries might deliberately design AI systems to be dangerous. Economic competition could lead companies to automate large swathes of human labor with AI, leading to increased inequality and concentration of power in the hands of private companies. Eventually, AI systems could be given control of many of the world’s most important decisions, undermining human autonomy on a global scale. 

See also: CAIS website, CAIS twitter, A technical safety research newsletter, An Overview of Catastrophic AI Risks, and our feedback form

Listen to the AI Safety Newsletter for free on Spotify.

Subscribe here to receive future versions.

Comments


No comments on this post yet.
Be the first to respond.
Curated and popular this week
Paul Present
 ·  · 28m read
 · 
Note: I am not a malaria expert. This is my best-faith attempt at answering a question that was bothering me, but this field is a large and complex field, and I’ve almost certainly misunderstood something somewhere along the way. Summary While the world made incredible progress in reducing malaria cases from 2000 to 2015, the past 10 years have seen malaria cases stop declining and start rising. I investigated potential reasons behind this increase through reading the existing literature and looking at publicly available data, and I identified three key factors explaining the rise: 1. Population Growth: Africa's population has increased by approximately 75% since 2000. This alone explains most of the increase in absolute case numbers, while cases per capita have remained relatively flat since 2015. 2. Stagnant Funding: After rapid growth starting in 2000, funding for malaria prevention plateaued around 2010. 3. Insecticide Resistance: Mosquitoes have become increasingly resistant to the insecticides used in bednets over the past 20 years. This has made older models of bednets less effective, although they still have some effect. Newer models of bednets developed in response to insecticide resistance are more effective but still not widely deployed.  I very crudely estimate that without any of these factors, there would be 55% fewer malaria cases in the world than what we see today. I think all three of these factors are roughly equally important in explaining the difference.  Alternative explanations like removal of PFAS, climate change, or invasive mosquito species don't appear to be major contributors.  Overall this investigation made me more convinced that bednets are an effective global health intervention.  Introduction In 2015, malaria rates were down, and EAs were celebrating. Giving What We Can posted this incredible gif showing the decrease in malaria cases across Africa since 2000: Giving What We Can said that > The reduction in malaria has be
LewisBollard
 ·  · 8m read
 · 
> How the dismal science can help us end the dismal treatment of farm animals By Martin Gould ---------------------------------------- Note: This post was crossposted from the Open Philanthropy Farm Animal Welfare Research Newsletter by the Forum team, with the author's permission. The author may not see or respond to comments on this post. ---------------------------------------- This year we’ll be sharing a few notes from my colleagues on their areas of expertise. The first is from Martin. I’ll be back next month. - Lewis In 2024, Denmark announced plans to introduce the world’s first carbon tax on cow, sheep, and pig farming. Climate advocates celebrated, but animal advocates should be much more cautious. When Denmark’s Aarhus municipality tested a similar tax in 2022, beef purchases dropped by 40% while demand for chicken and pork increased. Beef is the most emissions-intensive meat, so carbon taxes hit it hardest — and Denmark’s policies don’t even cover chicken or fish. When the price of beef rises, consumers mostly shift to other meats like chicken. And replacing beef with chicken means more animals suffer in worse conditions — about 190 chickens are needed to match the meat from one cow, and chickens are raised in much worse conditions. It may be possible to design carbon taxes which avoid this outcome; a recent paper argues that a broad carbon tax would reduce all meat production (although it omits impacts on egg or dairy production). But with cows ten times more emissions-intensive than chicken per kilogram of meat, other governments may follow Denmark’s lead — focusing taxes on the highest emitters while ignoring the welfare implications. Beef is easily the most emissions-intensive meat, but also requires the fewest animals for a given amount. The graph shows climate emissions per tonne of meat on the right-hand side, and the number of animals needed to produce a kilogram of meat on the left. The fish “lives lost” number varies significantly by
Neel Nanda
 ·  · 1m read
 · 
TL;DR Having a good research track record is some evidence of good big-picture takes, but it's weak evidence. Strategic thinking is hard, and requires different skills. But people often conflate these skills, leading to excessive deference to researchers in the field, without evidence that that person is good at strategic thinking specifically. I certainly try to have good strategic takes, but it's hard, and you shouldn't assume I succeed! Introduction I often find myself giving talks or Q&As about mechanistic interpretability research. But inevitably, I'll get questions about the big picture: "What's the theory of change for interpretability?", "Is this really going to help with alignment?", "Does any of this matter if we can’t ensure all labs take alignment seriously?". And I think people take my answers to these way too seriously. These are great questions, and I'm happy to try answering them. But I've noticed a bit of a pathology: people seem to assume that because I'm (hopefully!) good at the research, I'm automatically well-qualified to answer these broader strategic questions. I think this is a mistake, a form of undue deference that is both incorrect and unhelpful. I certainly try to have good strategic takes, and I think this makes me better at my job, but this is far from sufficient. Being good at research and being good at high level strategic thinking are just fairly different skillsets! But isn’t someone being good at research strong evidence they’re also good at strategic thinking? I personally think it’s moderate evidence, but far from sufficient. One key factor is that a very hard part of strategic thinking is the lack of feedback. Your reasoning about confusing long-term factors need to extrapolate from past trends and make analogies from things you do understand better, and it can be quite hard to tell if what you're saying is complete bullshit or not. In an empirical science like mechanistic interpretability, however, you can get a lot more fe