Hide table of contents

Welcome to the AI Safety Newsletter by the Center for AI Safety. We discuss developments in AI and AI safety. No technical background required.

Subscribe here to receive future versions.

Listen to the AI Safety Newsletter for free on Spotify.


Defensive Accelerationism 

Vitalik Buterin, the creator of Ethereum, recently wrote an essay on the risks and opportunities of AI and other technologies. He responds to Marc Andreessen’s manifesto on techno-optimism and the growth of the effective accelerationism (e/acc) movement, and offers a more nuanced perspective. 

Technology is often great for humanity, the essay argues, but AI could be an exception to that rule. Rather than giving governments control of AI so they can protect us, Buterin argues that we should build defensive technologies that provide security against catastrophic risks in a decentralized society. Cybersecurity, biosecurity, resilient physical infrastructure, and a robust information ecosystem are some of the technologies Buterin believes we should build to protect ourselves from AI risks. 

Technology has risks, but regulation is no panacea. Longer lifespans, lower poverty rates, and expanded access to education and information are among the many successes Buterin credits to technology. But most would recognize that technology can also cause harms, such as global warming. Buterin specifically says that, unlike most technologies, AI presents an existential threat to humanity. 

To address this risk, some advocate strong government control of AI development. Buterin is uncomfortable with this solution, and he expects that many others will be too. Many of history’s worst catastrophes have been deliberately carried out by powerful political figures such as Stalin and Mao. AI could help brutal regimes surveil and control large populations, and Buterin is wary of accelerating that trend by pushing AI development from private labs to public ones. 

Between the extremes of unrestrained technological development and absolute government control, Buterin advocates for a new path forwards. He calls his philosophy d/acc, where the “d” stands for defense, democracy, decentralization, or differential technological development

Defensive technologies for a decentralized world. Buterin advocates the acceleration of technologies that would protect society from catastrophic risks. Specifically, he highlights:

  1. Biosecurity. Dense cities, frequent air travel, and modern biotechnology all raise the risk of pandemics, but we can improve our biosecurity by improving our air quality, hastening the development of vaccines and therapeutics, and monitoring for emerging pathogens.
  2. Cybersecurity. AIs which can code could be used in cyberattacks, but they can also be used by defenders to find and fix security flaws before they’re exploited. Buterin’s work on blockchains aims towards a future where some digital systems can be provably secure.
  3. Resilient Physical Infrastructure. Most of the expected deaths in a nuclear catastrophe would come not from the blast itself, but from disruptions to the supply chain of food, energy, and other essentials. Elon Musk has aspired to improve humanity’s physical infrastructure by making us less dependent on fossil fuels, providing internet connections via satellite, and ideally making humanity a multi-planetary species that could outlast a disaster on Earth.
  4. Robust Information Environment. For helping people find truth in an age of AI persuasion, Buterin points to prediction markets and consensus-generating algorithms such as Community Notes

Scientists and CEOs might find themselves inspired by Buterin’s goal of building technology, rather than slowing it down. Yet for those who are concerned about AI and other catastrophic risks, Buterin offers a thoughtful view of the technologies that are most likely to keep our civilization safe. For those who are interested, there are many more thoughts in the full essay.

Retrospective on the OpenAI Board Saga

On November 17th, OpenAI announced that the board of directors had removed Sam Altman as CEO. After four days of corporate politics and negotiations, he returned as CEO. Here, we review the known facts about this series of events. 

OpenAI is intended to be controlled by a nonprofit board. OpenAI was founded in 2015 as a nonprofit. In 2019, OpenAI announced the creation of a for-profit company that would help finance its expensive plans for scaling large language models. The profit that investments in OpenAI can yield was originally “capped” at 100x — anything beyond that would be redirected to the nonprofit. But after a recent rule change, that cap will rise 20% per year beginning in 2025. 

OpenAI’s corporate structure was designed so that the nonprofit could retain legal control over the for-profit. The nonprofit is led by a board of directors, which holds power over the for-profit through its ability to select and remove the CEO of the for-profit. The board of directors is responsible for upholding the mission of the nonprofit, which is to ensure that artificial general intelligence benefits all of humanity.

OpenAI’s legal structure.

The board removes Sam Altman as CEO. At the time of its removal of Sam Altman as CEO, the board of directors consisted of OpenAI Chief Scientist Ilya Sutskever, Quora CEO Adam D’Angelo, technology entrepreneur Tasha McCauley, and CSET’s Helen Toner. Greg Brockman — OpenAI’s co-founder and president — was removed from his position as chair of the board alongside Sam Altman. 

According to the announcement, the board of directors fired Altman because he had not been “consistently candid in his communications with the board, hindering its ability to exercise its responsibilities.” While the board did not provide any specific examples of Altman's deception in the announcement, it was later reported that Altman had attempted to play board members against each other in an attempt to remove Helen Toner.

Altman had earlier confronted Toner about a paper she had co-authored. The paper in part criticizes OpenAI’s release of ChatGPT for accelerating the pace of AI development. It also praises one of OpenAI’s rivals, Anthropic, for delaying the release of their then-flagship model, Claude. 

OpenAI employees turn against the board. The fallout of the announcement was swift and dramatic. Within a few days, Greg Brockman and Mira Murati (the initial interim CEO) had resigned, and almost all OpenAI employees had signed a petition threatening to resign and join Microsoft unless Sam Altman was reinstated and the board members resigned. During negotiations, board member Helen Toner reportedly said that allowing OpenAI to be destroyed by the departure of its investors and employees would be “consistent with the mission.” Ilya Sutskevar later flipped sides and joined the petition, tweeting “I deeply regret my participation in the board's actions.”

Microsoft tries to poach OpenAI employees. Microsoft — OpenAIs largest minority stakeholder — had not been informed of the board's plans, and offered OpenAI employees positions in its own AI research team. It briefly seemed as if Microsoft had successfully hired Sam Altman, Greg Brockman, and other senior OpenAI employees. 

Sam Altman returns as CEO. On November 21st, OpenAI announced that it had reached an agreement that would have Sam Altman return as CEO and reorganize the board. The initial board is former Salesforce CEO Bret Taylor, former Secretary of the Treasury Larry Summers, and Adam D’Angelo. Among the initial board’s first goals is to expand the board, which will include a non-voting member from Microsoft. Sam Altman also faces an internal investigation of his behavior upon his return.

This series of events marks a time of significant change in OpenAI’s internal governance structure. 

Klobuchar and Thune’s “light-touch” Senate bill

Senators Amy Klobuchar and John Thune introduced a new AI bill. It would require companies building high-risk AI systems to self-certify that they follow recommended safety guidelines. Notably, the bill only focuses on AI systems built for high-risk domains such as hiring and healthcare, but its main provisions would not apply to many general purpose foundation models including GPT-4. 

The bill regulates specific AI applications, not general purpose AI systems. This application-based approach is similar to that taken by initial drafts of the EU AI Act, which specified domains where AI systems may be used for sensitive purposes, making them high-risk. General-purpose models like ChatGPT were not within the scope of the Act, but public use of these models and the demonstration of their capabilities has sparked debate on how to approach regulating them. 

This indicates that the current approach taken by the Senate bill may not be enough. By governing AI systems based on their applications, highly capable general purpose systems are left unregulated.

Risk assessments are mandatory, but enforcement may be difficult. The bill requires developers and deployers of high-risk AI systems to perform an assessment every two years evaluating how potential risks from their systems are understood and managed. Additionally, the Department of Commerce will develop certification standards with the help of the to-be-established AI Certification Advisory Committee, which will include industry stakeholders.

Because companies are asked to self-certify their own compliance with these standards, it will be important for the Commerce Department to ensure companies are actually following the rules. But the bill offers few enforcement options. The agency is not provided any additional resources for enforcing the new law. Moreover, they can only prevent a model from being deployed if it is determined that the bill’s requirements were violated intentionally. If an AI system accidentally violates the law, the agency will be able to fine the company that built it, but will not be able to prohibit its deployment. 

Mandatory identification of AI-generated content. The bill would require digital platforms to notify users when they are presented with AI-generated content. To ensure that malicious actors cannot pass off AI-generated content as authentic, NIST would develop new technical standards for determining the  provenance of digital content.

  • Google DeepMind released Gemini, a new AI system that’s similar to GPT-4 Vision and narrowly beats it on a variety of benchmarks.
  • Donald Trump says that as president he would immediately cancel Biden’s executive order on AI.
  • Secretary of Commerce Gina Raimondo spoke on AI, China, GPU export controls, and more.
  • The New York Times released a profile on the origins of today’s major AGI labs.
  • The Congressional Research Service released a new report about AI for biology.
  • Inflection released another LLM, with performance between that of GPT-3.5 and GPT-4.
  • A new open source LLM from Chinese developers claims to outperform Llama 2.
  • Here’s a new syllabus about legal and policy perspectives on AI regulation.
  • Two Swiss universities have started a new research initiative on AI and AI safety.
  • BARDA is accepting applications to fund AI applied to health security and CBRN threats.
  • The Future of Life Institute’s new affiliate will incubate new organizations addressing AI risks.
  • For those attending NeurIPS 2023, the UK’s AI Safety Institute will host an event, and there will also be an AI Safety Social

See also: CAIS website, CAIS twitter, A technical safety research newsletter, An Overview of Catastrophic AI Risks, and our feedback form

Listen to the AI Safety Newsletter for free on Spotify.

Subscribe here to receive future versions.

Comments


No comments on this post yet.
Be the first to respond.
Curated and popular this week
Paul Present
 ·  · 28m read
 · 
Note: I am not a malaria expert. This is my best-faith attempt at answering a question that was bothering me, but this field is a large and complex field, and I’ve almost certainly misunderstood something somewhere along the way. Summary While the world made incredible progress in reducing malaria cases from 2000 to 2015, the past 10 years have seen malaria cases stop declining and start rising. I investigated potential reasons behind this increase through reading the existing literature and looking at publicly available data, and I identified three key factors explaining the rise: 1. Population Growth: Africa's population has increased by approximately 75% since 2000. This alone explains most of the increase in absolute case numbers, while cases per capita have remained relatively flat since 2015. 2. Stagnant Funding: After rapid growth starting in 2000, funding for malaria prevention plateaued around 2010. 3. Insecticide Resistance: Mosquitoes have become increasingly resistant to the insecticides used in bednets over the past 20 years. This has made older models of bednets less effective, although they still have some effect. Newer models of bednets developed in response to insecticide resistance are more effective but still not widely deployed.  I very crudely estimate that without any of these factors, there would be 55% fewer malaria cases in the world than what we see today. I think all three of these factors are roughly equally important in explaining the difference.  Alternative explanations like removal of PFAS, climate change, or invasive mosquito species don't appear to be major contributors.  Overall this investigation made me more convinced that bednets are an effective global health intervention.  Introduction In 2015, malaria rates were down, and EAs were celebrating. Giving What We Can posted this incredible gif showing the decrease in malaria cases across Africa since 2000: Giving What We Can said that > The reduction in malaria has be
Rory Fenton
 ·  · 6m read
 · 
Cross-posted from my blog. Contrary to my carefully crafted brand as a weak nerd, I go to a local CrossFit gym a few times a week. Every year, the gym raises funds for a scholarship for teens from lower-income families to attend their summer camp program. I don’t know how many Crossfit-interested low-income teens there are in my small town, but I’ll guess there are perhaps 2 of them who would benefit from the scholarship. After all, CrossFit is pretty niche, and the town is small. Helping youngsters get swole in the Pacific Northwest is not exactly as cost-effective as preventing malaria in Malawi. But I notice I feel drawn to supporting the scholarship anyway. Every time it pops in my head I think, “My money could fully solve this problem”. The camp only costs a few hundred dollars per kid and if there are just 2 kids who need support, I could give $500 and there would no longer be teenagers in my town who want to go to a CrossFit summer camp but can’t. Thanks to me, the hero, this problem would be entirely solved. 100%. That is not how most nonprofit work feels to me. You are only ever making small dents in important problems I want to work on big problems. Global poverty. Malaria. Everyone not suddenly dying. But if I’m honest, what I really want is to solve those problems. Me, personally, solve them. This is a continued source of frustration and sadness because I absolutely cannot solve those problems. Consider what else my $500 CrossFit scholarship might do: * I want to save lives, and USAID suddenly stops giving $7 billion a year to PEPFAR. So I give $500 to the Rapid Response Fund. My donation solves 0.000001% of the problem and I feel like I have failed. * I want to solve climate change, and getting to net zero will require stopping or removing emissions of 1,500 billion tons of carbon dioxide. I give $500 to a policy nonprofit that reduces emissions, in expectation, by 50 tons. My donation solves 0.000000003% of the problem and I feel like I have f
LewisBollard
 ·  · 8m read
 · 
> How the dismal science can help us end the dismal treatment of farm animals By Martin Gould ---------------------------------------- Note: This post was crossposted from the Open Philanthropy Farm Animal Welfare Research Newsletter by the Forum team, with the author's permission. The author may not see or respond to comments on this post. ---------------------------------------- This year we’ll be sharing a few notes from my colleagues on their areas of expertise. The first is from Martin. I’ll be back next month. - Lewis In 2024, Denmark announced plans to introduce the world’s first carbon tax on cow, sheep, and pig farming. Climate advocates celebrated, but animal advocates should be much more cautious. When Denmark’s Aarhus municipality tested a similar tax in 2022, beef purchases dropped by 40% while demand for chicken and pork increased. Beef is the most emissions-intensive meat, so carbon taxes hit it hardest — and Denmark’s policies don’t even cover chicken or fish. When the price of beef rises, consumers mostly shift to other meats like chicken. And replacing beef with chicken means more animals suffer in worse conditions — about 190 chickens are needed to match the meat from one cow, and chickens are raised in much worse conditions. It may be possible to design carbon taxes which avoid this outcome; a recent paper argues that a broad carbon tax would reduce all meat production (although it omits impacts on egg or dairy production). But with cows ten times more emissions-intensive than chicken per kilogram of meat, other governments may follow Denmark’s lead — focusing taxes on the highest emitters while ignoring the welfare implications. Beef is easily the most emissions-intensive meat, but also requires the fewest animals for a given amount. The graph shows climate emissions per tonne of meat on the right-hand side, and the number of animals needed to produce a kilogram of meat on the left. The fish “lives lost” number varies significantly by