Hide table of contents

Welcome to the AI Safety Newsletter by the Center for AI Safety. We discuss developments in AI and AI safety. No technical background required.

Listen to the AI Safety Newsletter for free on Spotify or Apple Podcasts.


The Next Generation of Compute Scale

AI development is on the cusp of a dramatic expansion in compute scale. Recent developments across multiple fronts—from chip manufacturing to power infrastructure—point to a future where AI models may dwarf today's largest systems. In this story, we examine key developments and their implications for the future of AI compute.

xAI and Tesla are building massive AI clusters. Elon Musk’s xAI has brought its Memphis supercluster—“Colossus”—online. According to Musk, the cluster has 100k Nvidia H100s, making it the largest supercomputer in the world. Moreover, xAI plans to add 50k H200s in the next few months. For comparison, Meta’s Llama 3 was trained on 16k H100s.

Meanwhile, Tesla’s “Gigafactory Texas” is expanding to house an AI supercluster. Tesla's Gigafactory supercomputer is expected to initially draw 130MW, with potential growth to 500MW. One Megawatt is roughly enough to power 1,000 homes in the US, so this level of power consumption begins to match that of a large city.

OpenAI plans a global AI infrastructure push. CEO of OpenAI Sam Altman is reportedly concerned that xAI will have more access to computing power than OpenAI. OpenAI uses Microsoft’s compute resources, but recent reports have indicated that OpenAI plans its own infrastructure buildout.

According to Bloomberg, Sam Altman is spearheading a massive buildout of AI infrastructure, beginning with projects in several U.S. states. This initiative aims to form a global investor coalition to fund the physical infrastructure necessary for rapid AI development. 

The scope of these projects is broad, encompassing the construction of data centers, expansion of energy capacity, and growth of semiconductor manufacturing capabilities. Potential investors include entities from Canada, Korea, Japan, and the United Arab Emirates. 

This infrastructure push is happening alongside OpenAI's approach towards a new funding round that includes Apple and Nvidia, and could push the company's valuation beyond $100 billion.

These developments at OpenAI and xAI are not surprising—rather, they are representative of the broader trend towards ever larger compute scale. For example, North Dakota was reportedly approached by two separate companies about developing $125 billion clusters in the state.

TSMC starts production in Arizona, and Intel considers splitting out its foundry business. TSMC began trial chip production in its Arizona facility, and its yields are reportedly on par with facilities in Taiwan. The success put the US on track to meet its targets for domestic semiconductor production, and TSMC on track to receive $6.6 billion in grants and as much as $5 billion in loans from the US as a part of the CHIPS and Science Act. 

TSMCs Arizona facility during its construction. Photo source.

The picture is more complicated for Intel. Intel’s foundry business is supposed to receive approximately $8.5 billion under the CHIPS and Science Act, but it's already spending billions to qualify—in the second quarter, it reported a loss of $2.8 billion

The chipmaker has reportedly had difficulty receiving funds from the CHIPS act, and now faces a strategic crossroads. A US-based chip foundry is a national strategic priority, and investors might look to Intel to hedge against the geopolitical uncertainty of reliance on TSMC in light of China’s claims to Taiwan. However, Intel's foundry investments are dragging down its otherwise profitable microprocessors business.

In response, Intel is reportedly considering splitting out its foundry business. The move might return the company to profitability, while at the same time setting up a possible domestic competitor to TSMC.

Ranking Models by Susceptibility to Jailbreaking

On September 7th, the “AI safety and security” company Gray Swan kicked off a competition to jailbreak LLMs. The competition includes models from Anthropic, OpenAI, Google, Meta, Microsoft, Alibaba, Mistral, Cohere, and Gray Swan AI. 

As of the writing of this story, the competition is ongoing. It is set to end when all models have been jailbroken (successfully prompted to give a specified harmful output) by at least one person. Every model has been jailbroken except for Grey Swan’s, which have so far resisted over ten thousand manual jailbreaking attempts. The competition’s model leaderboard lists how the rest of the models compare. 

This is good evidence that the problem of making LLMs robust to malicious use is more tractable than previously thought. In particular, the safety techniques employed by Gray Swan, including “circuit breaking” and other representation engineering techniques.

However, there are also important limitations to what we can infer from this competition. First, competitors are allowed only one prompt at a time to jailbreak a model. Extended, multi-prompt conversations will likely jailbreak some models that can resist single-prompt attacks. Second, the competitors do not have access to the model’s weights. Open-weight models are subject to much stronger forms of adversarial attacks, such as fine-tuning.

Machine Ethics

Our new book, Introduction to AI Safety, Ethics and Society, is available for free online and will be published by Taylor & Francis in the next year. This week, we will look at Chapter 6: Beneficial AI and Machine Ethics. This chapter looks at the challenge of embedding beneficial and ethical goals in AI systems. The video lecture for this chapter is here.

Lawful AI. One proposal for guiding AI behavior is to ensure an AI agent adheres to existing law. Law has several advantages: it is arguably legitimately formed (at least in democracies), time-tested, and comprehensive in scope.

However, law also has several disadvantages: it is often written without AIs in mind. For example, much of criminal law requires mental states and intent, which do not necessarily apply to AIs. For example, the implementation act of the bioweapons convention discusses “knowingly” aiding terrorists; if an AI gives bioweapon instructions to a terrorist, it is not necessarily doing so knowingly, and neither do people the AI developers, so nothing gets penalized. Law is also intentionally silent on many important issues, so provides a limited set of guardrails.

Fair AI. Beneficial AIs should also ideally prioritize fairness. Unfair bias can enter the behavior of AI systems in many ways—for example, though through flawed training data. Bias in AIs is hazardous because it can generate feedback loops: AI systems trained on flawed data could make biased decisions that are then fed into future models. 

Improving the fairness of AI systems involves combining technical approaches like adversarial testing and sociotechnical solutions like participatory design, in which all stakeholders are involved in a system’s development.

Economically beneficial AI. Another proposal is that AI behavior should be guided by market forces, since capitalism incentivizes AIs that increase economic growth (think e/acc). However, while economic growth is a worthy goal, it has limitations like market failures. 

Moral uncertainty. AIs should be able to make decisions under moral uncertainty, or situations in which there are conflicting moral considerations. There are several potential solutions to moral uncertainty. 

First, an AI could use a “favored theory” at the expense of all others, but while simple, this could lead to single-mindedness and overconfidence. An AI could maximize the product of an option’s desirability and how likely its corresponding theory is true, but while this approach is more balanced, ranking theories by credence is inherently subjective. Finally, an AI could use a “moral parliament” in which hypothetical delegates from different theories debate and come to a compromise.

Government

Technology

  • OpenAI has reportedly demonstrated “Strawberry” to national security officials, and is using the breakthrough to help train its next flagship system, “Orion.” Strawberry is reported to be released within the next two weeks.
  • Ilya Sutskever’s three-month-old AI company, Safe Superintelligence (SSI), has raised $1 billion in cash, and is reportedly valued at $5 billion.
  • Sakana AI raised $100 million in a Series A funding round.
  • Amazon CEO Andy Jassy claims that the company’s AI software assistant has saved 4,500 developer-years of work.
  • Bloomberg reported on the effects that AI is having on the Philippines' outsourcing industry. 
  • AI developer Magic trained models to reason on up to 100 million tokens.
  • To raise awareness of advances in AI forecasting technology and increase its rate of adoption, CAIS released a demo of a forecasting bot.

Opinion

See also: CAIS website, CAIS X account, our $250K  Safety benchmark competition, our new AI safety course, and our feedback form

The Center for AI Safety is also hiring for several positions, including Chief Operating Officer, Director of Communications, Federal Policy Lead, and Special Projects Lead.


Double your impact! Every dollar you donate to the Center for AI Safety will be matched 1:1 up to $2 million. Donate here.

Comments


No comments on this post yet.
Be the first to respond.
Curated and popular this week
Garrison
 ·  · 7m read
 · 
This is the full text of a post from "The Obsolete Newsletter," a Substack that I write about the intersection of capitalism, geopolitics, and artificial intelligence. I’m a freelance journalist and the author of a forthcoming book called Obsolete: Power, Profit, and the Race to build Machine Superintelligence. Consider subscribing to stay up to date with my work. Wow. The Wall Street Journal just reported that, "a consortium of investors led by Elon Musk is offering $97.4 billion to buy the nonprofit that controls OpenAI." Technically, they can't actually do that, so I'm going to assume that Musk is trying to buy all of the nonprofit's assets, which include governing control over OpenAI's for-profit, as well as all the profits above the company's profit caps. OpenAI CEO Sam Altman already tweeted, "no thank you but we will buy twitter for $9.74 billion if you want." (Musk, for his part, replied with just the word: "Swindler.") Even if Altman were willing, it's not clear if this bid could even go through. It can probably best be understood as an attempt to throw a wrench in OpenAI's ongoing plan to restructure fully into a for-profit company. To complete the transition, OpenAI needs to compensate its nonprofit for the fair market value of what it is giving up. In October, The Information reported that OpenAI was planning to give the nonprofit at least 25 percent of the new company, at the time, worth $37.5 billion. But in late January, the Financial Times reported that the nonprofit might only receive around $30 billion, "but a final price is yet to be determined." That's still a lot of money, but many experts I've spoken with think it drastically undervalues what the nonprofit is giving up. Musk has sued to block OpenAI's conversion, arguing that he would be irreparably harmed if it went through. But while Musk's suit seems unlikely to succeed, his latest gambit might significantly drive up the price OpenAI has to pay. (My guess is that Altman will still ma
 ·  · 5m read
 · 
When we built a calculator to help meat-eaters offset the animal welfare impact of their diet through donations (like carbon offsets), we didn't expect it to become one of our most effective tools for engaging new donors. In this post we explain how it works, why it seems particularly promising for increasing support for farmed animal charities, and what you can do to support this work if you think it’s worthwhile. In the comments I’ll also share our answers to some frequently asked questions and concerns some people have when thinking about the idea of an ‘animal welfare offset’. Background FarmKind is a donation platform whose mission is to support the animal movement by raising funds from the general public for some of the most effective charities working to fix factory farming. When we built our platform, we directionally estimated how much a donation to each of our recommended charities helps animals, to show users.  This also made it possible for us to calculate how much someone would need to donate to do as much good for farmed animals as their diet harms them – like carbon offsetting, but for animal welfare. So we built it. What we didn’t expect was how much something we built as a side project would capture peoples’ imaginations!  What it is and what it isn’t What it is:  * An engaging tool for bringing to life the idea that there are still ways to help farmed animals even if you’re unable/unwilling to go vegetarian/vegan. * A way to help people get a rough sense of how much they might want to give to do an amount of good that’s commensurate with the harm to farmed animals caused by their diet What it isn’t:  * A perfectly accurate crystal ball to determine how much a given individual would need to donate to exactly offset their diet. See the caveats here to understand why you shouldn’t take this (or any other charity impact estimate) literally. All models are wrong but some are useful. * A flashy piece of software (yet!). It was built as
Omnizoid
 ·  · 9m read
 · 
Crossposted from my blog which many people are saying you should check out!    Imagine that you came across an injured deer on the road. She was in immense pain, perhaps having been mauled by a bear or seriously injured in some other way. Two things are obvious: 1. If you could greatly help her at small cost, you should do so. 2. Her suffering is bad. In such a case, it would be callous to say that the deer’s suffering doesn’t matter because it’s natural. Things can both be natural and bad—malaria certainly is. Crucially, I think in this case we’d see something deeply wrong with a person who thinks that it’s not their problem in any way, that helping the deer is of no value. Intuitively, we recognize that wild animals matter! But if we recognize that wild animals matter, then we have a problem. Because the amount of suffering in nature is absolutely staggering. Richard Dawkins put it well: > The total amount of suffering per year in the natural world is beyond all decent contemplation. During the minute that it takes me to compose this sentence, thousands of animals are being eaten alive, many others are running for their lives, whimpering with fear, others are slowly being devoured from within by rasping parasites, thousands of all kinds are dying of starvation, thirst, and disease. It must be so. If there ever is a time of plenty, this very fact will automatically lead to an increase in the population until the natural state of starvation and misery is restored. In fact, this is a considerable underestimate. Brian Tomasik a while ago estimated the number of wild animals in existence. While there are about 10^10 humans, wild animals are far more numerous. There are around 10 times that many birds, between 10 and 100 times as many mammals, and up to 10,000 times as many both of reptiles and amphibians. Beyond that lie the fish who are shockingly numerous! There are likely around a quadrillion fish—at least thousands, and potentially hundreds of thousands o