Hey all, wanted to share what some colleagues at OpenAI are up to: the new Preparedness team has been publicly announced, and they’re hiring

This team is going to be doing incredibly important work:

  • They’ll be the main team doing evals, forecasting, and risk assessment for catastrophic risk.
  • They’ll be coordinating AGI preparedness (figuring out what protective measures we need, etc.)
  • They’re in charge of developing and maintaining OpenAI’s RDP (our version of an RSP).

I think this will be one of the most important teams at OpenAI for mitigating AGI risk. The team is led by Aleksander Madry, who is great, and the early team members Tejal and Kevin are awesome.

I think it would be enormously impactful if they can continue to hire people who are really excellent + really get AGI risk. Please seriously consider applying, and spread the word to friends who you think could be a great fit!

85

3
0

Reactions

3
0
Comments13


Sorted by Click to highlight new comments since:

leopold - my key question here would be, if the OpenAI Preparedness team concluded in a year or two that the best way to mitigate AGI risk would be for OpenAI to simply stop doing AGI research, would anyone in OpenAI senior management actually listen to them, and stop doing AGI research? 

If not, this could end up being just another example of corporate 'safety-washing', where the company has already decided what they're actually going to do, and the safety team is just along for the ride.

I'd value your candid view on this; I can't actually tell if there are any conditions under which OpenAI would decide that what they've been doing is reckless and evil, and they should just stop.

Still waiting for an answer to this key question.... disappointed not to get one yet.

Yeah, I'd really like to know how they'd respond to information that says that they'd have to stop doing something that would go against their incentives, like accelerating AI progress.

I don't think it's very likely, but given the incentives at play, it really matters that the organization will actually be able to at least seriously consider the possibility that the solution to AI safety might be something that they aren't incentivized to do, or have anti-incentives to doing.

Sharmake -- this is also my concern. But it's even worse than this.

Even if OpenAI workers think that their financial, status, & prestige incentives would make it impossible to slow down their mad quest for AGI, it shouldn't matter, if they take the extinction risks seriously. What good would it do for the OpenAI leaders and devs to make a few extra tens of millions of dollars each, and to get professional kudos for creating AGI, if the result of their hubris is total devastation to our civilization and species?

Either they take the extinction risks seriously, or they don't. If they do, then there are no merely financial or professional incentives that could rationally over-ride the extinction risks. 

My conclusion is that they say they take the extinction risks seriously, but they're lying, or they're profoundly self-deceived. In any case, their revealed preferences are that they prefer a little extra money, power, and status for themselves over a lot of extra safety for everybody else -- and for themselves.

I want to flag that I see quite a lot of inappropriate binarization happening, and I generally see quite a lot of dismissals of valid third options.

Either they take the extinction risks seriously, or they don't.

There are other important possibilities, like a potential belief in AI progress helping or solving the existential risk, thinking that the intervention of increasing AI progress is actually the best strategy, etc. More generally, once we make weaker or no assumptions about AI risk, we no longer obtain the binary you've suggested.

So this doesn't really work, because it basically requires us to assume the conclusion, especially for near-term people.

My conclusion is that they say they take the extinction risks seriously, but they're lying, or they're profoundly self-deceived. In any case, their revealed preferences are that they prefer a little extra money, power, and status for themselves over a lot of extra safety for everybody else -- and for themselves.

Sharmake -- in most contexts, your point would be valid, and inappropriate binarization would be a bad thing.

But when it comes to AI X-risk, I don't see any functional difference between dismissing AI X risks, and thinking that AI progress will help solve (other?) X risks, or thinking that increasing AI progress with somehow reduce AI X risks. Those 'third options' just seem like they fall into the overall category of 'not taking AI X risk seriously, at all'. 

For example, if people think AI progress will somehow reduce AI X risk, that boils down to thinking that 'the closer we get to the precipice, the better we'll be able to avoid the precipice'. 

If people think AI progress will somehow reduce other X risks, I'd want a realistic analysis of what those other alleged X risks really are, and how exactly AI progress would help. In practice, in almost every blog, post, and comment I've seen, this boils down to the vague claim that 'AI could help us solve climate change'. But very few serious climate scientists think that climate change is a literal X risk that could kill every living human. 

I just want to remind that simply having some of the company budget allocated to pay people to spend their time thinking about and studying the potential impacts of the technology the company is developing, is in itself a good thing.

About the possibility that they would come to the conclusion the most rational thing would be to stop the development - I think the concern here is moot anyway because of the many player dilemma in the AI space (if one stops the others don't have to), which is (I think) impossible to solve from inside any single company anyway.

Having some of the OpenAI company budget allocated to 'AI safety' could just be safety-washing -- essentially, part of the OpenAI PR/marketing budget, rather than an actual safety effort.

If the safety people don't actually have any power to slow or stop the rush towards AGI, I don't see their utility. 

As for the arms race dilemma, imagine if OpenAI announced one day 'Oh no, we've made a horrible mistake; AGI would be way too risky; we are stopped all AGI-related research to protect humanity; he's how to audit us to make sure we follow through on this promise'. I think the other major players in the AI space would be under considerable pressure from investors, employees, media, politicians, and the public to also stop their AGI research. 

It's just not that hard to coordinate on the 'no-AGI-research' focal point if enough serious people decide to do so, and there's enough public support.

"Nobody is on the ball on AGI governance"?

There's a typo!

Outline an experiment plan to (ethically and legally) measure the true feasibility and potential severity of the misuse scenario you described above assuing you have a broad range of resources at your disposal, including an ability to perform human-AI evaluations. *

Hey, at least we know it was written by a human!

At the time of writing, the team is looking to hire for two roles (though presumably multiple people in each role).

  • National Security Threat Researcher San Francisco, California, United States — Preparedness
  • Research Engineer, Preparedness San Francisco, California, United States — Preparedness

Yep—and in particular, they are looking to hire people who do well on their Preparedness challenge: https://openai.com/form/preparedness-challenge. So if you're interested, try that out!

This link works for me:

https://openai.com/form/preparedness-challenge

(Just without period at the end)

Curated and popular this week
 ·  · 32m read
 · 
Summary Immediate skin-to-skin contact (SSC) between mothers and newborns and early initiation of breastfeeding (EIBF) may play a significant and underappreciated role in reducing neonatal mortality. These practices are distinct in important ways from more broadly recognized (and clearly impactful) interventions like kangaroo care and exclusive breastfeeding, and they are recommended for both preterm and full-term infants. A large evidence base indicates that immediate SSC and EIBF substantially reduce neonatal mortality. Many randomized trials show that immediate SSC promotes EIBF, reduces episodes of low blood sugar, improves temperature regulation, and promotes cardiac and respiratory stability. All of these effects are linked to lower mortality, and the biological pathways between immediate SSC, EIBF, and reduced mortality are compelling. A meta-analysis of large observational studies found a 25% lower risk of mortality in infants who began breastfeeding within one hour of birth compared to initiation after one hour. These practices are attractive targets for intervention, and promoting them is effective. Immediate SSC and EIBF require no commodities, are under the direct influence of birth attendants, are time-bound to the first hour after birth, are consistent with international guidelines, and are appropriate for universal promotion. Their adoption is often low, but ceilings are demonstrably high: many low-and middle-income countries (LMICs) have rates of EIBF less than 30%, yet several have rates over 70%. Multiple studies find that health worker training and quality improvement activities dramatically increase rates of immediate SSC and EIBF. There do not appear to be any major actors focused specifically on promotion of universal immediate SSC and EIBF. By contrast, general breastfeeding promotion and essential newborn care training programs are relatively common. More research on cost-effectiveness is needed, but it appears promising. Limited existing
 ·  · 2m read
 · 
Summary: The NAO will increase our sequencing significantly over the next few months, funded by a $3M grant from Open Philanthropy. This will allow us to scale our pilot early-warning system to where we could flag many engineered pathogens early enough to mitigate their worst impacts, and also generate large amounts of data to develop, tune, and evaluate our detection systems. One of the biological threats the NAO is most concerned with is a 'stealth' pathogen, such as a virus with the profile of a faster-spreading HIV. This could cause a devastating pandemic, and early detection would be critical to mitigate the worst impacts. If such a pathogen were to spread, however, we wouldn't be able to monitor it with traditional approaches because we wouldn't know what to look for. Instead, we have invested in metagenomic sequencing for pathogen-agnostic detection. This doesn't require deciding what sequences to look for up front: you sequence the nucleic acids (RNA and DNA) and analyze them computationally for signs of novel pathogens. We've primarily focused on wastewater because it has such broad population coverage: a city in a cup of sewage. On the other hand, wastewater is difficult because the fraction of nucleic acids that come from any given virus is very low,[1] and so you need quite deep sequencing to find something. Fortunately, sequencing has continued to come down in price, to under $1k per billion read pairs. This is an impressive reduction, 1/8 of what we estimated two years ago when we first attempted to model the cost-effectiveness of detection, and it makes methods that rely on very deep sequencing practical. Over the past year, in collaboration with our partners at the University of Missouri (MU) and the University of California, Irvine (UCI), we started to sequence in earnest: We believe this represents the majority of metagenomic wastewater sequencing produced in the world to date, and it's an incredibly rich dataset. It has allowed us to develop
Linch
 ·  · 6m read
 · 
Remember: There is no such thing as a pink elephant. Recently, I was made aware that my “infohazards small working group” Signal chat, an informal coordination venue where we have frank discussions about infohazards and why it will be bad if specific hazards were leaked to the press or public, accidentally was shared with a deceitful and discredited so-called “journalist,” Kelsey Piper. She is not the first person to have been accidentally sent sensitive material from our group chat, however she is the first to have threatened to go public about the leak. Needless to say, mistakes were made. We’re still trying to figure out the source of this compromise to our secure chat group, however we thought we should give the public a live update to get ahead of the story.  For some context the “infohazards small working group” is a casual discussion venue for the most important, sensitive, and confidential infohazards myself and other philanthropists, researchers, engineers, penetration testers, government employees, and bloggers have discovered over the course of our careers. It is inspired by taxonomies such as professor B******’s typology, and provides an applied lens that has proven helpful for researchers and practitioners the world over.  I am proud of my work in initiating the chat. However, we cannot deny that minor mistakes and setbacks may have been made over the course of attempting to make the infohazards widely accessible and useful to a broad community of people. In particular, the deceitful and discredited journalist may have encountered several new infohazards previously confidential and unleaked: * Mirror nematodes as a solution to mirror bacteria. "Mirror bacteria," synthetic organisms with mirror-image molecules, could pose a significant risk to human health and ecosystems by potentially evading immune defenses and causing untreatable infections. Our scientists have explored engineering mirror nematodes, a natural predator for mirror bacteria, to