Next week for The 80,000 Hours Podcast I'm interviewing Ajeya Cotra, senior researcher at Open Philanthropy, AI timelines expert, and author of Without specific countermeasures, the easiest path to transformative AI likely leads to AI takeover.

What should I ask her?

My first interview with her is here:

Some of Ajeya's work includes:

New Answer
Ask Related Question
New Comment

8 Answers sorted by

Artir Kel (aka José Luis Ricón Fernández de la Puente) at Nintil wrote an essay broadly sympathetic to AI risk scenarios but doubtful of a particular step in the power-seeking stories Cotra, Gwern, and others have told. In particular, he has a hard time believing that a scaled-up version of present systems (e.g. Gato) would learn facts about itself (e.g. that it is an AI in a training process, what its trainers motivations would be, etc) and incorporate those facts into its planning (Cotra calls this "situational awareness"). Some AI safety researchers I've spoken to personally agree with Kel's skepticism on this point. 

Since incorporating this sort of self-knowledge into one's plans is necessary for breaking out of training, initiating deception, etc, this seems like a pretty important disagreement. In fact, Kel claims that if he came around on this point, he would agree almost entirely with Cotra's analysis.

Can she describe in more detail what situational awareness means? Could it be demonstrated with current/nearterm models? Why does she think that Kel (and others) think it's so unlikely?

Is marginal work on AI forecasting usefwl? With so much brainpower being spent on moving a single number up or down, I'd expect it to hit diminishing returns pretty fast. To what extent is forecasting a massive brain drain and people should just get to work on the object-level problems if they're sufficiently convinced? How sensitive to AI forecasting estimates are your priorities over object-level projects (as in, how many more years out would your estimate of X have to be)?

Update: I added some arguments against forecasting here, but they are very general, and I suspect they will be overwhelmed by evidence related to specific cases.

What does she think policymakers should be trying to do to prevent risks from misaligned AI? 

How would she summarise the views of various folks with <2030 median timelines (eg: Daniel Kokotajlo, the Conjecture folks), and what are her cruxes for disagreeing with them?

Nice, looking forward to hearing this!

If she had to direct $1B of funding (in end-point grants) toward TAI x-safety within 12 months, what would she spend it on? How about $10B, or $100B?

Maybe even that isn't thinking big enough. At some point with enough funding it could be possible to buy up and retire most existing AGI capabilities projects. At least in the West. Maybe the world would then largely follow suit (as has happened with things like e.g. global conformity on bioethics). Although on a smaller scale, there has been the precedent of curtailing the development of electric vehicles, which perhaps set things back a decade there. And EAs have discussed related things like buying up coal mines to limit climate change.

What level of spe... (read more)

What would she do if she was elected President of the USA with a mandate to prevent existential catastrophe from AI?

Why is forecasting TAI with bio anchors useful? Many argue that the compute power of the human brain won’t be needed for TAI, since AI researchers are not designing systems that mimic them.