8867 karmaJoined Aug 2014



Researching Causality and Safe AI at Oxford

Previously, founder (with help from Trike Apps) of the EA Forum.

Discussing research etc at https://twitter.com/ryancareyai.


Topic Contributions

Hmm, OK. Back when I met Ilya, about 2018, he was radiating excitement that his next idea would create AGI, and didn't seem sensitive to safety worries. I also thought it was "common knowledge" that his interest in safety increased substantially between 2018-22, and that's why I was unsurprised to see him in charge of superalignment.

Re Elon-Zillis, all I'm saying is that it looked to Sam like the seat would belong to someone loyal to him at the time the seat was created.

You may well be right about D'Angelo and the others.

  1. The main thing that I doubt is that Sam knew at the time that he was gifting the board to doomers. Ilya was a loyalist and non-doomer when appointed. Elon was I guess some mix of doomer and loyalist at the start. Given how AIS worries generally increased in SV circles over time, more likely than not some of D'Angelo, Hoffman, and Hurd moved toward the "doomer" pole over time.


  1. I think Dario and others would've also been involved in setting up the corporate structure
  2. Sam never gave the "doomer" faction a near majority. That only happened because 2-3 "non-doomers" left and Ilya flipped.

Causal Foundations is probably 4-8 full-timers, depending on how you count the small-to-medium slices of time from various PhD students. Several of our 2023 outputs seem comparably important to the deception paper: 

  • Towards Causal Foundations of Safe AGI, The Alignment Forum - the summary of everything we're doing.
  • Characterising Decision Theories with Mechanised Causal Graphs, arXiv - the most formal treatment yet of TDT and UDT, together with CDT and EDT in a shared framework.
  • Human Control: Definitions and Algorithms, UAI - a paper arguing that corrigibility is not exactly the right thing to be aiming for, to assure good shut down behaviour.
  • Discovering Agents, Artificial Intelligence Journal - an investigation of the "retargetability" notion of agency.

What if you just pushed it back one month - to late June?

2 - I'm thinking more of the "community of people concerned about AI safety" than EA.

1,3,4- I agree there's uncertainty, disagreement and nuance, but I think if NYT's (summarised) or Nathan's version of events is correct (and they do seem to me to make more sense to me than other existing accounts) then the board look somewhat like "good guys", albeit ones that overplayed their hand, whereas Sam looks somewhat "bad", and I'd bet that over time, more reasonable people will come around to such a view.

It's a disappointing outcome - it currently seems that OpenAI is no more tied to its nonprofit goals than before. A wedge has been driven between the AI safety community and OpenAI staff, and to an extent, Silicon Valley generally.

But in this fiasco, we at least were the good guys! The OpenAI CEO shouldn't control its nonprofit board, or compromise the independence of its members, who were doing broadly the right thing by trying to do research and perform oversight. We have much to learn.

Yeah I think EA just neglects the downside of career whiplash a bit. Another instance is how EA orgs sometimes offer internships where only a tiny fraction of interns will get a job, or hire and then quickly fire staff. In a more ideal world, EA orgs would value rejected & fired applicants much more highly than non-EA orgs, and so low-hit-rate internships, and rapid firing would be much less common in EA than outside.

It looks like, on net, people disagree with my take in the original post. 

I just disagreed with the OP because it's a false dichotomy; we could just agree with the true things that activists believe, and not the false ones, and not go based on vibes. We desire to believe that mech-interp is mere safety-washing iff it is, and so on.

Load more