1097 karmaJoined


This is an anonymous account.


Critiques of Prominent AI Safety Labs


While we're taking a short break from writing criticisms, I (the non-technical author) was wondering if people would be find it valuable for us to share (brief) thoughts what we've learnt so far from writing these first two critiques - such as how to get feedback, balance considerations, anonymity concerns, things we wish would be different in the ecosystem to make it easier for people to provide criticisms etc.

  1. Especially keen to write for the audience of those who want to write critiques
  2. Keen to hear what specific things (if any) people would be curious to hear

We're always open to providing thoughts / feedback / inputs if you are trying to write a critique. I'd like to try and encourage more good-faith critiques that enable productive discourse.

Hi JWS, Just wanted to let you know that we've posted our introduction to the series. We hope it adds some clarity to the points you've raised here for others. 

We've updated the recommendation about working at Conjecture. 

We posted the Redwood post several weeks late on LW, which might explain the low karma on LW.

Hi Bruce, thanks for this thoughtful comment. We think Conjecture needs to address key concerns before we would recommend working there, although we could imagine Conjecture being the best option for a small fraction of people who are (a) excited by their current CoEm approach, (b) can operate independently in an environment with limited mentorship, (c) are confident they can withstand internal pressure (if there is a push to work on capabilities). As a result of these (and other) comments in this comment thread, we will be updating our recommendation to work at Conjecture. 

That being said, we expect it to be rare that an individual would have an offer from Conjecture but not have access to other opportunities that are better than independent research. In practice many organizations end up competing for the same, relatively small pool of the very top candidates. Our guess is that most individuals who could receive an offer from Conjecture could pursue one of the paths outlined above in our replies to Marius such as being a research assistant or PhD student in academia, or working in an ML engineering position in an applied team at a major tech company (if not from more promising places like the ones we discuss in the original post). We think these positions can absorb a fairly large amount of talent, although we note that most AI/ML fields are fairly competitive. 

Thanks for your offer to help Oli, we really appreciate it. We'll reach out via DM.

Thanks for sharing your experience, this kind of information is really helpful for us to know.

(personal, emotional reflection)

On a personal note, the past few days have been pretty tough for me. I noticed I took the negative feedback pretty hard.

I hope we have demonstrated that we are acting in good faith, willing to update and engage rigorously with feedback and criticism, but some of the comments made me feel like people thought we were trying to be deceptive or mislead people. It's pretty difficult to take that in when it's so far from our intentions.

We try not to let the fact that our posts are anonymous mean we can say things that aren't as rigorous, but sometimes it feels like people don't realize that we are people too. I think comments might be phrased differently if we weren't anonymous.

I think it's especially hard when this post has taken many weekends to complete, and we've invested several hours this week in engaging with comments, which is a tough trade off against other projects.

Brief reflections on the Conjecture post and it's reception

(Written from the non-technical primary author)

  • Reception was a lot more critical than I expected. As last time, many good points were raised that pointed out areas where we weren't clear
  • We shared it with reviewers (especially ones who we would expect to disagree with us) hoping to pre-empt these criticisms. The gave useful feedback.
  • However, what we didn't realize was that the people engaging with our post in the comments were quite different from our reviewers and didnt share the background knowledge that our reviewers did  
  • We included our end line views (based on feedback previously that we didn't do this enough) and I think it's those views that felt very strong to people. 
  • It's really, really hard to share the right level of detail and provide adequate context. I think this post managed to be both too short and too long.
  • Short: because we didn't make as many explicit comparisons benchmarking research
  • Long: we felt we needed to add context on several points that weren't obvious to low context people. 
  • When editing a post it's pretty challenging to figure out what assumptions you can assume and what your reader won't know, because there's a broad range of knowledge. I think nested thoughts could be helpful for making posts reasonable length 
  • We initially didn't give as much detail in some areas because the other (technical) author is time-limited and didn't think it was critical. The post editing process is extremely long for a post of this size and gravity, so we had to make decisions on when to stop iterating. 
  • Overall, I think the post still generated some interesting and valuable discussion, and I hope it at the very least causes people to think more critically about where they end up working. 
  • I am sad that Conjecture didn't engage with the post as much as we would have liked. 
  • I think it's difficult to strike a balance of 'say what you believe to be true' and 'write something people aren't put off by'
  • I think some people expected their views to be reflected in our critique. I think I'm sympathetic to that to some extent, but I think you can err to far in that direction (and I've seen pushback the other way as well). It feels like with this post, people felt very strongly (many comments were pretty strongly stated) such that it wasn't just a disagreement but people felt it was a hitpiece. Il
  • I think I want to get better at communicating that, ultimately, these are the views of a very small group of people, these topics are very high uncertainty, and there will be disagreements, but that doesn't mean we have a hidden agenda or something we are trying to push. (I'll probably add this to our intro).  
  • I'd be thrilled to see others write their own evaluations of these or other orgs. 


We didn't do some super basic things which feel obvious in retrospect e.g. explain why we are writing this series. But context is important when people are primed to respond negatively to a post. 

Changes we plan to make:

  • Recruiting "typical" readers for our next review round 
  • Hiring a copyeditor so we can spend more time on substance
  • Figuring out other ways to save time. Ideally, getting other technical contributors on board would be great (it would improve the quality and hopefully provide a slightly different perspective). Unfortunately, it's hard to get people to do unpaid, anonymous work that might get a lot of pushback. 
  • Posting an intro post with basic context we can point people to 
  • The next post will be on anthropic and have (substantively) different critiques. I'd ideally like to spend some time figuring out how to murphyjitsu it so that we can meet people where they are at 
  • I want to ensure that we get more engagement from anthropic (although I can imagine they might not engage much for different reasons to Conjecture - e.g. NDAs and what they are allowed to say publicly)

We appreciate you sharing your impression of the post. It’s definitely valuable for us to understand how the post was received, and we’ll be reflecting on it for future write-ups.

1) We agree it's worth taking into account aspects of an organization other than their output. Part of our skepticism towards Conjecture – and we should have made this more explicit in our original post (and will be updating it) – is the limited research track record of their staff, including their leadership. By contrast, even if we accept for the sake of argument that ARC has produced limited output, Paul Christiano has a clear track record of producing useful conceptual insights (e.g. Iterated Distillation and Amplification) as well as practical advances (e.g. Deep RL From Human Preferences) prior to starting work at ARC. We're not aware of any equally significant advances from Connor or other key staff members at Conjecture; we'd be interested to hear if you have examples of their pre-Conjecture output you find impressive.

We're not particularly impressed by Conjecture's process, although it's possible we'd change our mind if we knew more about it. Maintaining high velocity in research is certainly a useful component, but hardly sufficient. The Builder/Breaker method proposed by ARC feels closer to a complete methodology. But this doesn't feel like the crux for us: if Conjecture copied ARC's process entirely, we'd still be much more excited about ARC (per-capita). Research productivity is a product of a large number of factors, and explicit process is an important but far from decisive one.

In terms of the explicit comparison with ARC, we would like to note that ARC Theory's team size is an order of magnitude smaller than Conjecture. Based on ARC's recent hiring post, our understanding is the theory team consists of just three individuals: Paul Christiano, Mark Xu and Jacob Hilton. If ARC had a team ten times larger and had spent close to $10 mn, then we would indeed be disappointed if there were not more concrete wins.

2) Thanks for the concrete examples, this really helps tease apart our disagreement.

We are overall glad that the Simulators post was written. Our view is that it could have been much stronger had it been clearer which claims were empirically supported versus hypotheses. Continuing the comparison with ARC, we found ELK to be substantially clearer and a deeper insight. Admittedly ELK is one of the outputs people in the TAIS community are most excited by so this is a high bar.

The stuff on SVDs and sparse coding [...] was a valuable contribution. I'd still say it was less influential than e.g. toy models of superposition or causal scrubbing but neither of these were done by like 3 people in two weeks.

This sounds similar to our internal evaluation. We're a bit confused by why "3 people in two weeks" is the relevant reference class. We'd argue the costs of Conjecture's "misses" need to be accounted for, not just their "hits". Redwood's team size and budget are comparable to that of Conjecture, so if you think that causal scrubbing is more impressive than Conjecture's other outputs, then it sounds like you agree with us that Redwood was more impressive than Conjecture (unless you think the Simulator's post is head and shoulders above Redwood's other output)?

Thanks for sharing the data point this influenced independent researchers. That's useful to know, and updates us positively. Are you excited by those independent researchers' new directions? Is there any output from those researchers you'd suggest we review?

3) We remain confident in our sources regarding Conecture's discussion with VCs, although it's certainly conceivable that Conjecture was more open with some VCs than others. To clarify, we are not claiming that Connor or others at Conjecture did not mention anything about their alignment plans or interest in x-risk to VCs (indeed, this would be a barely tenable position for them given their public discussion of these plans), simply that their pitch gave the impression that Conjecture was primarily focused on developing products. It's reasonable for you to be skeptical of this if your sources at Conjecture disagree; we would be interested to know how close to the negotiations those staff were, although understand this may not be something you can share.

4) We think your point is reasonable. We plan to reflect this recommendation and will reply here when we have an update.

5) This certainly depends on what "general industry" refers to: a research engineer at Conjecture might well be better for ML skill-building than, say, being a software engineer at Walmart. But we would expect ML teams at top tech companies, or working with relevant professors, to be significantly better for skill-building. Generally we expect quality of mentorship to be one of the most important components of individuals developing as researchers and engineers. The Conjecture team is stretched thin as a result of rapid scaling, and had few experienced researchers or engineers on staff in the first place. By contrast, ML teams at top tech companies will typically have a much higher fraction of senior researchers and engineers, and professors at leading universities comprise some of the best researchers in the field. We'd be curious to hear your case for Conjecture as skill building; without that it's hard to identify where our main disagreement lies.

Load more