NN

Neel Nanda

5530 karmaJoined neelnanda.io

Bio

I lead the DeepMind mechanistic interpretability team

Comments
417

I agree that people's takes in response to surveys are very sensitive to framing and hard to interpret. I was trying to gesture at the hypothesis that many people are skeptical of future technologies, afraid of job loss, don't trust tech, etc, even if they do sincerely value loved ones. But anyway, that's not a crux.

I think we basically agree here, overall? I agree that my arguments here are not sufficient to support a large pause for a small reduction in risk. I don't consider this a core point of EA, but I'm not confident in that, and don't think you're too unreasonable for doing so

Though while I'm skeptical of the type of unilateral pause pushed for in EA, I am much more supportive of not actively pushing capabilities to be faster, since I think the arguments that pauses are distortionary and penalise safety motivated actors, don't apply there, and most acceleration will diffuse across the ecosystem. This makes me guess that Mechanize is net negative, so I imagine this is also a point of disagreement between us.

I broadly agree that the costs of long pauses look much more expensive if you're not a longtermist. (When I wrote this post, pauseAI and similar were much less of a thing).

I still stand by this post for a few reasons:

  • "This clearly matters under most reasonable moral views" - In my opinion, person affecting views are not that common a view (though I'm not confident here) and many people would consider human extinction to matter intrinsically, in that it affects their future children or grandchildren and legacy and future generations, quite a lot more than just the lives of everyone alive today, without being total utilitarians. Most people also aren't even utilitarians, and may think that death from old age is natural and totally fine. I just think if you told people "there's this new technology that could cause human extinction, or be a really big deal and save many lives and cause an age of wonders, should we be slow and cautious in how we develop it" most people would say yes? Under specifically a scope sensitive, person affecting view, I agree that pauses are unusually bad
  • I personally don't even expect pauses to work, without way more evidence of imminent risk than we currently have (and probably even then not for more than 6-24 months) and I think that most actions that people in the community take here have way less of a tradeoff - do more safety research, evaluate and monitor things better, actually have any regulation whatsoever, communicate and coordinate with China, model the impact these things will have on the economy, avoid concentrations of power that enable unilateral power grabs, ensure companies can go at an appropriate pace rather than being caught in a mad commercial rush, etc. I think that, to be effective, a pause must also include things like a hardware progress pause, affect all key actors, etc which seems really hard to achieve and I think it's very unrealistic without much stronger evidence of imminent risk, at which point I think the numbers are much more favourable towards pausing, as my risk conditional on no pausing would be higher. I just really don't expect the world to pause on the basis of a precautionary principle.
    • For example, I do interpretability work. I think this is just straightforwardly good under most moral frameworks here and my argument here is sufficient to support much more investment in technical safety research, one of the major actions called for by the community. I care more about emphasising areas of common ground than justifying the most extreme and impractical positions
  • Personally, my risk figures and timelines are notably beyond the baseline described in this post, so I'm more pro extreme actions like pausing, even on person affecting grounds, but I agree this is a harder sell requiring stronger arguments than this post.

This is a very different life path from most people I know and I thoroughly enjoyed learning about it in so much detail. Thanks for writing this up!

Strong agree that people talk about AI timelines way too much. I think that the level of EG 2 Vs 5 Vs 20 Vs way longer is genuinely decision relevant, but that much more fine grained than that often isn't. And it's so uncertain and the evidence is so weak that I think it's difficult to do much more than putting decent probability on all of those categories and shifting your weightings a bit

My argument is essentially that "similar income, non impactful job" is as relevant a reference class to the "similar income, impactful job person" as it is as a reference class to the "high income, non impactful job" person. I also personally think reference classes is the wrong way to think about it. If taking a more impactful job also makes someone obliged to take on a lower post donation salary (when they don't have to), I feel like something has gone wrong, and the incentives are not aligned with doing the most good.

This is reasonable. I think the key point that I want to defend is that it seems wrong to say that choosing a more impactful job should mean you ought to have a lower post donation salary.

I personally think of it in terms of having some minimum obligation for doing your part (which I set at 10% by default), plus encouragement (but not obligation) to do significant amounts more good if you want to

Kudos for making the hard decision to cut so many things! This seems the best thing for GWWC's impact

Suppose they're triplets, and Charlotte, also initially identical, earns $1M/year just like Belinda, but can't/doesn't want to switch to safety. How much of Charlotte's income should she donate in your worldview? What is the best attitude for the EA community?

My point is that, even though there's a moral obligation, unless you think that high earning people in finance should be donating a very large fraction of their salary (so their post donation pay is less than the pay in AI safety), their de facto moral obligation has increased by the choice to do direct work, which is unreasonable to my eyes.

I would also guess that at least most people doing safety work at industry labs could get a very well paying role at a top tier finance firm? The talent bar is really high nowadays

My point is that "other people in the income bracket AFTER taking a lower paying job" is the wrong reference class.

Let's say someone is earning $10mn/year in finance. I totally think they should donate some large fraction of their income. But I'm pretty reluctant to argue that they should donate more than 99% of it. So it seems completely fine to have a post donation income above $100K, likely far above.

If this person quits to take a job in AI Safety that pays $100K/year, because they think this is more impactful than their donations, I think it would be unreasonable to argue that they need to donate some of their reduced salary, because then their "maximum acceptable post donation salary" has gone down, even though they're (hopefully) having more impact than if they donated everything above $100K

I'm picking fairly extreme numbers to illustrate the point, but the key point is that choosing to do direct work should not reduce your "maximum acceptable salary post donations", and that at least according to my values, that max salary post donation is often above what they get paid in their new direct role.

Load more