Neel Nanda

I'm a recent graduate, interested in finance and AI. I blog about rationality, motivation, social skills and life optimisation at . In November I will start work at Anthropic on language model interpretability research

Wiki Contributions


Join EA Global: London as a virtual attendee

For what it's worth, I was still confused about this until I read this comment thread - the post felt like it was describing virtual attendees joining EAG, it now feels more like you're running a separate and simultaneous virtual event

Lessons learned running the Survey on AI existential risk scenarios

Thanks a lot for the thorough post-mortem, that was super interesting.

Testers are good at identifying flaws, but bad at proposing improvements

I found this point particularly interesting as someone who gives feedback fairly often - I think I've previously been pretty overconfident about suggesting concrete fixes (and often feel like feedback is incomplete if I don't suggest a fix), and this was a useful update.

Robin Hanson on the Long Reflection

I feel on board with essentially everything in this article. I'm pretty confused by the popularity of the Long Reflection idea - it seems utterly impractical and unrealistic without major changes to how humans act, for eg the reasons outlined here. I feel like I must be misunderstanding or missing something?

How would you run the Petrov Day game?

To clarify my position, I PERSONALLY find not pressing the button extremely easy, because I am strongly incentivised to not do it. This means that I don't personally feel like I am demonstrating that I am worthy of trust. If other people feel the same way, the ritual is also ineffective on them. 

Entirely consistently with this, if some people think this is dumb, get tricked, want to troll etc, it is easy for them to press the button. Ensuring none of the hundred people are like this is a hard problem, and I agree with Oliver that that is an achievement

Honoring Petrov Day on the EA Forum: 2021

I'd fairly strongly disagree with that take. I think it's an extremely reasonable assumption that a somewhat cartoony red button someone put at the top of a website deliberately does not do harm to press. Someone deliberately chose to put it there, and most features on websites are optimised for user interaction. This only looks unreasonable within the strong frame of having cultural context about Petrov Day

How would you run the Petrov Day game?

What would you think of making button pressers anonymous? Currently, I will definitely not press the button because I know that this could plausibly lead to negative social consequences for me, and be clearly tied to my identity. Which is a purely self-interested thing, rather than me actually taking agency and choosing not to unilaterally destroy the world, and demonstrating myself to be worthy of trust. I imagine this is true for other people too? Which, to me, majorly undercuts the community ritual and trust-building angles


Alternately, maybe the social consequences are how people are coordinating?

Clarifying the Petrov Day Exercise

I like that idea, and I think it would make me feel much more on board with the idea of the ritual!

Clarifying the Petrov Day Exercise

I think there's an important difference between '100 people opted-in to our community ritual, and all successfully coordinated' and 200, but where 100 paid attention and 100 totally ignored the email. I don't feel any notion of trust or coordination from people ignoring an email, or just not being interested.

Clarifying the Petrov Day Exercise

Thanks for writing this! I personally am a pretty big fan of the idea of this Petrov Day celebration, and have gotten joy from previous years, but mostly within the frame of the game or prompt for interesting discussions and explorations of norms. To me, this doesn't feel like a community ritual, though it clearly does to some. Though I could imagine getting value and connection from an alternate version of this that did feel like a real ritual.

But I strongly agree that not asking people for consent/to opt-in is clearly bad here, and significantly under-cuts a lot of the possible value. In particular, I just do not think a community ritual works without asking people to buy in to it. And I feel pretty uncomfortable that people are entered into a game that may have real social consequences to them or give them opportunity to upset other people, without really having context on this - in particular, what happened at LessWrong last year seems terrible. 

Load More