NN

Neel Nanda

3804 karmaJoined Nov 2019neelnanda.io

Bio

I lead the DeepMind mechanistic interpretability team

Comments
287

Strong +1 to this! Also, entertainingly, I know many of the people in the first episode, and they seemed significantly funnier there than they do in real life - clearly I'm not hanging out with you all in the right settings!

Idk, I do just think that bad faith actors exist, especially in the public sphere. It's a mistake to assume that all critics are in bad faith, but equally it's naive to assume that it's never bad faith

The Wenar criticism in particular seems laughably bad, such that I find bad faith hypotheses like this fairly convincing. I do agree it's a seductive line of reasoning to follow in general though, and that this can be dangerous

I got the OpenPhil grant only after the other grant went through (and wasn't thinking much about OpenPhil when I applied for the other grant). I never thought to inform the other grant maker after I got the OpenPhil grant, which maybe I should have in hindsight out of courtesy?

This was covering some salary for a fixed period of research, partially retroactive, after an FTX grant fell through. So I guess I didn't have use for more than X, in some sense (I'm always happy to be paid a higher salary! But I wouldn't have worked for a longer period of time, so I would have felt a bit weird about the situation)

Without any context on this situation, I can totally imagine worlds where this is reasonable behaviour, though perhaps poorly communicated, especially if SFF didn't know they had OpenPhil funding. I personally had a grant from OpenPhil approved for X, but in the meantime had another grantmaker give me a smaller grant for y < X, and OpenPhil agreed to instead fund me for X - y, which I thought was extremely reasonable.

In theory, you can imagine OpenPhil wanting to fund their "fair share" of a project, evenly split across all other interested grantmakers. But it seems harmful and inefficient to wait for other grantmakers to confirm or deny, so "I'll give you 100%, but lower that to 50% if another grantmaker is later willing to go in as well" seems a more efficient version.

I can also imagine that they eg think a project is good if funded up to $100K, but worse if funded up to $200K (eg that they'd try to scale too fast, as has happened with multiple AI Safety projects that I know of!). If OpenPhil funds $100K, and the counterfactual is $0, that's a good grant. But if SFF also provides $100K, that totally changes the terms, and now OpenPhil's grant is actively negative (from their perspective).

I don't know what the right social norms here are, and I can see various bad effects on the ecosystem from this behaviour in general - incentivising grantees to be dishonest about whether they have other funding, disincentivising other grantmakers from funding anything they think OpenPhil might fund, etc. I think Habryka's suggestion of funging, but not to 100% seems reasonable and probably better to me.

Omg what, this is amazing(though nested bullets not working does seem to make this notably less useful). Does it work for images?

I liked this, and am happy for this to have been a post. Maybe putting [short poem] in the title could help calibrate people on what to expect?

I'd be curious to hear your or Emma's case for why it's notably higher impact for a forum reader to donate via the campaign rather than to New Incentives directly (if they're inclined to make the donation at all)

Load more