All of Raymond D's Comments + Replies

Throwing in my 2c on this:

  1. I think EA often comes with a certain kind of ontology (consequentialism, utilitarianism, generally thinking in terms of individuals) which is kind of reflected in the top-level problems given here (from the first list: persuasion, human power concentration, AI character and welfare) - not just the focus but the framing of what the problem even is.
  2. I think there are nearby problems which are best understood from a slightly different ontology - how AI will affect cultural development, the shifting of power from individuals to emerge
... (read more)
-3
Frankle Fry
Your concern about EA's consequentialist lens warping these fields resonates with what I found when experimenting with multi-AI deliberation on ethics. I had Claude, ChatGPT, Grok, and Gemini each propose ethical frameworks independently, and each one reflected its training philosophy - Grok was absolutist about truth-seeking, Claude cautious about harm, ChatGPT moderate and consensus-seeking. The key insight: single perspectives hide their own assumptions. It's only when you compare multiple approaches that the blindspots become visible. This makes your point about EA flooding these areas with one ontology particularly concerning. If we're trying to figure out "AI character" or "gradual disempowerment" through purely consequentialist framing, we might be encoding that bias into foundational work without realizing it. Maybe the solution isn't avoiding EA involvement, but structuring the work to force engagement with different philosophical traditions from the start? Like explicitly pairing consequentialists with virtue ethicists, deontologists, care ethicists, etc. in research teams. Or requiring papers to address "what would critics from X tradition say about this framing?" Your "gradual disempowerment" example is perfect - this seems like it requires understanding emergent structures and collective identity in ways that individual-focused utilitarian thinking might miss entirely. Would you say the risk is: * EA people not recognizing non-consequentialist framings as valid? * EA organizational culture making it uncomfortable to disagree with consequentialist assumptions? * Just sheer numbers overwhelming other perspectives in discourse?
5
Sharmake
My general take on gradual disempowerment, independent of any other issues raised here, is that I think it's a coherent scenario, but that it ultimately is very unlikely to arise in practice, because it relies on an equilibrium where the sort of very imperfect alignment needed for divergence between human and AI interests to occur over the long-run being stable, even as the reasons for why the alignment problem in humans being very spotty/imperfect being stable get knocked out. In particular, I'm relatively bullish on automated AI alignment conditional on non-power seeking/non-sandbagging if we give the AIs reward but misaligned human-level AI, so I generally think it quite rapidly resolves as either the AI is power-seeking and willing to sandbag/scheme on everything, leading to the classic AI takeover, or the AI is aligned to the principal in such a way that the principal-agency cost becomes essentially 0 over time. Note I'm not claiming that most humans won't be dead/disempowered, I'm just saying that I don't think gradual disempowerment is worth spending much time/money on. Tom Davidson has a longer post on this here.

I'd quite like to help read some of these. I strongly agree that a table read of the MIRI conversations would be good: given their conversational nature I think a lot of people would find them easier to approach as a recording than as a text log.

Also, my impression is that the Fable of the Dragon Tyrant got a lot out of having a nice video version. If the recordings go well it might be worth considering commissioning an accompanying video for the top prize winner at least.

Kind of. From a virtue ethicist standpoint, things that happen aren't really good or bad in and of themselves. It's not bad for a child to drown, and it's not good for a child to be saved, because those aren't the sorts of things that can be good or bad. 

It seems very unintuitive if you look at it from a consequentialist standpoint, but it is consistent and coherent, and people who are committed to it find it intuitive.

I guess an equivalent argument from the other side would be something like "Consequentialists think that virtues only matter in terms ... (read more)

1
Charles Dillon 🔸
It makes sense, but it feels like a very narrow conception of what morality ought to concern itself with. In your simulation example, I think it depends on whether we can be fully confident that simulated entities cannot suffer, which seems unlikely to me.

Ah, thank you! I'll keep an eye on it.