Researcher at the Center on Long-Term Risk. All opinions my own.
So then, the difference between (a) and (b) is purely empirical, and MNB does not allow me to compare (a) and (b), right? This is what I'd find a bit arbitrary, at first glance.
Gotcha, thanks! Yeah, I think it's fair to be somewhat suspicious of giving special status to "normative views". I'm still sympathetic to doing so for the reasons I mention in the post (here). But it would be great to dig into this more.
What would the justification standards in wild animal welfare say about uncertainty-laden decisions that involve neither AI nor animals: e.g. as a government, deciding which policies to enact, or as a US citizen, deciding who to vote for President?
Yeah, I think this is a feeling that the folks working on bracketing are trying to capture: that in quotidian decision-making contexts, we generally use the factors we aren't clueless about (@Anthony DiGiovanni -- I think I recall a bracketing piece explicitly making a comparison to day-to-day decision making, but now can't find it... so correct me if I'm wrong!). So I'm interested to see how that progresses.
I think the vast majority of people making decisions about public policy or who to vote for either aren't ethically impartial, or they're "spotlighting", as you put it. I expect the kind of bracketing I'd endorse upon reflection to look pretty different from such decision-making.
That said, maybe you're thinking of this point I mentioned to you on a call: I think even if someone is purely self-interested (say), they plausibly should be clueless about their actions' impact on their expected lifetime welfare, because of strange post-AGI scenarios (or possible afterlives, simulation hypotheses, etc.).[1] See this paper. So it seems like the justification for basic prudential decision-making might have to rely on something like bracketing, as far as I can tell. Even if it's not the formal theory of bracketing given here. (I have a draft about this on the backburner, happy to share if interested.)
I used to be skeptical of this claim, for the reasons argued in this comment. I like the "impartial goodness is freaking weird" intuition pump for cluelessness given in the comment. But I've come around to thinking "time-impartial goodness, even for a single moral patient who might live into the singularity, is freaking weird".
Would you say that what dictates my view on (a)vs(b) is my uncertainty between different epistemic principles
It seems pretty implausible to me that there are distinct normative principles that, combined with the principle of non-arbitrariness I mention in the "Problem 1" section, imply (b). Instead I suspect Vasco is reasoning about the implications of epistemic principles (applied to our evidence) in a way I'd find uncompelling even if I endorsed precise Bayesianism. So I think I'd answer "no" to your question. But I don't understand Vasco's view well enough to be confident.
Can you explain more why answering "no" makes metanormatively bracketing in consequentialist bracketing (a bit) arbitrary? My thinking is: Let E be epistemic principles that, among other things, require non-arbitrariness. (So, normative views that involve E might provide strong reasons for choice, all else equal.) If it's sufficiently implausible that E would imply Vasco's view, then E will still leave us clueless, because of insensitivity to mild sweetening.
But lots of the interventions in 2. seem to also be helpful for getting things to go better for current farmed and wild animals, e.g. because they are aimed avoiding a takeover of society by forces which don't care at all about morals
Presumably misaligned AIs are much less likely than humans to want to keep factory farming around, no? (I'd agree the case of wild animals is more complicated, if you're very uncertain or clueless whether their lives are good or bad.)
Thanks Jo! Yeah, the perspective I defend in that post in a nutshell is:
(Similarly, the decision theory of "bracketing" might also resolve incomparability within consequentialism, but see here for some challenges.)
Re: the first link, what do you think of Dynamic Strong Maximality, which avoids money pumps while allowing for incomparability?
this happens to break at least the craziest Pascalian wagers, assuming plausible imprecise credences (see DiGiovanni 2024).
FWIW, since writing that post, I've come to think it's still pretty dang intuitively strange if taking the Pascalian wager is permissible on consequentialist grounds, even if not obligatory. Which is what maximality implies. I think you need something like bracketing in particular to avoid that conclusion, if you don't go with (IMO really ad hoc) bounded value functions or small-probability discounting.
(This section of the bracketing post is appropos.)
This particular claim isn't empirical, it's about what follows from compelling epistemic principles.
(As for empirical evidence that would change my mind about imprecision being so severe that we're clueless, see our earlier exchange. I guess we hit a crux there.)
Quotes: Recent discussions of backfire risks in AI safety
Some thinkers in AI safety have recently pointed out various backfire effects that attempts to reduce AI x-risk can have. I think pretty much all of these effects were known before,[1] but it's helpful to have them front of mind. In particular, I'm skeptical that we can weigh these effects against the upsides precisely enough to say an AI x-risk intervention is positive or negative in expectation, without making an arbitrary call. (Even if our favorite intervention doesn't have these specific downsides, we should ask if we're pricing in the downsides (and upsides) we haven't yet discovered.)
(Emphasis mine, in all the quotes below.)
Holden's Oct 2025 80K interview:
Helen Toner’s Nov 2025 80K interview:
Wei Dai, “Legible vs. Illegible AI Safety Problems”:
Among other sources, see my compilations of backfire effects here and here, and discussion of downside risks of capacity-building / aiming for "option value" or "wiser futures" here.