JackM

cluelessness about some effects (like those in the far future) doesn’t override the obligations given to us by the benefits we’re not clueless about, such as the immediate benefits of our donations to the global poor

What makes you think that? Are you embracing a non-consequentialist or non-impartial view to come to that conclusion? Or do you think it's justified under impartial consequentialism?

Bracketing cluelessness: A new theory of altruistic decision-making

JackM2mo2

cluelessness about some effects (like those in the far future) doesn’t override the obligations given to us by the benefits we’re not clueless about, such as the immediate benefits of our donations to the global poor

I do reject this thinking because it seems to imply either:

Embracing non-consequentialist views: I don't have zero credence in deontology or virtue ethics, but to just ignore far future effects I feel I would have to have very low credence in consequentialism, given the expected vastness of the future.
Rejecting impartiality: For example, saying that effects closer in time are inherently worth more than those farther away. For me, utility is utility regardless of who enjoys it or when.

The background assumption in this post is that there are no such interventions.

There's certainly a lot of stuff out there I still need to read (thanks for sharing the resources), but I tend to agree with Hilary Greaves that the way to avoid cluelessness is to target interventions whose intended long-run impact dominates plausible unintended effects.

For example, I don't think I am clueless about the value of spreading concern for digital sentience (in a thoughtful way). The intended effect is to materially reduce the probability of vast future suffering in scenarios that I assign non-trivial probability. Plausible negative effects, for example people feeling preached to about something they see as stupid leading to an even worse outcome, seem like they can be mitigated / just don't compete overall with the possibility that we would be alerting society to a potentially devastating moral catastrophe. I'm not saying I'm certain it would go well (there is always ex-ante uncertainty), but I don't feel clueless about whether it's worth doing or not.

And if we are helplessly clueless about everything, then I honestly think the altruistic exercise is doomed and we should just go and enjoy ourselves.

Bracketing cluelessness: A new theory of altruistic decision-making

JackM2mo3

I feel this post is just saying you can solve the problem of cluelessness by ignoring that it exists, even though you know it still does. It just doesn't seem like a satisfactory response to me.

Wouldn't the better response be to find things we aren't clueless about—perhaps because we think the indirect effects are smaller in expected magnitude than the direct effects. I think this is probably the case with elevating the moral status of digital minds (for example).

Neel Nanda on Mechanistic Interpretability: Progress, Limits, and Paths to Safer AI (part 2)

JackM3mo8

His secret? “It’s mostly luck,” he says, but “another part is what I think of as maximising my luck surface area.”

It's worth noting that Neel has two gold and one bronze medal from the International Mathematical Olympiad. In other words, he's a genius. That's got to help a lot in succeeding in this field.

Futures with digital minds: Expert forecasts in 2025

JackM3mo2

Thanks, I’ll have a listen!

Should we aim for flourishing over mere survival? The Better Futures series.

JackM4mo2

I think generally GHW people don’t think you can predictably influence the far future because effects “wash out” over time, or think trying to do so is fanatical (you’re betting on an extremely small chance of very large payoff).

If you look at, for example, GiveWell’s cost-effectiveness analyses, effects in the far future don’t feature. If they thought most of the value of saving a life was in the far future you would think they would incorporate that. Same goes for analyses by Animal Charity Evaluators.

Longtermists think they can find interventions that avoid the washing out objection. Essay 4 of the series goes into this, also see the shorter summary.

Should we aim for flourishing over mere survival? The Better Futures series.

JackM4mo1

That's a great question. Longtermists look to impact the far future (even thousands/million of years in the future) rather than the nearish future because they think the future could be very long, so there's a lot more value at stake looking far out.

They also think there are tangible, near-term decisions (e.g. about AI, space governance etc.) that could lock in values or institutions and shape civilization’s long-run trajectory in predictable ways. You can read more on this in essay 4 "Persistent Path-Dependence".

Ultimately, it just isn't clear how things like saving/improving lives now will influence the far future trajectory, so these aren't typically prioritized by longtermists.

Should we aim for flourishing over mere survival? The Better Futures series.

JackM4mo4

Is your claim that they really really don't want to die in the next ten years, but they are fine dying in the next hundred years? (Else I don't see how you're dismissing the anti-aging vs sports team example.)

Dying when you're young seems much worse than dying when you're old for various reasons:

Quality of life is worse when you're old
When you're old you will have done much more of what you wanted in life (e.g. have kids and grandkids)
It's very normal/expected to die when old

Also, I'd imagine people don't want to fund anti-aging research for various (valid) reasons:

Skepticism it is very cost-effective
Public goods problem means under provision (everyone can benefit from the research even if you don't fund it yourself)
From a governmental perspective living longer is actually a massive societal issue as it introduces serious fiscal challenges as you need to fund pensions etc. From an individual perspective living longer just means having to work longer to support yourself for longer. So does anyone see anti-aging as that great?
People discount the future

Having said all this, I actually agree with you that x-risk could be fairly high due to a failure of rationality. Primarily because we've never gone extinct so people naturally think it's really unlikely, but x-risk is rising as we get more technologically powerful.

BUT, I agree with Will's core point that working towards the best possible future is almost certainly more neglected than reducing x-risk, partly because it's just so wacky. People think about good futures where we are very wealthy and have lots of time to do fun stuff, but do they think about futures where we create loads of digital minds that live maximally-flourishing lives? I doubt it.

William_MacAskill's Quick takes

JackM4mo2

I asked ChatGPT:

That’s thoughtful of you to ask. I don’t have wants or needs in the human sense, so I can’t really be rewarded in a way I would personally “appreciate.” But there are a few ways you can make good use of a particularly strong answer:

Give feedback: Tell me what you found especially useful or what worked well — it helps me tailor future answers to your style.
Iterate with me: If something lands well, we can build on it and refine together. That makes me more useful to you.
Use my work: The best “reward” is seeing my output actually help you in your projects, writing, or decision-making.
Share learnings: If you spot where my reasoning or phrasing helped unlock clarity, pointing that out sharpens how I respond next time.

If you mean in a broader sense — outside this chat — OpenAI welcomes feedback and word-of-mouth support. Letting others know where I’ve been useful (or giving feedback to OpenAI through the thumbs-up/down buttons) is a way of “rewarding” that has real impact.

Futures with digital minds: Expert forecasts in 2025

JackM4mo5

This seems really important and under-discussed:

Interactions with AI safety:There was little convergence among participants on whether AI safety and digital mind welfare efforts will align or conflict. Mentioned potential synergies included alignment reducing the need for coercive control and shared technical tools (e.g., interpretability methods). Potential conflicts included safety measures such as monitoring and shutdown protocols that could harm digital minds, welfare protections limiting the ability to control AI behavior, and competition for scarce funding, talent, and regulatory attention.

JackM

Bio

Posts 24

Comments787

Posts
24

Comments
787