The Epistemic Challenge to Longtermism (Tarsney, 2020)

MichaelA

This is a linkpost for https://globalprioritiesinstitute.org/christian-tarsney-the-epistemic-challenge-to-longtermism/

Abstract from the paper

Longtermists claim that what we ought to do is mainly determined by how our actions might affect the very long-run future. A natural objection to longtermism is that these effects may be nearly impossible to predict— perhaps so close to impossible that, despite the astronomical importance of the far future, the expected value of our present options is mainly determined by short-term considerations. This paper aims to precisify and evaluate (a version of) this epistemic objection to longtermism. To that end, I develop two simple models for comparing “longtermist” and “short-termist” interventions, incorporating the idea that, as we look further into the future, the effects of any present intervention become progressively harder to predict. These models yield mixed conclusions: If we simply aim to maximize expected value, and don’t mind premising our choices on minuscule probabilities of astronomical payoffs, the case for longtermism looks robust. But on some prima facie plausible empirical worldviews, the expectational superiority of longtermist interventions depends heavily on these “Pascalian” probabilities. So the case for longtermism may depend either on plausible but non-obvious empirical claims or on a tolerance for Pascalian fanaticism.

Why I'm making this linkpost

I want to draw a bit more attention to this great paper
- I think this is one of the best sources for people interested in arguments for and against longtermism
  - For people who are interested in learning about longtermism and are open to reading (sometimes somewhat technical) philosophy papers, I think the main two things I'd recommend they read are The Case for Strong Longtermism and this paper
    - Other leading contenders are The Precipice, Existential Risk Prevention as Global Priority, and some of the posts tagged Longtermism
I want to make it possible to tag the post so that people see it later when it's relevant to what they're looking for via tags (e.g., I'd want people who check out the Longtermism tag to see a pointer to this paper to come up prominently)
I want to make it easier for people to get a quick sense of whether it's worth their time to engage with this paper, given their goals (because people can check this post's karma, comments, and/or tags)
I want to give people a space to discuss the paper in a way that other people can see and build on
- I'll share a bunch of my own comments below
  - (I'll try to start each one with a tl;dr for that comment)

79 Reactions

Mentioned in

55My personal cruxes for focusing on existential risks / longtermism / anything other than just video games

39Thoughts on "A case against strong longtermism" (Masrani)

30Thoughts on “The Case for Strong Longtermism” (Greaves & MacAskill)

21Red teaming a model for estimating the value of longtermist interventions - A critique of Tarsney's "The Epistemic Challenge to Longtermism"

12What harm could AI safety do?

Load more (5/6)

Comments28

Sorted by

New & upvoted

Click to highlight new comments since: Today at 12:40 PM

JackM3y16

In case anyone is interested, Rob Wiblin will be interviewing Tarsney on the 80,000 Hours podcast next week. Rob is accepting question suggestions on Facebook (I think you can submit questions to Rob on Twitter or by email too).

MichaelA3y15

tl;dr: Tarsney's model updates me towards thinking reducing non-extinction existential risks should be a little less of a priority than I previously thought.

Here's a quote from Tarsney (which makes more sense after reading the rest of the paper):

The potentially enormous impact that the long-term rate of ENEs [events which nullify the intended effect of a longtermist intervention, e.g. a later extinction event] has on the expected value of longtermist interventions has implications for “intra-longtermist” prioritization: We have strong pro tanto reason to focus on bringing about states such that both they and their complements are highly stable, since it is these interventions whose effects are likely to persist for a very long time (and thus to affect our civilization when it is more widespread and resource-rich). This suggests, in particular, that interventions focused on reducing existential risk may have higher expected value than, say, interventions aimed at reforming institutions or changing social values: Intuitively, the intended effects of these interventions are relatively easy to undo, or to achieve at some later date even if we fail to achieve them now. So the long-term rate of ENEs (i.e., value of r) may be significantly higher for these interventions than for existential risk mitigation.

See also Greaves and MacAskill's concept of "attractor states".

This indeed seems like an interesting implication of Tarsney's model, and indeed updates me towards placing a bit less emphasis on reducing non-extinction existential risks - e.g., reducing the chance of lock-in of a bad governmental system or set of values.

(I already considered this a lower priority from longtermists as a whole than reducing extinction risks. But I also thought that longtermists should prioritise investigating this potential priority more than they currently do. I still think that, but now with a bit lower confidence.)

---

That said, I also think Tarsney's phrasing is a bit misleading. He compares "interventions focused on reducing existential risk" to "interventions aimed at reforming institutions or changing social values". But interventions may be aimed at doing the latter as a means to doing the former; one could try to change institutions or social values with the primary goal of ultimately reducing existential risk (or extinction risk specifically). And Tarsney's model doesn't seem to push against those interventions relative to other means of reducing existential risk.

I think Tarsney really wants to compare interventions aimed at reducing extinction risk to interventions ultimately aimed at changing aspects of the long-term future other than whether humanity goes extinct - e.g., again, reducing the chance of lock-in of a bad governmental system or set of values.

This highlights another way in which Tarsney's phrasing seems a bit misleading: existential risk itself already includes non-extinction existential risk. So I think Tarsney should use the term "extinction risk" here.

JackM3y4

This indeed seems like an interesting implication of Tarsney's model, and indeed updates me towards placing a bit less emphasis on reducing non-extinction existential risks - e.g., reducing the chance of lock-in of a bad governmental system or set of values.

Surely "lock-in" implies stability and persistence?

Greaves and MacAskill introduce the concept of the 'non-extinction attractor state' to capture interventions that can achieve the persistence Tarsney says is so important, but that don't rely on extinction to do so.

This includes institutional reform:

But once such institutions were created, they might persist indefinitely. Political institutions often change as a result of conflict or competition with other states. For strong world governments, this consideration would not apply (Caplan 2008). In the past, governments have also often changed as a result of civil war or internal revolution. However, advancing technology might make that far less likely for a future world government: modern and future surveillance technologies could prevent insurrection, and AI-controlled police and armies could be controlled by the leaders of the government, thereby removing the possibility of a military coup (Caplan 2008; Smith 2014).

MichaelA3y4

Surely "lock-in" implies stability and persistence?

Yeah, definitely. I see now that I didn't clearly explain what I meant. It's not that I changed my views on how how important the difference between lock-in of a bad governmental system or set of values and a future without such a lock-in is.

It's more like I somewhat updated my views regarding:

how likely such a lock-in is
- and in particular how likely it is that a state that looks like it might be a lock-in would actually be a lock-in
  - and in particular how much the epistemic challenge to longtermism might undermine a focus on this type of potential lock-in in particular

And as a result, I somewhat updated me views regarding how much we should focus on preventing these outcomes. Analogous to how I'd update my prioritisation of biorisk if I learned the relevant catastrophes were less likely than I thought, even if no less bad.

(I'm still not sure that explanation is 100% clear.)

And yeah, Greaves and MacAskill's "non-extinction attractor state" concept is relevant here, and I liked that section of their paper :)

JackM3y3

OK that's clearer, although I'm not immediately sure why the paper would have achieved the following:

I somewhat updated my views regarding:
how likely such a lock-in is
and in particular how likely it is that a state that looks like it might be a lock-in would actually be a lock-in
...

I think Tarsney implies that institutional reform is less likely to be a true lock-in, but he doesn't really back this up with much argument. He just implies that this point is somewhat obvious. Under this assumption, I can understand why his model would lead to the following update:

...
...
and in particular how much the epistemic challenge to longtermism might undermine a focus on this type of potential lock-in in particular

In other words, if Tarsney had engaged in a discussion about why institutional change isn't actually likely to be stable/persistent, providing object-level reasons for why (which may involve disagreeing with Greaves and MacAskill's points), I think I too would update away from thinking institutional change is that important, but I don't think he really engages in this discussion.

I should say that I haven't properly read through the whole paper (I have mainly relied on watching the video and skimming through the paper), so it's possible I'm missing some things.

MichaelA3y6

[Writing this comment quickly]

I think it makes sense to be a bit confused about what claim I'm making and why. I read the paper and made the initial version of these note a few weeks ago, so my memory of what the paper said and how it changed my views is slightly hazy.

But I think the key point is essentially the arguably obvious point that the rate of ENEs can be really important, and that that rate seems likely to be much higher when the target state is something like "a very good system of government or set of values" or "a very bad system of government or set of values" (compared to when the target state is whether an intelligent civilization exists). It does seem much more obvious that extinction or non-extinction are each stronger attractor states that particularly good or particularly bad non-extinction outcomes are.

This is basically something I already knew, but I think Tarsney's models and analysis made the point a bit more salient, and also made it clearer how important it is (since the rate of ENEs seems like probably one of the most important factors influencing the case for longtermism).

But what I've said above kind-of implicitly accepts Tarsney's focus (for the sake of his working example) on simply whether there is an intelligent civilization around, rather than what it's doing. In reality, I think that what the civilization is doing is likely also very important.[1] So the above point about particularly good or particularly bad non-extinction outcomes maybe being only weak attractor states might also undermine the significance of keeping an intelligent civilization around.

But here's one way that might not be true: Maybe we think it's easier to have a lock-in of - or natural trends that maintain - a good non-extinction outcome than a bad non-extinction outcome. (I think Ord essentially implies this in The Precipice. I might soon post something related to this. It's also been discussed in some other places, e.g. here.) If so, then the point about the rate of ENEs suggests the case for avoiding unrecoverable dystopias and unrecoverable collapses might be weak, but it wouldn't as strongly suggest the case for avoiding extinction is weak.

...but this all seems rather complicated, and I'm still not sure my thinking is clear, and even less sure my explanation is clear!

[1] Tarsney does acknowledge roughly this point later in the paper:

Additionally, there are other potential sources of epistemic resistance to longtermism besides Weak Attractors that this paper has not addressed. In particular, these include:
Neutral Attractors To entertain small values of r [the rate of ENEs], we must assume that the state S targeted by a longtermist intervention, and its complement ¬S, are both at least to some extent “attractor” states: Once a system is in state S, or state ¬S, it is unlikely to leave that state any time soon. But to justify significant values of ve and vs, it must also be the case that the attractors we are able to target differ significantly in expected value. And it’s not clear that we can assume this. For instance, perhaps “large interstellar civilization exists in spatial region X” is an attractor state, but “large interstellar civilization exists in region X with healthy norms and institutions that generate a high level of value” is not. If civilizations tend to “wander” unpredictably between high-value and low-value states, it could be that despite their astronomical potential for value, the expected value of large interstellar civilizations is close to zero. In that case, we can have persistent effects on the far future, but not effects that matter (in expectation).

JackM3y2

OK thanks I think that is clearer now.

MichaelA3y11

tl;dr: Tarsney seems to me to understate the likelihood that accounting for non-human animals would substantially affect the case for longtermism.

Tarsney includes a helpful appendix listing the simplifications made in his model/paper, and the rationales for these simplifications. Here's a passage from that:

Simplification: The model ignores effects on the welfare of beings other than Homo sapiens and our “descendants”.
Rationale: (1) The sign and magnitude of the effects of paradigmatic longtermist interventions on the welfare of non-human animals (or their far-future counterparts) are very unclear. (2) Dropping this simplification seems unlikely to change our quantitative results by more than 1–2 orders of magnitude (though this is far from obvious), and so unlikely to affect our qualitative conclusions.

I appreciate Tarsney's caveat that "this is far from obvious", and, given that caveat, I don't strongly disagree with this sentence. But it seems quite plausible to me[1] that considering those effects would strengthen or weaken the case for paradigmatic longtermist interventions by more than 1-2 orders of magnitude, or even that it would flip the sign of the expected value of those interventions.

Relatedly, I also think that considering those effects should plausibly change which longtermist interventions we support (not just whether we support them vs non-longtermist interventions).

(I'm not sure how likely I see these things as, so maybe I actually agree with Tarsney that this "seems unlikely [but with that being far from obvious]".)

[1] We could operationalise "it seems quite plausible to me that X" as something like "there's at least a 20% chance that I would think X if I spent another 100 hours of thinking about the topic".

MichaelA3y7

tl;dr: Tarsney writes "resources committed at earlier time should have greater impact, all else being equal". I think that this is misleading and an oversimplification. See Crucial questions about optimal timing of work and donations and other posts tagged Timing of Philanthropy.

(But that claim was not necessary for any of Tarsney's arguments; he just gave it as one reason why the actual case for longtermism might be stronger than his deliberately conservative estimates suggest.)

Context and explanation:

A core part of Tarsney's model is - roughly speaking - the amount by which spending $1 million on mitigating existential risks changes the probability of being in the target state at a given time, relative to the probability that would occur if the short-termist intervention was used. This parameter is represented by p. The target state means something like "The accessible region of the Universe contains an intelligent civilization”.

Tarsney makes:

a lower-bound estimate [of p] based on the details of our working example, that is almost certainly far too pessimistic, but nevertheless informative.
The estimate proceeds in two stages: First, how much could humanity as a whole change the probability of [the target state at a particular time]
(i.e., roughly, the probability that we survive the next thousand years), relative to the status quo, if we committed all our collective time and resources solely to this objective for the next thousand years? “One percent” seems like a very safe lower bound here (remembering that we are dealing with epistemic probabilities rather than objective chances). Now, if we assume that each unit of time and resources makes the same marginal contribution to increasing the probability of [the target state at that time], we can calculate p simply by computing the fraction of humanity’s resources over the next thousand years that can be bought for 1 million, and multiplying it by 0.01. This yields p [roughly equal to 10 to the power of negative 14].

Tarsney writes that "This is an extremely conservative lower bound", and that "I think it would be justifiable to adjust p upward from this lower-bound estimate by a several-order-of-magnitude “fudge factor”, if we were so inclined" (though he doesn't do this for his paper). He gives two reasons for this.

The first has to do with diminishing marginal returns and the fact that we'll by default spend far less than all our collective time and resources over the next 1000 years to reducing existential risk. Thus, spending an extra $1 million on the current margin will probably achieve far more than one would expect by "simply by computing the fraction of humanity’s resources over the next thousand years that can be bought for 1 million". This argument makes sense to me, and I do think it suggests Tarsney's estimate for p is a very conservative one (as he intends).

But then he writes:

Second, resources committed at [an] earlier time should have greater impact, all else being equal. (If nothing else, this is true because resources that might be committed to existential risk mitigation, say, 500 years from now can do nothing to prevent any of the existential catastrophes that might occur in the next 500 years, while resources committed today are potentially impactful any time in the next thousand years.)

It's definitely true that there are many reasons why resources committed at an earlier time could have a greater impact. And the reason Tarsney raises is a valid one; we could describe this as discounting for the possibility that the later use of resources would be "too late". This is an extreme example of how we might miss "windows of opportunity" if we wait too long.

But there are also many reasons why resources committed at a later time could have a greater impact. This is especially true if we don't count resources as committed to a problem when they're used in an investment-like way in order to generate more resources that can be committed later, but it's even true if we do count resources as already committed to a problem when they're "merely invested".

In particular, it's possible that "leverage over the future" (or hingyness, pivotality, etc.) will increase in future. This could occur if:

We know more in future about what we should do
Longtermist priorities become more neglected in future
Windows of opportunity that aren't currently open become open in future
- E.g., there could be a future point at which important global governance institutions are being set up or policy frameworks for a currently unforeseen technology are being set

(For explanation and discussion of the above points, see here.)

Of course, the opposite effects could also occur. My point is merely that "resources committed at [an] earlier time should have greater impact, all else being equal" seems to be either false or misleading.

(I think it'd be reasonable for Tarsney to merely claim that his all-things-considered view is that resources committed at an earlier time will in practice probably have a greater impact. But this more uncertain stance would then weaken the case for a several-order-of-magnitude upwards adjustment of p.)

MichaelA3y6

A final quick thing that came to mind: I think that Ord's concept of existential security could be represented in Tarsney's models as the value of r asymptotically decreasing towards 0 over time. I'd be interested to hear people's thoughts on whether that seems accurate and, if so:

how accounting for various likelihoods of existential security could affect Tarsney's models conclusions
how considering Tarsney's models could affect Ord's conclusions

(I haven't tried to think this through myself yet.)

MichaelA3y6

I think it'd be interesting to run a sensitivity analysis on Tarsney's model(s), and to think about the value of information we'd get from further investigation of:

how likely the future is to resemble Tarsney's cubic growth model vs his steady model
whether there are other models that are substantially likely, whether the model structures should be changed
what the most reasonable distribution for each parameter is.

It seems like the value of information from that might be very high, at least if we think we don't want to accept fanaticism. This is because Tarsney's paper suggests reasonable empirical views could either support the case for longtermism without requiring fanaticism or only support the case for longtermism if we accept fanaticism. So further research on these models, alternative models, and these parameters could perhaps give us a much better sense of how robust the case for longtermism is.

To some extent, this comment can be boiled down to something that was obvious already: "The case for longtermism seems plausible but uncertain, and whether it's true seems very decision-relevant, so maybe investigating whether it's true would be really valuable." But I think Tarsney's paper highlights specific points to look into, and that it would allow for (rough) quantitative estimates of the value of information to be gained by investigating each point.

For a quick and non-quantitative example, it seems that the probability of interstellar settlement has a very large bearing on the results of the model, and it also seems like we should be quite uncertain about that probability.

Some caveats to that:

I'm not sure how tractable these investigations would be.
- But note that it could be useful just to become somewhat less uncertain than we currently are, even if we still remain quite uncertainty.
Tarsney's models focus on a particular working example of a longtermist intervention/priority (increasing the chance that there's an intelligent civilization at any given time point). As discussed in other comments here, how good the success of that intervention would be depends on other things not modelled by Tarsney (essentially, what that civilization does with the accessible universe), and there are many other interventions/priorities we might focus on.
- So ideally we'd run the sensitivity analysis and value of information calculations on either a more general version of the models or on a set of models that collectively represent various major possible priorities.
Tarsney's models make various ethical and decision-theoretic assumptions that are conducive to longtermism. An ideal version of the sensitivity analysis and value of information calculations might also allow for investigation of what happens if these assumptions are relaxed.
- But that might become unwieldy.

MichaelStJules3y6

There's also a talk. https://globalprioritiesinstitute.org/christian-tarsney-the-epistemic-challenge-to-longtermism/

When I reference work by GPI, I usually link to the page with both the talk and the pdf.

MichaelA3y2

Good point, thanks! I'll edit this post to link to that page instead :)

MichaelA3y5

Just a nitpick

By the rules of the expected value game, the case for longtermism appears to survive the epistemic challenge with which we confronted it. But it has prevailed in a way that should make us slightly uneasy: by appealing to potentially-minuscule probabilities of astronomical quantities of value.

I think that this particular sentence is false or misleading. As Tarsney notes earlier and later, his model and parameter estimates[1] suggests that the case for longtermism survives given either acceptance of fanaticism or plausible but non obvious empirical views. That is, on some plausible empirical views, longtermism doesn't require an appeal to minuscule probabilities of astronomical quantities of value.

(Tarsney's sentence may still be technically accurate, since he says potentially-minuscule. But it seems at least a bit misleading to me.)

[1] Along with certain ethical and decision-theoretic assumptions, e.g. total utilitarianism.

JackM3y2

I agree with you that Tarsney hasn't been clear, but I think you've got it the wrong way around (please tell me if you think I'm wrong though). The abstract to the paper says:

But on some prima facie plausible empirical worldviews, the expectational superiority of longtermist interventions depends heavily on these “Pascalian” probabilities. So the case for longtermism may depend either on plausible but non-obvious empirical claims or on a tolerance for Pascalian fanaticism.

These two sentences seem to say different things, as you have outlined. The first implies that you need fanaticism, whilst the second implies you need either fanaticism or non-obvious but plausible empirical views. Counter to you I think the former is actually correct.

Tarsney initially runs his model using point estimates for the parameters and concludes that the case for longtermism is "plausible-but-uncertain" if we assume that humanity will eventually spread to the starts, and "extremely demanding" if we don't make that assumption. Therefore longtermism doesn't really "survive the epistemic challenge" when using point estimates.

Tarsney says however that "The ideal Bayesian approach would be to treat all the model parameters as random variables rather than point estimates". So if we're Bayesians we can pretty much ignore the conclusions so far and everything is still to play for.

When Tarsney does incorporate uncertainty for all parameters, the expectational superiority of longtermism becomes clear because "the potential upside of longtermist interventions is so enormous". In other words the use of random variables allows for fanaticism to take over and demonstrates the superiority of longtermism.

So it seems to me that it really is fanaticism that is doing the work here. Would be interested to hear your thoughts.

EDIT: On a closer look at his paper Tarsney does say that it isn't clear how Pascalian the superiority of longtermism is because of the "tremendous room for reasonable disagreement about the relevant probabilities". Perhaps this is what you're getting at Michael?

MichaelA3y3

These two sentences seem to say different things, as you have outlined.

I actually think that those two sentences are consistent with each other. And I think that, as Tarsney says, his models and estimates do not show that fanaticism is necessarily required for the case for longtermism to hold.

Basically (from memory and re-skimming), Tarsney gives two model structures, some point estimates for most of the parameters, and then later some probability distributions for the parameters. He intends both models to represent plausible empirical views. He intends his point estimates and probability distributions to represent beliefs that are reasonable but at the pessimistic end for longtermism (so it's not crazy to think those things, but his all-things-considered beliefs about those parameters would probably be more favourable to longtermism). And he finds that the case for longtermism holds given the following assumptions:

You use one of the model structures (cubic growth model), his pessimistic parameter estimates, and a "prima facie plausible" value for the long-run rate of ENEs
You use the model structure that's less favourable to longtermism (steady state growth), his pessimistic parameter estimates, and an "extremely demanding" value for the long-run rate of ENEs
- So he thinks that the case is "extremely precarious" if we use that model and we use point estimates
We use distributions to represent our uncertainty both between those model structures and over some parameters, with the distributions based on setting lower bounds that Tarsney thinks are "quite conservative and hard to reasonably dispute"

(There are various complications, caveats, and additional points, but this stuff is key.)

So his reasoning is consistent with it being that case that the most reasonable empirical position would support longtermism without requiring any minuscule probabilities of extremely huge payoffs, or with that not being the case.

E.g., that could be the case is if we should have a non-minuscule credence in the cubic growth model and that "prima facie plausible" value for the long-run rate of ENEs.

When Tarsney does incorporate uncertainty for all parameters, the expectational superiority of longtermism becomes clear because "the potential upside of longtermist interventions is so enormous". In other words the use of random variables allows for fanaticism to take over and demonstrates the superiority of longtermism.

Incorporating uncertainty, and this suggesting that the potential upside of one thing makes that the thing we should go for, doesn't necessarily mean fanaticism is involved. E.g., I made many job applications that I expected would turn out to have not been worth the time they took, due to the potential upside, and without having a clear point estimate for my odds of getting the job or how valuable that'd be (so I sort-of implicitly had a probability distribution over possible credences). This'd only be fanatical if the probabilities involved were minuscule and the payoffs huge enough to "make up for that", and Tarsney's analysis suggests that that may or may not be the case when it comes to longtermism.

MichaelA3y3

Here's a relevant section from the paper:

By this measure, the preceding analysis suggests that the choice between longtermist and short-termist interventions could be extremely Pascalian. We have found that longtermist interventions can have much greater expected value than their short-termist rivals even when the probability of having any impact at all on the far future is minuscule (2 x 10^14, for a fairly large investment of resources) and when, conditional on having an impact, most of the expected value of the longtermist intervention is conditioned on further low-probability assumptions (e.g., the prediction of large-scale interstellar settlement, astronomical values of vs, large values of s, and—in particular—small values of r). It could turn out that the vast majority of the expected value of a typical longtermist intervention—and, more importantly, the component of its expected value that gives it the advantage over its short-termist competitors—depends on a conjunction of improbable assumptions with joint probability on the order of (say) 10^18 or less. In this case, by the measure proposed above, the choice between L and B is extremely Pascalian (1-(2x10^18) or greater).
On the other hand, there is tremendous room for reasonable disagreement about the relevant probabilities. If you think that, in the working example, p is on the order of (say) 10^7, and that the assumptions of eventual interstellar settlement, astronomical values of vs, large values of s, and very small values of r are each more likely than not, then the amount of tail probability we would have to ignore to prefer B might be much greater—say, 10^8 or more.
These numbers should not be taken too literally—they are much less robust, I think, than the expected value estimates themselves, and at any rate, it’s not yet clear whether we should care that a choice situation is Pascalian in the sense defined above, or if so, at what threshold of Pascalian-ness we should begin to doubt the conclusions of expectational reasoning. So the remarks in this section are merely suggestive. But it seems to me there are reasonable grounds to worry that the case for longtermism is problematically dependent on a willingness to take expectational reasoning to a fanatical extreme.

I think maybe a useful framing to have in mind is that Tarsney's paper was not aimed at actually working out the likelihood of each model structure relative to the other, or working out what precise parameter estimates would be most appropriate. And those are things we should be very uncertain about.

So perhaps our 90% credible interval (or something like that) for what we'd believe after some years of further research should include both probability estimates/distributions in which the case for longtermism survives without fanaticism and probability estimates/distributions in which the case for longtermism would survive only if we accept fanaticism.

JackM3y2

Thanks yeah, I saw this section of the paper after I posted my original comment. I might be wrong but I don't think he really engages in this sort of discussion in the video, and I had only watched the video and skimmed through the paper.

So overall I think you may be right in your critique. It might be interesting to ask Tarsney about this (although it might be a fairly specific question to ask).

MichaelA3y2

Yeah, I plan to suggest some questions for Rob to ask Tarsney later today. Perhaps this'll be one of them :)

MichaelA3y5

tl;dr: The paper ignores 2 factors that could strengthen the case for longtermism - namely, possible increases in how efficiently resources are used and in what extremes of experiences can be reached.

Tarsney writes:

The case for longtermism starts from the observation that the far future is very big. A bit more precisely, the far future of human-originating civilization holds vastly greater potential for value and disvalue than the near future. This is true for two reasons. The first is duration. On any natural way of drawing the boundary between the near and far futures (e.g., 1000 or 1 million years from the present), it is possible that our civilization will persist for a period orders of magnitude longer than the near future. For instance, even on the extremely conservative assumption that our civilization must die out when the increasing energy output of the Sun makes Earth too hot for complex life as we know it, we could still survive some 500 million years. Second is spatial extent and resource utilization. If our descendants eventually undertake a program of interstellar settlement, even at a small fraction of the speed of light, they could eventually settle a region of the Universe and utilize a pool of resources vastly greater than we can access today. Both these factors suggest that the far future has enormous potential for value or disvalue.

I essentially agree with all those points. Furthermore, given my current moral and empirical views, I think those factors are probably the main factors driving the case for longtermism.

But I think there are at least two other factors that are relevant and that might substantially add to the case for longtermism. (Though it's possible that they add so little relative to the other factors that they won't really be decision-relevant.)

---

The first factor is possible increases in efficiency of resource usage. For a given quantity and type of matter or energy, future civilizations may be able to more efficiently convert that into moral value or disvalue than current civilization can. For example, if we can create simulated humans or animals (or artificial sentiences) that are morally relevant, these may be able to experience the same pleasures or pains we can with substantially less energy required.

Thus, the factor by which total quantity of moral (dis)value in the long-term future is expected to be larger than that in the present + near-term future may be even larger than one would think if one considered only the duration, spatial extent, and resources used in the future.

(Tarsney's term "resource utilization" might seem like it should capture this idea, but his description suggests that he has in mind only changes in how much resources we use, not changes in how efficiently we use them.)

---

The second factor is possible increases in the extremes of experience that can be reached. It seems plausible that future civilizations will be able to create experiences more extremely good or bad than experiences that we can create today or that are experienced in nature. If so, this might increase the importance of the long-term future, if either of the following things are true:

Those experiences can be created relatively efficiently (e.g., just slightly less efficiently than substantially less extreme experiences)
There is some moral reason why the extreme experiences matter disproportionately more than other experiences (i.e., if the moral significance of an experience increases superlinearly with the extremity of the experience, at least at some points of that "function")

I'd guess that this factor is much less important than the efficiency factor, but it seems very hard to say.

The same basic point might also apply to non-experience things that might be morally good or bad. (E.g., if art has intrinsic moral value, perhaps future civilization could create art that is more extremely good than current art.)

---

I've seen roughly those ideas idea discussed in various places before, though I can't remember precisely where. The concept of hedonium can be seen as a special case of the efficiency factor.

Chapter 8 of The Precipice, on "Our Potential", is also relevant here. Ord splits that chapter into discussion of the future's potential duration, its potential scale, and its potential quality. I imagine that the points I raised above were covered in that chapter, but I can't remember for sure (I read the book a year ago, and foolishly enough I had not yet converted to using Anki as I read).

---

I think it'd be interesting for someone to think about how Tarsney's models or parameter estimates could be tweaked to account for these factors, and maybe to see how much difference this makes (after plugging in some reasonable-seeming distributions for the parameters).

MichaelStJules3y4

I think these would basically be just constant factors multiplying the whole impacts, assuming we remain near the peaks for far longer than we spend making significant moves towards the peaks.

The difference between intentionally optimizing for hedonistic welfare and a default with human-like minds could itself be on the scale of an existential catastrophe for a classical utilitarian, and more important than extinction, although it could also be far less tractable and not really an attractor state at all if it's not stable/persistent. This could also generalize to other theories of welfare, just with different targets.

MichaelStJules3y4

On his estimate of the difference in probability we can achieve promoting one state over its complement, it's worth mentioning that this does not consider the possibility of doing more harm than good, e.g. AI safety work advancing AGI more than it aligns it, and with the very low (but in his view, extremely conservative) probabilities that he uses in his argument, the possibility of backfire effects outweighing them becomes more plausible.

Furthermore, it does not argue that we can effectively predict that any particular state is better than its complement, e.g. is extinction good or bad? How should we deal with moral uncertainty, especially around population ethics?

For these reasons, it may be difficult to justifiably identify robustly positive expected value longtermist interventions ahead of time, which the case for longtermism depends on. I mean this even with subjective probabilities, since such probabilities supporting longtermist interventions tend to be particularly poorly-informed (largely for absence of good evidence) and so seem more prone to biases and whims, e.g. wishful thinking and the non-rational particulars of people's brains and priors. This is just deep uncertainty and moral cluelessness.

For what it's worth, I don't think it makes much sense for this paper to address such issues in detail given its current length already, although they seem worth mentioning.

(Also, I read the paper a while ago, so maybe it did discuss these issues and I missed it.)

MichaelA3y6

In line with your comment:

I don't recall the paper discussing the possibility that longtermist interventions could backfire for their intended effects
The paper's main working example is just about any intelligent civilization existing, and doesn't get into what that civilization is doing or how valuable it is (which therefore includes things like not discussing whether it's better or worse than extinction)

But Tarsney does acknowledge roughly that second point in one place:

Additionally, there are other potential sources of epistemic resistance to longtermism besides Weak Attractors that this paper has not addressed. In particular, these include:
Neutral Attractors To entertain small values of r [the rate of ENEs], we must assume that the state S targeted by a longtermist intervention, and its complement ¬S, are both at least to some extent “attractor” states: Once a system is in state S, or state ¬S, it is unlikely to leave that state any time soon. But to justify significant values of ve and vs, it must also be the case that the attractors we are able to target differ significantly in expected value. And it’s not clear that we can assume this. For instance, perhaps “large interstellar civilization exists in spatial region X” is an attractor state, but “large interstellar civilization exists in region X with healthy norms and institutions that generate a high level of value” is not. If civilizations tend to “wander” unpredictably between high-value and low-value states, it could be that despite their astronomical potential for value, the expected value of large interstellar civilizations is close to zero. In that case, we can have persistent effects on the far future, but not effects that matter (in expectation).

He says "low-value" rather than "negative value", but I assume he actually meant negative value, because random wandering between high and low positive values wouldn't produce an EV (for civilization existing rather than not existing) of close to 0.

MichaelA3y3

tl;dr: Tarsney slightly misrepresents an existential risk estimate.

Tarsney writes:

To my knowledge, the most pessimistic estimate of near-term existential risk in the academic literature belongs to Rees (2003), who gives a 0.5 probability that humanity will not survive the next century.

But what Rees actually writes is:

I think the odds are no better than fifty-fifty that our present civilisation on Earth will survive to the end of the present century.

(Here's one online source quoting Rees. I've seen the same quote elsewhere too.)

Whether "our present civilisation on Earth" survives is very different from whether humanity survives. I haven't read Rees' book, so I don't know what he intended that quote to mean, but I'd guess he'd include things like a major population collapse that lasts a few decades as "our present civilisation on Earth not surviving". Arguably, his forecast could even be seen as capturing the chance that we just very substantially change our political, cultural, and economic systems, in the same way as how Europe in the 1900s was arguably a "different civilisation" to Europe in the year 100CE.

Also, Rees doesn't give a 0.5 probability; he gives a probability no better than 0.5.

MichaelA3y3

Also, Tarsney writes:

For a collection of such estimates, see Tonn and Stiefel (2014, pp. 134–5).

I think it'd be better to direct people to the appendix of Beard et al. (2020), since that's more comprehensive and up-to-date. (I also really like the article itself.)

Perhaps unsurprisingly, I also think it'd be even better-er to direct people to my database, since that's even more comprehensive and up-to-date (and people can and do make suggestions to it, which I process, such that it should presumably remain the most comprehensive resource, rather than being frozen in time). But I can understand Tarsney preferring to refer readers to an academic source.

(Incidentally, if there's anyone who'd in theory like to cite my database, but can't do so unless it's hosted somewhere else - e.g., a preprint server - or needs it to look different, please let me know and I'll see what I can do.)

MichaelA3y3

tl;dr I'm aware of 1-3 other things that might count as more pessimistic estimates of near-term existential risk in the academic literature.

Specifically:

Frank Tipler wrote "Personally, I now think we humans will be wiped out this century"
John Leslie estimated the risk of extinction over the next five centuries as at or above 30%.
Nick Bostrom estimates the odds that "existential disaster will do us in" at some point as probably at or above 25%.

For further details and sources, see my Database of existential risk estimates (or similar) (see here for the accompanying post).

But:

Tipler doesn't give a quantitative estimate, so maybe that shouldn't count.
Leslie and Bostrom's estimates themselves are arguably less pessimistic than Rees, in the sense that their lower bounds for the risk level are lower
- But I include their estimates here anyway since it's possible that their overall distribution would be more pessimistic, and because Rees' estimate is not necessarily about existential catastrophe (since it could include other ways in which "our present civilisation" doesn't "survive")
Leslie and Bostrom's estimates aren't as near-term as Rees' estimate

So Tarsney's claim is reasonable on this front; I'm just adding some extra info.

There are also some more pessimistic estimates in sources that aren't academic but do seem similarly worth paying attention to to Rees' estimate; see my database.

BrownHairedEevee3y1

Thanks for posting this! Your linkpost actually got me to watch the talk for the first time, even though I was aware of this paper for a while.

I think some variant of the cubic growth model could be useful for figuring out whether trying to reduce x-risk is better than trying to make durable changes to the long-term "trajectory" of the social welfare curve. I spent some time a few months ago trying to address this by modeling the trajectory of humanity, so I appreciate this paper for proposing even a simpler toy model.

I have rough thoughts about how the utility from economic growth could be incorporated: Assume that each star system has a growth rate that the residents of that star system can influence (e.g. through policy). The economy of each star system tends to grow exponentially, but GDP per capita has logarithmic utility, so the utility of the star system $u_{s} (t)$ grows roughly linearly.

If the economy of each star system starts at a steady state, then grows exponentially at $g$ starting at time $t_{0}$ , the time at which humanity arrives at the star system, we get $u_{s} (t) = u_{0} + max (0, g (t - t_{0}))$ . If the star system's GDP is capped at $exp (u_{m a x})$ , then we get $u_{s} (t) = u_{0} + min (max (0, g (t - t_{0})), u_{m a x})$ .

To incorporate economic growth into the trajectory model used in the paper, we can replace $n (s \cdot (t - t_{ℓ}))$ with the cross-correlation of $u_{s} (t)$ and $n (s \cdot (t - t_{ℓ}))$ (this assumes that all star systems have the same growth rate). Since $u_{s} (t)$ is piecewise linear and $n (s \cdot (t - t_{ℓ}))$ is cubic, the cross-correlation is piecewise quintic (it's the integral of a cubic function times a linear function). My gut tells me that having a piecewise quintic term in the trajectory function instead of a cubic term isn't going to change much about the implications of the model.

Note: I realize that by using GDP per capita, I'm leaving out the population of each star system. This would result in multiplying $u_{s} (t)$ by a function that models the population over time, starting at time $t_{0}$ .

[comment deleted]3y2

Deleted by MichaelA, 04/04/2021