Sometimes, as a way for me to get more strategic clarity about what intermediate goals I want my research to accomplish, I try to evaluate whether things that are locally good/bad are good for the long-term future. For example, technological growth, economic growth, forecasting, IIDM, democracy, poverty reduction, non-existential catastrophic risks, and so forth. 

My standard argument/frame of thinking goes like this:

“Well, you start with a prior of ~50-50 that any given macroscopic change is good for the long-term future, and then you update on the evidence that-”

And if this is done in conversation, my interlocutor often interrupts me with

“50-50 is a crazy prior because-”

And often it’s some argument that locally good things should be expected to be globally good. Sometimes people reference flow-through effects.There’s different flavors of this, but the most elegant version I’ve heard is “duh, good things are good.”

And like, I sort of buy this somewhat. I think it’s intuitive that good things are good, and I’ve argued before that we should start with an intuition that first-order effects (for a specific target variable) are higher than second-order effects. While that argument is strongest about local variables, perhaps we should expect that generally there’s a correlation between a thing’s goodness on one metric to its goodness on other metrics (even if the tails come apart and things that are amazing for one metric aren't the best for other metrics).

But when it comes to the long-term future, how much should we buy that things that are considered good by near-term proxies that don’t consider the long-term future are good for long-term stuff? 

Put another way, what’s our prior that “an arbitrarily chosen intervention that we believe to be highly likely to be net positive for the experience of sentient beings in the next 0-5 years increases the likelihood of P(utopia)?"

--

Related post (though I don't exactly agree with the formalisms)

49

New Answer
Ask Related Question
New Comment

7 Answers sorted by

Hmm, my mind doesn't have a single bucket for "good things within the next few years". When I think about it I immidiately unpack it to things like:

  • increase the influence, competence and wisdom of altruistic groups and individuals – >>50% (off the cuff)
  • decrease the risk of global catastrophes in the near term – >>50%
  • improve the standard of living, wealth, health, ... – >50%
  • increase economic growth – >50%

For considering flow-through effects, I would want to distinguish between dimensions like the following:

  • Improves someone's life quality
  • Improves values
  • Increases economic growth
  • Differential technological progress
  • Improves institutions

I ordered these in ascending order of importance with how likely I think they're going to have a big impact. I think all of those look more likely positive than negative to me, except for economic growth where I have no idea (though I note that more people lean in the direction of it being positive, and considering it negative is maybe a bit of a brittle argument) .

Probably a lot of people would put "improves values" above economic growth in terms of relevance on the future, but my intuitions go in the direction of technological determinism and the view that if things go poorly, it's more because of hypocrisy and lack of competence rather than "bad values."

To get back to your question: I think I'd approach this by looking for correlations between what you call "good things" and the other factors on my list. I'd say there's a significant positive correlation. So I'd say "good things have good flow-through effects" because of that positive correlation. If you find an example where you improve someone's life quality but it doesn't have any effect on the other variables, I wouldn't expect positive flow-through effects from that.

I'm broadly in your camp, i.e. starting with a 50-50 prior.

I think a useful intuition pump is asking oneself whether some candidate good near-term effect/action X is net good for total insect welfare or total nematode welfare over the next 0-5 years (assuming these are a thing, i.e. that insects or nematodes are sentient). I actually the correlation here is even smaller than for the short-term vs long-term impact variables we typically consider, but I think it can be a good intuition pump because it's so tangible.

I agree with "first-order effects are usually bigger than second-order effects", but my model here is roughly that we have heavy-tailed uncertainty over 'leverage', i.e. which variable matters how much for the long-term future (and that usually includes sign uncertainty). 

We can imagine some literal levers that are connected to each other through some absurdly complicated Rube Goldberg machine such that we can't trace the effect that pulling on one lever is going to have on other levers. Then I think our epistemic position is roughly like "from a bunch of experience with pulling levers we know that it's a reasonable prior that if we pull on one of these levers the force exerted on all the other levers is much smaller, though sometimes there are weird exceptions; unfortunately we don't really know what the levers are doing, i.e. for all we know even a miniscule force exerted – or failing to exert a miniscule force – on one of the levers destroys the world".

I would imagine a sensible model has goodness persisting in expectation but asymptotically approaching 0. That seems both reasonably intuitive and to lead to some helpful guidance once you have such a model.

The question of what the relative decay rate for different classes of action is then becomes paramount - if you can identify actions with low expected decay rates, you have a phenomenally important class of actions. Extinction events are perhaps the example in which we can have highest confidence in a low decay rate, but highest confidence in a low decay rate doesn't necessarily = highest expected decay rate (this is maybe another way of thinking about trajectory change).

I think it's >60% likely that if an intervention doesn't contribute significantly accelerating the development of technology which could pose existential risks, and is highly likely to be net positive for the experience of sentient beings in the next 0-5 years, then it has a net positive effect on the long-term future. I have no idea whether a completely undefined "arbitrarily chosen" intervention that is net positive for the next 0–5 years is likely to contribute to risky technology, since I feel confused about what would be a reasonable sample of five such arbitrarily chosen interventions. I might have a 50–50 prior just because of that fact, but that doesn't seem like a good reason to have a 50–50 prior.

I agree with you that 'good now' gives us in general no reason to think it increases P(Utopia), and I'm hoping someone who disagrees with you replies.

As a possible example, that may or may not have reduced P(Utopia), I have a pet theory, that may be totally wrong, that the Black Death, by making capital far more valuable in Europe for a century and a half was an important part of triggering the shifts that caused Europe to be clearly ahead of the rest fo the world in the tech tree leading to industrialization by 1500 (claiming that Europe was clearly ahead by 1500 is also a disputed claim).

Assuming we think an earlier industrialization is a bigger good thing than the badness of the black death, then the black death was a good thing under this model.

Which line of thinking is how I learned to be highly skeptical of 'good now' = 'good in the distant future'.

I think the case for the Black Death is reasonable but I don't think counterexamples are very strong evidence here, since I doubt anybody has a 99-1 prior here. I imagine even the extremes of the debate is between 52-48 or maybe 50-50 vs 95-5 or 90-10 priors. 

4 comments, sorted by Click to highlight new comments since: Today at 8:17 PM

My answer to your question depends on how you define "good for the long-term future". When I think about evaluating the chance an action is good including of long-run effects, specifying a few more dimensions matters to me. It feels like several combinations of these could be reasonable and would often lead to fairly different probabilities.

Expected value vs. realized value

Does "good for the long-term future" mean: good in expectation, or actually having good observed effects?

What is the ground truth evaluation?

Is the ground truth evaluation one that would be performed by:

  1. An oracle that has all knowledge of all events?
  2. The best evaluation that is in some sense realizable, e.g.:
    1. A large (1,000 people?), well-organized team of competent people evaluating the action for a long time (1,000 years?)?
    2. The best evaluation AI 100 years after AI surpasses human-level.

I think usually people mean (1), but in practice it often feels useful to me to think about some version of (2).

Foresight vs. hindsight

Does the ground truth evaluation occur before the action occurs, or after it occurs and all (or some) effects can be observed?

(Note: Using this as an excuse to ask this clarifying question that I've thought about some recently, that could apply to many posts. Haven't done a thorough lit review on this so apologies if this is already covered somewhere else)

I came here to say this--in particular that I think my prior probability for "good things are good for the long-term future" might be very different than my prior for "good things are good for the long-term future in expectation", so it matters a lot which is being asked.

I think the former is probably much closer to 50% than the latter. These aren't my actual estimates, but for illustrative purposes I think the numbers might be something like 55% and 90%.

I agree with Eli that my actual estimates would also depend on the other questions Eli raises.

Another factor that might affect my prior a lot is what the reference class of "good things" looks like. In particular, are we weighting good things based on how often these good things are done / how much money is spent on them, or weighting them once per unique thing as if someone were generating a list of good things? E.g. Does "donation to a GiveWell top charity" count a lot, or once? (Linch's wording at the end of the post makes it seem like he means the latter.)

Perhaps it would be helpful to Linch's question to generate a list of 10-20 "good things" and then actually think about each one carefully and estimate the probability that it is good for the future, and good for the future in expectation, and use these 10-20 data points to estimate what one's prior should be. (Any thoughts on whether this would be a worthwhile research activity, Linch or others reading this?)

You could of course ask this question the other way round. What is the probability that things that are good for the long run future (for P(utopia)) are good in the short run as well?

For this I would put a very high probability as:

  • Most of what I have read generally about how to effect the long run future suggest you need to have feedback loops to show things are working which suggests short run improvements (e.g. you want your AI interpretability work etc to help in real world cases today)
  • Many but not all of the examples I know of people doing things that are good for the long-run are also good for both (e.g. value spreading, pandemic preparedness)
  • Some good things wont last to effect the future unless they are useful (for someone) today (e.g. improved institutions)

I agree that this seems an important question. I'll lazily add a potentially useful link rather than any novel thoughts/analysis: The wiki entry and tag indirect long-term effects.