Sometimes, as a way for me to get more strategic clarity about what intermediate goals I want my research to accomplish, I try to evaluate whether things that are locally good/bad are good for the long-term future. For example, technological growth, economic growth, forecasting, IIDM, democracy, poverty reduction, non-existential catastrophic risks, and so forth.
My standard argument/frame of thinking goes like this:
“Well, you start with a prior of ~50-50 that any given macroscopic change is good for the long-term future, and then you update on the evidence that-”
And if this is done in conversation, my interlocutor often interrupts me with
“50-50 is a crazy prior because-”
And often it’s some argument that locally good things should be expected to be globally good. Sometimes people reference flow-through effects.There’s different flavors of this, but the most elegant version I’ve heard is “duh, good things are good.”
And like, I sort of buy this somewhat. I think it’s intuitive that good things are good, and I’ve argued before that we should start with an intuition that first-order effects (for a specific target variable) are higher than second-order effects. While that argument is strongest about local variables, perhaps we should expect that generally there’s a correlation between a thing’s goodness on one metric to its goodness on other metrics (even if the tails come apart and things that are amazing for one metric aren't the best for other metrics).
But when it comes to the long-term future, how much should we buy that things that are considered good by near-term proxies that don’t consider the long-term future are good for long-term stuff?
Put another way, what’s our prior that “an arbitrarily chosen intervention that we believe to be highly likely to be net positive for the experience of sentient beings in the next 0-5 years increases the likelihood of P(utopia)?"
--
Related post (though I don't exactly agree with the formalisms)
I would imagine a sensible model has goodness persisting in expectation but asymptotically approaching 0. That seems both reasonably intuitive and to lead to some helpful guidance once you have such a model.
The question of what the relative decay rate for different classes of action is then becomes paramount - if you can identify actions with low expected decay rates, you have a phenomenally important class of actions. Extinction events are perhaps the example in which we can have highest confidence in a low decay rate, but highest confidence in a low decay rate doesn't necessarily = highest expected decay rate (this is maybe another way of thinking about trajectory change).