dwebb

PhD Student in Economics at the Paris School of Economics

https://www.duncan-webb.com/

Comments

Formalising the "Washing Out Hypothesis"

I think this is a very good point, and it's helping shape my ideas on this topic, thank you!

I guess it's true that most/all candidates for longtermist interventions that I've seen are based on attractor states. At the same time, it's useful to think about whether we might be missing any potential longtermist interventions by focusing on these attractor state cases. One such example that plausibly might fit into this category is an intervention that broadly improves institutional decision-making. Perhaps here, interventions plausibly have a long run positive impact on future value but we are worried that this will be "washed out" by other factors. It's not clear that there's an obvious attractor state involved. (Note that I'm not very confident in this; I could easily be persuaded otherwise. Maybe people advocate for improving institutional decision-making on the basis that it reduces the risk of many different bad attractor states.)

Thinking about this type of intervention,  the results of my model can be read either pessimistically or optimistically from the longtermist's perspective (depending on your beliefs about the nature of the parameters):

  •  Optimistic: there are potentially cases where a longtermist intervention that's not based on an attractor state can have very large long-run benefits. If forecasting error increases sub-linearly or just relatively slowly, then an intervention can be good from a longtermist perspective even if there's no attractor state involved.
  • Pessimistic: for lots of plausible parameter values (e.g. high alpha, linearly increasing forecasting error), long run benefits wash out. If this is true across a wide range of potential interventions, then attractor states are perhaps the only way out of this trap.
Formalising the "Washing Out Hypothesis"

I hadn't seen this post before, but to me it sounds like Beckstead's arguments are very much in line with the idea of attractor states, rather than deviating from it. A path-dependent trajectory change is roughly the same as moving from one attractor state to another, if I've understood correctly.

The argument he is making is that extinction / existential risks are not the only form of attractor state, which I agree with.

Formalising the "Washing Out Hypothesis"

Yes indeed, kudos to Dave Bernard - he pointed out this distinction to me as well.

And good spot - sorry for the confusing error! I've now edited this in the text.

Formalising the "Washing Out Hypothesis"

Thanks so much for the very insightful commentary!  My thoughts are still evolving on these topics so I will digest some of your remarks and reply on each one.

Formalising the "Washing Out Hypothesis"

Thanks for some thought provoking questions!

The posterior estimate of value trends towards zero because we assumed that the prior distribution of u_t has a mean of 0. Intuitively, absent any other evidence, we believe a priori that our actions will have a net effect of 0 on the world (at any time in the future). (For example, I might think that my action of drinking some water will have 0 effect on the future unless I see evidence to the contrary. There's a bit of discussion of why you might have this kind of "sceptical" prior in Holden Karnofsky's blog posts.) Then, because our signal becomes very noisy as we predict further into the future, we weight the signal less and so discount our posterior towards 0. It would be perfectly possible to include non-zero-mean priors into the model, but it's hard for me to think (off the top of my head) in what scenarios this would make sense. In your example of having a continuous stream of benefits in perpetuity, it seems more natural to model this as evidence you've received about the effects of an intervention, rather than your prior belief about the effects of an intervention.

For attractor states, I basically don't think that the assumption of a signal with increasing variance as the time horizon increases is a good way of modelling these. That's because in some sense predictability increases over time with attractor states (at the beginning, you don't know which state you'll be in, so predictability of future value is low; once you're in the attractor state, you persist in that state for a long time so predictability is high). As MichaelStJules mentioned, Christian Tarsney's paper is a better starting point for thinking about these attractor states.

Formalising the "Washing Out Hypothesis"

You're right that the constant predicted benefits for each intervention is an important simplifying assumption. However, as you mention, it would be relatively easy to change the integrand to allow for different shapes of signalled benefits. For example, a signal that suggests increasing benefits as we increase the time horizon might increase the relative value of the longtermist intervention. 

It quickly becomes an empirical question what the predicted-benefit function looks like, and so it will depend on the exact intervention we are looking at, along with various other empirical predictions. An important one is indeed whether we think the "size"/"scale" of the future will be much larger in value terms (e.g. if the number of individuals increases continuously in the future, the predicted benefits of L could plausibly increase over time).

About attractor states, you say:

attractor states mostly seem good by limiting the unpredictability rather than by increasing the impact

I think that's basically true, although we need to be careful here about what we mean by "impact". Even if the "impact" at any one time of being in a good attractor state vs a bad attractor state may be relatively small, the overall "impact" of getting into that attractor state may be large because it persists for so long. 

Formalising the "Washing Out Hypothesis"

I agree! I think you're pointing towards a useful way of carving up this landscape. My framework is good for modelling "ordinary" actions that don't involve attractor states, where actions are more likely to wash out and longtermism becomes harder to defend (but may still win out over neartermist interventions under the right conditions). Then, Tarsney's framework is a useful way of thinking about attractor states, where the case for longtermism becomes stronger but is still not a given.

Introducing LEEP: Lead Exposure Elimination Project

Thanks for this Jack! Sounds like an interesting area to look into.

I am curious about the literature suggesting that lead paint causes negative health / psychological effects. After an admittedly cursory glance, many of the studies you cite seem to indicate an association between lead exposure and some negative outcome, but don't necessarily imply a causal link from lead exposure to these negative outcomes. This is important: if the correlation is actually due to some other factor (e.g. living in worse conditions more generally), then we may overestimate how bad lead exposure is, and end up misdirecting funds.

Can you point us towards the best causal studies on lead exposure? E.g. ones that evaluate an RCT reducing lead exposure, or some other kind of "natural experiment"? (Apologies if you've referred to it and I just missed something)