Zach Stein-Perlman

Research @ AI Impacts
4085 karmaJoined Nov 2020Working (0-5 years)Berkeley, CA, USA



AI forecasting & strategy at AI Impacts. Blog: Not Optional.


Topic Contributions

An undignified way for everyone to die: an AI lab produces clear, decisive evidence of AI risk/scheming/uncontrollability, freaks out, and tells the world. A less cautious lab ends the world a year later.

A possible central goal of AI governance: cause an AI lab produces decisive evidence of AI risk/scheming/uncontrollability, freaks out, and tells the world to result in rules that stop all labs from ending the world.

I don't know how we can pursue that goal.

I don't want to try to explain now, sorry.

(This shortform was intended more as starting-a-personal-list than as a manifesto.)

What's the best thing to read on "Zvi's writing on EAs confusing the map for the territory"? Or at least something good?

Thanks for the engagement. Sorry for not really engaging back. Hopefully someday I'll elaborate on all this in a top-level post.

Briefly: by axiological utilitarianism, I mean classical (total, act) utilitarianism, as a theory of the good, not as a decision procedure for humans to implement.

Thanks. I agree that the benefits could outweigh the costs, certainly at least for some humans. There are sophisticated reasons to be veg(etari)an. I think those benefits aren't cruxy for many EA veg(etari)ans, or many veg(etari)ans I know.

Or me. I'm veg(etari)an for selfish reasons — eating animal corpses or feeling involved in the animal-farming-and-killing process makes me feel guilty and dirty.

I certainly haven't done the cost-benefit analysis on veg(etari)anism, on the straightforward animal-welfare consideration or the considerations you mention. For example, if I was veg(etari)an for the straightforward reason (for agent-neutral consequentialist reasons), I'd do the cost-benefit analysis, and do things like:

  • Eat meat that would otherwise go to waste (when that wouldn't increase anticipated demand for meat in the future)
  • Try to reduce others' meat consumption, and try to reduce the supply of meat or improve the lives of farmed animals, when that's more cost-effective than personal veg(etari)anism
  • Notice whether eating meat would substantially boost my health and productivity, and go back to eating meat if so

I think my veg(etari)an friends are mostly like me — veg(etari)an for selfish reasons. And they don't notice this.

Written quickly, maybe hard-to-parse and imprecise.

(I agree it is reasonable to have a bid-ask spread when betting against capable adversaries. I think the statements-I-object-to are asserting something else, and the analogy to financial markets is mostly irrelevant. I don't really want to get into this now.)

Thanks. I agree! (Except with your last sentence.) Sorry for failing to communicate clearly; we were thinking about different contexts.


Some people say things like "my doom-credence fluctuates between 10% and 25% day to day"; this is dutch-book-able and they'd make better predictions if they reported what they feel like on average rather than what they feel like today, except insofar as they have new information.

Common beliefs/attitudes/dispositions among [highly engaged EAs/rationalists + my friends] which seem super wrong to me:


  • Giving a range of probabilities when you should give a probability + giving confidence intervals over probabilities + failing to realize that probabilities of probabilities just reduce to simple probabilities
    • But thinking in terms of probabilities over probabilities is sometimes useful, e.g. you have a probability distribution over possible worlds/models and those worlds/models are probabilistic
  • Unstable beliefs about stuff like AI timelines in the sense of I'd be pretty likely to say something pretty different if you asked tomorrow
    • Instability in the sense of being likely to change beliefs if you thought about it more is fine; fluctuating predictably (dutch-book-ably) is not


  • Axiologies besides ~utilitarianism
    • Possibly I'm actually noticing sloppy reasoning about how to go from axiology to decision procedure, possibly including just not taking axiology seriously
  • Veg(etari)anism for terminal reasons; veg(etari)anism as ethical rather than as a costly indulgence
  • Thinking personal flourishing (or something else agent-relative) is a terminal goal worth comparable weight to the impartial-optimization project

Cause prioritization:

  • Cause prioritization that doesn't take seriously the cosmic endowment is astronomical, likely worth >10^60 happy human lives and we can nontrivially reduce x-risk
  • Deciding in advance to boost a certain set of causes [what determines that set??], or a "portfolio approach" without justifying the portfolio-items
    • E.g. multiple CEA staff donate by choosing some cause areas and wanting to help in each of those areas
    • Related error: agent-relativity
    • Related error: considering difference from status quo rather than outcomes in a vacuum
    • Related error: risk-aversion in your personal contributions (much more egregious than risk-averse axiology)
    • Instead you should just argmax — find the marginal value of your resources in each cause (for your resources that can funge between causes), then use them in the best possible way
  • Intra-cause offsetting: if you do harm in area X [especially if it's avoidable/unnecessary/intentional], you should fix your harm in that area, even if you could do more good in another area
    • Maybe very few of my friends actually believe this


  • Not noticing big obvious problems with impact certificates/markets
  • Naively using calibration as a proxy for forecasting ability
  • Thinking you can (good-faith) bet on the end of the world by borrowing money
    • Many examples, e.g. How to place a bet on the end of the world
    • I think most of us understand the objection you can do better by just borrowing money at market rates — I think many people miss that utility is about ∫consumption not ∫bankroll (note the bettor typically isn't liquidity-constrained). The bet only makes sense if you spend all your money before you'd have to pay back.
  • [Maybe something deep about donations; not sure]
  • [Maybe something about compensating altruists or compensating for roles often filled by altruists; not sure]
  • [Maybe something about status; not sure]

Possibly I'm wrong about which attitudes are common.

For now I'm just starting a list, not trying to be legible, much less change minds. I know I haven't explained my views.

Edit: I'm sharing controversial beliefs, without justification and with some framed provocatively. If one of these views makes you think worse of me to a nontrivial degree, please ask for elaboration; maybe there's miscommunication or it's more reasonable than it seems. Edit 2: there are so many comments; I may not respond to requests-for-elaboration but will at least notice them as a bid-for-elaboration-at-some-point.

I have high credence in basically zero x-risk after [the time of perils / achieving technological maturity and then stabilizing / 2050]. Even if it was pretty low, "pretty low" * 10^70 ≈ 10^70. Most value comes from the worlds with extremely low longterm rate of x-risk, even if you think they're unlikely.

(I expect an effective population much much larger than 10^10 humans, but I'm not sure "population size" will be a useful concept (e.g. maybe we'll decide to wait billions of years before converting resources to value), but that's not the crux here.)

Load more