@ Rethink Priorities
19032 karmaJoined Dec 2015Working (6-15 years)


"To see the world as it is, rather than as I wish it to be."

I'm a Senior Researcher on the General Longtermism team at Rethink Priorities. Right now, my team and I are re-orienting towards what are the best things to do, see here.

I also volunteer as a funds manager for EA Funds' Long-term Future Fund.



Yeah I don't have non-deference based arguments of really basic and important things like:

  • whether stars exist
  • how the money system works
  • gravity

And it was only in the last few years that I considered inside view arguments for why the Earth isn't flat. 


The more important metric to me will be if it is possible to do highly (preferably positively) impactful work while collaborating with universities, which I've seen positive evidence from individual professors/labs but not for larger groups. 


I think I have those sympathies because I'm an evolved being and this is a contigent fact at least of a) me being evolved and b) being socially evolved. I think it's also possible that there are details very specific to being a primate/human/WEIRD human specifically that's relevant to utilitarianism, though I currently don't think this is the most likely hypothesis[1]. 

If you think there's some underlying logic to them (which I do, and I would venture a decent fraction of utilitarians do) then why wouldn't you expect intelligent aliens to uncover the same logic?

I think I understand this argument. The claim is that if moral realism is true, and utilitarianism is correct under moral realism, then aliens will independently converge to utilitarianism. 

If I understand the argument correctly, it's the type of argument that makes sense syllogistically, but quickly falls apart probabilistically. Even if you assign only a 20% probability that utilitarianism is contingently human, this is all-else-equal enough to favor a human future, or the future of our endorsed descendants. 

Now "all-else-equal" may not be true. But to argue that, you'd probably need to advance a position that somehow aliens are more likely than humans to discover the moral truths of utilitarianism (assuming moral realism is true), or that aliens are more or equally likely than humans to contingently favor your preferred branch of consequentialist morality. 

[1] eg I'd think it's more likely than not that sufficiently smart rats or elephants will identify with something akin to utilitarianism. Obviously not something I could have any significant confidence in.

In Twitter and elsewhere, I've seen a bunch of people argue that AI company execs and academics are only talking about AI existential risk because they want to manufacture concern to increase investments and/or as a distraction away from near-term risks and/or regulatory capture. This is obviously false. 

However, there is a nearby argument that is likely true: which is that incentives drive how people talk about AI risk, as well as which specific regulations or interventions they ask for. This is likely to happen both explicitly and unconsciously. It's important (as always) to have extremely solid epistemics, and understand that even apparent allies may have (large) degrees of self-interest and motivated reasoning. 

Safety-washing is a significant concern; similar things have happened a bunch in other fields, it likely has already happened a bunch in AI, and will likely happen again in the months and years to come, especially if/as policymakers and/or the general public become increasingly uneasy about AI.


I disagree, I think major risks should be defined in terms of their potential impact sans intervention, rather than taking tractability into account (negatively). 

Incidentally there was some earlier speculation of what counterfactually might happen if we had invented CFCs a century earlier, which you might find interesting.


Unless I'm misunderstanding something important (which is very possible!) I think Bengio's risk model is missing some key steps. 

In particular, if I understand the core argument correctly, it goes like this:

1. (Individually) Human-level AI is possible.

2. At the point where individual AIs are human-level intelligence, they will collectively be superhuman in ability, due to various intrinsic advantages of being digital.

3. It's possible to build such AIs with autonomous goals that are catastrophically or existentially detrimental to humanity. (Bengio calls them "rogue AIs")

4. Some people may choose to actually build rogue AIs.

5. Thus, (some chance of) doom.

As stated, I think this argument is unconvincing. Because for superhuman rogue AIs to be catastrophic for humanity, they need to not only be catastrophic for 2023_Humanity but also for humanity even after we also have the assistance of superhuman or near-superhuman AIs. 

If I was trying to argue for Bengio's position, I would probably go down one (or more) of the following paths: 

  1. Alignment being very hard/practically impossible: If alignment is very hard and nobody can reliably build a superhuman AI that's sufficiently aligned that we trust it to stop rogue AI, then the rogue AI can cause a catastrophe unimpeded
    1. Note that this is not just an argument for the possibility of rogue AIs, but an argument against non-rogue AIs.
  2. Offense-defense imbalance: Perhaps it's easier in practice to create rogue AIs to destroy the world than to create non-rogue AIs to prevent the world's destruction.
    1. Vulnerable world: Perhaps it's much easier to destroy the world than prevent its destruction
      1. Toy example: Suppose AIs with a collective intelligence of 200 IQ is enough to destroy the world, but AIs with a collective intelligence of 300 IQ is needed to prevent the world's destruction. Then the "bad guys" will have a large head start on the "good guys."
    2. Asymmetric carefulness: Perhaps humanity will not want to create non-rogue AIs because most people are too careful about the risks. Eg maybe we have an agreement among the top AI labs to not develop AI beyond capabilities level X without alignment level Y, or something similar in law (and suppose in this world that normal companies mostly follow the law and at least one group building rogue AIs don't).
      1. In a sense, you can view this as a more general case of (1). In this story, we don't need AI alignment to be very hard, just for humanity to believe it is.

It's possible Bengio already believes 1) or 2), or something else similar, and just thought it was obvious enough to not be worth noting. But at least among my conversations with AI risk skeptics, among the ones who think AGI itself is possible/likely, the most common objection is why rogue AIs will be able to overpower not just humans but also other AIs as well.


Thanks for the reply! I appreciate it and will think further.

This seems to be the best paper on the topic, concluding that climate change drove most of the extinctions: https://www.sciencedirect.com/science/article/pii/S2590332220304760

To confirm, you find the climate change extinction hypotheses very credible here? I know very little about the topic except I vaguely recall that some scholars also advanced climate change as the hypothesis for the megafauna extinctions but these days it's generally considered substantially less credible than human origin.


since this is the one major risk where we are doing a good job

What about ozone layer depletion?


At least in the US, I'd worry that comparisons to climate change will get you attacked by ideologues from both of the main political sides (vitriol from the left because they'll see it as evidence that you don't care enough about climate change, vitriol from the right because they'll see it as evidence that AI risk is as fake/political as climate change).


I agree there are probably a few longtermist and/or EA-affliated people at Schimdt Futures, just as there are probably such people at Google, Meta, the World Bank, etc. This is a different claim than whether Schimdt Futures institutionally is longtermist, which is again a different claim from whether Eric Schimdt himself is.

Load more