In the most recent episode of the 80,000 Hours podcast, Rob Wiblin and Ajeya Cotra from Open Phil discuss "the challenge Open Phil faces striking a balance between taking big ideas seriously, and not going all in on philosophical arguments that may turn out to be barking up the wrong tree entirely.

"They also discuss:

  • Which worldviews Open Phil finds most plausible, and how it balances them
  • Which worldviews Ajeya doesn’t embrace but almost does
  • How hard it is to get to other solar systems
  • The famous ‘simulation argument’
  • When transformative AI might actually arrive
  • The biggest challenges involved in working on big research reports
  • What it’s like working at Open Phil
  • And much more"

I'm creating this thread so that anyone who wants to share their thoughts on any of the topics covered in this episode can do so. This is in the spirit of MichaelA's suggestion of posting all EA-relevant content here.

17 comments, sorted by Highlighting new comments since Today at 11:44 PM
New Comment

Thanks for making this linkpost, Evelyn! I did have some thoughts on this episode, which I'll split into separate comments so it's easier to keep discussion organised. (A basic point is that the episode was really interesting, and I'd recommend others listen as well.)

A bundle of connected quibbles: 

  • Ajeya seems to use the term "existential risk" when meaning just "extinction risk"
  • She seems to imply totalism is necessary for longtermism
  • She seems to imply longtermism is only/necessarily focused on existential risk reduction
  • (And I disagree with those things.)

An illustrative quote from Ajeya:

I think I would characterise the longtermist camp as the camp that wants to go all the way with buying into the total view — which says that creating new people is good — and then take that to its logical conclusion, which says that bigger worlds are better, bigger worlds full of people living happy lives are better — and then take that to its logical conclusion, which basically says that because the potential for really huge populations is so much greater in the future — particularly with the opportunity for space colonisation — we should focus almost all of our energies on preserving the option of having that large future. So, we should be focusing on reducing existential risks.

But "existential risks" includes not just extinction risk but also includes risks of unrecoverable collapse, unrecoverable dystopia, and some (but not all) s-risks/suffering catastrophes. (See here.) 

And my understanding is that, if we condition on rejecting the totalism: 

  • Risk of extinction does becomes way less important
  • Risk of unrecoverable collapse probably becomes way less important (though this is a bit less clear)
  • Risk of unrecoverable dystopia and s-risks still retain much of their importance

(See here for some discussion relevant to those points.)

So one can reasonably be a non-totalist yet still prioritise reducing existential risk - especially risk of unrecoverable dystopias. 

Relatedly, a fair number of longtermists are suffering-focused and/or prioritise s-risk reduction, sometimes precisely because they reject the idea that making more happy beings is good but do think making more suffering beings is bad.

Finally, one can be a longtermist without prioritising reducing either reduction of extinction risk or reducing of other existential risks. In particular, one could prioritise work on what I'm inclined to call "non-existential trajectory changes". From a prior post of mine:

But what if some of humanity’s long-term potential is destroyed, but not the vast majority of it? Given Ord and Bostrom’s definitions, I think that the risk of that should not be called an existential risk, and that its occurrence should not be called an existential catastrophe. Instead, I’d put such possibilities alongside existential catastrophes in the broader category of things that could cause “Persistent trajectory changes”. More specifically, I’d put them in a category I’ll term in an upcoming post “non-existential trajectory changes”. (Note that “non-existential” does not mean “not important”.)

(Relatedly, my impression from a couple videos or podcasts is that Will MacAskill is currently interested in thinking more about a broad set of trajectory changes longtermists could try to cause/prevent, including but not limited to existential catastrophes.)

I expect Ajeya knows all these things. And I think it's reasonable for a person to think that extinction risks are far more important than other existential risks, that the strongest argument for longtermism rests on totalism, and that longtermists should only/almost only prioritise existential/extinction risk reduction. (My own views are probably more moderate versions of those stances.) But it seems to me that it's valuable to not imply that those things are necessarily true or true by definition

(Though it's of course easy to state things in ways that are less than perfectly accurate or nuanced when speaking in an interview rather than producing edited, written content. And I did find a lot of the rest of that section of the interview quite interesting and useful.)

Somewhat relatedly, Ajeya seems to sort-of imply that "the animal-inclusive worldview" is necessarily neartermist, and that "the longtermist worldview" is necessarily human-centric. For example, the above quote about longtermism focuses on "people", which I think would typically be interpreted as just meaning humans, and as very likely excluding at least some beings that might be moral patients (e.g., insects). And later she says:

And then within the near-termism camp, there’s a very analogous question of, are we inclusive of animals or not?

But I think the questions of neartermism vs longtermism and animal-inclusivity vs human-centrism are actually fairly distinct. Indeed, I consider myself an animal-inclusive longtermist.

I do think it's reasonable to be a human-centric longtermist. And I do tentatively think that even  animal-inclusive longtermism should still prioritise existential risks, and still with extinction risks as a/the main focus within that. 

But I think animal-inclusivity makes at least some difference (e.g., pushing a bit in favour of prioritising reducing risks of unrecoverable dystopias). And it might make a larger difference. And in any case, it seems worth avoiding implying that all longtermists must be focused only or primarily on benefitting humans, since that isn't accurate.

(But as with my above comment, I expect that Ajeya knows these things, and that the fact she was speaking rather than producing edited written content is relevant here.)

I haven't finished listening to the podcast episode yet but I picked up on a few of these inaccuracies and was disappointed to hear them. As you say I would be surprised if Ajeya isn't aware of these things. Anyone who has read Greaves and MacAskill's paper The Case for Strong Longtermism should know that longtermism doesn't necessarily mean a focus on reducing x-risk, and that it is at least plausible that longtermism is not conditional on a total utilitarianism population axiology*.

However, given that many people listening to the show might not have read that paper, I feel these inaccuracies are important and might mislead people. If longtermism is robust to different views (or at least if this is plausible), then it is very important for EAs to be aware of this. I think that it is important for EAs to be aware of anything that might be important in deciding between cause areas, given the potentially vast differences in value between them. 

*Even the importance of reducing extinction risk isn't conditional on total utilitarianism. For example, it could be vastly important under average utilitarianism if we expect the future to be good, conditional on humans not going extinct. That said, I'm not sure how many people take average utilitarianism seriously.

Thank you for writing this critique, it was a thought I had while listening as well. In my experience many EAs make the same mistake, not just Ajeya.

Update: I sort-of adapted this comment into a question for Ajeya's AMA, and her answer clarifies her views. (It seems like her and I do in fact basically agree on all of these points.)

Is eventually expanding beyond our solar system necessary for achieving a long period with very low extinction risk?

As part of the discussion of "Effective size of the long-term future", Ajeya and Rob discussed the barriers to and likelihood of various forms of space colonisation. I found this quite interesting. 

During that section, I got the impression that Ajeya was implicitly thinking that a stable, low-extinction-risk future would require some kind of expansion beyond our solar system. (Though I don't think she said that explicitly, so maybe I'm making a faulty inference. Perhaps what she actually had in mind was just that such expansion could be one way to get a stable, low-extinction-risk future, such that the likelihood of such expansion was one important question in determining whether we can get such a future, and a good question to start with.)

If she does indeed think that, that seems a bit surprising to me. I haven't really thought about this before, but I think I'd guess that we could have a stable, low-extinction-risk future - for, let's says, hundreds of millions of years - without expanding beyond our solar system. Such expansion could of course help[1], both because it creates "backups" and because there are certain astronomical extinction events that would by default happen eventually to Earth/our solar system. But it seems to me plausible that the right kind of improved technologies and institutions would allow us to reduce extinction risks to negligible levels just on Earth for hundreds of millions of years. 

But I've never really directly thought about this question before, so I could definitely be wrong. If anyone happens to have thoughts on this, I'd be interested to hear them.

[1] I'm not saying it'd definitely help - there are ways it could be net negative. And I'm definitely not saying that trying to advance expansion beyond our solar system is an efficient way to reduce extinction risk.

In my view, the AI timelines work is exactly the kind of research longtermists should do more of to persuade those more skeptical, and test their own assumptions.

Another is Christian Tarsney's "The Epistemic Challenge to Longtermism".

What I'd like to see next are grounded estimates of the causal effects of longtermist interventions, like research and policy, and including risks of backfire (e.g. accelerating AI development).

The sections "Biggest challenges with writing big reports" and "What it’s like working at Open Phil" were interesting and relatable

A lot of what was said in these sections aligned quite a bit with my own experiences from researching/writing about EA topics, both as part of EA orgs and independently. 

For example, Ajeya said:

One thing that’s really tough is that academic fields that have been around for a while have an intuition or an aesthetic that they pass on to new members about, what’s a unit of publishable work? It’s sometimes called a ‘publon’. What kind of result is big enough? What kind of argument is compelling enough and complete enough that you can package it into a paper and publish it? And I think with the work that we’re trying to do — partly because it’s new, and partly because of the nature of the work itself — it’s much less clear what a publishable unit is, or when you’re done. And you almost always find yourself in a situation where there’s a lot more research you could do than you assumed naively, going in. And it’s not always a bad thing.

It’s not always you’re being inefficient or you’re going down rabbit holes, if you choose to do that research and just end up doing a much bigger project than you thought you were going to do. I think this was the case with all of the timelines work that we did at Open Phil. My report and then other reports. It was always the case that we came in, we thought, I thought I would do a more simple evaluation of arguments made by our technical advisors, but then complications came up. And then it just became a much longer project. And I don’t regret most of that. So it’s not as simple as saying, just really force yourself to guess at the outset how much time you want to spend on it and just spend that time. But at the same time, there definitely are rabbit holes, and there definitely are things you can do that eat up a bunch of time without giving you much epistemic value. So standards for that seemed like a big, difficult issue with this work.

I think most of the EA-related things I've started looking into and writing up, except those that I deprioritised very early on, ending up growing and spawning spinoff tangent docs/posts. And then those spinoffs often ended up spawning their own spinoffs, and so on. And I think this was usually actually productive, and sometimes the spinoffs were more valuable than the original thing, but it definitely meant a lot of missed deadlines, changed plans, and uncertainties about when to just declare something finished and move on.

I don't have a lot of experience with research/writing on non EA-related topics, so maybe this is just a matter of my own (perhaps flawed) approach, or maybe it's just fairly normal. (One thing that comes to mind here is that - if I recall correctly - Joe Henrich says in his newest book, The WEIRDest People in the World, that his previous book - Secret of Our Success - was all basically just meant to be introductory chapters to WEIRDest People. And the prior book is itself quite long and quite fascinating!) 

But I did do ~0.5FTE years of academic psychology research during my Honours year. There I came up with the question and basic design before even starting, and the final product really had stuck pretty closely to that, and on schedule, with no tangents. So there's at least weak evidence that my more recent tangent-heavy approach (which I think I actually endorse) isn't just an approach I'd adopt even in more established fields. 

A few other things Ajeya said in those sections that resonated with me:

So a lot of the feeling of collaboration and teamyness and collegiality is partly driven by like, does each part of this super siloed organisation have its own critical mass.

[...]

And then [in terms of what I dislike about my job], it comes back to the thing I was saying about how it’s a pretty siloed organisation. So each particular team is quite small, and then within each team, people are spread thin. So there’s one person thinking about timelines and there’s one person thinking about biosecurity, and it means the collaboration you can get from your colleagues — and even the feeling of team and the encouragement you can get from your colleagues — is more limited. Because they don’t have their head in what you’re up to. And it’s very hard for them to get their head in what you’re up to. And so people often find that people don’t read their reports that they worked really hard on as much as they would like, except for their manager or a small set of decision makers who are looking to read that thing.

And so I think that can be disheartening. 

It was interesting - and sort of nice, in a weird way! - to hear that even someone with a relatively senior role at one of the most prominent and well-resourced EA orgs has those experiences and perceptions.

(To be clear, I've overall been very happy with the EA-related roles I've worked in! Ajeya also talked about a bunch of stuff about her job that's really positive and that also resonated with me.)

One other part of those sections that feels worth highlighting:

Rob Wiblin: Is there anything you can say to people who I guess either don’t think it’s possible they’ll get hired by Open Phil and maybe were a bit disappointed by that, or have applied and maybe didn’t manage to get a trial?

Ajeya Cotra: Yeah. I guess my first thought is that Open Phil is not people’s only opportunity to do good. Even doing generalist research of the kind that I think Open Phil does a lot of, especially for that kind of research, I think it’s a blessing and a curse, but you just need a desk and a computer to do it. I would love to see people giving it a shot more, and I think it’s a great way to get noticed. So when we write reports, all the reports we put out recently have long lists of open questions that I think people could work on. And I know of people doing work on them and that’s really exciting to me. So that’s one way to just get your foot in the door, both in terms of potentially being noticed at a place like Open Phil or a place like FHI or GPI, and also just get a sense of what does it feel like to do this? And do you like it? Or are the cons outweighing the pros for you?

I sort-of effectively followed similar advice, and have been very happy with the apparent results for my own career. And I definitely agree that there are a remarkable number of open questions (e.g., here and here) which it seems like a variety of people could just independently have a crack at, thereby testing their fit and/or directly providing useful insights.

The doomsday argument, the self-sampling assumption (SSA), and the self-indication assumption (SIA)

The interview contained an interesting discussion of those ideas. I was surprised to find that, during that discussion, I felt like I actually understood what the ideas of SSA and SIA were, and why that mattered. (Whereas there've been a few previous times when I tried to learn about those things, but always ended up mostly still feeling confused. That said, it's very possible I currently just have an illusion of understanding.)

While listening, I felt like maybe that section of the interview could be summarised as follows (though note that I may be misunderstanding things, such that this summary might be misleading):

"We seem to exist 'early' in the sequence of possible humans. We're more likely to observe that if the sequence of possible humans will actually be cut off relatively early than if more of the sequence will occur. This should update us towards thinking the sequence will be cut off relatively early - i.e., towards thinking there will be relatively few future generations. This is how the SSA leads to the doomsday argument.

But, we also just seem to exist at all. And we're more likely to observe that (rather than observing nothing at all) the more people will exist in total - i.e., the more of the sequence of possible humans will occur. This should update us towards thinking the sequence won't be cut off relatively early. This is how the SIA pushes against the doomsday argument.

Those two updates might roughly cancel out [I'm not actually sure if they're meant to exactly, roughly, or only very roughly cancel out]. Thus, these very abstract considerations have relatively little bearing on how large we should estimate the future will be."

(I'd be interested in people's thoughts on whether my attempted summary seems accurate, as well as on whether it seems relatively clear and easy to follow.)

One other thing on this section of the interview: Ajeya and Rob both say that the way the SSA leads to the doomsday argument seems sort-of "suspicious". Ajeya then says that, on the other hand, the way the SIA causes an opposing update also seems suspicious. 

But I think all of her illustrations of how updates based on the SIA can seem suspicious involved infinities. And we already know that loads of things involving infinities can seem counterintuitive or suspicious. So it seems to me like this isn't much reason to feel that SIA in particular can cause suspicious updates. In other words, it seems like maybe the "active ingredient" causing the suspiciousness in the examples she gives is infinity, not SIA. Whereas the way the SSA leads to the doomsday argument doesn't have to involve infinity, so there it seems like SSA is itself suspicious.

I'm not sure whether this is a valid or important point, but maybe it is? (I obviously don't think we should necessarily dismiss things just because they feel "suspicious", but it could make sense to update a bit away from them for that reason, and, to the extent that that's true, a difference in the suspiciousness of SSA vs SIA could matter.)

I like Ajeya's metaphor of different "sects" of the EA movement being different stops on the "train to crazy town":

So, you can think about the longtermist team as trying to be the best utilitarian philosophers they can be, and trying to philosophy their way into the best goals, and win that way. Where at least moderately good execution on these goals that were identified as good (with a lot of philosophical work) is the bet they’re making, the way they’re trying to win and make their mark on the world. And then the near-termist team is trying to be the best utilitarian economists they can be, trying to be rigorous, and empirical, and quantitative, and smart. And trying to moneyball regular philanthropy, sort of. And they see their competitive advantage as being the economist-y thinking as opposed to the philosopher-y thinking.

And so when the philosopher takes you to a very weird unintuitive place — and, furthermore, wants you to give up all of the other goals that on other ways of thinking about the world that aren’t philosophical seem like they’re worth pursuing — they’re just like, stop… I sometimes think of it as a train going to crazy town, and the near-termist side is like, I’m going to get off the train before we get to the point where all we’re focusing on is existential risk because of the astronomical waste argument. And then the longtermist side stays on the train, and there may be further stops.

The idea of neartermism and longtermism reflecting economist-like and philosopher-like ways of thinking struck a chord with me. I feel as if there are conflicting parts of me that want to follow different approaches to saving the world.

I'm not finished yet with the whole episode, but I didn't understand the part about fairness agreements and the veil of ignorance that Rob and Ajeya were talking about as a way to figure out how much money to allocate per worldview. This was the part from 00:27:50 to 00:41:05.  I think I understood the outlier opportunities principle though.

I've re-read the transcript once to try and understand it more but I still don't. I also googled about the Veil of Ignorance, and it started to make more sense, but I still don't understand the fairness agreements part. Is there a different article that explains what Ajeya meant by that? Or can someone explain it in a better way? Thanks!

(It turns out I was slightly mistaken in my other comment - there actually are a few public written paragraphs on the idea of fairness agreements in one section of a post by Holden Karnofsky in 2018.)

A separate comment I had been writing about that section of the interview:

  • Ajeya and Rob discussed "Fairness agreements". This seemed to me like a novel and interesting approach that could be used for normative/moral uncertainty (though Open Phil seem to be using it for worldview uncertainty, which is related but a bit different)
    • I currently feel more inclined towards some other approaches to moral uncertainty
      • But at this stage where the topic of moral uncertainty has received so little attention, it seems useful to come up with additional potential approaches
      • And it may be that, for a while, it remains useful to have multiple approaches one can bring to bear on the same question, to see where their results converge and diverge
    • On a meta level, I found it interesting that the staff of an organisation primarily focused on grantmaking appear to have come up with what might be a novel and interesting approach to normative/moral uncertainty
      • That seems like the sort of abstract theoretical philosophy work that one might expect to only be produced by academic philosophers, rather than people at a more "applied" org

A more direct response to your comment:

  • I haven't heard of the idea before, and had read a decent amount on moral uncertainty around the start of 2020. That, plus the way the topic was introduced in this episode, makes me think that this might be a new idea that hasn't been publicly  written up yet.
    • (See also the final bullet point here)
    • [Update: I was slightly mistaken; there are a few written paragraphs on the idea here]
  • I think it's understandable to have been a bit confused by that part; I don't think I fully understood the idea myself, and I got the impression that it was still at a somewhat fuzzy stage
    • (I'd guess that with an hour of effort I could re-read that part of the transcript and write an ok explainer, but unfortunately I don't have time right now. But hopefully someone else will be able to do that, ideally better and more easily than I could!)

No worries that you don't have the time to explain it Michael! I'm glad to hear that others haven't heard of the idea before and that this is a new topic. Hopefully someone else can explain it in more depth. I think sometimes concepts featured in 80K podcast episodes or other EA content can be really hard to grasp, and maybe others can create visuals, videos, or better explanations to help.

An example of another hard to grasp topic in 80K's past episodes is complex cluelessness. I think Hilary Greaves and Arden did a good/okay job in explaining it, and I kinda get the idea, but it would be hard for me to explain without looking up the paper, reading the transcript, or listening to the podcast again.

I also still find the concept of complex cluelessness slippery, and am under the impression that many EAs misunderstand and misuse the term compared to Greaves' intention. But if you haven't seen it already, you may find this talk from Greaves' helpful.