Implications of evidential cooperation in large worlds

Lukas Finnveden

Implications of evidential cooperation in large worlds

Lukas Finnveden

21 min readAug 23, 2023

Comments 1

Sorted by

New & upvoted

Wei Dai

One way to affect things is to increase the probability that humanity ends up building a healthy and philosophically competent civilization. (But we already knew that was important.)

Do you know anyone who is actually working on this, especially the second part (philosophical competence)? I've been thinking about this myself, and wrote some LW posts on the topic. (In short, my main message is that if we care about our collective philosophical competence, the AI transition represents both a high risk and a unique opportunity.) But I feel like my public and private efforts to attract more attention and work to this area haven't yielded much. Do you see things differently?

Comments

More from the author

Being honest with AIs

Lukas Finnveden·10mo ago·21m read

154

AGI and Lock-In

Lukas Finnveden, Jess_Riedel, CarlShulman·3y ago·Curated 3y ago·12m read

What's important in "AI for epistemics"?

Lukas Finnveden·1y ago·34m read

Curated and popular this week

What would an animal-aligned AI be aligned to?

Aidan Kankyoku, Anima International·1w ago·Curated 3d ago·15m read

This is a crosspost from the new Animal Welfare Alignment Newsletter by Anima International. You can subscribe on Substack if you are interested in following these efforts. Audio reading also available on Substack. The goals of this post are to: 1. Raise a question I see as crucially important to the goal of aligning AI to animal welfare...

178

The first video from Giving What We Can's new channel is out now!

JustinPortela·5d ago·1m read

Hello! I'm Justin Portela. I got hired by GWWC to make YouTube videos after AI in Context did such a kickass job. My channel is using that same cinematic, high-production value beauty to talk about everything in the EA universe that isn't AI. ...

New round of digital minds funding opportunities at Longview

zdgroff, Longview Philanthropy·6d ago·2m read

This is a linkpost for Request for Proposals: Research and Applied Work on Digital Minds. I'm glad to announce a request for proposals for research and applied work on digital minds at Longview Ph...

Recent opportunities to take action

A huge way you can help pigs in 5-20 minutes (in the US)

ElliotTep·2d ago·1m read

175

Possible mistake EAs are making and shout out to Pause AI UK

Michelle_Hutchinson·2w ago·4m read

RP is looking for project founders in neglected animal areas

Rethink Priorities·1w ago·7m read

^{^}

For even more references, see all the content gathered on this page, and more recently, this post written by Paul Christiano and this paper by Johannes Treutlein.

^{^}

If you know any plausible implication that I don’t list here — then either I don’t buy that it’s an implication of ECL, or it doesn’t seem sufficiently decision-relevant to me, or I haven’t thought about it / forgot about it and you should let me know.

^{^}

Whereas today, we can focus on handing-off the future to a broadly competent and healthy civilization, and trust decisions about what to do with the future to them.

^{^}

When I discuss how we should “care more about other humans’ universe-wide values”, I exclusively refer to universe-wide values held by humans on our current planet Earth, as opposed to values that might be held by distant human-like species. But the reason to benefit such values is to generate evidence that other people benefit our values on distant planets (not just here, on planet Earth). So why focus specifically on humans’ values? The reason is that we are more confident that some people treasure them, and it’s easy to benefit them via supporting humans who support them. For more, see here.

^{^}

“Misaligned AI” refers to AI whose values are very different from what was intended by the evolved species that first created them. If a distant species has very different values from us, and successfully aligns AI systems that they create, I wouldn’t count those as “misaligned AIs”.

^{^}

Or any other kind of acausal effects.

^{^}

Premature commitments are often a gamble that might gain you a better bargaining position while carrying a risk of everyone getting a lower payoff. Since that’s quite uncooperative, it seems plausible that ECL could discourage premature commitments. So this might be a reason to spread knowledge about ECL.

^{^}

Though also possible that uncareful thinking could increase them — given that they are by-their-nature caused by humanity making errors in what order they learn about and commit to doing certain things.

^{^}

And ideally, you would also think about other opportunities that faction A and faction B would have of benefiting each other, since you might also be providing evidence about those. Even more ideally, you might think about possible gains from trades that involve even more factions.

^{^}

Though the total effort that goes to each should perhaps still be allocated based on the number of people who support each set of values and who are sympathetic to ECL. Potentially adjusted by speculation about whether either set of values is underrepresented (among ECL-sympathizers) on Earth compared to the universe-at-large, in which case we should prioritize that set of values higher.

^{^}

It will be the most evidence for the actions of people in exactly my position. But this is not where most of my acausal influence will come from, since even a small amount of evidence across a sufficiently larger number of actors will weigh higher. The hypothesis that I’m putting forward here is that there might be some fairly broad class of actors which still share some key similarities with you, whose decisions your decisions provide more evidence about. And that your values might be (or be correlated with) one of the key similarities.

^{^}

Though I am personally somewhat sympathetic to both upside- and downside-focused values, so this doesn’t have a big impact on my all-things-considered view.

^{^}

Even if the aliens who went extinct shared our values, their choice to prioritize non-AI extinction risk less could still have been net-positive ex-ante. For example, they might have reallocated resources in a way that reduced AI takeover risk by 0.1% and increased non-AI extinction risk by 0.1001%. The added 0.0001% of x-risk might have been worth the benefit of leaving behind empty space rather than AI-controlled space in 0.1% of worlds.

^{^}

In particular, ECL suggests that we should discount benefits to aliens insofar as they on average correlate less strongly with us than the average civilizations-with-our-values do. (When making relevant decisions.)

^{^}

As an example of someone with this view: This facebook post by Eliezer Yudkowsky starts “I think that I care about things that would, in your native mental ontology, be imagined as having a sort of tangible red-experience or green-experience, and I prefer such beings not to have pain-experiences. Happiness I value highly is more complicated.” Yudkowsky has also written about the complexity and fragility of value elsewhere, e.g. here.

^{^}

In particular, ECL suggests that we should discount benefits to AI insofar as they correlate less strongly with us than actors-with-our-values do.

Implications of evidential cooperation in large worlds

Implications of evidential cooperation in large worlds

Summary (with links to sub-sections)

Affect whether (and how) future actors do ECL

Futures with aligned AI

Futures with misaligned AI

How us doing ECL affects our priorities

Care more about other humans’ universe-wide values

It matters less which universe-wide values control future resources (seems minor in practice?)

Upside- and downside-focused longtermists should care more about each others’ values

Care more about evolved aliens’ universe-wide values

Minor: Prioritize non-AI extinction risk less highly

Influence how AI benefits/harms alien civilizations’ values

Possibly: Weigh suffering-focused values somewhat higher if they are more universal

Care more about misaligned AIs’ universe-wide values

Minor: Prioritize AI takeover risk less highly

Positively influence misaligned AI

More

Appendices

What values do you need for this to be relevant?

More details on the split between humans, evolved species, and misaligned AI

Why distinguish humans from aliens?

Why distinguish evolved aliens from misaligned AIs?

Acknowledgments