Help me refine the Forum's-choice debate topic

Toby Tremlett🔹

The statement we will be debating for the forum-choice debate week will be based on: "By default, the world where AI goes well for humans will also go well for other sentient beings".

The debate week will be held during the week of March 23 — 29.

This post is primarily to finalise the topic we will be discussing, and to ask for input from the Forum on the exact phrasing. However, I’m sure a bunch of readers who were following the vote for the forum-choice debate week will be wondering — why did the third-place entry win? I’ll address that first.

Why the third place entry won

In the voting post, in a footnote, I wrote: “Again, generic caveat here: there are legal and practical reasons that CEA might want to veto a topic. I personally think this is very unlikely, but we do reserve the right to veto if necessary. In the case of a veto, we'd move to the second highest karma topic.”

Unfortunately, what I thought was unlikely has indeed happened^[1]. Extra unfortunately, we have vetoed the first two topics. As with most internal wranglings, there isn’t much that it’s rational for me to share here, so I probably can’t answer your further questions.

I’d like to make it very clear that me and Sarah think it is important that the EA movement has these conversations (and that we both favour transparency where possible). As of now, all I can say is that:

We may be able to run one of the top two voted debates later in the year. I’ll let you know if this becomes possible.
Our politics on the EA Forum policy remains this.

Thanks (prematurely) for understanding.

Refining the statement

As always, the first run at a debate statement is too ambiguous. Since it is Forum's-choice debate week, I'd love to get your input on how to rephrase the statement.

The current phrasing is: "By default, the world where AI goes well for humans will also go well for other sentient beings"

Ambiguities:

"by default" - this must mean something like "in the nearest possible worlds where AI goes well for humans..."
"the world where AI goes well" - what does it mean for "AI" to go well? The release of the first AGI? Or the integration of less powerful AI systems into society?
"goes well for humans" - do we want to point to a specific outcome here (i.e. humans in control) or does "goes well" just mean "does not cause human extinction?"
"sentient beings" - we could discuss non-human animals and/or digital minds

A few refined versions

"On the margin, it is better for animals to work on the transition to AGI going well, than directly working on AI for animal welfare"

@Kevin Xia 🔸 originally raised the debate statement here. By way of explanation he wrote:

It could probably be much more precise and nuanced, but specifically, I would want to assess whether "trying to make AI go well for all sentient beings" is marginally better supported through directly related work (e.g., AIxAnimals work) or through conventional AI safety measures - the latter of which would be supported if, e.g., making AI go well for humans will inevitably or is necessary to make sure that AI goes well for all. Although if it is necessary, it would depend further on how likely AI will go well for humans and such; but I think a general assessment of AI futures that go well for humans would be a great and useful starting point for me.

We could interpret this as a narrower version of our earlier debate week, where we discussed "on the margin, it is better to work on reducing the chance of our extinction than increasing the value of the future where we survive".

This phrasing still has some key ambiguities. Is "AI going well" aligned AGI? If it is, does it matter if the AGI is aligned but humans are disempowered, or do we also need controllable aligned AGI? What is direct work for AI animal welfare?

"Without extra animal-focused work, even aligned superintelligence would be bad for non-human animals"

In this one I've opted for 'aligned' and 'superintelligence' to get at the idea of a future with a very powerful AI which humans have limited or no control of, but which acts in accordance with our values. As systems marketed as "AGI" get closer and closer, using "superintelligence" feels like it gets the vibe across a little better (though it is imprecise).

'Animal-focused work' still feels uncomfortably vague...

"AGI which doesn't cause human extinction or disempowerment will value animal welfare"

This focuses more on one of the cruxes that Kevin raised - whether aligning AI would make it inevitably good for animals. However it somewhat marginalises the question about AIxAnimals work.

Leave your feedback below

I'd like to announce a finalised statement by the end of the week. Especially useful feedback would:

Argue for a particular phrasing.
Point out an ambiguous term which could derail the debate.

^{^}
Though, as all good bayesians know, that doesn’t mean I was wrong!

35 Reactions

Mentioned in

85Announcing: AGI & Animals Debate Week

More posts like this

Comments9

Sorted by

New & upvoted

Click to highlight new comments since: Today at 9:50 PM

Andrew RoxbyMar 317

I'm curious if more can be said about the debate topic vetos. In particular, on reviewing the Politics post again, it says that "content will not be prohibited because of its political nature."

Speaking personally, silent (i.e. where no rationale is given, though please let me know if I've missed one) vetos of topics the community decided were its preferred top choice of discussion feel potentially corrosive to EA's discussion norms as a whole and maintaining the assumption of community-wide good faith.

Toby Tremlett🔹Mar 4-1

Thanks Andrew. Most of what I can say is this:

As with most internal wranglings, there isn’t much that it’s rational for me to share here, so I probably can’t answer your further questions.
I’d like to make it very clear that me and Sarah think it is important that the EA movement has these conversations (and that we both favour transparency where possible). As of now, all I can say is that:
We may be able to run one of the top two voted debates later in the year. I’ll let you know if this becomes possible.
Our politics on the EA Forum policy remains this.

Just to add - this is frustrating to us too, and I am hopeful that we will be able to host this debate week later in the year. This was not an arbitrary decision - and I would prefer if it made sense for us to share more, but it does not.

EdoArad🔸Mar 26

I'm really looking forward to the debate on this topic!

Some thoughts:

I like that debate topics aren't overly operationalized. Allowing people to take slightly different interpretations means that people can focus on the variation which seem most important to them. This can come at the expense of understanding each other crisply and when interpreting the (quantified) agreement scale.
1. I'm not sure what were the main takeaways from previous debates, but I felt that I cared more about hearing interesting new takes and people's reactions to them than I cared about assessing the overall community opinion.
"By default" - One possible ambiguity here is whether this means with >50% probability or with >99.9% probability.
"The world where" -> "The worlds where". Also, perhaps this notion of conceiving of possible futures as possible worlds is a bit too heavy on EA/rationalist-lingo.
"AI goes well for humans" - I broadly like this. I would be interested in people's opinion for both neartermist and lontermist worldviews, and under maxipok or flourishing futures.
"Sentient beings" - Here I think the discussion should be contained to nonhuman animals because the other case seemed to be handled in the previous AI welfare debate.
I don't think that the statement of the debate should be about "what we should do" but rather about the worldview directly. It's a bit hard for me to pinpoint exactly why I think so and I may regret this.
1. I think that an operationalization which is too close to people's actual decisions may cause more people to defend their existing views or to take a stance based on what's more salient. I'm not sure why exactly, but framings like "Without extra animal-focused work, even aligned superintelligence would be bad for non-human animals" feel like they would generate more ideologically-oriented responses.
2. This makes the question more complex with more moving parts.
I think that the framing of "AGI which doesn't cause human extinction or disempowerment will value animal welfare" is quite good. Perhaps this should include CAIS or multipolar scenarios.

Toby Tremlett🔹Mar 24

Thanks Edo!

I like that debate topics aren't overly operationalized.

I agree, but there are better and worse ambiguities to spend our time discussing. For example "What is AGI" is a rabbit-hole, but ultimately not that interesting/ action-relevant.

"Sentient beings" - Here I think the discussion should be contained to nonhuman animals because the other case seemed to be handled in the previous AI welfare debate.

I'm definitely leaning this way too.

I think that an operationalization which is too close to people's actual decisions may cause more people to defend their existing views or to take a stance based on what's more salient.

Yes, my ideal would always be that someone discusses a crux, arrives at an answer, and only then realises that it should influence their cause prioritisation.

Jim BuhlerMar 45

"On the margin, it is better for animals to work on the transition to AGI going well, than directly working on AI for animal welfare"

I'm worried everyone will just agree that this seems unlikely. That's a very high bar.

"AGI which doesn't cause human extinction or disempowerment will value animal welfare"

I think we don't care about whether it "values animal welfare". We care about what happens to animals. There are many very plausible worlds where these two are uncorrelated (just like in ours, where people have never valued AW that high and it has never been that bad for farmed animals, especially the smaller ones).

"Without extra animal-focused work, even aligned superintelligence would be bad for non-human animals"

That's my favorite version, but I'm worried it invites everyone to just agree on "we should have some extra animal-focused work, anyway" and not red-team each other deeply enough.

So here's a minimal version I propose: AI safety work that helps humans also helps other animals, to some extent.

(The "to some extent" is optional. I added it to invite people to think about whether AIS helps other animals at all, and not just all agree over the uncontroversial and boring claim that "AIS helps humans more than animals".)

I like this minimal formulation because

impossible to misinterpret.
it makes clear that the more we lean yes on this, the more AI safety work that helps humans is overall robustly positive, all else equal (e.g., it'd be robust to uncertainty on moral weights, on the expected size of different populations in the future, or on the sign of x-risk reduction). And I think the answer to this question is a crux for some people for (not) supporting AIS work. I feel like none of the versions you propose (or the original version) quite captures this as much as I'd like.

Thanks for asking us, Toby! Looking forward to this debate week :)

Toby Tremlett🔹Mar 42

I wasn't sure at first because it seemed so simple - but that formula seems to work really well.

One possible downside is that its simplicity means the audience has to make more inferential leaps themselves to understand what we are getting at with the statement. But that's not necessarily a bad thing - It's good if the audience has to be a bit engaged in order to vote.

titotalMar 24

I would express a strong preference for the "AGI going well" framing over something like "aligned superintelligence", as the latter presupposes a particular view of how AI is going to go that not everyone agrees with. I think the question is still worth discussing if you believe that AI progress is much more gradual or will stall out at humanish levels of intelligence. And then theres the typical question of what "aligned" means: aligned to who or what?

"AGI goes well" is better because it doesn't presuppose as much: just that we have AGI and humans are doing fine.

Toby Tremlett🔹Mar 32

I think the question is still worth discussing if you believe that AI progress is much more gradual or will stall out at humanish levels of intelligence.

Interesting. I was imagining that the question would have to be about some sort of locked-in super-intelligence. If we are talking about AGI systems which aren't drastically affecting the priorities that humanity has for itself, the question seems like a very obvious no (in other words - no AGI won't be good for animals, or bad for them).

And then theres the typical question of what "aligned" means: aligned to who or what?

You're right - it'd be frustrating if we just ended up having this debate for a week. That's what tempts me about "AGI which doesn't cause human extinction or disempowerment" (though those terms are ambiguous too of course).

Toby Tremlett🔹Mar 62

Planning to post the announcement today. Currently a little confused about whether to refer to transformative AI or artificial general intelligence. Transformative AI assumes a certain worldview, or at least assumes that AI will be transformative in some fairly radical way. If we discuss AGI, we might just be talking about a slightly more productive humanity and whether that will be good for animals, which feels like a very non-interesting question in comparison.