Hide table of contents

The statement we will be debating for the forum-choice debate week will be based on: "By default, the world where AI goes well for humans will also go well for other sentient beings".

The debate week will be held during the week of March 23 — 29. 

This post is primarily to finalise the topic we will be discussing, and to ask for input from the Forum on the exact phrasing. However, I’m sure a bunch of readers who were following the vote for the forum-choice debate week will be wondering — why did the third-place entry win? I’ll address that first.

Why the third place entry won

In the voting post, in a footnote, I wrote: “Again, generic caveat here: there are legal and practical reasons that CEA might want to veto a topic. I personally think this is very unlikely, but we do reserve the right to veto if necessary. In the case of a veto, we'd move to the second highest karma topic.”

Unfortunately, what I thought was unlikely has indeed happened[1]Extra unfortunately, we have vetoed the first two topics. As with most internal wranglings, there isn’t much that it’s rational for me to share here, so I probably can’t answer your further questions. 

I’d like to make it very clear that me and Sarah think it is important that the EA movement has these conversations (and that we both favour transparency where possible). As of now, all I can say is that:

  1. We may be able to run one of the top two voted debates later in the year. I’ll let you know if this becomes possible. 
  2. Our politics on the EA Forum policy remains this

Thanks (prematurely) for understanding.

Refining the statement

As always, the first run at a debate statement is too ambiguous. Since it is Forum's-choice debate week, I'd love to get your input on how to rephrase the statement. 

The current phrasing is: "By default, the world where AI goes well for humans will also go well for other sentient beings"

Ambiguities:

  • "by default" - this must mean something like "in the nearest possible worlds where AI goes well for humans..."
  • "the world where AI goes well" - what does it mean for "AI" to go well? The release of the first AGI? Or the integration of less powerful AI systems into society?
  • "goes well for humans" - do we want to point to a specific outcome here (i.e. humans in control) or does "goes well" just mean "does not cause human extinction?"
  • "sentient beings" - we could discuss non-human animals and/or digital minds

A few refined versions

"On the margin, it is better for animals to work on the transition to AGI going well, than directly working on AI for animal welfare"

 @Kevin Xia 🔸  originally raised the debate statement here. By way of explanation he wrote:

It could probably be much more precise and nuanced, but specifically, I would want to assess whether "trying to make AI go well for all sentient beings" is marginally better supported through directly related work (e.g., AIxAnimals work) or through conventional AI safety measures - the latter of which would be supported if, e.g., making AI go well for humans will inevitably or is necessary to make sure that AI goes well for all. Although if it is necessary, it would depend further on how likely AI will go well for humans and such; but I think a general assessment of AI futures that go well for humans would be a great and useful starting point for me. 

We could interpret this as a narrower version of our earlier debate week, where we discussed "on the margin, it is better to work on reducing the chance of our extinction than increasing the value of the future where we survive".  

This phrasing still has some key ambiguities. Is "AI going well" aligned AGI? If it is, does it matter if the AGI is aligned but humans are disempowered, or do we also need controllable aligned AGI? What is direct work for AI animal welfare? 

"Without extra animal-focused work, even aligned superintelligence would be bad for non-human animals"

In this one I've opted for 'aligned' and 'superintelligence' to get at the idea of a future with a very powerful AI which humans have limited or no control of, but which acts in accordance with our values. As systems marketed as "AGI" get closer and closer, using "superintelligence" feels like it gets the vibe across a little better (though it is imprecise). 

'Animal-focused work' still feels uncomfortably vague... 

"AGI which doesn't cause human extinction or disempowerment will value animal welfare"

This focuses more on one of the cruxes that Kevin raised - whether aligning AI would make it inevitably good for animals. However it somewhat marginalises the question about AIxAnimals work. 

Leave your feedback below

I'd like to announce a finalised statement by the end of the week. Especially useful feedback would:

  • Argue for a particular phrasing.
  • Point out an ambiguous term which could derail the debate. 
  1. ^

    Though, as all good bayesians know, that doesn’t mean I was wrong!

10

0
0

Reactions

0
0

More posts like this

Comments1
Sorted by Click to highlight new comments since:

I'm really looking forward to the debate on this topic! 

Some thoughts:

  1. I like that debate topics aren't overly operationalized. Allowing people to take slightly different interpretations means that people can focus on the variation which seem most important to them. This can come at the expense of understanding each other crisply and when interpreting the (quantified) agreement scale.
    1. I'm not sure what were the main takeaways from previous debates, but I felt that I cared more about hearing interesting new takes and people's reactions to them than I cared about assessing the overall community opinion.
  2. "By default" - One possible ambiguity here is whether this means with >50% probability or with >99.9% probability.
  3. "The world where" -> "The worlds where". Also, perhaps this notion of conceiving of possible futures as possible worlds is a bit too heavy on EA/rationalist-lingo.
  4. "AI goes well for humans" - I broadly like this. I would be interested in people's opinion for both neartermist and lontermist worldviews, and under maxipok or flourishing futures.
  5. "Sentient beings" - Here I think the discussion should be contained to nonhuman animals because the other case seemed to be handled in the previous AI welfare debate.
  6. I don't think that the statement of the debate should be about "what we should do" but rather about the worldview directly. It's a bit hard for me to pinpoint exactly why I think so and I may regret this.
    1. I think that an operationalization which is too close to people's actual decisions may cause more people to defend their existing views or to take a stance based on what's more salient. I'm not sure why exactly, but framings like "Without extra animal-focused work, even aligned superintelligence would be bad for non-human animals" feel like they would generate more ideologically-oriented responses.
    2. This makes the question more complex with more moving parts.
  7. I think that the framing of "AGI which doesn't cause human extinction or disempowerment will value animal welfare" is quite good. Perhaps this should include CAIS or multipolar scenarios.
Curated and popular this week
Relevant opportunities