Being mindful of the incentives created by pressure campaigns

I've spent the past few months trying to think about the whys and hows of large-scale public pressure campaigns (especially those targeting companies — of the sort that have been successful in animal advocacy).

A high-level view of these campaigns is that they use public awareness and corporate reputation as a lever to adjust corporate incentives. But making sure that you are adjusting the right incentives is more challenging than it seems. Ironically, I think this is closely connected to specification gaming: it's often easy to accidentally incentivize companies to do more to look better, rather than doing more to be better.

For example, an AI-focused campaign calling out RSPs recently began running ads that single out AI labs for speaking openly about existential risk (quoting leaders acknowledging that things could go catastrophically wrong). I can see why this is a "juicy" lever — most of the public would be pretty astonished/outraged to learn some of the beliefs that are held by AI researchers. But I'm not sure if pulling this lever is really incentivizing the right thing.

As far as I can tell, AI leaders speaking openly about existential risk is good. It won't solve anything in and of itself, but it's a start — it encourages legislators and the public to take the issue seriously. In general, I think it's worth praising this when it happens. I think the same is true of implementing safety policies like RSPs, whether or not such policies are sufficient in and of themselves.

If these things are used as ammunition to try to squeeze out stronger concessions, it might just incentivize the company to stop doing the good-but-inadequate thing (i.e. CEOs are less inclined to speak about the dangers of their product when it will be used as a soundbite in a campaign, and labs are probably less inclined to release good-but-inadequate safety policies when doing so creates more public backlash than they were facing before releasing the policy). It also risks directing public and legislative scrutiny to actors who actually do things like speak openly about (or simply believe in) existential risks, as opposed to those who don't.

So, what do you do when companies are making progress, but not enough? I'm not sure, but it seems like a careful balance of carrots and sticks.

For example, animal welfare campaigns are full of press releases like this: Mercy for Animals "commends" Popeye's for making a commitment to broiler welfare reforms. Spoiler alert: it probably wasn't written by someone who thought that Popeye's had totally absolved themselves of animal abuse with a single commitment, but rather it served as a strategic signal to the company and to their competitors (basically, "If you lead relative to your competitors on animal welfare, we'll give you carrots. If you don't, we'll give you the stick." If they had reacted by demanding more (which in my heart I may feel is appropriate), it would have sent a very different message: "We'll punish you even if you make progress." Even when it's justified [1], the incentives it creates can leave everybody worse off.

There are lots of other ways that I think campaigns can warp incentives in the wrong ways, but this one feels topical.

  1. Popeyes probably still does, in fact, have animal abuse in its supply chain ↩︎

So I'm sympathetic to this perspective, but I want to add a different perspective on this point:

an AI-focused campaign calling out RSPs recently began running ads that single out AI labs for speaking openly about existential risk (quoting leaders acknowledging that things could go catastrophically wrong). I can see why this is a "juicy" lever — most of the public would be pretty astonished/outraged to learn some of the beliefs that are held by AI researchers.

I don't think they view this as a 'juicy' lever, it might just be the right lever (from their PoV)

If some of these leaders/labs think that there is a non-credible chance that AGI could cause an existential risk in the near-term (let's say 10%+ within ~10/20 years) then I think 'letting the public know' has very strong normative and pragmatic support. The astonishment and outrage would rightfully come from the instinctive response of 'wait, if you believe this, then why the hell are you working on it at all?'

So I guess it's not just the beliefs the public would find astonishing, but the seeming dissonance between beliefs and actions - and I think that's a fair response.

I think just letting the public now about AI lab leaders’ p(dooms)s makes sense - in fact, I think most AI researchers are on board with that too (they wouldn’t say these things on podcasts or live on stage if not).

It seems to me this campaign isn’t just meant to raise awareness of X-risk though — it’s meant to punish a particular AI lab for releasing what they see as an inadequate safety policy, and to generate public/legislative opposition to that policy.

I think the public should know about X-risk, but I worry using soundbites of it to generate reputatonial harms and counter labs’ safety agendas might make it less likely they speak about it in the future. It’s kind of like a repeated game: if the behavior you want in the coming years is safety-oriented, you should cooperate when your opponent exhibits that behavior. Only when they don’t should you defect.

So for clarity I'm much closer to your position than the position, and very much agree with your concerns.

But I think, from their perspective, the major AI labs are already defecting by scaling up models that are inherently unsafe despite knowing that this has a significant chance of wiping out humanity (my understanding of, not my own opinion[1])

I'm going to write a response to Connor's main post and link to it here that might help explain where their perspective is coming from (based on my own interpretation) [update: my comment is here, which is my attempt to communicate what the position is, or at least where their scepticism of RSP's has come from]

  1. ^

    fwiw my opinion is here

I would be most interested to think what seasoned animal rights campaigners think about this, but m not sure this take matces with the way social norms have changed in the past.

First I think it's useful to turn to what evidence we have. Animal rights and climate change campaigners have shown that somewhat counter intuitively, more extreme beligerant activism moves the overton window and actually makes it easier for mistake campaigners. There is a post on the forum and a talk at EA Nordic about this I can't find right now.

"So, what do you do when companies are making progress, but not enough? I'm not sure, but it seems like a careful balance of carrots and sticks."

On the basis of what evidence we have, I would more lean towards piling on more both more sticks and more carrots. I think the risk of AI lab heads going to ground publicly is close to zero. They don't want to lose the control they have of the discourse they have right now. If one goes to ground, others will take over the public sphere anyway.

One slightly more extreme organisation can call out the hypocrisy of AI leaders not taking publicly about their pdoom, while another org can praise then for the speaking out they are doing. Sticks and carrots.

I'm not sure there can ever be "too much pressure" put on that would cause Negative outcomes, but I could be wrong, it might help if you can point out a historical example. I think small victories can be followed by even more pressure.

Mercy for animals would probably be ok with commending Popeyes one day for making progress then haranguing then again the next day to do even better, but I could be wrong.

As a side note, I feel like we in the EA community might be at primary school level sometimes when discussing advocacy and activism. I would love to hear the take of some expert seasoned activists about where they think AI policy work and advocacy sure go.

I think the lesson we can draw from climate and animal rights that you mention - the radical flank effect - shows that extreme actions concerning an issue in general might make incremental change more palatable to the public. But I don’t think it shows that extreme action attacking incremental change makes that particular incremental change more likely.

If I had to guess, the analogue to this in the animal activist world would be groups like PETA raising awareness about the “scam” that is cage-free. I don’t think there’s any reason to think this has increased the likelihood of cage-free reforms taking place — in fact, my experience from advocating for cage-free tells me that it just worsened social myths that the reform was meaningless despite evidence showing it reduced total hours spent suffering by nearly 50%.

So, I would like to see an activist ecosystem where there are different groups with different tactics - and some who maybe never offer carrots. But directing the stick to incremental improvements seems to have gone badly in past movements, and I wouldn’t want to see the same mistake made here.

Thanks Tyler nice job explaining, I think I've changed my mind on the specific case of attacking a small positive incremental change. Like you I struggle to see how that's helpful. Better to praise the incremental change (or say nothing) then push harder.

Have retracted my previous comment.

I'm heartened as well that you have had experience in animal campaigns.

Some exciting news from the animal welfare world: this morning, in a very ideologically-diverse 5-4 ruling, the US Supreme Court upheld California's Proposition 12, one of the strongest animal welfare laws in the world!

Consider Using a Reading Ruler!

Digital reading rulers are tools that create parallel lines across a page of text, usually tinted a certain color, which scroll along with the text as you read. They were originally designed as a tool to aid comprehension for dyslexic readers, based on what was once a very simple strategy: physically moving a ruler down a page as you read.

There is some recent evidence showing that reading rulers improve speed and comprehension in non-dyslexic readers, as well. Also, many reading disabilities are probably something of a spectrum disorder, and I suspect it’s possible to have minor challenges with reading that slightly limit speed/comprehension but don’t create enough of a problem to be noticed early in life or qualify for a diagnosis.

Because of this, I suggest most regular readers at least try using one and see what they think. I’ve had the surprising experience that reading has felt much easier to me while using one, so I plan to continue to use reading rulers for large books and articles in the foreseeable future.

There are browser extensions that can offer reading rulers for articles, and the Amazon Kindle app for iOS added reading rulers two years ago. I’d be curious to hear if anyone else has had a positive experience with them.

