Hide table of contents

Introduction: some contemporary AI governance context

It’s a confusing time in AI governance. Several countries’ governments recently changed hands. DeepSeek and other technical developments have called into question certain assumptions about the strategic landscape. Political discourse has swung dramatically away from catastrophic risk and toward framings of innovation and national competitiveness.

Meanwhile, the new governments have issued statements of policy, and AI companies (mostly) continue to publish or update their risk evaluation and mitigation approaches. Interpreting these words and actions has become an important art for AI governance practitioners: does the phrase “human flourishing” in the new executive order signal concern about superintelligence, or just that we should focus on AI’s economic and medical potential and not “hand-wring” about safety? How seriously should we take the many references to safety in the UK’s AI Opportunities Action Plan, given the unreserved AI optimism in the announcement? Does Meta’s emphasis on “unique” risks take into account whether a model’s weights are openly released? The answers matter not only for predicting future actions but also for influencing them: it’s useful to know an institution’s relative appetite for different kinds of suggestions, e.g. more export controls versus maintaining Commerce’s reporting requirements.

So, many people who work in AI governance spend a lot of time trying to read between the lines of these public statements, talking to their contacts at these institutions, and comparing their assessment of the evidence with others’. This means they can wind up with a lot of non-public information — and often, they also have lots of context that casual observers (or people who are doing heads-down technical work in the Bay) might not.

All of that is to say: if you hear someone express a view about how an institution is thinking about AI (or many other topics), you might be tempted to update your own view towards theirs, especially if they have expertise or non-public information. And, of course, this is sometimes the correct response.

But this post argues that you should take these claims with a grain of salt. The rest of the post shifts to a much higher level of abstraction than the above, in part because I don’t want to “put anyone on blast,” and in part because this is a general phenomenon. Note that lots of these are generic reasons to doubt claims you can’t independently verify, but some of them are specific to powerful institutions.

Biases towards claiming agreement with one’s own beliefs

Let’s say you hear Alice say that a powerful institution (like a political party, important company, government, etc.) agrees with her position on a controversial topic more than you might think.

If you have reason to think that Alice knows more about that institution than you do, or just has some information that you don’t have, you might be inclined to believe Alice and update your views accordingly: maybe that institution is actually more sympathetic to Alice’s views than you realized!

This might be true, of course. But I’d like to point out a few reasons to be skeptical of this claim.

  • Maybe Alice is basing her claim on interactions with people in the institution whose views aren’t publicly known. But this evidence is liable to be biased:
    • The people Alice knows within the institution probably agree with Alice more than the average person in that institution. After all, they are somehow connected to Alice. This means they’re more likely than the average person in that institution to share some characteristic with Alice, like both having lived in the Bay Area, or both having worked in the national security space. Or maybe it’s even just that Alice has convinced them individually.
    • Those people are also incentivized to convince Alice that they agree with her more than they do. Giving Alice the impression that they’re on her side probably makes Alice more likely to take actions that help them rather than obstruct them, or gives her the impression that they’ve done her a meaningful favor (“I passed along that idea you mentioned, and I think there’s buy-in for it – we’ll see!”).
  • Maybe Alice is making this claim strategically, e.g. because expressing support for the institution makes them more likely to listen to her, and/or she’s trying to “incept” the idea that they hold this view.
  • Maybe Alice would be better off if it were true, and even though Alice doesn’t knowingly lie, the selfish parts of her brain can convince the reasoning parts of her brain that convenient things are true.
    • For example, maybe Alice’s work is at least partly aimed at influencing the institution, and Alice would be better able to recruit and fundraise to the extent that people believe that influencing this institution is tractable.
    • Or perhaps Alice is on record predicting that the institution will agree with her, and it would make her look prescient if people believe it does (or embarrass her if not). 

Weaker biases towards claiming disagreement with one’s own beliefs

Now imagine that you hear Bob, who agrees with Alice’s view, make the opposite claim: actually, the institution disagrees with us!

Not all of the same factors above apply – and I think, on net, these effects are stronger for those claiming agreement than disagreement, roughly in proportion to how powerful the institution is. But some of them still do, at least for some permutation:

  • Symmetrically, maybe Bob publicly predicted that the institution wouldn’t agree with him and stands to gain or lose status depending on whether people believe it does.
  • Maybe Bob isn’t trying to influence that institution – call it Institution A – but rather is trying to influence some opposing institution called Institution B.
    • By saying Institution A disagrees with him, he could be demonstrating his opposition to Institution A and thus his affiliation with Institution B, and trying to negatively polarize Institution B towards his view.
    • I think this effect is probably especially weak, but if Bob can make it look intractable to influence Institution A, this makes his own efforts to influence Institution B more appealing to employers and funders.

Conclusion

I wouldn’t totally dismiss either claim, especially if Alice/Bob do have some private information, even if I knew that they had many of these biases. Claims like theirs are a valuable source of evidence. But I would take both claims (especially Alice’s) with a grain of salt, and if the strength of these claims were relevant for an important decision, I’d consider whether and to what extent these biases might be at play. This means giving a bit more weight to my own prior views of the institution and my own interpretations of the evidence, albeit only to the extent that I think biases like the above apply less to me than to the source of the claim.

20

0
0

Reactions

0
0

More posts like this

Comments1


Sorted by Click to highlight new comments since:

Executive summary: Claims about the views of powerful institutions should be approached with skepticism, as biases and incentives can distort how individuals interpret or present these institutions' positions, especially when claiming alignment with their own views.

Key points:

  1. AI governance is in flux, with shifts in political leadership and discourse affecting interpretations of institutional policies and statements.
  2. People with inside knowledge may unintentionally misrepresent an institution’s stance due to biases, including selective exposure to like-minded contacts and incentives to overstate agreement.
  3. Individuals may strategically portray institutions as aligned with their views to gain influence, credibility, or resources.
  4. The bias toward overstating agreement is generally stronger than the bias toward overstating disagreement, though both exist.
  5. While such claims provide useful evidence, they should be weighed carefully, with extra consideration given to one’s own independent assessment of the institution’s stance.

 

 

This comment was auto-generated by the EA Forum Team. Feel free to point out issues with this summary by replying to the comment, and contact us if you have feedback.

More from tlevin
Curated and popular this week
Sam Anschell
 ·  · 6m read
 · 
*Disclaimer* I am writing this post in a personal capacity; the opinions I express are my own and do not represent my employer. I think that more people and orgs (especially nonprofits) should consider negotiating the cost of sizable expenses. In my experience, there is usually nothing to lose by respectfully asking to pay less, and doing so can sometimes save thousands or tens of thousands of dollars per hour. This is because negotiating doesn’t take very much time[1], savings can persist across multiple years, and counterparties can be surprisingly generous with discounts. Here are a few examples of expenses that may be negotiable: For organizations * Software or news subscriptions * Of 35 corporate software and news providers I’ve negotiated with, 30 have been willing to provide discounts. These discounts range from 10% to 80%, with an average of around 40%. * Leases * A friend was able to negotiate a 22% reduction in the price per square foot on a corporate lease and secured a couple months of free rent. This led to >$480,000 in savings for their nonprofit. Other negotiable parameters include: * Square footage counted towards rent costs * Lease length * A tenant improvement allowance * Certain physical goods (e.g., smart TVs) * Buying in bulk can be a great lever for negotiating smaller items like covid tests, and can reduce costs by 50% or more. * Event/retreat venues (both venue price and smaller items like food and AV) * Hotel blocks * A quick email with the rates of comparable but more affordable hotel blocks can often save ~10%. * Professional service contracts with large for-profit firms (e.g., IT contracts, office internet coverage) * Insurance premiums (though I am less confident that this is negotiable) For many products and services, a nonprofit can qualify for a discount simply by providing their IRS determination letter or getting verified on platforms like TechSoup. In my experience, most vendors and companies
 ·  · 4m read
 · 
Forethought[1] is a new AI macrostrategy research group cofounded by Max Dalton, Will MacAskill, Tom Davidson, and Amrit Sidhu-Brar. We are trying to figure out how to navigate the (potentially rapid) transition to a world with superintelligent AI systems. We aim to tackle the most important questions we can find, unrestricted by the current Overton window. More details on our website. Why we exist We think that AGI might come soon (say, modal timelines to mostly-automated AI R&D in the next 2-8 years), and might significantly accelerate technological progress, leading to many different challenges. We don’t yet have a good understanding of what this change might look like or how to navigate it. Society is not prepared. Moreover, we want the world to not just avoid catastrophe: we want to reach a really great future. We think about what this might be like (incorporating moral uncertainty), and what we can do, now, to build towards a good future. Like all projects, this started out with a plethora of Google docs. We ran a series of seminars to explore the ideas further, and that cascaded into an organization. This area of work feels to us like the early days of EA: we’re exploring unusual, neglected ideas, and finding research progress surprisingly tractable. And while we start out with (literally) galaxy-brained schemes, they often ground out into fairly specific and concrete ideas about what should happen next. Of course, we’re bringing principles like scope sensitivity, impartiality, etc to our thinking, and we think that these issues urgently need more morally dedicated and thoughtful people working on them. Research Research agendas We are currently pursuing the following perspectives: * Preparing for the intelligence explosion: If AI drives explosive growth there will be an enormous number of challenges we have to face. In addition to misalignment risk and biorisk, this potentially includes: how to govern the development of new weapons of mass destr
Dr Kassim
 ·  · 4m read
 · 
Hey everyone, I’ve been going through the EA Introductory Program, and I have to admit some of these ideas make sense, but others leave me with more questions than answers. I’m trying to wrap my head around certain core EA principles, and the more I think about them, the more I wonder: Am I misunderstanding, or are there blind spots in EA’s approach? I’d really love to hear what others think. Maybe you can help me clarify some of my doubts. Or maybe you share the same reservations? Let’s talk. Cause Prioritization. Does It Ignore Political and Social Reality? EA focuses on doing the most good per dollar, which makes sense in theory. But does it hold up when you apply it to real world contexts especially in countries like Uganda? Take malaria prevention. It’s a top EA cause because it’s highly cost effective $5,000 can save a life through bed nets (GiveWell, 2023). But what happens when government corruption or instability disrupts these programs? The Global Fund scandal in Uganda saw $1.6 million in malaria aid mismanaged (Global Fund Audit Report, 2016). If money isn’t reaching the people it’s meant to help, is it really the best use of resources? And what about leadership changes? Policies shift unpredictably here. A national animal welfare initiative I supported lost momentum when political priorities changed. How does EA factor in these uncertainties when prioritizing causes? It feels like EA assumes a stable world where money always achieves the intended impact. But what if that’s not the world we live in? Long termism. A Luxury When the Present Is in Crisis? I get why long termists argue that future people matter. But should we really prioritize them over people suffering today? Long termism tells us that existential risks like AI could wipe out trillions of future lives. But in Uganda, we’re losing lives now—1,500+ die from rabies annually (WHO, 2021), and 41% of children suffer from stunting due to malnutrition (UNICEF, 2022). These are preventable d
Recent opportunities in AI safety
18
Eva
· · 1m read