The Deployer Gap: Why EA Should Build Values Frameworks for Organisations Using AI

Jakob Van Dyk

I originally wrote this as a project for a BlueDot Impact AI Safety course in January 2024 and have revised it substantially since. While the landscape in governance has encouragingly developed since then I believe the gap identified is still largely unaddressed.

I am moderately confident in the core claim about operating entities being neglected, drawing from personal experience implementing enterprise software as a tech consultant and advising educational institutions on AI policy. I have been in many meetings where large organisations procured tools with no framework beyond the vendor's marketing materials.

The examples I give are primarily from the Global North and the operating entity gap may look different in other regulatory contexts. However, given the AI accessibility gap between developing and high-income economies, I expect that having a values framework to draw from will become even more crucial as AI tools become more widely available globally.

I used AI tools (Claude) for editorial feedback, structural suggestions, fact-checking, and citation verification during the revision process. The core arguments, examples, and analysis are my own. Views expressed here do not represent those of any current or former employer.

Most AI governance discourse focuses on two leverage points: the labs that build frontier AI systems, and the regulators who constrain them. But there is a third category of stakeholder: operating entities, the organisations that actually deploy AI in hospitals, schools, courts, and government agencies who are systematically underserved by existing safety frameworks. This article argues that developing practical, values-based adoption guidance for these organisations is one of the most tractable interventions available to the EA community.

The challenge of coordinating human stakeholders around shared AI values has been recognised in the AI safety literature, notably in Critch and Krueger's work on multi-principal alignment (2020) and the cooperative AI research agenda (Dafoe et al., 2021). The Centre for the Governance of AI (GovAI) has produced substantial work on AI governance institutions, though its focus has been primarily on frontier AI policy and international coordination rather than deployment-level guidance. International frameworks like the OECD AI Principles and the EU AI Act have also adopted multi-stakeholder models, but a significant gap remains. Current procedures like specific deployer obligations in Articles 26-27 inform deployers on what risk categories exist and what documentation is required, but they don’t help a school district reason about whether an AI-assisted grading tool aligns with its educational mission.

There's a specific gap between "legally compliant" and "values-aligned" that existing institutions are not filling, and that EA's analytical toolkit is well-suited to address.

Much of the EA community's work on AI safety focuses on technical alignment, ensuring that AI systems pursue goals that reflect human values. But there is a prior coordination problem that receives comparatively less attention: human stakeholders themselves are not aligned on what values AI should reflect in the first place. AI labs develop under commercial pressures; regulators respond to political incentives; consumers express preferences through purchasing behaviour; governments articulate interests through policy. When these groups pull in different directions, the resulting AI landscape reflects whichever stakeholder group exerts the most pressure in the moment. Not necessarily the one with the best values.

The Alignment Problem beneath the Alignment Problem

Misalignment between human values and artificial intelligence poses serious risks if not addressed early in development. One crucial and underappreciated form of this misalignment stems not from AI systems themselves, but from conflicting interests between the stakeholders who influence AI's development. When different groups who are involved in multi-faceted forms of funding, building, deploying, regulating, and being impacted by AI lack shared principles, it becomes difficult to steer technology toward outcomes that benefit society broadly^[1].

How can we possibly align AI if every group disagrees on what we should be aligning it to? This multi-principal alignment problem (Critch & Krueger, 2020) means that even technically well-aligned AI systems can produce harmful outcomes if the human institutions deploying them are pulling in different directions. The cooperative AI research agenda (Dafoe et al., 2021) and international frameworks like the OECD AI Principles have recognised this. But recognition is not resolution. In practice, one category of stakeholder, the operating entities, remains largely without the tools to participate meaningfully in this coordination. The rest of this article explains why that gap matters and what the EA community can do about it.

A stakeholder map

Understanding these stakeholder groups is essential for identifying where alignment efforts can have the greatest impact. Three broad categories of stakeholders shape AI development:

Organisations: include both developing entities (those that invest in, build, and deploy AI products, e.g., OpenAI, Anthropic, Google, Microsoft) and operating entities (organisations that consume AI products, whether public or private sector). This category encompasses employees, shareholders, and management across both types, and their concerns centre on regulatory compliance, reputation, and profitability.
Regulators and norm-setting entities: include governments, courts, and industry associations that legislate, regulate, enforce, and set standards for AI development and deployment. They are concerned with security, economic stability, and the interests of their constituents.
The general public: includes end users of private and public AI services, civil society advocates, academics, and affected communities. Their primary concerns are privacy, data control, equity, and freedom from harm.

How pressure propagates

Developing entities, like OpenAI, Microsoft, Google, and Anthropic, currently exercise the greatest directive power over the values and goals instilled in AI. But they operate within the preferences, norms, and frameworks of the other stakeholders. Aligning those groups around positive AI principles therefore creates structural pressure on developing entities to embed beneficial traits in their models.

The regulatory lever is the most direct. An AI lab must operate within the laws and regulations of any jurisdiction it wishes to function in. The EU's General Data Protection Regulation, for instance, carries fines of up to €20 million or 4% of global annual revenue for violations.^[2] The AI Act, now finalised, creates a risk-tiered regulatory regime that will constrain high-risk AI deployments across one of the world's largest markets.^[3] These are not abstract incentives; they reshape development priorities.

The public pressure lever operates through political accountability. Regulators respond to their constituents, and collective public concern translates into political demand for government action. A telling example: in March 2023, Italy's data protection authority temporarily banned ChatGPT from operating in the country, citing GDPR violations related to data collection practices and the absence of age verification, a decision driven in part by public complaints filed with the regulator. OpenAI responded by implementing specific changes, including age verification gates, improved privacy disclosures, and an opt-out mechanism for training data, before the ban was lifted approximately one month later. The episode demonstrated that even a frontier AI company operating at global scale can be compelled to alter its deployment practices when public concern is channelled through regulatory institutions.^[4]

This pattern is not new. In 2020, the UK government's use of an algorithm to predict A-level exam grades, which systematically downgraded students from lower-socioeconomic backgrounds, was reversed within days following public outcry.^[5] While that case involved a statistical algorithm rather than generative AI, both episodes illustrate the same mechanism: organised public pressure, when it finds institutional channels, can override commercial and governmental inertia on AI deployment decisions.

At a broader scale, in 2023 a coalition of AI researchers, technologists, and public advocates signed the Future of Life Institute's open letter calling for a pause on giant AI experiments.^[6] The letter generated substantial mainstream media coverage and contributed to the formation of government advisory committees and internal reviews across multiple jurisdictions. The Australian Government's interim response to its safe and responsible AI consultation, released in early 2024, acknowledged this public pressure directly, with Industry Minister Ed Husic stating that the government had "heard loud and clear that Australians want stronger guardrails to manage higher-risk AI".^[7]

The market lever operates through operating entities as consumers of AI products. If organisations prefer to deploy AI systems with demonstrable privacy protections, developing entities face commercial incentives to build those features. A 2023 Pew Research Center survey found that 52% of Americans are more concerned than excited about the increased use of AI in daily life, with data privacy ranking among the top concerns. Among those who had heard about AI, 70% reported little to no trust in companies to make responsible decisions about AI use in their products.^[8] A McKinsey Global Survey of over 1,300 business leaders and 3,000 consumers across 27 countries found that organisations positioned as leaders in digital trust are 1.6 times more likely than the global average to see revenue and EBIT growth of at least 10% annually, creating a direct commercial incentive for responsible AI practices.^[9]

The Public: an underestimated force

The general public is often treated as a passive recipient of AI governance decisions rather than an active participant in shaping them. This underestimates both their capacity and their track record of influence.

Beyond the UK exam algorithm case, consider the trajectory of facial recognition technology. Public and civil society backlash against police use of facial recognition systems, particularly following documented cases of misidentification disproportionately affecting people of colour, led multiple major US cities to introduce moratoriums or outright bans between 2019 and 2021. San Francisco, Boston, and Portland all enacted restrictions following organised public pressure and media scrutiny, without any federal mandate requiring them to do so.^[10] While some of these restrictions have since been narrowed or revisited, the episode demonstrates that organised public pressure can override commercial and law enforcement interests in AI deployment.

The challenge, however, is that existential AI risk remains far outside the public's immediate frame of concern. The general public engages most readily with proximate, plausible harms, like job displacement, privacy violations, and discriminatory outputs, rather than long-horizon catastrophic risk. This is not a failure of rationality, it reflects how attention and political mobilisation actually work.

Kamala Harris's November 2023 address in London on the future of AI illustrates this dynamic. In a speech that had the potential to foreground existential risks, Harris focused instead on bias, disinformation, and economic displacement.^[11] This was not a mistake; it was a strategic calibration to the concerns her constituents actually hold. The subsequent shift in US AI policy under the current administration underscores how dependent political-level AI governance is on electoral outcomes, and strengthens the case for building durable values frameworks at the deployment level that persist across political cycles. For AI safety advocates seeking to build public momentum, the implication is that messaging must bridge from near-term, tangible harms toward longer-horizon risks, building credibility and trust incrementally rather than leading with worst-case scenarios.

The Australian Government's stakeholder engagement workshops demonstrated a constructive approach to surfacing public values around AI adoption. Feedback was systematically compiled into a briefing that informed the government's principles for responsible AI, a genuine instance of public input shaping policy direction.^[12] The subsequent response acknowledged community concerns as a driver of the government's risk-based approach.^[6]

The neglected stakeholder: operating entities

If the general public is an underestimated force, then operating entities are the most systematically neglected stakeholder category in the AI safety governance landscape.

Developing entities face growing regulatory pressure and investor scrutiny. The general public has civil society advocates, media attention, and electoral accountability channels. But operating entities, the organisations that actually deploy AI products in hospitals, schools, government agencies, and businesses, largely lack formal, values-based guidance for how to be responsible AI consumers.

The Australian Government's guidance for public servants on using generative AI tools is instructive as an example of both the progress made and its limits. The Digital Transformation Agency's staff guidance, the primary document governing how thousands of public servants interact with AI, essentially advises not to enter sensitive or security-classified information into public generative AI tools.^[13]

Since this guidance was first issued, the Australian Government has taken further steps, including consulting on mandatory AI guardrails for high-risk settings and releasing updated policy guidance through the Department of Industry, Science and Resources. These developments are welcome, but they remain primarily compliance-oriented. They tell operating entities what not to do with AI (don't input classified data, don't deploy in prohibited categories) rather than equipping them with a positive framework for how to deploy AI in ways that serve their institutional values and obligations. The gap between 'compliance floor' and 'values-informed deployment' is precisely where EA-style reasoning could add the most value.

The Australian case is illustrative, but the pattern is widespread. Operating entities across sectors are largely self-regulating their AI adoption, with whatever frameworks happen to be commercially available. As primary consumers of AI products, their collective procurement choices send powerful signals to developing entities about what features and values to prioritise. An operating entity that demands transparency, explainability, and human oversight in its AI procurement is exercising real market power. Most operating entities are not exercising that power deliberately.

So where should EA organisations focus?

Standards bodies produce compliance frameworks, ISO/IEC 42001:2023 establishes requirements for AI management systems, and NIST's AI Risk Management Framework (2023) provides a structured approach to AI risk, but these are necessarily generic and consensus-driven, establishing minimum requirements, not best practice. While management consultancies do advise on AI procurement, their incentives align with vendor relationships rather than public benefit. GovAI and similar research organisations produce excellent analytical work on governance, but their focus is primarily on frontier AI policy rather than deployment-level guidance.

The EA community occupies a distinctive niche: it combines analytical rigour with a normative commitment to beneficial outcomes, and it has a track record of producing practical resources (e.g., 80,000 Hours career guides, GiveWell cost-effectiveness analyses) that translate complex reasoning into actionable frameworks. This is exactly what operating entities need.

This is tractable for several reasons. Operating entities are actively seeking guidance; the gap is not one of demand but of supply. The frameworks required are not primarily technical, they are ethical and procedural, drawing on exactly the reasoning tools EA practitioners use routinely. And unlike technical alignment work (where progress requires deep ML expertise) or frontier AI regulation (where influence requires political access), operating entity guidance can be developed and disseminated by organisations of modest size and resource.

Guidance for AI deployers is not entirely absent. A growing ecosystem of commercial content from AI governance platforms, enterprise software vendors, and consultancies offers deployers advice on regulatory compliance, risk management, and deployment best practices. But this content is overwhelmingly either compliance-oriented (focused on meeting regulatory requirements), generically principled (listing values like 'fairness' and 'transparency' without operationalising them for specific contexts), or commercially motivated (framed around business risk rather than obligations to affected communities). What's missing is not content, it's content that helps deployers ask a values-based question.

To be concrete about what 'values-based' means in this context: a compliance framework asks whether a deployment meets legal requirements. A quality assurance framework asks whether it works reliably. A values-based framework asks the prior question: should we deploy this system at all, given our institutional mission and obligations to the people we serve? It asks who benefits, who bears the risk, whether the deployment concentrates or distributes power, and whether the affected communities had meaningful input into the decision. These are the questions that EA's analytical tradition, with its emphasis on counterfactual impact, scope sensitivity, and impartial welfare is distinctively equipped to structure.

Consider a mid-sized public hospital system evaluating whether to adopt an AI-assisted triage tool. The hospital's procurement team can assess cost, integration requirements, and vendor reliability. What they typically lack is a structured framework for asking: Does this tool's training data reflect our patient population? What happens when the tool's recommendation conflicts with clinical judgment, who has override authority, and is that documented? Has the vendor provided sufficient transparency about the model's decision-making process for us to meet our duty of care to patients? What recourse do patients have if the tool contributes to a misdiagnosis?

These are not technical AI safety questions, they are ethical and procedural questions that EA-style reasoning is well-equipped to structure. A sector-specific AI procurement checklist, developed with input from both AI safety researchers and healthcare practitioners, could make these questions routine rather than exceptional. I’m not suggesting EA organisations produce these frameworks alone, but to catalyse their development, by funding collaborations between AI safety researchers and domain practitioners, by producing template frameworks that sector experts can adapt, or by advocating for operating entity representation in AI governance forums. The EA community's role is as a connector and catalyst, not a sole author.

While this is not a direct antidote to the existential crisis AI poses, operating entity alignment creates ecosystem pressure on developers. If thousands of deployers demand transparency, interpretability, and human oversight as procurement conditions, this shifts commercial incentives for frontier labs in safety-positive directions. The importance is high: operating entities mediate AI's real-world impact on millions. The neglectedness is high: no major organisation is producing values-based deployment guidance. The tractability is high: the frameworks required are ethical and procedural rather than deeply technical. This is an indirect but real pathway to influencing frontier AI development.

Achieving broad societal alignment on AI safety requires coordinating across all three stakeholder categories. Technical alignment research addresses the model. Regulatory advocacy addresses the governmental layer. The operating entity gap, organisations deploying AI without adequate values frameworks, is the most tractable and most neglected piece of the puzzle, and the EA community is well-positioned to fill it.

I'm currently exploring how to put this analysis into practice. If you're working on AI deployment guidance for operating entities, or know of existing efforts I should be aware of, I'd welcome the connection. Reach out in the comments or via direct message.

References