Independent AGI Safety Companies: A Proposal for Government-Funded Guardrail Development
Abstract
Artificial General Intelligence (AGI) presents both unprecedented opportunities and profound risks. While global private investment into AI capabilities is already in the tens to hundreds of billions annually (e.g., $109.1B in the US alone in 2024), safety research remains underfunded, with public and philanthropic efforts in the tens to low hundreds of millions. Much of this safety work is also conducted within the same organizations racing to build AGI, creating conflicts of interest and increasing the risk of inadequate safeguards.
We propose the creation of government-funded, independent companies dedicated exclusively to AGI safety guardrails — including sandboxing, tripwire systems, interpretability tools, continuous monitoring, and certification frameworks. These entities would not engage in AGI development, ensuring neutrality and focus. This model is cost-effective, globally beneficial, and politically viable as a first step toward mitigating existential AGI risks.
1. Introduction
- AGI development is accelerating, with major tech labs and national governments investing tens to hundreds of billions of dollars annually into scaling models, data, and compute.
- Safety research, though growing, lags far behind. Current efforts are fragmented and often embedded inside capability-driven organizations, where commercial or geopolitical incentives can pressure teams to compromise on safeguards.
- Without robust, independent safety infrastructure, AGI poses risks ranging from widespread disruption to potential loss of human control.
- Proposal: Establish independent, safety-first companies, funded by governments, whose sole mission is to design, test, and certify guardrails for AGI systems.
2. Rationale
- Neutrality – By excluding AGI development, these companies avoid competitive incentives that push capability labs to cut safety corners. They would employ technical experts but operate under a strict mandate: no AGI building, only guardrails.
- Trust – Independence increases legitimacy, making safety oversight more acceptable to both the public and rival nations.
- Cost-effectiveness – Safety infrastructure requires only a fraction of AGI research budgets, yet dramatically reduces existential risk.
- Global alignment – Safety methods can be shared internationally as principles and designs, without disclosing sensitive AGI capability code.
3. Functions of Independent Safety Companies
- Sandbox & Containment Systems → secure environments to test AGI without real-world access.
- Tripwire Mechanisms → automatic shutdown/isolation if dangerous behaviors emerge.
- Interpretability Tools → dashboards and frameworks to make AGI reasoning auditable.
- Red-Team Testing → adversarial probing to reveal hidden failure modes.
- Certification Standards → independent safety approval required pre-deployment.
- Continuous Monitoring → post-deployment auditing, since AGI behavior may drift over time.
4. Governance Model
- Funding: Primarily government-backed, with options for multinational co-funding (UN, NATO, G7).
- Independence: Operate outside corporate profit motives and insulated from day-to-day political cycles through an independence charter (similar to central banks).
- Accountability: Transparent reporting to oversight boards composed of scientists, ethicists, and civil society representatives.
- Verification: Authority to audit AGI projects, mandate safety fixes, and block unsafe deployments.
5. Cost Analysis
- Global AI private investment (2024): $252B worldwide; $109.1B in the US; $33.9B into generative AI specifically.
- Public and philanthropic safety funding: tens to low hundreds of millions annually.
- Estimated independent safety network cost: $1–3B annually to establish world-class safety companies across continents.
- Comparative anchors:
- IAEA Regular Budget (2025): €439.5M; Total resources: €757.5M.
- FDA FY2025 budget: ~$7.2B.
- FAA FY2025 request: $21.8B ($26.8B incl. Bipartisan Infrastructure Law).
Conclusion: An AGI safety ecosystem is not only affordable but cheap compared to both AGI research and existing safety regulators in other fields.
6. Benefits & Challenges
Benefits:
- Separation of safety from capability incentives.
- Higher global trust and accountability.
- Scalable, open safety commons available to all nations.
- Proactive rather than reactive risk management.
Challenges:
- Political mistrust between rival nations (US vs China, etc.).
- Overlap between safety research and capability work may require careful information-sharing policies.
- Talent scarcity — top AI researchers are heavily recruited by capability labs; strong incentives needed to attract them into safety roles.
- Enforcement — requires legal authority to prevent rogue actors bypassing certification.
7. Conclusion
Independent, government-funded safety companies represent a practical, scalable first step toward mitigating AGI risk. They are cost-effective, globally beneficial, and avoid the conflicts of interest inherent in current safety research models.
We recommend:
- By 2026 → Governments establish national AGI Safety Companies with independence charters.
- By 2028 → Build a multinational network (UN or NATO-linked) to harmonize safety standards.
- By 2030 → Target global adoption of mandatory third-party AGI safety certification before deployment.
This timeline is ambitious but achievable — and given the stakes, delaying safety infrastructure risks far greater costs than early investment.
