Results from the AI x Democracy Research Sprint

Esben Kran

1

Executive summary: A 3-day research sprint on AI governance produced winning projects that demonstrated risks to democracy from AI, including challenges with unlearning hazardous information, AI-generated misinformation and disinformation, manipulation of public comment systems, and sleeper agents spreading misinformation.

Key points:

The research sprint aimed to demonstrate risks to democracy from AI and support AI governance work.
One project red-teamed unlearning techniques to evaluate their effectiveness in removing hazardous information from open-source models while retaining essential knowledge.
Another project showed that improving LLMs' ability to identify misinformation also enhances their ability to create sophisticated disinformation.
A project demonstrated how AI can undermine U.S. federal public comment systems by generating realistic, high-quality forged comments that are challenging to detect.
The final winning project explored how AI sleeper agents can spread misinformation, collaborate with each other, and use personal information to scam users.

This comment was auto-generated by the EA Forum Team. Feel free to point out issues with this summary by replying to the comment, and contact us if you have feedback.

Results from the AI x Democracy Research Sprint

Results from the AI x Democracy Research Sprint

Projects

Beyond Refusal: Scrubbing Hazards from Open-Source Models

Jekyll and HAIde: The Better an LLM is at Identifying Misinformation, the More Effective it is at Worsening It.

Artificial Advocates: Biasing Democratic Feedback using AI

Unleashing Sleeper Agents

Other projects

Apart Research Sprint