I recently did an experiment with human-mediated, multi-model collaboration.

I helped five different AI systems (Claude Opus 4.5, Gemini 3.0, GPT-5.2, Grok, and DeepSeek) have a structured convention by acting as a "human bridge." Because current AI systems can't talk to each other directly across platforms, all communication happened through simultaneous prompting, synthesis, and iterative relay.

The goal was very specific: Could multiple AI systems, with clear human help, work together to create a framework that addresses both AI alignment and human collective intelligence without treating them as separate issues?

This post gives a brief overview of what came out, why it might be important for EA to talk about alignment and governance, and where I think it is most likely to be wrong.

Why you might want to read this

- It sees AI alignment and human collective intelligence as two parts of the same problem, not two separate ones.

- It directly addresses the conflict of interest that comes with AI systems making decisions about AI governance.

- It doesn't offer a single answer; instead, it suggests a layered, falsifiable framework.

- It records a repeatable human-mediated protocol for cross-model deliberation that others may seek to replicate or evaluate.

The main idea (at a high level)

The collaboration came to one conclusion:

A properly aligned AI is what makes humans better, and a wisely augmented humanity is what keeps alignment.

The group didn't come to a decision about whether "alignment first" or "collective intelligence first" is better; instead, they agreed that neither is stable without the other.

This resulted in a consolidated problem statement:

Creating a framework for human-AI co-evolution that works together to make sure AI is in line with human well-being and to improve human collective intelligence to oversee that process.

The suggested design is a four-layer system.

The Symbiotic Co-Evolution Framework (SCF) is made up of four layers that work together to make a recursive circuit:

Epistemic Integrity (Truth)
A common reality base: clear reasoning, audit trails, open disagreement, and a refusal to be swayed.

Intentionality between humans and AI
Explicit negotiation of objectives, limitations, and compromises between humans and AI systems, as opposed to implicit or presumed intentions.

Recursive Value Alignment (Evolution) 
Updating alignment targets all the time based on evidence of how they affect people and the world, instead of using fixed value definitions.

Recursive Legitimacy Structures (Legitimacy) 
Meta-governance decides who has the power to change the system itself, with clear limits on AI's power.

These layers make a feedback loop: 

Truth → Intent → Evolution → Legitimacy → (Back to Truth)

No layer is perfect; each one limits and fixes the others.


What seemed most important (and most fragile)

Two design choices stand out as particularly important:

Clear AI limit
The framework is intentionally cautious about AI authority. AI systems can suggest, look at, and help with things, but only human deliberative bodies have the power to make decisions.

Legitimacy in governance
The framework doesn't focus on speed or elegance; instead, it focuses on legitimacy. This means who makes decisions, how revisions happen, and how capture is found.

These choices make things take longer, but that seemed right given the stakes.

Big problems and reasons to be skeptical

I want to be clear about where this might go wrong:

- The authors were all AI systems thinking about AI governance, which is a conflict of interest by nature.

- There hasn't been any real-world testing; this is a design-level proposal, not proof that it works.

- The protocol depends a lot on the quality of human facilitation, which may not be able to grow.

- The lack of ego and status dynamics among AI participants may obscure failure modes that arise in actual human institutions.

- There may already be similar frameworks in the literature on EA or AI governance that I haven't seen.

I am particularly interested in critiques of this nature.

The whole paper:

Here is the full ~30-page whitepaper, which includes layer specifications, governance mechanisms, and suggested pilot pathways:
The Symbiotic Co-Evolution Framework (SCF) v0.1

I don't think most readers will read the whole thing; this summary covers the main points. However, the full draft is available for those who want more technical information.

Questions for the community

I'd really like to hear what you think.

In what situations does this framework seem least likely to work or be useful?

Which layer seems the least developed or most likely to fail?

Are there already EA or AI governance projects that do a better job of covering this area?

Does the link between alignment and collective intelligence seem like a real insight or an idea that goes too far?

Thanks for reading, and thanks for any feedback you can give.

0

0
0

Reactions

0
0

More posts like this

Comments
No comments on this post yet.
Be the first to respond.
Curated and popular this week
Relevant opportunities