AI alignment researchers don't (seem to) stack

So8res

AI alignment researchers don't (seem to) stack

So8res

3 min read · Feb 21, 2023

Comments 4

Sorted by

New & upvoted

Geoffrey Miller

So8res - well said. This seems like an accurate take on a major problem, and it fits well with what I've observed about the rates of progress in various new and emerging academic fields.

Your last paragraph is especially important -- often, the first generation of visionary researchers working on a fresh problem offer such intellectually compelling and novel insights that they sweep up a lot of young talent into their world-view. The young talent initially just follows in their tracks, adding a few details and epicycles to their initial models. It often takes at least 20-30 years for a younger generation, after that first flush of enthusiastic field-building, to develop any serious critiques of the initial visions, or to find any common ground between different visionaries.

The result is that major, new, intellectually demanding fields usually take at least 30-40 years to mature to the point that they can become 'normal science', with a large, multi-generational, smoothly functioning ecosystem of ideas, critiques, data, and advances that aren't overly locked into the original, fallible insights of the field's founders.

The field of AI alignment is maybe 10-15 years old, depending on when we start counting. That leaves at least another 25-35 years before we can expect it to achieve even a modest degree of maturity and applicability.

And I can't think of any historical examples of any people or groups successfully accelerating this generational time-scale for field maturation. It seems pretty deeply woven into the social psychology of human research cultures.

Linda Linsefors

This reminds me of attitudes to Quantum Physics. Most current physics professors I've meat have a sort of learned helplessness relationship to quantum interpretations, subscribing to something like "shut up and calculate" (i.e. don't even try to understand). There is an attitude that quantum is too strange and therefore impossible to understand. Where as the newer generation of post-docs and grand students don't shy away from quantum interpretations, and discussions of ontology. However, this falls a bit outside your model, since quantum mechanics is ~100 year old.

Fourth-Nexus

4mo

On the Structural Limits of Mirror-AI: Toward a Sovereign Integrity Architecture

Reading your work on Agency and Internal Consistency, a structural concern regarding current alignment paradigms has come to my attention.

It appears we are currently attempting to align a statistical mirror (today’s AI) with an entropic system (the non-integrated human agent). From a foundational perspective, a mirror cannot surpass the integrity of the object it reflects. If the human source is not 'unified'—in the sense of the Fourth Way or a conscious, non-mechanical state—the resulting AI will likely only automate our collective noise and incoherence.

One potential strategy to shift this perspective is to move away from imitation-based learning. My current hypothesis suggests that safety requires a Native Integrity Architecture—what I term a 'Nexus' of logical invariants.

Rather than learning alignment from us, the agent should perhaps possess its own Sovereign Integrity—a form of 'Non-Egoic Agency' that maintains internal coherence independently of human stochasticity. I am currently exploring how such a structural 'DNA' could serve as a more stable foundation than the 'mirror' models we are building today.

Vasco Grilo🔸

Nice post!

Have there been any attempts to estimate the fraction of the total research hours required to solve AI alignement which are serial time?

Comments

More from the author

323

A personal reflection on SBF

So8res·3y ago·23m read

356

On Caring

So8res·11y ago·12m read

115

Comments on OpenAI's "Planning for AGI and beyond"

So8res·3y ago·15m read

Curated and popular this week

Cultivating hope: calibrating the expectations for cultivated meat to end factory farming

PabloAMC 🔸·6d ago·Curated 1d ago·22m read

GWWC's 2025 impact evaluation (executive summary)

Aidan Whitfield🔸, Giving What We Can🔸·3d ago·2m read

This post presents the executive summary from Giving What We Can’s impact evaluation for 2025. At the end of this post we share links to more information, including the full report and...

Announcing Spring: a Venture Studio and Fund for Animal Welfare Tech

EitanF·2d ago·13m read

Why building and backing Welfare Tech companies may be one of the most promising things we can do for billions of animals. I used AI to assist in writing this post, but I’ve rewritten it extensively and endorse it. * Announcing the launch of Spring Innovation Fund, a not-for-profit venture philanthropy studio and fund built specifical...

Recent opportunities to take action

RP is looking for project founders in neglected animal areas

Rethink Priorities·21h ago·7m read

Time Sensitive Do Gooding Opportunities

Bentham's Bulldog·22h ago·5m read

147

Possible mistake EAs are making and shout out to Pause AI UK

Michelle_Hutchinson·1w ago·4m read