[Crosspost] Why Uncontrollable AI Looks More Likely Than Ever

Otto

[Crosspost] Why Uncontrollable AI Looks More Likely Than Ever

Otto

4 min readMar 8, 2023

Comments 6

Sorted by

New & upvoted

Stuart Buck

"But a true AGI could not only transform the world, it could also transform itself."

Is there a good argument for this point somewhere? It doesn't seem obvious at all. We are generally intelligent ourselves, and yet existed for hundreds of thousands of years before we even discovered that there are neurons, synapses, etc., and we are absolutely nowhere near the ability to rewire our neurons and glial cells so as to produce ever-increasing intelligence. So too, if AGI ever exists, it might be at an emergent level that has no idea it is made out of computer code, let alone knows how to rewrite its own code.

Matt Boyd

We transform ourselves all the time, and very powerfully. The entire field of cognitive niche construction is dedicated to studying how the things we create/build/invent/change lead to developmental scaffolding and new cognitive abilities that previous generations did not have. Language, writing systems, education systems, religions, syllabi, external cognitive supports, all these things have powerfully transformed human thought and intelligence. And once they were underway the take-off speed of this evolutionary transformation was very rapid (compared to the 200,000 years spent being anatomically modern with comparatively little change).

Geoffrey Miller

Matt -- good point.

Also, humans cognitively enhance ourselves through nootropics such as nicotine and caffeine. These might seem mild at the individual level, but I suspect that at the collective level, they may have helped spark the Enlightenment, the Scientific Revolution, and the Industrial Revolution (as Michael Pollan has argued).

And, on a longer time-scale, we've shaped the course of our own genetic evolution through the mate choices we make, about who to combine our genes with. (Something first noticed by Darwin, 1871).

jacquesthibs

A "true" AGI will have situational awareness and knows its weights were created with the help of code, eventually knows its training setup (and how to improve it), and also knows how to rewrite its code. These models can already write code quite well; it's only a matter of time before you can ask a language model to create a variety of architectures and training runs based on what it thinks will lead to a better model (all before "true AGI" IMO). It just may take it a bit longer to understand what each of its individual weights do and will have to rely on coming up with ideas by only having access to every paper/post in existence to improve itself as well as a bunch of GPUs to run experiments on itself. Oh, and it has the ability to do interpretability to inspect itself much more precisely than any human can.

Stuart Buck

All of that seems question-begging. If we define "true AGI" as that which knows how to rewrite its own code, then that is indeed what a "true AGI" would be able to do.

jacquesthibs

When I say "true," I simply mean that it is inevitable that these things are possible by some future AI system, but people have so many different definitions of AGI they could be calling GPT-3 some form of weak AGI and, therefore incapable of doing the things I described. I don't particularly care about "true" or "fake" AGI definitions, but just want to point out that the things I described are inevitable, and we are really not so far away (already) from the scenario I described above, whether you call this future system AGI or pre-AGI.

Situational awareness is simply a useful thing for a model to learn, so it will learn it. It is much better at modelling the world and carrying out tasks if it knows it is an AI and what it is able to do as an AI.

Current models can already write basic programs on their own and can in fact write entire AI architecture with minimal human input.

Comments

More from the author

Paper Summary: The Effectiveness of AI Existential Risk Communication to the American and Dutch Public

Otto·3y ago·5m read

Organizing a debate with experts and MPs to raise AI xrisk awareness: a possible blueprint

Otto·3y ago·5m read

Announcing the AI Safety Summit Talks with Yoshua Bengio

Otto·2y ago·1m read

Curated and popular this week

Hard-to-reverse decisions destroy option value

Stefan_Schubert·9y ago·Curated 1d ago·14m read

This post is co-authored with Ben Garfinkel. It is cross-posted from the CEA blog. A PDF version can be found here. Summary: Some strategic decisions available to the effective altruism m...

Introducing Impact List: a ranking of philanthropists by expected lives saved

Elliot Olds·2d ago·6m read

TL;DR: I'm releasing a website that ranks philanthropists according to EA principles and research, and allows users to re-rank the list using their own assumptions. I'd like feedback and help making it better. I'd especially like ideas for how to make the results more trustworthy. Funding may be available. I recently built Impact List (impactlist.xyz), a site which ranks people by their positive impact via donations. The goal is t...

If you're agentic, work in biosecurity

sharmaayushmaan🔸·6d ago·7m read

Disclaimer: Although I work on the Groups Team at CEA, I’m writing this in a personal capacity, and this post does not constitute an endorsement by CEA. Agency - the realisation that you really can just do things. TL;DR Biosecurity needs people (of any background) who are agentic and have a high execution velocity and track record....

Recent opportunities to take action

Marginal Victories: career advising and opportunities for U.S. democracy preservation & political work

Annika Burman 🔸·4d ago·2m read

I'm stepping down as Hive's Executive Director, and we're hiring my successor

SofiaBalderson, Hive·4d ago·3m read

Starting an EA group @ SUNY Binghamton

micahzarin·3d ago·1m read