Jérémy Perret

43Lyon, FranceJoined May 2022

Comments
11

What kind of organization should be the first to develop AGI in a potential arms race?

Slightly humorous answer: it should be the very most pessimistic organization out there (I had MIRI in mind, but surely if we're picking the winner in advance we can craft an organization that goes even further on that scale).

My point is the same as jimrandomh: if there's an arms race that actually goes all the way up to AGI, safety measures are going to get in the way of speed, corners will be cut, and disaster will follow.

This assumes, of course, that any unaligned AGI system will be the cause of non-recoverable catastrophe, independently from the good intentions of their designers.

If this assumption proves wrong, then the winner of that race still holds the most powerful and versatile technological artifact ever designed; the kind of organization to wield that kind of influence should be... careful.

I'm not sure which governance design best achieves the carefulness that is needed in that case.

Do EA folks think that a path to zero AGI development is feasible or worthwhile for safety from AI?

I cannot speak for all EA folks; here's a line of reasoning I'm patching together from the "AGI-never is unrealistic" crowd.

Most AI research isn't explicitly geared towards AGI; while there are a few groups with that stated goal (for instance, DeepMind), most of the AI community wants to solve the next least difficult problem in a thousand subdomains, not the more general AGI problem.

So while peak-performance progress may be driven by the few groups pushing for general capability, for the bulk of the field "AGI development" is just not what they do. Which means, if all the current AGI groups stop working on it tomorrow, "regular" AI research still pushes forward.

One scenario for "everyone avoids generality very hard while still solving as many problems as possible" is the Comprehensive AI Services framework. That is one pathway, not without safety concerns.

However, as Richard Ngo argues, "Open-ended agentlike AI seems like the most likely candidate for the first strongly superhuman AGI system."

To sum up:

  • regular, not-aiming-for-AGI AI research will very likely attempt to cover as many tasks as possible, as most of the field has done, and will eventually on aggregate cover a wide enough range of capability that alignment issues kick in;
  • more general agents are still likely to appear before we get there, with nothing impeding progress (for instance, while DeepMind has a safety team aware of AGI concerns, this doesn't prevent them from advancing general capability further).

A separate line of reasoning argues that no one will ever admit (in time) we're close enough to AGI that we should stop for safety reasons; so that everyone can claim "we're not working on AGI, just regular capabilities" until it's too late.

In that scenario, stopping AGI research amounts to stopping/slowing down AI research at large, which is also a thing being discussed!

What if we don't need a "Hard Left Turn" to reach AGI?

Hi! Thanks for this post. What you are describing matches my understanding of Prosaic AGI, where no significant technical breakthrough is needed to get to safety-relevant capabilities.

Discussion of the implications of scaling large language models is a thing, and your input would be very welcome!

On the title of your post: the hard left turn term is left undefined, I assume that's a reference to Soares's sharp left turn.

Introducing Asterisk

please send a short paragraph [...] to clara@asteriskmag.com

Apparently, yes!

AI Risk is like Terminator; Stop Saying it's Not

Furthering your "worse than Terminator" reframing in your Superintelligence section,  I will quote Yudkowsky there (it's said in jest, but the message is straightforward):

Dear journalists: Please stop using Terminator pictures in your articles about AI. The kind of AIs that smart people worry about are much scarier than that! Consider using an illustration of the Milky Way with a 20,000-light-year spherical hole eaten away.

Here, "AI risk is not like Terminator" attempts to dismiss the eventuality of a fair fight... and rhetorically that could be reframed as "yes, think Terminator except much more lopsided in favor of Skynet. Granted, the movies would have been shorter that way".

List of AI safety courses and resources

Nice initiative, thanks!

Plugging my own list of resources (last updated April 2020, next update before the end of the year).

Best resources for introducing longtermism and AI risk?

That's much more specific, thanks. I'll answer with my usual pointers!

Best resources for introducing longtermism and AI risk?

I'd like to answer this. I'd need some extra clarification first, because the introductions I use highly depend on the context:

  • 30-second pitch to spark interest, or 15-minute intro to a captive (and already curious) meetup audience?
  • In-person, by mail, by chat, by voice?
  • 1-to-1, or 1-to-many?

(if the answer is "all of the above", I can work with than too, but it will be edited for brevity)

Moloch and the Pareto optimal frontier
In practice the Pareto frontier isn’t necessarily static because background variables may be changing in time. As long as the process of moving towards the frontier is much faster than the speed at which the frontier changes though we’d continue to expect again the motion of going towards the frontier and then skating along it.

For a model of how conflicts of optimization (especially between agents pursuing distinct criteria) may evolve when the resource pie grows (i.e. when the Pareto frontier moves away from the origin), see Paretotopian Goal Alignment (Eric Drexler, 2019).

There can be a common interest for all parties to expand the Pareto frontier, since it expands the opportunities of everyone scoring better at the same time on their respective criteria.

Load More