Weak point in "most important century": lock-in

Holden Karnofsky

This is the second of (for now) two posts covering what I see as the weakest points in the “most important century” series. (The first one is here.)

The weak point I’ll cover here is the discussion of “lock-in”: the idea that transformative AI could lead to societies that are stable for billions of years. If true, this means that how things go this century could affect what life is like in predictable, systematic ways for unfathomable amounts of time.

My main coverage of this topic is in a section of my piece on digital people. It’s pretty hand-wavy, not super thorough, and isn’t backed by an in-depth technical report (though I do link to some informal notes from physicist Jess Riedel that he made while working at Open Philanthropy). Overcoming Bias critiqued me on this point, leading to a brief exchange in the comments.

I’m not going to be dramatically more thorough or convincing here, but I will say a bit more about how the overall “most important century” argument is affected if we ignore this part of it, and a bit more about why I find “lock-in” plausible.

(Also note that "lock-in" will be discussed at some length in an upcoming book by Will MacAskill, What We Owe the Future.)

Throughout this piece, I’ll be using “lock-in” to mean “key things about society, such as who is in power or which religions/ideologies are dominant, are locked into place indefinitely, plausibly for billions of years,” and “dynamism” or “instability” to mean the opposite: “such things change on much shorter time horizons, as in decades/centuries/millennia.” As noted previously, I consider "lock-in" to be a scary possibility by default, though it's imaginable that certain kinds of lock-in (e.g., of human rights protections) could be good.

“Most important century” minus “lock-in”

First, let’s just see what happens if we throw out this entire part of the argument and assume that “lock-in” isn’t a possibility at all, but accept the rest of the claims. In other words, we assume that:

Something like PASTA (advanced AI that automates scientific and technological advancement) is likely to be developed this century.
That, in turn, would lead to explosive scientific and technological advancement, resulting in a world run by digital people or misaligned AI or something else that would make it fair to say we have "transitioned to a state in which humans as we know them are no longer the main force in world events."
But it would not lead to any particular aspect of the world being permanently set in stone. There would remain billions of years full of unpredictable developments.

In this case, I think there is still an important sense in which this would be the “most important century for humanity”: it would be our last chance to shape the transition from a world run by humans to a world run by something very much unlike humans. This is one of the two definitions of “most important century” given here.

More broadly, in this case, I think there’s an important sense in which the “most important century” series should be thought of as “Pointing to a drastically underrated issue; correct in its most consequential, controversial implications, if not in every detail.” When people talk about the most significant issues of our time (in fact, even when they are specifically talking about likely consequences of advanced AI), they rarely include much discussion of the sorts of issues emphasized in this series; and they should, whether or not this series is correct about the possibility of “lock-in.”

As noted here, I ultimately care more about whether the “most important century” series is correct in this sense - pointing at drastically underappreciated issues - than about how likely its title is to end up describing reality. (Though I care about both.) It’s for this reason that I think the relatively thin discussion of lock-in is a less important “weak point” than the weak point I wrote about previously, which raises questions about whether advanced AI would change the world very quickly or very much at all.

But I’ve included the mention of lock-in because I think it’s a real possibility, and it would make the stakes of this century even higher.

Dissecting “lock-in”

There have probably been many people in history (emperors, dictators) with enormous power over their society, and who would’ve liked to keep things going just as they were forever. There may also have been points in time when democratically elected governments would have “locked in” at least some things about their society for good, if they could have.

But they couldn’t. Why not?

I think the reasons broadly fall into a few categories, and digital people (or misaligned AI, but I’ll focus on digital people to keep things simple for now) could change the picture quite a bit.

First I'll list factors that seem particularly susceptible to being changed by technology, then one factor that seems less so.

Factors that seem particularly susceptible to being changed by technology

Aging and death. Any given powerful person has to die at some point. They can try to transfer power to children or allies, but a lot changes in the handoff (and over very long periods of time, there are a lot of handoffs).

Digital people need not age or die. (More broadly, sufficient advances in science and technology seem pretty likely to be able to eliminate aging and death, even if not via digital people.) So if some particular set of them had power over some particular part of the galaxy, death and aging need not interfere here at all.

Other population changes. Over time, the composition of any given population changes, and in particular, one generation replaces the previous one. This tends to lead to changes in values and power dynamics.

Without aging or death, and with extreme productivity, we could end up quickly exhausting the carrying capacity of any particular area - so that area might not see changes in population composition at all (or might see much smaller, more controlled changes than we are used to today - no cases where a whole generation is replaced by a new one). Generational turnover seems like quite a big driver of dynamism to date.

Chaos. To date, even when some government is officially “in charge” of a society, it has very limited ability to monitor and intervene in everything that’s going on. But I think technological advancement to date has already greatly increased the ability of a government to exercise control over a large number of people and large geography. An explosion in scientific and technological advancement could radically further increase governments’ in-practice control of what’s going on.

(Digital people provide an extreme example: controlling the server running a virtual environment would mean being able to monitor and control everything about the people in that environment. And powerful figures could create many copies of themselves for monitoring and enforcement.)

Natural events. All kinds of things might disrupt a human society: changes in the weather/climate, running lower on resources, etc. Sufficient advances in science and technology could drive this sort of disruption to extremely low levels (and in particular, digital people have pretty limited resource needs, such that they need not run low on resources for billions of years).

Seeking improvement. While some dictators and emperors might prefer to keep things as they are forever, most of today’s governments don’t tend to have this as an aspiration: elected officials see themselves as accountable to large populations whose lives they are trying to improve.

But dramatic advances in science and technology would mean dramatically more control over the world, as well as potentially less scope for further improvement (I generally expect that the rate of improvement has to trail off at some point). This could make it increasingly likely that some government or polity decides they’d prefer to lock things in as they are.

But could these factors be eliminated so thoroughly as to cause stability for billions of years? I think so, if enough of society were digital (e.g., digital people, such that those seeking stability could use digital error correction (essentially, making multiple copies of any key thing, which can be used to roll back anything that changes for any reason - for more, see Jess Riedel’s informal notes, which argue that digital error correction could be used to reach quite extreme levels of stability).

A tangible example here would be tightly controlled virtual environments, containing digital people, programmed to reset entirely (or reset key properties) if any key thing changed. These represent one hypothetical way of essentially eliminating all of the above factors as sources of change.

But even if we prefer to avoid thinking about such specific scenarios, I think there are broader cases for explosive scientific and technological advancement radically reducing the role of each of the above factors, as outlined above.

Of course, just because some government could achieve "lock-in" doesn't mean it would. But over the course of a long enough time, it seems that "anti-lock-in" societies would simply gain ever more chances to become "pro-lock-in" societies, whereas even a few years of a "pro-lock-in" society could result in indefinite lock-in. (And in a world of digital people operating a lot faster than humans, a lot of "time" could go by by the end of this century.)

A factor that seems less susceptible to being changed by technology: competition between societies

Even if a government had complete control over its society, this wouldn’t ensure stability, because it could always be attacked from outside. And unlike the above factors, this is not something that radical advances in science and technology seem particularly likely to change: in a world of digital people, different governments would still be able to attack each other, and would be able to negotiate with each other with the threat of attack in the background.

This could cause sustained instability such that the world is constantly changing. This is the point emphasized by the Overcoming Bias critique.

I think this dynamic might - or might not - be an enduring source of dynamism. Some reasons it might not:

If AI caused an explosion in scientific and technological advancement, then whoever develops it first could quickly become very powerful - being “first to develop PASTA by a few months” could effectively mean developing the equivalent of a several-centuries lead in science and technology after that. This could lead to consolidation of power on Earth, and there are no signs of intelligent life outside Earth - so that could be the end of “attack” dynamics as a force for instability.
Awareness of the above risk might cause the major powers to explicitly negotiate and divide up the galaxy, committing (perhaps enforceably, depending on how the technological picture shakes out) never to encroach each others’ territory. In this case, any particular part of the galaxy would not be subject to attacks.
It might turn out that space settlements are generally easier to defend than attack, such that once someone establishes one, it is essentially not subject to attack.

Any of the above, or a combination (e.g., attacks are possible but risky and costly; world powers choose not to attack each other in order not to set off a war), could lead to the permanent disappearance of military competition as a factor, and open up the possibility for some governments to “lock in” key characteristics of their societies.

Three categories of long-run future

Above, I’ve listed some factors that may - or may not - continue to be sources of dynamism even after explosive scientific and technological advancement. I think I have started to give a sense for why, at a minimum, sources of dynamism could be greatly reduced in the case of digital people or other radically advanced technology, compared to today.

Now I want to divide the different possible futures into three broad categories:

Full discretionary lock-in. This is where a given government (or coalition or negotiated setup) is able to essentially lock in whatever properties it chooses for its society, indefinitely.

This could happen if essentially every source of dynamism outlined above goes away, and governments choose to pursue lock-in.

Predictable competitive dynamics. I think the source of dynamism that is most likely to persist (in a world of digital people or comparably advanced science and technology) is the last one discussed in the above section: military competition between advanced societies.

However, I think it could persist in a way that makes the long-run outcomes importantly predictable. In fact, I think “importantly predictable long-run outcomes” is part of the vision implied by the Overcoming Bias critique, which argues that the world will need to be near-exclusively populated by beings that spend nearly their entire existence working (since the population will expand to the point that it’s necessary to work constantly just to survive).

If we end up with a world full of digital beings that have full control over their environment except for having to deal with military competition from others, we might expect that there will be strong pressures for the digital beings that are most ambitious, most productive, hardest-working, most aggressive, etc. to end up populating most of the galaxy. These may be beings that do little else but strive for resources.

True dynamism. Rather than a world where governments lock in whatever properties they (and/or majorities of their constituents) want, or a world where digital beings compete with largely predictable consequences, we could end up with a world in which there is true freedom and dynamism - perhaps deliberately preserved via putting specific measures in place to stop the above two possibilities, and enforce some level of diversity and even randomness.

Having listed these possibilities, I want to raise the hypothesis that if we could end up with any of these three, and this century determines which (or which mix) we end up with, that makes a pretty good case for this century having especially noteworthy impacts, and thereby being the most important century of all time for intelligent life.

For example, say that from today’s vantage point, we’re equally likely to get (a) a world where powerful governments employ “lock-in,” (b) a world where unfettered competition leads the galaxy to be dominated by the strong/productive/aggressive, or (c) a truly dynamic world where future events are unpredictable and important. In that case, if we end up with (c), and future events end up being enormously interesting and consequential, I would think that there would still be an important sense in which the most important development of all time was the establishment of that very dynamic. (Given that one of the other two could have instead ended up determining the shape of civilization across the galaxy over the long run.)

Another way of putting this: if lock-in (and/or predictably competitive dynamics) is a serious possibility starting this century, the opportunity to prevent it could make this century the most important one.

Boiling it down

This has been a lot of detail about radically unfamiliar futures, and readers may have the sense at this point that things have gotten too specific and complex to put much stock in. But I think the broad intuitions here are fairly simple and solid, so I’m going to give a more high-level summary:

Scientific and technological advancement can reduce or eliminate many of today’s sources of instability, from aging and death to chaos and natural events. An explosion in scientific and technological advancement could therefore lead to a big drop in dynamism. (And as one vivid example, digital people could set up tightly controlled virtual environments with very robust error correction - something I consider a scary possibility by default, as noted in the intro.)
Dynamism may or may not remain, depending on a number of factors about how consolidated power ends up being and how different governments/societies deal with each other. The “may or may not” could be determined this century.
I think this is a serious enough possibility that it heightens the stakes of the “most important century,” but I’m far from confident in the thinking here, and I think most of the spirit of the “most important century” hypothesis survives even if we forget about all of it.

Hopefully these additional thoughts have been helpful context on where I’m coming from, but I continue to acknowledge that this is one of the more under-developed parts of the series, and I’m interested in further exploration of the topic.

Effective Altruism Forum
EA Forum