Written by Daniel Kokotajlo for AI Impacts.
[Epistemic status: I wrote this for Blog Post Day II. Sorry it’s late.]
In this post, I distinguish between three different kinds of competitiveness — Performance, Cost, and Date — and explain why I think these distinctions are worth the brainspace they occupy. For example, they help me introduce and discuss a problem for AI safety proposals having to do with aligned AIs being outcompeted by unaligned AIs.
Distinguishing three kinds of competitiveness and competition
A system is performance-competitive insofar as its ability to perform relevant tasks compares with competing systems. If it is better than any competing system at the relevant tasks, it is very performance-competitive. If it is almost as good as the best competing system, it is less performance-competitive.
(For AI in particular, “speed” “quality” and “collective” intelligence as Bostrom defines them all contribute to performance-competitiveness.)
A system is cost-competitive to the extent that it costs less to build and/or operate than its competitors. If it is more expensive, it is less cost-competitive, and if it is much more expensive, it is not at all cost-competitive.
A system is date-competitive to the extent that it can be created sooner (or not much later than) its competitors. If it can only be created after a prohibitive delay, it is not at all date-competitive.
A performance competition is a competition that performance-competitiveness helps you win. The more important performance-competitiveness is to winning, the more intense the performance competition is.
Likewise for cost and date competitions. Most competitions are all three types, to varying degrees. Some competitions are none of the types; e.g. a “competition” where the winner is chosen randomly.
I briefly searched the AI alignment forum for uses of the word “competitive.” It seems that when people talk about competitiveness of AI systems, they usually mean performance-competitiveness, but sometimes mean cost-competitiveness, and sometimes both at once. Meanwhile, I suspect that this important post can be summarized as “We should do prosaic AI alignment in case only prosaic AI is date-competitive.”
Putting these distinctions to work
First, I’ll sketch some different future scenarios. Then I’ll sketch how different AI safety schemes might be more or less viable depending on which scenario occurs. For me at least, having these distinctions handy makes this stuff easier to think and talk about.
Disclaimer: The three scenarios I sketch aren’t supposed to represent the scenarios I think most likely; similarly, my comments on the three safety proposals are mere hot takes. I’m just trying to illustrate how these distinctions can be used.
Scenario: FOOM: There is a level of performance which leads to a localized FOOM, i.e. very rapid gains in performance combined with very rapid drops in cost, all within a single AI system (or family of systems in a single AI lab). Moreover, these gains & drops are enough to give decisive strategic advantage to the faction that benefits from them. Thus, in this scenario, control over the future is mostly a date competition. If there are two competing AI projects, and one project is building a system which is twice as capable and half the price but takes 100 days longer to build, that project will lose.
Scenario: Gradual Economic Takeover: The world economy gradually accelerates over several decades, and becomes increasingly dominated by billions of AGI agents. However, no one entity (AI or human, individual or group) has most of the power. In this scenario, control over the future is mostly a cost and performance competition. The values which shape the future will be the values of the bulk of the economy, and that in turn will be the values of the most popular and successful AGI designs, which in turn will be the designs that have the best combination of performance- and cost-competitiveness. Date-competitiveness is mostly irrelevant.
Scenario: Final Conflict: It’s just like the Gradual Economic Takeover scenario, except that several powerful factions are maneuvering and scheming against each other, in a Final Conflict to decide the fate of the world. This Final Conflict takes almost a decade, and mostly involves “cold” warfare, propaganda, coalition-building, alliance-breaking, and that sort of thing. Importantly, the victor in this conflict will be determined not so much by economic might as by clever strategy; a less well resourced faction that is nevertheless more far-sighted and strategic will gradually undermine and overtake a larger/richer but more dysfunctional faction. In this context, having the most capable AI advisors is of the utmost importance; having your AIs be cheap is much less important. In this scenario, control of the future is mostly a performance competition. (Meanwhile, in this same scenario, popularity in the wider economy is a moderately intense competition of all three kinds.)
Proposal: Value Learning: By this I mean schemes that take state-of-the-art AIs and train them to have human values. I currently think of these schemes as not very date-competitive, but pretty cost-competitive and very performance-competitive. I say value learning isn’t date-competitive because my impression is that it is probably harder to get right, and thus slower to get working, than other alignment proposals. Value learning would be better for the gradual economic takeover scenario because the world will change slowly, so we can afford to spend the time necessary to get it right, and once we do it’ll be a nice add-on to the existing state-of-the-art systems that won’t sacrifice much cost or performance.
Proposal: Iterated Distillation and Amplification: By this I mean… well, it’s hard to summarize. It involves training AIs to imitate humans, and then scaling them up until they are arbitrarily powerful while still human-aligned. I currently think of this scheme as decently date-competitive but not as cost-competitive or performance-competitive. But lack of performance-competitiveness isn’t a problem in the FOOM scenario because IDA is above the threshold needed to go FOOM; similarly, lack of cost-competitiveness is only a minor problem because if they don’t have enough money already, the first project to build FOOM-capable AI will probably be able to attract a ton of investment (e.g. via being nationalized) without even using their AI for anything, and then reinvest that investment into paying the extra cost of aligning it via IDA.
Proposal: Impact regularization: By this I mean attempts to modify state-of-the-art AI designs so that they deliberately avoid having a big impact on the world. I think of this scheme as being cost-competitive and fairly date-competitive. I think of it as being performance-uncompetitive in some competitions, but performance-competitive in others. In particular, I suspect it would be very performance-uncompetitive in the Final Conflict scenario (because AI advisors of world leaders need to be impactful to do anything), yet nevertheless performance-competitive in the Gradual Economic Takeover scenario.
Putting these distinctions to work again
I came up with these distinctions because they helped me puzzle through the following problem:
Lots of people worry that in a vastly multipolar, hypercompetitive AI economy (such as described in Hanson’s Age of Em or Bostrom’s “Disneyland without children” scenario) eventually pretty much everything of merely intrinsic value will be stripped away from the economy; the world will be dominated by hyper-efficient self-replicators various kinds, performing their roles in the economy very well and seeking out new roles to populate but not spending any time on art, philosophy, leisure, etc. Some value might remain, but the overall situation will be Malthusian.
Well, why not apply this reasoning more broadly? Shouldn’t we be pessimistic about any AI alignment proposal that involves using aligned AI to compete with unaligned AIs? After all, at least one of the unaligned AIs will be willing to cut various ethical corners that the aligned AIs won’t, and this will give it an advantage.
This problem is more serious the more the competition is cost-intensive and performance-intensive. Sacrificing things humans value is likely to lead to cost- and performance-competitiveness gains, so the more intense the competition is in those ways, the worse our outlook is.
However, it’s plausible that the gains from such sacrifices are small. If so, we need only worry in scenarios of extremely intense cost and performance competition.
Moreover, the extent to which the competition is date-intensive seems relevant. Optimizing away things humans value, and gradually outcompeting systems which didn’t do that, takes time. And plausibly, scenarios which are not at all date competitions are also very intense performance and cost competitions. (Given enough time, lots of different designs will appear, and minor differences in performance and cost will have time to overcome differences in luck.) On the other hand, aligning AI systems might take time too, so if the competition is too date-intensive things look grim also. Perhaps we should hope for a scenario in between, where control of the future is a moderate date competition.
These distinctions seem to have been useful for me. However, I could be overestimating their usefulness. Time will tell; we shall see if others make use of them.
If you think they would be better if the definitions were rebranded or modified, now would be a good time to say so! I currently expect that a year from now my opinions on which phrasings and definitions are most useful will have evolved. If so, I’ll come back and update this post.