“Biological anchors” is about bounding, not pinpointing, AI timelines

Holden Karnofsky

“Biological anchors” is about bounding, not pinpointing, AI timelines

Holden Karnofsky

13 min read · Nov 18, 2021

Comments 9

Sorted by

New & upvoted

kokotajlod

I'm going on record as someone who would be mildly surprised if we didn't have AI-PONR by 2036. :) (That is, an AI-induced point of no return.) I also think TAI would follow within 5 years of such an event, possibly within 5 seconds depending on how fast takeoff goes.

And that said, all of the above is a set of “coulds” and “mights” - every case I’ve heard for “transformative AI by 2036” seems to require a number of uncertain pieces to click into place.
If “long-horizon” tasks turn out to be important, Bio Anchors shows that it’s hard to imagine there will be enough compute for the needed training runs.
Even if there is plenty of compute, 15 years might not be enough time to resolve challenges like assembling the right training data and environments.
It’s certainly possible that some completely different paradigm will emerge - perhaps inspired by neuroscience - and transformative AI will be developed in ways that don’t require Bio-Anchors-like “training runs” at all. But I don’t see any particular reason to expect that to happen in the next 15 years.
So I also don’t have a lot of sympathy for people who think that there’s a >50% chance of transformative AI by 2036.

I think the case for AI-PONR by 2036 is more disjunctive than conjunctive, or to put it another way, every case I've heard for "No AI-PONR by 2036" seems to require a number of uncertain pieces to click into place. It's something like "All PONR-inducing tasks are long-horizon, AND we won't figure out a way to get human-level performance at any PONR-inducing task via generalization, AND we won't figure out a way to get human-level performance at any PONR-inducing task via decomposition, AND we won't have a new paradigm that is more data-efficient, AND we won't be able to significantly accelerate AI R&D despite having human-brain-sized NN's that are superhuman at every short-horizon task we have data for" (And note that each of those conjuncts was itself pretty conjunctive, e.g. AI-PONR doesn't even require AGI or agency.)

(There are some disjuncts, such as "OR maybe the scaling laws will break down soon, OR there'll be a massive world war or something that stifles AI progress" I'd be interested to hear a list of such disjuncts.)

Holden Karnofsky

I think the sense in which your case is disjunctive is mostly that there are multiple potential "PONR-inducing tasks," and multiple potential ways to get to each one (brute-force trial-and-error on the full task, generalization from easier-to-learn tasks, decomposition into easier-to-learn tasks, breakthrough new paradigm). But this sort of disjunctiveness seems like it was fundamentally there in 1970 and in 1990 - if it didn't predict transformative AI (or PONR AI) within 15 years then, what's different today?

I'm guessing your answer is something like "Today, we are close to being able to train human-brain-sized models, if only on small-number-of-timestep tasks." I do think that's relevant. But with GPT-3 having been out for more than a year, within 1000x of the "human brain size" threshold, and with seemingly nobody having found a way to get it to do something that seems all that much like a human doing some economically relevant task, this doesn't seem like enough to get over 50% probability by 2036.

kokotajlod

Hmm, good point. I wonder if this comes down to how "meta" we like to be: My initial reaction to your question was

"What's different today?!?? Loads of things! We have AlphaStar and GPT-3 and scaling laws and transfer learning and image recognition and theory underpinning the scaling laws and almost-human-brain-sized-NNs ... There are multiple PONR-inducing tasks that seem plausibly within reach now via multiple methods, whereas 15 years ago before the deep learning boom there was only one method: the 'maybe we'll have some huge unprecedented breakthrough' method!"

But I imagine you'd say: "Yes, those are all differences, but one can always find differences if one looks. In 2006 you would have been able to list a bunch of things that happened since 1991, for example. The meta-strategy you are employing of thinking about recent AI progress and then noting the various ways in which it brings us closer to AGI etc. is a bad one; it has consistently failed for half a century so we shouldn't expect it to work now."

The way to settle this would be to wipe my memory and take me back in time to 2006 and see if I am similarly bullish about AI timelines, i.e. see if I actually am using that meta-strategy. It doesn't feel like I am, but maybe I am self-deceived. (I had 20-30 year timelines up until about 2 years ago) Unfortunately we don't have the right equipment.

Anyhow, I dislike this meta stuff. I think it's better to reason on the object level, at least in cases like this where there aren't people with more expertise to defer to. And on the object level, it seems like there are now multiple plausible paths to AI-PONR whereas 15 years ago there were none, or maybe just the "maybe we'll have some unprecedented breakthrough" one. (This has a lot to do with the human brain size anchor, yes, but also with various other things like the scaling laws and the recent deep learning boom and GPT-3 etc.) That said, I wasn't thinking about these things 15 years ago and it's possible that if I was I'd have been raving about the impending singularity. :P So I admit you do have a point.

When would your skepticism cease? I feel like it'll always be true that APS-AI or AI-PONR will require a number of uncertain pieces to click into place, in some sense, until it's literally happening. What matters is the "in some sense." What sorts of signs and portents would convince you that there's a >50% chance of APS-AI or AI-PONR or TAI within 15 years?

To the point about GPT-3 being out for more than a year: One year is not a very long time and 1000x smaller than a human brain is not very big and I don't care about economic relevance primarily anyway (what I care about is AI-PONR, which I tentatively expect to come before GWP accelerates) and we are talking 10-year timelines not 2-year timelines. Suppose AI really will accelerate GWP 10 years from now. Does that confidently predict that GPT-3 would find massive economic application within 1 year? I don't think so. It's definitely evidence, but not strong evidence I think.

ETA: I forgot to mention that my timelines are generated from Ajeya's model, not from reasoning about disjunctiveness and impressiveness. I just put different weights into the different anchors than she does. The reasons I put different weights come down to different interpretations of evidence like the scaling laws, transfer learning, etc. and different intuitions about how hard it'll be, etc. It's of course possible that I'm biased and 15 years ago would have put loads of weight on milestones that have already passed... but it's more plausible IMO that actually I wouldn't have; this human brain anchor stuff plausibly would have appealed to me then as now, for example.

Holden Karnofsky

Re: "When would your skepticism cease?", it is certainly hard to lay out hypothetical observations that would correspond to particular AI timelines! But I'll give a shot. Some example observations that seem like they should make my timelines a lot shorter than they are now, down to 15 years or shorter:

"There are cases of successful, impressive training runs that required key rewards to be very sparse (100,000+ timesteps each)." In this case, I'd doubt that compute was a bottleneck anymore, and think we were down to environment design.
"A large number of tasks have been trained without such sparse rewards; when looking at the list of tasks, I no longer find it plausible to think that there are key 'long-horizon' tasks that will need more compute than these tasks have needed, and it seems like a pretty good guess that it's affordable compute-wise to train any task." Similar to the previous case, with sparse-reward training turning out to be unnecessary for the kinds of training runs I would've guessed it'd be necessary for.
"People are actually running training runs that seem like they could theoretically generate an AGI." By default I expect that to be happening at least several years before we see actual AGI, as I expect it will take a while between "This sort of thing seems very concretely doable" and "This is working."
"We pretty much have a proof-of-concept for nearly every task an AI would need to do to be transformative/PONR; many are expensive and impractical and unreliable; people are engaged in massive data collection and environment design efforts to improve this." Again, I think we'd still be at least several years out by default here.
"Siri, Assistant, etc. can carry out a number of multi-step tasks about as well and reliably as a human virtual assistant would, and someone has made a decent case that their level of autonomy, creativity, etc. is on an upward trend that implies being able to do the kind of work top scientists do soon." I'm not currently aware of any analysis in such trends, and subjectively don't feel there's been impressive progress over the last few years.

I thought these up with Bio Anchors in mind, and accordingly, most explicitly involve some evidence that we don't still have orders of magnitudes of compute-affordability to go. But there are probably lots of other configurations of in-the-lab-but-not-really-working-yet experiments, out-in-the-world performance, opinions from people closest to the work, etc. that would lead me to have shorter timelines (and there are probably things that some people would argue qualify as the above, that wouldn't).

More prosaically, I can keep comparing reality to the implied predictions of my preferred weightings for Bio Anchors. As the size of the biggest affordable training run gets bigger - and/or I see examples of successful training runs that seem like they should've required more compute, causing me to feel that the "effective" size of the biggest affordable training run has gotten bigger - I hope to update accordingly.

I do think it's possible that we'll get such a sudden jump that none of these sorts of things happens far in advance. I just don't think it's more than 50% likely.

I wouldn't guess you'd have had the same timelines in 2006, and I don't think I would have either. I think a lot has changed. But the basic fact that there are a lot of imaginable paths to AGI doesn't seem to have changed.

The fact that there are several that now seem "plausible" has changed to some degree, but looking over your list, those paths do all seem quite unlikely to get us all the way to "PONR-inducing AI" by 2036 (and they're not independent either). It might be interesting to try to specify the probabilities you see for each potential path.

kokotajlod

Thanks for the thoughtful reply, that's a good list! I'll make a list of my own below. Warning: Wall of text incoming, I won't be offended if you don't read it!

The fact that there are several that now seem "plausible" has changed to some degree, but looking over your list, those paths do all seem quite unlikely to get us all the way to "PONR-inducing AI" by 2036 (and they're not independent either). It might be interesting to try to specify the probabilities you see for each potential path.

This is the crux I guess, haha. Here's a stab:

Let's suppose it's 2030 and algorithmic and hardware progress have continued at the rates Ajeya projects and so has willingness-to-spend. Also let's suppose the scaling laws have continued to hold.

Here is a disjunctive list of paths-to-AI-PONR:

a. Some PONR-inducing task turns out to be short-horizon

b. Some PONR-inducing task turns out to work with smallish brains and medium horizons

c. Some PONR-inducing task can be reached via generalization (in short-horizon-pre-trained human-size brains)

d. Some PONR-inducing task can be reached via task decomposition (e.g. bureaucracies of AIs of the aforementioned types)

e. New algorithmic advancements appear that make it possible to do long-horizon training a few OOMs more data-efficiently (I guess I mean this to also be the catch-all category for paradigm shifts and the like)

I should now say what the main PONR-inducing tasks are in my opinion. They are:

--APS-AI [EDIT: Advanced, Planning, Strategically aware. See this report.]

--Persuasion tools good enough to cause major ideological strife and/or major degradation of public epistemology

--R&D acceleration

--Unknown/catchall

Technically R&D acceleration isn't PONR-inducing but it would lead to something PONR-inducing pretty quickly so I include it.

Ok, credences:

a. I think APS-AI is probably not short-horizon, but persuasion and R&D acceleration and unknown might be. (Maybe if we did AlphaFold but bigger and for AI R&D it would make a kickass tool for designing new AI architectures. Input hyperparameters, it predicts what training curve and performance on benchmarks will be!) Let's say 50% chance for persuasion, 25% for R&D acceleration, and 15% for unknown, and 65% for combined.

b. I worry that maybe a small neural net trained long-horizon-style to be APS-AI might actually succeed at some PONR-inducing task even though it is smaller than the human brain. I don’t worry too much about this, but… think about how GPT-2 is able to write sensible English even though it’s 5 OOMs smaller than the human brain. Or how AlphaStar an go toe-to-toe with human experts despite being 7 OOMs smaller! Let’s say 20%.

c. I’m more worried about big pre-trained brains generalizing (perhaps with a bit of fine-tuning.) I know there has been some research done into scaling laws for transfer, and Rohin extrapolated to calculate that this would only knock off 1.5 OOMs of cost from a hypothetical long-horizon training run… but I’m still nervous. Put it this way: Humans are FAR from optimal at long-horizon tasks anyway. There is no reason to think that we are as good as a human-brain-sized neural net trained for 10^14 data points each one the length of a subjective human lifetime. There’s every reason to think that neural net would instead be dramatically better than us. What sorts of things does an AI need to do to be APS-AI? Planning, strategically aware… arguably GPT-3 can already do those things, it just can’t do them well. But once it’s bigger, and fine-tuned… maybe it’ll be able to go toe-to-toe with humans, while still being far from optimal. Or even if it can’t be APS-AI, maybe it can be smart enough to accelerate AI R&D. (One could also imagine making a brain bigger than the human brain, and then pre-training it, and then using it as an oracle… ask it to predict which AI architecture will yield the best results, etc.) I say 60%.

d. I think bureaucracies of neural nets are pretty brittle and finicky now, but (a) that might change in the future as we get more practice with them, and (b) I get the impression that they do reasonably well when you can fine-tune them / retrain them into their new roles. See e.g. the recent OpenAI crawl-the-internet-and-do-research-with-which-to-answer-questions bot. I say 25%.

e: Let's suppose there have been 2 paradigm shifts in the last 60 years of AI research. Seems like the recent shift to deep learning was one. Seems very plausible that if we have a new shift that is to deep learning what deep learning was to the previous shitty stuff in the early 2000s, then we are going to get AI-PONR very shortly thereafter. So anyhow maybe this suggests something like a 33% chance of another such shift by 2030, going on base rates? Could go down if you think there have been fewer paradigm shifts in the past, could go up if you think there have been more. I'd love to see someone measure the recent increase in investment and calculate whether we are more likely to get paradigm shifts now than any time in the past, taking into account ideas-getting-harder-to-find effects. (Huh, you know, I don't think I realized how high the chance of paradigm shift is until now... I guess this means my timelines should be shorter...)

f. I’m not sure which category this fits in, but what about just scaling up EfficientZero? As far as I know its architecture is pretty damn general, not game-specific at all. You should be able to hook it up to a robot or a chatbot (perhaps with a pre-trained model like GPT-3 as a seed) and let rip. Napkin math time: Instead of spending 1 day training on hardware that costs $10,000, let's make a custom supercomputer that is 6 OOMs bigger. Cost: $10B. Run it for 100 days instead of 1. That gives us 8 OOMs more compute to work with than EfficientZero had. Use 5 OOMs to increase the subjective training time from 2 hours to 22 years. Use 3 OOMs to increase parameter count. Maybe this setup would work for something much more complex than Atari… I’m gonna say 20%.

Anyhow, all of this is off the cuff, out of my ass, etc. but it really does feel like it adds up to significantly more than 50% to me, more like 80% or so. So then why aren’t my timelines 80% by 2030? Well, remember all of this was conditioning on “algorithmic and hardware progress have continued at the rates Ajeya projects and so has willingness-to-spend. Also let's suppose the scaling laws have continued to hold.” Also I wish to be humble etc. and defer to people like yourself and Ajeya and Paul at least a little bit.

My promised list: Here are some example observations that would go a long way towards lengthening my timelines a lot longer, e.g. to 20-30 years instead of 10:

1. AI winter. Progress slows, investment dries up. People generally agree that the amount of compute used for the largest training runs will stop growing for the next decade or so, rather than grow by a couple OOMs as is currently expected.

2. Roadblock that doesn't quickly fall: My brief (5year) experience watching AI progress is a story of many repeated instances of purported roadblocks being smashed through almost as soon as I hear about them. E.g. transfer learning, imperfect-information games, common sense understanding, reasoning, real-time games, sim-2-real, ... the list goes on. Most recently people I respect a lot (Ajeya, Paul, etc.) taught me about horizon lengths and data inefficiency and I came to believe that modern AI methods were fundamentally less data-efficient than the human brain... but then along came EfficientZero! So, I'd lengthen my timelines if someone clearly articulates a major roadblock to all important milestones (AGI/TAI/APS-AI/etc.), DeepMind and OpenAI etc. throw themselves at overcoming it for a few years, and fail. (Maybe this has already happened and I haven't heard about it because of publication bias?) (Also it's important that the roadblock plausibly block us from AGI/TAI/APS-AI/etc. Data-efficiency is on thin ice by this metric because plausibly even if AI is dramatically less data-efficient than humans there might still be a way to make AGI/TAI/APS-AI/etc. out of it. Causal reasoning and common sense and imperfect-information games do much better by this metric; too bad we smashed through them so easily.)

3. Solid evidence that human intelligence comes from "special sauce" that needs to either be painstakingly imitated via much greater knowledge of neuroscience, or brute-force rediscovered via at least genome-anchor-like levels of artificial evolution. As far as I know there isn't really any solid evidence for the special sauce hypothesis; if actually AGI is really easy and there is no special sauce whatsoever, my brain would still look exactly the way it does. (To date there has been no experiment along the lines of “make a 100T parameter dense model and train it for a billion time steps,” not even close.) The best piece of evidence I know of is along the lines of "If there's no special sauce, then we should be able to make AIs as smart as animal brains of similar size, and we can't." Except that so far it seems like we can actually? We can make image recognizers better than bee brains, for example, as OpenPhil's investigation showed. I haven't yet heard of an intellectual task tiny-brained animals can do that we know current AI methods can't also do.

4. People trying to build AGI with a track record of success change their minds and start disagreeing with me about timelines: My impression is that the people actually trying to build AGI, especially the ones at the cutting edge with the best track records, tend to have even shorter timelines than me!

Holden Karnofsky

Interesting, thanks! Yep, those probabilities definitely seem too high to me :) How much would you shade them down for 5 years instead of 15? It seems like if your 5-year probabilities are anywhere near your 15-year probabilities, then the next 5 years have a lot of potential to update you one way or the other (e.g., if none of the "paths to PONR" you're describing work out in that time, that seems like it should be a significant update).

I'm not going to comment comprehensively on the paths you laid out, but a few things:

I think EfficientZero is sample-efficient but not compute-efficient: it's compensating for its small number of data points by simulating a large number, and I don't think there are big surprises on how much compute it's using to do that. This doesn't to be competing with human "efficiency" in the most important (e.g., compute costs) sense.
I don't know what you mean by APS-AI.
I'm pretty skeptical that "Persuasion tools good enough to cause major ideological strife and/or major degradation of public epistemology" is a serious PONR candidate. (There's already a lot of ideological strife and public confusion ...) I think the level of persuasiveness needed here would need to be incredibly extreme - far beyond "can build a QAnon-like following" and more like "Can get more than half the population to take whatever actions one wants them to take." This probably requires reasoning about neuroscience or something, and doesn't seem to me to be adding much in the way of independent possibility relative to the R&D possibility.

kokotajlod

Gaah, sorry, I keep forgetting to put links in -- APS-AI means Advanced, Planning, Strategically Aware AI -- the thing the Carlsmith report talks about. I'll edit to put links in retroactively.

I've written a short story about what I expect the next 5 years to look like. Insofar as AI progress is systematically slower and less impressive than what is depicted in that story, I'll update towards longer timelines, yeah.

I'm currently at something like 20% that AI-PONR will be crossed in the next 5 years, and so insofar as that doesn't seem to have happened 5 years from now then that'll be a 20%-sized blow to my timelines in the usual Bayesian way. It's important to note that this won't necessarily lengthen my timelines all things considered, because what happens in those 5 years might be more than a 20% blow to 20+year timelines. (For example, and this is what I actually think is most likely, 5 years from now the world could look like it does at the end of my short story, in which case I'd have become more confident that the point of no return will come sometime between 2026 and 2036 than I am now, not less, because things would be more on track towards that outcome than they currently seem to be.)

Re: persuasion tools: You seem to have a different model of how persuasion tools cause PONR than I do. What I have in mind is mundane, not exotic--I'm not imagining AIs building QAnon-like cult followings, I'm imagining the cost of censorship/propaganda* continuing to drop rapidly and the effectiveness continuing to increase rapidly, and (given a few years for society to catch up) ideological strife to intensify in general. This in turn isn't an x-risk by itself but it's certainly a risk factor, and insofar as our impact comes from convincing key parts of society (e.g. government, tech companies) to recognize and navigate a tricky novel problem (AI risk) it seems plausible to me that our probability of success diminishes rapidly as ideological strife in those parts of society intensifies. So when you say "there's already a lot of ideological strife and public confusion" my response is "yeah exactly, and isn't it already causing big problems and e.g. making our collective handling of COVID worse? Now imagine that said strife and confusion gets a lot worse in the next five years, and worse still in the five years after that."

*I mean these terms in a broad sense. I'm talking about the main ways in which ideologies strengthen their hold on existing hosts and spread themselves to new ones. For more on this see the aforementioned story, this post, and this comment.

Re: EfficientZero: Fair, I need to think about that more... I guess it would be really helpful to have examples of EfficientZero being done on more complex environments than Atari, such as e.g. real-world robot control or Starcraft or text prediction.

Holden Karnofsky

Sorry for the long delay, I let a lot of comments to respond to pile up!

APS seems like a category of systems that includes some of the others you listed (“Advanced capability: they outperform the best humans on some set of tasks which when performed at advanced levels grant significant power in today’s world (tasks like scientific research, business/military/political strategy, engineering, and persuasion/manipulation) … “). I still don’t feel clear on what you have in mind here in terms of specific transformative capabilities. If we condition on not having extreme capabilities for persuasion or research/engineering, I’m quite skeptical that something in the "business/military/political strategy" category is a great candidate to have transformative impact on its own.

Thanks for the links re: persuasion! This seems like a major theme for you and a big place where we currently disagree. I'm not sure what to make of your take, and I think I'd have to think a lot more to have stable views on it, but here are quick reactions:

If we made a chart of some number capturing "how easy it is to convince key parts of society to recognize and navigate a tricky novel problem" (which I'll abbreviate as "epistemic responsiveness") since the dawn of civilization, what would that chart look like? My guess is that it would be pretty chaotic; that it would sometimes go quite low and sometime sgo quite high; and that it would be very hard to predict the impact of a given technology or other development on epistemic responsiveness. Maybe there have been one-off points in history when epistemic responsiveness was very high; maybe it is much lower today compared to peak, such that someone could already claim we have passed the "point of no return"; maybe "persuasion AI" will drive it lower or higher, depending partly on who you think will have access to the biggest and best persuasion AIs and how they will use them. So I think even if we grant a lot of your views about how much AI could change the "memetic environment," it's not clear how this relates to the "point of no return."
I think I feel a lot less impressed/scared than you with respect to today's "persuasion techniques."
- I'd be interested in seeing literature on how big an effect size you can get out of things like focus groups and A/B testing. My guess is that going from completely incompetent at persuasion (e.g., basically modeling your audience as yourself, which is where most people start) to "empirically understanding and incorporating your audience's different-from-you characteristics" causes a big jump from a very low level of effectiveness, but that things flatten out quickly after that, and that pouring more effort into focus groups and testing leads to only moderate effects, such that "doubling effectiveness" on the margin shouldn't be a very impressive/scary idea.
- I think most media is optimizing for engagement rather than persuasion, and that it's natural for things to continue this way as AI advances. Engagement is dramatically easier to measure than persuasion, so data-hungry AI should help more with engagement than persuasion; targeting engagement is in some sense "self-reinforcing" and "self-funding" in a way that targeting persuasion isn't (so persuasion targeters need some sort of subsidy to compete with engagement targeters); and there are norms against targeting persuasion as well. I do expect some people and institutions to invest a lot in persuasion targeting (as they do today), but my modal expectation does not involve it becoming pervasive on nearly all websites, the way yours seems to.
- I feel like a lot of today's "persuasion" is either (a) extremely immersive (someone is raised in a social setting that is very committed to some set of views or practices); or (b) involves persuading previously-close-to-indifferent people to believe things that call for low-cost actions (in many cases this means voting and social media posting; in some cases it can mean more consequential, but still ultimately not-super-high-personal-cost, actions). (b) can lead over time to shifting coalitions and identities, but the transition from (b) to (a) seems long.
- I particularly don't feel that today's "persuaders" have much ability to accomplish the things that you're pointing to with "chatbots," "coaches," "Imperius curses" and "drugs." (Are there cases of drugs being used to systematically cause people to make durable, sustained, action-relevant changes to their views, especially when not accompanied by broader social immersion?)
I'm not really all that sure what the special role of AI is here, if we assume (for the sake of your argument that AI need not do other things to be transformative or PONR-y) a lack of scientific/engineering ability. What has/had higher ex ante probability of leading to a dramatic change in the memetic environment: further development of AI language models that could be used to write more propaganda, or the recent (last 20 years) explosion in communication channels and data, or many other changes over the last few hundred years such as the advent of radio and television, or the change in business models for media that we're living through now? This comparison is intended to be an argument both that "your kind of reasoning would've led us to expect many previous persuasion-related PONRs without needing special AI advances" and that "if we condition on persuasion-related PONRs being the big thing to think about, we shouldn't necessarily be all that focused on AI."

I liked the story you wrote! A lot of it seems reasonably likely to be reasonably on point to me - I especially liked your bits about AIs confusing people when asked about their internal lives. However:

I think the story is missing a kind of quantification or "quantified attitude" that seems important if we want to be talking about whether this story playing out "would mean we're probably looking at transformative/PONR-AI in the following five years." For example, I do expect progress in digital assistants, but it matters an awful lot how much progress and economic impact there is. Same goes for just how effective the "pervasive persuasion targeting" is. I think this story could be consistent with worlds in which I've updated a lot toward shorter transformative AI timelines, and with worlds in which I haven't at all (or have updated toward longer ones.)
As my comments probably indicate, I'm not sold on this section.
- I'll be pretty surprised if e.g. the NYT is using a lot of persuasion targeting, as opposed to engagement targeting.
- I do expect "People who still remember 2021 think of it as the golden days, when conformism and censorship and polarization were noticeably less than they are now" will be true, but that's primarily because (a) I think people are just really quick to hallucinate declinist dynamics and call past times "golden ages"; (b) 2021 does seem to have extremely little conformism and censorship (and basically normal polarization) by historical standards, and actually does kinda seem like a sort of epistemic golden age to me.
  - For people who are strongly and genuinely interested in understanding the world, I think we are in the midst of an explosion in useful websites, tools, and blogs that will someday be seen nostalgically;* a number of these websites/tools/blogs are remarkably influential among powerful people; and while most people are taking a lot less advantage than they could and seem to have pretty poorly epistemically grounded views, I'm extremely unconvinced that things looked better on this front in the past - here's one post on that topic.

I do generally think that persuasion is an underexplored topic, and could have many implications for transformative AI strategy. Such implications could include something like "Today's data explosion is already causing dramatic improvements in the ability of websites and other media to convince people of arbitrary things; we should assign a reasonably high probability that language models will further speed this in a way that transforms the world." That just isn't my guess at the moment.

*To be clear, I don't think this will be because websites/tools/blogs will be less useful in the future. I just think people will be more impressed with those of our time, which are picking a lot of low-hanging fruit in terms of improving on the status quo, so they'll feel impressive to read while knowing that the points they were making were novel at the time.

kokotajlod

I'm a fan of lengthy asynchronous intellectual exchanges like this one, so no need to apologize for the delay. I hope you don't mind my delay either? As usual, no need to reply to this message.

If we condition on not having extreme capabilities for persuasion or research/engineering, I’m quite skeptical that something in the "business/military/political strategy" category is a great candidate to have transformative impact on its own.

I think I agree with this.

Re: quantification: I agree; currently I don't have good metrics to forecast on, much less good forecasts, for persuasion stuff and AI-PONR stuff. I am working on fixing that problem. :)

Re persuasion: For the past two years I have agreed with the claims made in "The misinformation problem seems like misinformation."(!!!) The problem isn't lack of access to information; information is more available than it ever was before. Nor is the problem "fake news" or other falsehoods. (Most propaganda is true.) Being politically polarized and extremist correlates positively with being well-informed, not negatively! (Anecdotally, my grad school friends with the craziest/most-extreme/most-dangerous/least-epistemically-virtuous political beliefs were generally the people best informed about politics. Analogous to how 9/11 truthers will probably know a lot more about 9/11 than you or me.) This is indeed an epistemic golden age... for people who are able to resist the temptations of various filter bubbles and the propaganda of various ideologies. (And everyone thinks themself one such person, so everyone thinks this is an epistemic golden age for them.)

I do disagree with your claim that this is currently an epistemic golden age. I think it's important to distinguish between ways in which it is and isn't. I mentioned above a way that it is.

If we made a chart of some number capturing "how easy it is to convince key parts of society to recognize and navigate a tricky novel problem" ... since the dawn of civilization, what would that chart look like? My guess is that it would be pretty chaotic; that it would sometimes go quite low and sometimes go quite high

Agreed. I argued this, in fact.

and that it would be very hard to predict the impact of a given technology or other development on epistemic responsiveness.

Disagree. I mean, I don't know, maybe this is true. But I feel like we shouldn't just throw our hands up in the air here, we haven't even tried! I've sketched an argument for why we should expect epistemic responsiveness to decrease in the near future (propaganda and censorship are bad for epistemic responsiveness & they are getting a lot cheaper and more effective & no pro-epistemic-responsiveness-force seems to be rising to counter it)

Maybe there have been one-off points in history when epistemic responsiveness was very high; maybe it is much lower today compared to peak, such that someone could already claim we have passed the "point of no return"; maybe "persuasion AI" will drive it lower or higher, depending partly on who you think will have access to the biggest and best persuasion AIs and how they will use them.

Agreed. I argued this, in fact. (Note: "point of no return" is a relative notion; it may be that relative to us in 2010 the point of no return was e.g. the founding of OpenAI, and nevertheless relative to us now the point of no return is still years in the future.)

So I think even if we grant a lot of your views about how much AI could change the "memetic environment," it's not clear how this relates to the "point of no return."

The conclusion I built was "We should direct more research effort at understanding and forecasting this stuff because it seems important." I think that conclusion is supported by the above claims about the possible effects of persuasion tools.

What has/had higher ex ante probability of leading to a dramatic change in the memetic environment: further development of AI language models that could be used to write more propaganda, or the recent (last 20 years) explosion in communication channels and data, or many other changes over the last few hundred years such as the advent of radio and television, or the change in business models for media that we're living through now? This comparison is intended to be an argument both that "your kind of reasoning would've led us to expect many previous persuasion-related PONRs without needing special AI advances" and that "if we condition on persuasion-related PONRs being the big thing to think about, we shouldn't necessarily be all that focused on AI."

Good argument. To hazard a guess:
1. Explosion in communication channels and data (i.e. the Internet + Big Data)
2. AI language models useful for propaganda and censorship
3. Advent of radio and television
4. Change in business models for media

However I'm pretty uncertain about this, I could easily see the order being different. Note that from what I've heard the advent of radio and television DID have a big effect on public epistemology; e.g. it partly enabled totalitarianism. Prior to that, the printing press is argued to have also had disruptive effects.

This is why I emphasized elsewhere that I'm not arguing for anything unprecedented. Public epistemology / epistemic responsiveness has waxed and waned over time and has occasionally gotten extremely bad (e.g. in totalitarian regimes and the freer societies that went totalitarian) and so we shouldn't be surprised if it happens again and if someone has an argument that it might be about to happen again it should be taken seriously and investigated. (I'm not saying you yourself need to investigate this, you probably have better things to do.) Also I totally agree that we shouldn't just be focused on AI; in fact I'd go further and say that most of the improvements in propaganda+censorship will come from non-AI stuff like Big Data. But AI will help too; it seems to make censorship a lot cheaper for example.

I'd be interested in seeing literature on how big an effect size you can get out of things like focus groups and A/B testing. My guess is that going from completely incompetent at persuasion (e.g., basically modeling your audience as yourself, which is where most people start) to "empirically understanding and incorporating your audience's different-from-you characteristics" causes a big jump from a very low level of effectiveness, but that things flatten out quickly after that, and that pouring more effort into focus groups and testing leads to only moderate effects, such that "doubling effectiveness" on the margin shouldn't be a very impressive/scary idea.

I think most media is optimizing for engagement rather than persuasion, and that it's natural for things to continue this way as AI advances. Engagement is dramatically easier to measure than persuasion, so data-hungry AI should help more with engagement than persuasion; targeting engagement is in some sense "self-reinforcing" and "self-funding" in a way that targeting persuasion isn't (so persuasion targeters need some sort of subsidy to compete with engagement targeters); and there are norms against targeting persuasion as well. I do expect some people and institutions to invest a lot in persuasion targeting (as they do today), but my modal expectation does not involve it becoming pervasive on nearly all websites, the way yours seems to.
I feel like a lot of today's "persuasion" is either (a) extremely immersive (someone is raised in a social setting that is very committed to some set of views or practices); or (b) involves persuading previously-close-to-indifferent people to believe things that call for low-cost actions (in many cases this means voting and social media posting; in some cases it can mean more consequential, but still ultimately not-super-high-personal-cost, actions). (b) can lead over time to shifting coalitions and identities, but the transition from (b) to (a) seems long.
I particularly don't feel that today's "persuaders" have much ability to accomplish the things that you're pointing to with "chatbots," "coaches," "Imperius curses" and "drugs." (Are there cases of drugs being used to systematically cause people to make durable, sustained, action-relevant changes to their views, especially when not accompanied by broader social immersion?)

These are all good points. This is exactly the sort of thing I wish there was more research into, and that I'm considering doing more research on myself.

Re: pervasiveness on almost all websites: Currently propaganda and censorship both seem pretty widespread and also seem to be on a trend of becoming more so. (The list of things that get censored is growing, not shrinking, for example.) This is despite the fact that censorship is costly and so theoretically platforms that do it should be outcompeted by platforms that just maximize engagement. Also, IIRC facebook uses large language models to do the censoring more efficiently and cheaply, and I assume the other companies do too. As far as I know they aren't measuring user opinions and directly using that as a feedback signal, thank goodness, but... is it that much of a stretch to think that they might? It's only been two years since GPT-3.

Comments

“Biological anchors” is about bounding, not pinpointing, AI timelines

“Biological anchors” is about bounding, not pinpointing, AI timelines

Summary of what the framework is about

Things I agree with about the framework’s weaknesses/limitations

Bio Anchors “acts as if” AI will be developed in a particular way, and it almost certainly won’t be

Bio Anchors “acts as if” compute availability is the only major blocker to transformative AI development, and it probably isn’t

It is very easy to picture worlds where transformative AI takes much more or less time than Bio Anchors implies, for reasons that are essentially not modeled in Bio Anchors at all

Bio Anchors is not “pinpointing” the most likely year transformative AI will be developed

(Not the focus of this piece) The estimates in Bio Anchors are very uncertain

Bio Anchors as a way of bounding AI timelines

I’d be at least mildly surprised if transformative AI weren’t developed by 2060

I would be significantly surprised if transformative AI weren’t developed by 2100

Transformative AI by 2036 seems plausible and concretely imaginable, but doesn’t seem like a good default expectation

Bottom line