The Basic Argument for AI Safety

Richard Y Chappell🔸

This is a linkpost for https://www.goodthoughts.blog/p/the-basic-argument-for-ai-safety

High-stakes uncertainty warrants caution and research

When I see confident dismissals of AI risk from other philosophers, it’s usually not clear whether our disagreement is ultimately empirical or decision-theoretic in nature. (Are they confident that there’s no non-negligible risk here, or do they think we should ignore the risk even though it’s non-negligible?) Either option seems pretty unreasonable to me, for the general reasons I previously outlined in X-Risk Agnosticism. But let me now take a stab at spelling out an ultra-minimal argument for worrying about AI safety in particular:

It’s just a matter of time until humanity develops artificial superintelligence (ASI). There’s no in-principle barrier to such technology, nor should we by default expect sociopolitical barriers to automatically prevent the innovation.
1. Indeed, we can’t even be confident that it’s more than a decade away.
2. Reasonable uncertainty should allow at least a 1% chance that it occurs within 5 years (let alone 10).
The stakes surrounding ASI are extremely high, to the point that we can’t be confident that humanity would long survive this development.
Even on tamer timelines (with no “acute jumps in capabilities”), gradual disempowerment of humanity is a highly credible concern.
We should not neglect credible near-term risks of human disempowerment or even extinction. Such risks warrant urgent further investigation and investment in precautionary measures.
1. If there’s even a 1% chance that, within a decade, we’ll develop technology that we can’t be confident humanity would survive—that easily qualifies as a “credible near-term risk” for purposes of applying this principle.

Conclusion: AI risk warrants urgent further investigation and precautionary measures.^[1]

Sufficient probability density in the danger zone?

My question for those who disagree with the conclusion: which premise(s) do you reject?

13 Reactions

Comments20

Sorted by

New & upvoted

Click to highlight new comments since: Today at 1:42 PM

fergusqJan 923

Since you ask the viewpoint of those who disagree, here is a summary of my objections to your argument. It consists of two parts, first my objection to your probability of AI risk and then my objection to your conclusion.

It’s just a matter of time until humanity develops artificial superintelligence (ASI). There’s no in-principle barrier to such technology, nor should we by default expect sociopolitical barriers to automatically prevent the innovation.
Indeed, we can’t even be confident that it’s more than a decade away.
Reasonable uncertainty should allow at least a 1% chance that it occurs within 5 years (let alone 10).

A reasonable prior is that we will not develop ASI in near future (out of all possible decades, each single decade has a very small probability of ASI being developed, way less than 1%). To overcome this prior, we would need evidence. However, there is little to no evidence that suggests that any AGI/ASI technologies are possible in near future.

It is clear that our current LLM tech is not sufficient for AGI, lacking several properties that an AGI would require, such as learning-planning^[1]. Since the current progress is not going towards an AGI, it does not count as good evidence for AGI technology surfacing in near future.

We should not neglect credible near-term risks of human disempowerment or even extinction. Such risks warrant urgent further investigation and investment in precautionary measures.
If there’s even a 1% chance that, within a decade, we’ll develop technology that we can’t be confident humanity would survive—that easily qualifies as a “credible near-term risk” for purposes of applying this principle.

I'm a firm believer of the neglectedness, tractability and importance framework whenever deciding on possible interventions. Therefore, if the question is should we neglect a risk, first thing to ask is, do others neglect it. In the case of AI risk, the answer is, in my opinion, no. AI risk is not neglected. It is, in fact, taken very seriously by major AI companies, numerous other organizations, and even some governments. AI is researched in almost every university on our planet, and massive funds are used for AI safety research. So I believe AI risk fails the neglectedness criterion.

But even more crucially, I think it also fails tractability. Because the AGI technology does not exist, we cannot research it. Most so called "AI safety research" focuses on unimportant sidetracks that do not have any measurable effect on AI risk. Similarly, it is very difficult to establish any governmental policy to limit AI development, as we do not even know what kind of technology we need to regulate aside from a blanket ban on AI research, which most our politicians correctly deem would be an overreaching and harmful policy, since current AI tech is harmless from the X-risk viewpoint (and there would be no way out of that ban since we cannot research safety of non-existing tech).

I do not believe AI risk is important as there is no good reason to believe we will develop ASI in near future. But even if we believed so, it fails the two other criteria of the ITN framework and thus would not be a good target for interventions.

^{^}
Learning-planning is what I call the ability to assess one's own abilities and efficiently learn missing abilities in a targeted way. Currently, machine learning algorithms are extremely inefficient, and models lack introspection capabilities required to assess missing abilities.

Richard Y Chappell🔸Jan 95

Thanks for explaining your view!

On the first point: I think we should view ASI as disproportionately likely in decades that already feature (i) recent extraordinary progress in AI capabilities that surprises almost everyone, and (ii) a fair number of experts in the field who appear to take seriously the possibility that continued progress in this vein could soon result in ASI.

I'd then think we should view it as disproportionately unlikely that ASI will either be (a) achieved before any such initial signs of impressive progress, OR (b) achieved centuries after such initial progress. (If not achieved within a century, I'd think it more likely to be outright unachievable.)

I don't really know enough about AI safety research to comment on the latter disagreement. I'm curious to hear others' views.

fergusqJan 914

While it might feel to you that AI progress has been rapid in the past decade, most innovations behind it such as neural networks, gradient descent, backpropagation, and the concept of language models are very old innovations. The only major innovation in the past decade is the Transformer architecture from 2017, and almost everything else is just incremental progress and scaling on larger models and datasets. Thus, the pace of AI architecture development is very slow and the idea that a groundbreaking new AGI architecture will surface has a low probability.

ASI is the ultimate form of AI and in some sense computer science as a whole. Claiming that we will reach it just because we've just got started in computer science seems premature, akin to claiming that physics is soon solved just because we've made so much progress recently. Science (and AI in particular) is often compared to an infinite ladder: you can take as many steps as you like, and there will still be infinite steps ahead. I don't believe there are literally infinite steps to ASI, but assuming there must be only a few steps ahead just because there are a lot of steps behind is a fallacy.

I was recently reading Ada Lovelace's "Translator's Notes" from 1843, and came across this timeless quote (emphasis original):

It is desirable to guard against the possibility of exaggerated ideas that might arise as to the powers of the Analytical Engine. In considering any new subject, there is frequently a tendency, first, to overrate what we find to be already interesting or remarkable; and, secondly, by a sort of natural reaction, to undervalue the true state of the case, when we do discover that our notions have surpassed those that were really tenable.

This is a comment to the text by Luigi Menabrea she was translating, in which he was hyping that the "conceptions of intelligence" could be encoded into the instructions of the Analytical Engine^[1]. Having a much better technical understanding of the machine than Menabrea, Lovelace was skeptical of his ideas and urged him to calm down.

The rest of their discussion is much more focused on concrete programs the machine could execute, but this short quote stroke me as very familiar of our current discussion. There existed (some level of) scientific discussion of artificial intelligence in 1840s, and their talking points seem so similar to ours, with some hyping and others being skeptical!

From the perspective of Lovelace and Menabrea, computer science progressed incredibly fast. Babbage's Analytical Engine was a schematic for a working computer that was much better than earlier plans such as the Differential Engine. Designing complex programs became possible. I can feel their excitement while reading their texts. But even despite this, it took a hundred years until ENIAC, the first general purpose digital computer, was built in 1945. The fact that a field progresses fast in its early days does not mean much when predicting its progress in future.

^{^}
The quote she was commenting: "Considered under the most general point of view, the essential object of the machine being to calculate, according to the laws dictated to it, the values of numerical coefficients which it is then to distribute appropriately on the columns which represent the variables, it follows that the interpretation of formulae and of results is beyond its province, unless indeed this very interpretation be itself susceptible to expression by means of the symbols which the machine employs. Thus, although it is not itself the being that reflects, it may yet be considered as the being which executes the conceptions of intelligence."

Richard Y Chappell🔸Jan 92

I really want to stress the difference between saying something will (definitely) happen versus saying there's a credible chance (>1%) that it will happen. They're very different claims!

Lovelace and Menabrea probably should have regarded their time as disproportionately likely (compared to arbitrary decades) to see continued rapid progress. That's compatible with thinking it overwhelmingly likely (~99%) that they'd soon hit a hurdle.

As a heuristic, ask: if one were, at the end of history, to plot the 100 (or even just the 50) greatest breakthrough periods in computer science prior to ASI (of course there could always be more breakthroughs after that), should we expect our current period to make the cut? I think it would be incredible to deny it.

fergusqJan 1011

It is true that Lovelace and Menabrea should have assumed a credible chance of rapid progress. Who knows, maybe if they had had the right resources and people, we could have had computers much earlier than we ultimately had.

But when talking about ASI, we are not just talking about rapid progress, we are talking about the most extreme progress imaginable. Extraordinary claims require extraordinary evidence, and so forth. We do not know what breakthroughs ASI requires, nor do we know how far we are from it.

It all comes down to the question of whether the current tech is relevant for ASI or not. In my estimation, it is not – something else entirely is required. The probability for us discovering that something else just now is low.

David Mathers🔸Jan 122

"It all comes down to the question of whether the current tech is relevant for ASI or not. In my estimation, it is not – something else entirely is required. The probability for us discovering that something else just now is low."

I think Richard's idea is that you shouldn't have *super-high* confidence in your estimation here, but should put some non-negligible credence on the idea that it is wrong, and current progress is relevant. Why be close to certainty about a question that you probably think is hard and that other smart people disagree about being the reasoning? And once you open yourself up to a small chance that current progress is in fact relevant, it then becomes at least somewhat unclear that you should be way below 1% in the chanc of AGI in the relatively near term or in current safety work being relevant. (Not necessarily endorsing the line of thought in this paragraph myself.)

fergusqJan 1212

Thanks for your answer.

other smart people disagree

I'm generally against this sort of appeal to authority. While I'm open to hear the arguments of smart people, we should evaluate those arguments themselves and not the people giving them. So far, I've heard no argument that would change my opinion on this matter.

You seem to make a similar argument in your other comment:

[...] But when I ask myself what evidence I have for "there are not >20 similar sized jumps before AGI" I come up short. I don't necessarily think the burden of proof here is actually on people arguing that the chance of AGI in the next decade is non-negligible though: it's a goal of some serious people within the relevant science [...]

Again, I think just because there are serious people with this goal, that doesn't mean it is a justified belief. As you say yourself, you can't find evidence for your view. Extraordinary claims require extraordinary evidence, and the burden of proof lays on the person making the claim. That some serious/smart people believe in it is not enough evidence.

1%

I want to stress that even if we gave AGI/ASI a 1% probability in the next decade, my other point that AI safety work is not tractable and not neglected still stands, and it is thus not a good intervention that people in EA should focus on.

David Mathers🔸Jan 122

And it's not so much that I think I have zero evidence: I keep up with progress in AI to some degrees, I have some idea of what the remaining gaps are to general intelligence, I've seen the speed at which capabilities have improved in recent years etc. It's that how to evaluate that evidence is not obvious, and so simply presenting a skeptic with it probably won't move them, especially as the skeptic-in this case you-probably already has most of the evidence I have anyway. If it was just some random person who had never heard of AI asking why I thought the chance of mildly-over-human level AI in 10 years was not far under 1%, there are things I could say. It's just you already know those things, probably, so there's not much point in my saying them to you.

David Mathers🔸Jan 122

"I'm generally against this sort of appeal to authority. While I'm open to hear the arguments of smart people, we should evaluate those arguments themselves and not the people giving them. So far, I've heard no argument that would change my opinion on this matter."

I think this attitude is just a mistake if your goal is to form the most accurate credences you can. Obviously, it is always good practice to ask people for their arguments rather than only taking what they say on trust. But your evaluation of other people's arguments is fallible, and you know it is fallible. So you should distribute some of your confidence to cases where your personal evaluations of credible people's arguments are just wrong. This isn't the same as failing to question purported experts. I can question an expert, and even disagree with them overall, and still move my credences somewhat towards theirs. (I'm much more confident about this general claim than I am about what credences in ASI in the next decade are or aren't reasonable, or how much credibility anyone saying ASI is coming in the next decade should get.)

Yarrow Bouchard 🔸Jan 22*10

There is a kernel of truth in this; some version of this argument is a good argument. But the devil is in the details.

If you’re not a scientist or a person with relevant expertise and you feel inclined to disagree with the ~97-99.9% of climate scientists who think anthropogenic climate change is happening, you better adopt a boatload of epistemic humility. In practice, many non-scientists or non-experts disagree with the scientific consensus. I’m not aware of one example of such a person adopting the appropriate level of epistemic humility.

On the other hand, people invoke this same epistemic humility argument when it comes to fringe stuff like UFOs, ESP, and conspiracy theories. A lot of smart people believe in fringe stuff and, boy howdy, have they ever spent a lot of time researching it and thinking about it.

I think the difference between these two examples of epistemic humility arguments is some form of Carl Sagan-esque scientific skepticism.^[1]

The difference also has to do with the idea of expertise, the idea of science (both the scientific method and scientific institutions or communities), and the idea of academia. This goes beyond the general concept of disagreement, or the general concept of some number of people holding an opinion. Climate scientists are an expert community. People who believe in alien UFOs are not an expert community. There is something to be said here about knowledge production and how it happens, which is deeper than just some group of people believing something.

There’s lots to be said here about epistemology, and about science or scientific skepticism as an approach to knowledge versus less reliable approaches. For instance, around unfalsifiable claims. The claim that there is a 2% chance that AGI or superintelligence will be developed within the next decade is unfalsifiable. Even if the next decade plays out exactly how someone who assigns a 0.002% (or 1 in 50,000) probability to AGI would expect, a person who assigned a 2% or 20% or even 60% probability could still think they were right to do so. It’s not exactly a scientific question. It’s also outside the scope of the sort of forecasting questions that the forecasting research literature can support. This is strongly disanalogous to epistemic deference toward scientific experts.^[2]

^{^}
In my view, it’s not a coincidence that a hugely disproportionate number of people who believe in a high level of existential risk from near-term AGI a) are strongly skeptical of mainstream, institutional science and strongly advocate fringe, dubious, borderline, or minority scientific views, b) believe in fringe stuff like aliens, conspiracy theories, or ESP, or c) have joined a cult.
The rate of cult membership among people who believe in a high level of existential risk from near-term AGI has got to be something like 10,000x higher than the general population. The LessWrong community in the Bay Area is, I think, around 1,000 people (if not less) and it has started half a dozen cults in 17 years. Former members have committed as many murders in that timeframe. Small towns with much larger populations and much longer histories typically have never had a single cult (and very rarely have cults that murder people).
Clearly, the argument for epistemic humility that applies to say, climate scientists, doesn’t apply cleanly to this community. If there is strong independent evidence of a community frequently holding extreme, false beliefs, an appeal to their overall rationality doesn’t work. (Also, LessWrong’s primary thought leader has confidently made incorrect predictions about the timing of AGI and about the imminent existential dangers from advanced technology in the past — another reason for skepticism.)
The EA community is also directly implicated. One of the aforementioned cults, Leverage Research, ran the Centre for Effective Altruism for a few years, and organized the first EA conferences.
^{^}
The number of people who believe in Hinduism is vastly larger than the number who believe in near-term AGI: 15% of the world population, 1.2 billion people. Far more university professors, scientists, philosophers, and so on believe in Hinduism than near-term AGI. Should you assign more than a 1% chance to the probability that Hinduism is a correct theory of metaphysics, cosmology, consciousness, and morality? The stakes are also very large in this case.

David Mathers🔸Jan 222

Sure, but I I wasn't really thinking of people on LessWrong, but rather of the fact that at least some relevant experts outside of the LW milieu seem worried and/or think that AGI is not THAT far. I.e. Hinton, Bengio, Stuart Russell (for danger) and even people often quoted as skeptical experts* like Gary Marcus or Yann LeCunn often give back of the envelope timelines of 20 years, which is not actually THAT long. Furthermore I actually do think the predictions of relatively near term AGI by Anthropic and the fact that DeepMind and OpenAI have building AGI as a goal to carry some weight here. Please don't misunderstand me, I am not saying that these orgs are trustworthy exactly: I expect them to lie in their own interests to some degree, including about how fast their models will get better, and also to genuinely overestimate how fast they will make progress. Nonetheless they are somewhat credible in the sense that a) they are at the absolute cutting edge of the science here and have made some striking advancements, and b) they have some incentive not to overpromise so much that no one ever believes anything they say ever again, and c) they are convincing enough to outsiders with money that they keep throwing large sums of money at them, which suggests those outsiders at least expect reasonably rapid advancement, whether or not they expect AGI itself, and which is also evidence that these are serious organizations.

I'd also say that near-term AGI is somewhat disanalogous to Hinduism, ESP, Christianity, crystal healing etc. in that all these things are actually in conflict with a basic scientific worldview fairly directly, in that they describe things that would plausibly violate known laws of physics, or are clearly supernatural in a fairly straightforward sense. That's not true of near-term AGI.

Having said that I certainly agree that it is not completely obvious that there is enough real expertise behind predictions of near-term AGI to treat them with deference. My personal judgment is that there is, but once we get away from obvious edge cases like textbook hard science on the one hand and "experts" in plainly supernatural things on the other, it gets hard to tell how much deference people deserve.

There's also an issue of priors here of course: I don't think "AGI will be developed in the next 100 years" is an "extraordinary" claim in the same sense as supernatural claims, or even just something unlikely but possible like "Scotland will win the next football world cup". We know it is possible in principle, and that technology can advance quickly over a timespan of decades-just compare where flight was in 1900 to 1960-and that trillions of dollars are going to be spent advancing AI in the near term, and while Moore's law is breaking down, we haven't actually hit theoretical in principle limits on how good chips can be, and that more money and labour is currently going into making advanced AI than ever before. If we say there's a 25% chance of it being developed in the next hundred years, an even divide per decade of that would say 2.5% chance of it arriving next decade. Even if we cut that 5x for the next decade, that would give a 0.5% chance which I think is worth worrying about given how dramatic the consequence of AGI would be. (Of course, you personally have lots of arguments against it being near, but I am just talking about what it's reasonable to expect from broad features of the current situation before we get into the details.) But of course, forecasting technology 100 years out is extraordinarily hard. In general because forecasting is hard beyond the next few years, so I guess maybe 25% is way too high (although the uncertainty cuts both ways.)

I get that the worry here is that people can just say any possible technology might happen soon, so if it was very consequential we should worry about it now. But my response is just that if it's a technology that several of the world's largest or, fastest growing or most innovative companies claim or hint to be building it, and a Nobel winning scientist in the area in question agree that they very well might be, probably this is right, whereas if no one serious is working towards a technology, a higher degree of skepticism is probably warranted. (Presumption could be overcome if almost all scientists with relevant expertise think that Bengio and Hinton are complete cranks on this, but I haven't seen strong evidence of that.)

*In fairness, my guess is that they are actually more bullish on AGI than many people in machine learning, but that is only a guess.

fergusqJan 227

I believe that you are underestimating just how strong incentives OpenAI etc. have to lie about AGI. For them, it is an existential question, as there is a real chance of them becoming bankrupt if they do not deliver. This means that we should expect them to always say that AGI is close regardless of their true beliefs, because no CEO is ever going to make public claims that could risk the whole company.

Even in case of companies such as Microsoft and Google that would not fail if there is no AGI, saying out loud that there won't be AGI would possibly crash their stocks. They will likely maintain the illusion as long as they can.

I will also push back a little on relying too much on views of individual researchers such as Hinton or Bengio, which would be much more credible if they dared to present any evidence for their claims. See, for instance, this report from October 2025 by Bengio, Hinton, and others. It fails to provide any good evidence for progress in capabilities required for general intelligence, mainly focusing on how AI systems are better at some benchmarks, despite those benchmarks not really being related to AGI in any way. Instead, the report admits that while "AI systems continue to improve on most standardised evaluations," they "show lower success rates on more realistic workplace tasks", hinting that even the benchmark progress is fluff to at least some degree.

If even their own report doesn't find any progress towards AGI, what is the basis for their short timelines? I think we are right to require more evidence before using their opinion as a basis for EA interventions or funding.

David Mathers🔸Jan 238

I don't know if/how much EA money should go to AI safety either. EAs are trying to find the single best thing, and it's very hard to know what that is, and many worthwhile things will fail that bar. Maybe David Thorstad is right, and small X-risk reductions have relatively low value because another X-risk will get us in the next few centuries anyway*. What I do think is that society as a whole spending some resources caring about the risk of AGI arriving in the next ten years is likely optimal, and that it's not more silly to do so than to do many other obviously good things. I don't actually give to AI safety myself, and I only work on AI-related stuff-forecasting etc., I'm not a techy person-because it's what people are prepared to pay more for, and people being prepared to pay me to work on near-termist causes is less common, though it does happen. I myself give to animal welfare, not AI safety.

If you really believe that everyone putting money into Open AI etc. will only see returns if they achieve AGI that seems to me to be a point in favour of "there is a non-negligible risk of AGI in the next 10 years". I don't believe that, but if I did I that alone would significantly raise the chance I give to AGI within the next 10 years. But yes, they have some incentive to lie here, or to lie to themselves, obviously. Nonetheless, I don't think that means their opinion should get zero weight. For it to actually have been some amazing strategy for them to talk up the chances of AGI, *because it attracted cash* you'd have to believe they can fool outsiders with serious money on the line, and that this will be profitable for them in the long term, rather than crashing and burning when AGI does not arrive. I don't think that is wildly unlikely or anything, indeed, I think it is somewhat plausible-though my guess is Anthropic in particular believe their own hype. But it does require a fairly high amount of foolishness on behalf of other quite serious actors. I'm much more sure of "raising large amounts of money for stuff that obviously won't work is relatively hard" than I am of any argument about how far we are from AGI that looks at the direct evidence, since the latter sort of arguments are very hard to evaluate. I'd feel very differently here if we were arguing about 50% chance of AI in ten years, or even 10% chance. It's common for people to invest in things that probably won't work but have a high pay-off if they do. But what your saying is that Richard is wrong for thinking there is a non-negligible risk, because the chance is significantly under 1%. I doubt there are many takers for like "1 in 1000" chance of a big pay-off.

It is of course not THAT unlikely that they are fooling the serious money: serious investors make mistakes and even the stock market does. Nonetheless, being able to attract serious investment that is genuinely only investing because they think you'll achieve X, whilst simultaneously being under huge media attention and scrutiny is a credible signal that you'll eventually achieve X.

I don't think the argument I've just given is all that definitive, because they have other incentives to hype, like attracting top researchers (who I think it is probably eaiser to fool, because if they are fooled about AGI working at a big lab was probably good for them anyway; quite different from what happens to people funding the labs who are fooled who just lose money.) So it's possible that the people pouring serious money in don't take any of the AGI stuff seriously. Nonetheless, I trust "serious organisations with technical prowess seem to be trying to do this" as a signal to take something minimally seriously, even if they have some incentive to lie.

Similarly, if you really think Microsoft and Google have taken decisions that will crash their stock if AGI doesn't arrive, I think a similar argument applies: Are you really sure you're better at evaluating whether there is a non-negligible chance that a tech will be achieved by the tech industry than Microsoft and Google? Eventually, if AGI is not arriving from the huge training runs that are being planned in the near future, people will notice, and Microsoft and Google don't want to lose money in 5 years from now either. Again, it's not THAT implausible that they are mistaken, mistakes happen. But you aren't arguing that there probably won't be AGI in ten years-a claim I actually strongly agree with!-but rather that Richard was way out in saying that it's a tail risk we should take seriously given how important it would be.

Slower progress on one thing than another does not mean no progress on the slower thing.

"despite those benchmarks not really being related to AGI in any way." This is your judgment, but clearly it is not the judgment of some of the world's leading scientific experts in the area. (Though there may well be other experts who agree with you.)

*Actually Thorstad's opinion is more complicated than that, he says that this is true conditional on X-risk currently being non-negligible, but he doesn't himself endorse the view that it is currently non-negligible as far as I can tell.

fergusqJan 23*11

I'm not an economist, but the general consensus among the economists I have spoken to is that different kinds of bubbles (such as the dot-com bubble) are commonplace and natural, and even large companies make stupid mistakes that affect their stock hugely.

Anecdotally, there are a lot of small companies that are clearly overvalued, such as the Swedish startup Lovable, which recently reached the valuation of $6.6 billion. It is insane for a startup whose only product is a wrapper for another company's LLM in a space where every AI lab has their own coding tool. If people are willing to invest money in that, I'd assume they'd invest even more in larger companies that actually do have a product, even if it is overhyped. Overvaluation leads to a cycle where the company must keep overpromising to avoid a market correction, which in turn leads to even more overvaluation.

Again, I'm not an economist so all that is said with the caveat that I might misunderstand some market mechanisms. But what I am is an AI researcher. I would be more than willing to believe that the investors are right, if they provided evidence. But my experience talking with people who are investing seriously in AI is that they don't understand the tech at all. At my day-job, talking to the management that wants AI to do things it cannot do and allocating resources for hopeless projects. At investor-sponsored events, where people are basically handing off money to any company that seems reasonable enough (e.g., employing qualified people, having experience in doing business) regardless of the AI project they are proposing. I know a person who got bought for millions due to having a good-sounding idea even though they have no product, no clients and no research backing up their idea. Some people are just happy to get rid of their money.

There do exist reasonable investors too, but there is not too much they can do about the valuations of private companies. And even though in theory markets allow things such as shorting that can correct overvaluations, these instruments are very risky and I think these people are more likely to just stay away of AI companies as a whole than attempt to short a bubble when they don't know when it crashes.

"despite those benchmarks not really being related to AGI in any way." This is your judgment, but clearly it is not the judgment of some of the world's leading scientific experts in the area. (Though there may well be other experts who agree with you.)

Are you sure? Even in that report, they carefully avoid using the term "AGI" and instead refer to "general-purpose AI systems", a legal term used in the EU AI Act and generally thought to refer to current LLMs. Although both terms contain the word "general", they mean very different things, which is something that the authors mention as well^[1].

They also quite straight say that they do not have any strong evidence. According to them, "[t]he pace and unpredictability of advancements in general-purpose AI pose an ‘evidence dilemma’ for policymakers." (This PDF, page 14). They continue that due to the "rapid an unexpected" advancements, policymakers will have to make decisions "without having a large body of scientific evidence available". They admit that "pre-emptive risk mitigation measures based on limited evidence might turn out to be ineffective or unnecessary." Still, they claim that "waiting for stronger evidence of impending risk could leave society unprepared or even make mitigation impossible".

This type of text using wordings such as "impending risk" even though they do not have strong evidence adds an artificial layer of urgency to the issue that is not based on facts.

Even their "weak evidence" is not very good. They list several risks, but of us the most important is the risk they call "loss of control", under which they put X-risks. They reference several highly speculative papers on this issue that on a cursory reading make several mistakes and have generally a quite low quality. Going through them would be worth of a whole article, but as an example, the paper by Dung (2024) argues that "given that no one has identified an important capacity which does not improve with scaling, the currently best supported hypothesis is arguably that further scaling will bring about AGI." That is just wrong: Based on a recent survey, 76% of AI experts believe that scaling is not enough. The most important missing capability is continual learning, which is a feature current algorithms simply lack that cannot be scaled. Dung does mention some objections to his argument, but omits the most obvious ones that he should have been aware of. This is not the kind of argument we should take even as weak evidence.

Going back to the International AI Safety Report, it seems that Bengio et al. know that their evidence is not enough, and they have carefully written their report so that it doesn't claim anything that is technically incorrect, but overall gives a much more alarmist tone than justified by their findings. If they really had better evidence, they would say so more clearly.

^{^}
"General-purpose AI is not to be confused with ‘artificial general intelligence’ (AGI). The term AGI lacks a universal definition but is typically used to refer to a potential future AI that equals or surpasses human performance on all or almost all cognitive tasks. By contrast, several of today’s AI models and systems already meet the criteria for counting as general-purpose AI as defined in this report." (This PDF, page 27).

David Mathers🔸Jan 234

I guess I'm just slightly confused about what economists actually think here since I'd always thought they took the idea that markets and investors were mostly quite efficient most of the time fairly seriously.

Yarrow Bouchard 🔸Jan 26*3

I don't know much about this topic myself, but my understanding is that market efficiency is less about having the objectively correct view (or making the objectively right decision) and more about the difficulty of any individual investor making investments that systematically outperform the market. (An explainer page here helps clarify the concept). So, the concept, I think, is not that the market is always right, but when the market is wrong (e.g. that generative AI is a great investment), you're probably wrong too. Or, more precisely, that you're unlikely to be systematically right more often then the market is right, and systematically wrong less often than the market is wrong.

As I understand it, there are differing views among economists on how efficient the market really is. And there is the somewhat paradoxical fact that people disagreeing with the market is part of what makes it as efficient as it is in the first place. For instance, some people worry that the rise of passive investing (e.g. via Vanguard ETFs) will make the market less efficient, since more people are just deferring to the market to make all the calls, and not trying to make calls themselves. If nobody ever tried to beat the market, then the market would become completely inefficient.

There is an analogy here to forecasting, with regard to epistemic deference to other forecasters versus herding that throws out outlier data and makes the aggregate forecast less accurate. If all forecasters just circularly updated until all their individual views were the aggregate view, surely that would be a big mistake. Right?

Do you have a specific forecast for AGI, e.g. a median year or a certain probability within a certain timeframe?

If so, I'd be curious to know how important AI investment is to that forecast. How much would your forecast change if it turned out the AI industry is in a bubble and the bubble popped, and the valuations of AI-related companies dropped significantly? (Rather than trying to specifically operationalize "bubble", we could just defer the definition of bubble to credible journalists.)

There are a few different reasons you've cited for credence in near-term AGI — investment in AI companies, the beliefs of certain AI industry leaders (e.g. Sam Altman), the beliefs of certain AI researchers (e.g. Geoffrey Hinton), etc. — and I wonder how significant each of them is. I think each of these different considerations could be spun out into its own lengthy discussion.

I wrote a draft of a comment that addresses several different topics you raised, topic-by-topic, but it's far too long (2000 words) and I'll have to put in a lot of work if I want to revise it down to a normal comment length. There are multiple different rabbit holes to go down, like Sam Altman's history of lying (which is why the OpenAI Board fired him) or Geoffrey Hinton's belief that LLMs have near-human-level consciousness.

I feel like going deeper into each individual reason for credence in near-term AGI and figuring out how significant each one is for your overall forecast could be a really interesting discussion. The EA Forum has a little-used feature called Dialogues that could be well-suited for this.

fergusqJan 243

I guess markets are efficient most of the time, but stock market bubbles do exist and are common even, which goes against the efficient market hypothesis. I believe it is a debated topic in economics and I don't know what the current consensus regarding it is.

My own experience points to the direction that there is an AI bubble, as cases like Lovable indicate that investors are overvaluing companies. I cannot explain their valuation, other than that investors bet on things they do not understand. As I mentioned, anecdotally this seems to often be the case.

David Mathers🔸Jan 232

The report has many authors, some of whom maybe much less concerned or think the whole thing is silly. I never claimed that Bengio and Hinton's views were a consensus, and in any case, I was citing their views as evidence for taking the idea that AGI may arrive soon seriously, not their views on how risky AI is. I'm pretty sure I've seen them give relatively short time-lines when speaking individually, but I guess I could be misremembering. For what it's worth Yann LeCunn seems to think 10 years is about right, and Gary Marcus seems to think a guess of 10-20 years is reasonable: https://helentoner.substack.com/p/long-timelines-to-advanced-ai-have

[comment deleted]Jan 122

Deleted by David Mathers🔸, 01/12/2026

Reason: Posted in response to the wrong comment in the thread

David Mathers🔸Jan 94

It seems like if you find it incredible to deny and he doesn't, it's very hard to make further progress :( I'm on your side about the chance being over 1% in the next decade, I think, but I don't know how I'd prove it to a skeptic, except to gesture and say that capabilities have improved loads in a short time, and it doesn't seem like the are >20 similar sized jumps before AGI. But when I ask myself what evidence I have for "there are not >20 similar sized jumps before AGI" I come up short. I don't necessarily think the burden of proof here is actually on people arguing that the chance of AGI in the next decade is non-negligible though: it's a goal of some serious people within the relevant science, and they are not making zero progress, and some identifiable quantifiable individual capabilities have improved very fast. Plus the extreme difficulty of forecasting technological breakthroughs over more than a couple of years cuts both ways.

Vasco Grilo🔸Jan 87

Thanks for crossposting this, Richard!

The stakes surrounding ASI are extremely high, to the point that we can’t be confident that humanity would long survive this development

I guess the risk of human extinction over the next 10 years is 10^-7, and I am not aware of any quantitative empirical modelling suggesting otherwise. I do not think that is high enough to justify your conclusion that "AI risk warrants urgent further investigation and precautionary measures", even accounting for longterm effects. I very much agree there should be some investigation and precautionary measures, but I do not consider this "urgent" at the margin.