All of David Mathers🔸's Comments + Replies

The report has many authors, some of whom maybe much less concerned or think the whole thing is silly. I never claimed that Bengio and Hinton's views were a consensus, and in any case, I was citing their views as evidence for taking the idea that AGI may arrive soon seriously, not their views on how risky AI is. I'm pretty sure I've seen them give relatively short time-lines when speaking individually, but I guess I could be misremembering. For what it's worth Yann LeCunn seems to think 10 years is about right, and Gary Marcus seems to think a guess of 10-20 years is reasonable: https://helentoner.substack.com/p/long-timelines-to-advanced-ai-have

I guess I'm just slightly confused about what economists actually think here since I'd always thought they took the idea that markets and investors were mostly quite efficient most of the time fairly seriously. 

3
Yarrow Bouchard 🔸
I don't know much about this topic myself, but my understanding is that market efficiency is less about having the objectively correct view (or making the objectively right decision) and more about the difficulty of any individual investor making investments that systematically outperform the market. (An explainer page here helps clarify the concept). So, the concept, I think, is not that the market is always right, but when the market is wrong (e.g. that generative AI is a great investment), you're probably wrong too. Or, more precisely, that you're unlikely to be systematically right more often then the market is right, and systematically wrong less often than the market is wrong. As I understand it, there are differing views among economists on how efficient the market really is. And there is the somewhat paradoxical fact that people disagreeing with the market is part of what makes it as efficient as it is in the first place. For instance, some people worry that the rise of passive investing (e.g. via Vanguard ETFs) will make the market less efficient, since more people are just deferring to the market to make all the calls, and not trying to make calls themselves. If nobody ever tried to beat the market, then the market would become completely inefficient. There is an analogy here to forecasting, with regard to epistemic deference to other forecasters versus herding that throws out outlier data and makes the aggregate forecast less accurate. If all forecasters just circularly updated until all their individual views were the aggregate view, surely that would be a big mistake. Right? Do you have a specific forecast for AGI, e.g. a median year or a certain probability within a certain timeframe? If so, I'd be curious to know how important AI investment is to that forecast. How much would your forecast change if it turned out the AI industry is in a bubble and the bubble popped, and the valuations of AI-related companies dropped significantly? (Rather than try
3
fergusq
I guess markets are efficient most of the time, but stock market bubbles do exist and are common even, which goes against the efficient market hypothesis. I believe it is a debated topic in economics and I don't know what the current consensus regarding it is. My own experience points to the direction that there is an AI bubble, as cases like Lovable indicate that investors are overvaluing companies. I cannot explain their valuation, other than that investors bet on things they do not understand. As I mentioned, anecdotally this seems to often be the case.

I don't know if/how much EA money should go to AI safety either. EAs are trying to find the single best thing, and it's very hard to know what that is, and many worthwhile things will fail that bar. Maybe David Thorstad is right, and small X-risk reductions have relatively low value because another X-risk will get us in the next few centuries anyway*. What I do think is that society as a whole spending some resources caring about the risk of AGI arriving in the next ten years is likely optimal, and that it's not more silly to do so than to do many other ob... (read more)

I'm not an economist, but the general consensus among the economists I have spoken to is that different kinds of bubbles (such as the dot-com bubble) are commonplace and natural, and even large companies make stupid mistakes that affect their stock hugely.

Anecdotally, there are a lot of small companies that are clearly overvalued, such as the Swedish startup Lovable, which recently reached the valuation of $6.6 billion. It is insane for a startup whose only product is a wrapper for another company's LLM in a space where every AI lab has their own coding to... (read more)

METR has an official internal view on what time horizons correspond to "takeover not ruled out"? 

6
Thomas Kwa🔹
See the gpt-5 report. "Working lower bound" is maybe too strong; maybe it's more accurate to describe it as an initial guess at a warning threshold for rogue replication and 10x uplift (if we can even measure time horizons that long). I don't know what the exact reasoning behind 40 hours was, but one fact is that humans can't really start viable companies using plans that only take a ~week of work. IMO if AIs could do the equivalent with only a 40 human hour time horizon and continuously evade detection, they'd need to use their own advantages and have made up many current disadvantages relative to humans (like being bad at adversarial and multi-agent settings).

Yeah, I am inclined to agree-for what my opinion is worth which on this topic is probably not that much-that there will be many things AIs can't do even once they have a METR 80% time-horizon of say 2 days. But I am less sure of that than I am of the meta-level point about this being an important crux. 

Sure, but I I wasn't really thinking of people on LessWrong, but rather of the fact that at least some relevant experts outside of the LW milieu seem worried and/or think that AGI is not THAT far. I.e. Hinton, Bengio, Stuart Russell (for danger) and even people often quoted as skeptical experts* like Gary Marcus or Yann LeCunn often give back of the envelope timelines of 20 years, which is not actually THAT long. Furthermore I actually do think the predictions of relatively near term AGI by Anthropic and the fact that DeepMind and OpenAI have building AGI ... (read more)

7
fergusq
I believe that you are underestimating just how strong incentives OpenAI etc. have to lie about AGI. For them, it is an existential question, as there is a real chance of them becoming bankrupt if they do not deliver. This means that we should expect them to always say that AGI is close regardless of their true beliefs, because no CEO is ever going to make public claims that could risk the whole company. Even in case of companies such as Microsoft and Google that would not fail if there is no AGI, saying out loud that there won't be AGI would possibly crash their stocks. They will likely maintain the illusion as long as they can. I will also push back a little on relying too much on views of individual researchers such as Hinton or Bengio, which would be much more credible if they dared to present any evidence for their claims. See, for instance, this report from October 2025 by Bengio, Hinton, and others. It fails to provide any good evidence for progress in capabilities required for general intelligence, mainly focusing on how AI systems are better at some benchmarks, despite those benchmarks not really being related to AGI in any way. Instead, the report admits that while "AI systems continue to improve on most standardised evaluations," they "show lower success rates on more realistic workplace tasks", hinting that even the benchmark progress is fluff to at least some degree. If even their own report doesn't find any progress towards AGI, what is the basis for their short timelines? I think we are right to require more evidence before using their opinion as a basis for EA interventions or funding.

I don't think this is sufficient to explain EA disinterest, because there are also neartermist EAs who are skeptical about near-term AGI, or just don't incorporate it into their assessment of cause areas and interventions. 

Somewhat surprised to hear that people can successfully pull that off. 

It seems to me like the really important thing is interpreting what "METR 80% time horizon goes to a year", or whatever endpoint you have in mind actually means. It's important if that takes longer than AI2027 predicts, obviously, but it seems more crux-y to me whether getting to that point means transformative AI is near or not, since the difference between "3 years and 7 seven years" say, while important seems less important to me than between "definitely in 7 years" and "who knows, could still be 20+ years away". 

2
Vasco Grilo🔸
Agreed, David. The post Where’s my ten minute AGI? by Anson Ho discusses why METR's task time horizon does not translate into as much automation as one may naively expect.

I think part of the issue here probably is that EAs mostly don't think biodiversity is good in itself, and instead believe only humans and animals experiencing well-being is good, and that the impact on well-being of promoting biodiversity is complex, uncertain and probably varies a lot with how and where biodiversity is being promoted. Hard to try and direct biodiversity funding if you don't really clearly agree with raising biodiversity as a goal. 

0
Vasco Grilo🔸
Agreed, David. Nitpick. I would say humans, animals, microorganisms, and digital beings.
1
David Goodman
I agree with some of the comments below -- I think most EAs support things like lab-grown meat for animal welfare reasons. If there's a strong argument (which I think there is) for lab-grown meat ALSO being the best possible thing you could do for biodiversity, and making that argument to the right people could literally 10x the amount of money going to lab-grown meat R&D per year, then I think we should making that argument. If you're consequentialist about it, the motives of the GBF are irrelevant. What matters is that they could massively fund lab-grown meat, and nobody is argueing to them that it's their interest to do so.  And about huw's point below (ie. many lobbyists make arguments that don't align with their true motivations), I think that's how lobbying usually works. It's pretty easy to imagine EAs going to a COP and making the 100% true and good faith argument that lab-grown meat would be more effective for protecting biodiversity than, say, "protecting" on paper a random, 150-square km patch of water in the South Pacific. Those EAs might not care about biodiversity themselves, but if they succeeded in getting 0.1% of the budget dedicated to lab grown meat R&D, and thus DOUBLING annual investment in the sector, that would also be awesome for animal welfare. 
9
NickLaing
I agree that most EAs probably don't think biodiversity is good in and of itself. I'm in the minority that do - I'm not just a hedonistic utilitatian. Also to reassure people Its OK to be an EA and not just believe the only thing that matters in this universe is how much well-being there is. I think the OP has a very good point, and with this much money moving around, biodiversity funding might well be an interesting area for some people to look into.  
6
Thomas Kwa🔹
It's plausible to me that biodiversity is valuable, but with AGI on the horizon it seems a lot cheaper in expectation to do more out-there interventions, like influencing AI companies to care about biodiversity (alongside wild animal welfare), recording the DNA of undiscovered rainforest species about to go extinct, and buying the cheapest land possible (middle of Siberia or Australian desert, not productive farmland). Then when the technology is available in a few decades and we're better at constructing stable ecosystems de novo, we can terraform the deserts into highly biodiverse nature preserves. Another advantage of this is that we'll know more about animal welfare-- as it stands now the sign of habitat preservation is pretty unclear.

That’s not strictly true, a lot of animal orgs are farmer-facing and will speak to a motivation the farmer cares about (yield) while they secretly harbour another one (welfare of animals). I’ve heard that some orgs go to great lengths to hide their true intentions and sometimes even take money from their services just to appear as if they have a non-suspicious motivation.

I am actually curious why a similar approach hasn’t been tried in biodiversity—if it was just EAs yucking biodiversity (which I have seen, same as you), that’d be really disappointing.

Oh, ok, I agree, if the number of deer is the same after as counterfactually, it seems plausibly net positive yes.

Also, it's certainly not common sense that it is always better to have less beings with higher welfare. It's not common sense that a world with 10 incredibly happy people is better than one with a billion very slightly less happy people.

And not every theory that avoids the repugnant conclusion delivers this result, either. 

1
Tristan Katz
No - and I wasn't meaning to say that less beings with higher welfare is always better. Like I said, I don't think the common sense view will be philosophically satisfying.  But a second common sense view is: if there are some beings whose existence depends on harming others, then them not coming into existence is preferable.  I expect you can find some counter-example to that, but I think most people will believe this in most situations (and certainly those involving parasites).  

I agree, it is unclear whether welfare is actually positive. 

Those are fair point in themselves, but I don't think "less deer is fine, so long as they have a higher standard of living" has anything like the same commonsense standing as "we should protect people from malaria with insecticide even if the insecticide hurts insects". 

And it's not clear to me that assuming less deer is fine in itself even if their lives are good is avoiding taking a stance on the intractable philosophical debate, rather than just implicitly taking one side of it. 

3
Tristan Katz
Oh I see I'd misunderstood your point. I thought you were concerned about lowering the number of warble flies. This policy wouldn't lower the number of deer - it would maintain the population at the same level. This is for the sake of avoiding unwanted ecological effects. If you think it's better to have more deer, fair enough - but then you've got to weigh that against the very uncertain ecological consequences of having more deer (probably something like what happened in Yellowstone Nationa Park: fewer young trees, more open fields, fewer animals that depend on those trees, more erosion etc)

"A potentially lower-risk example might be the warble fly (Hypoderma), which burrows under the skin of cattle and deer, causing great discomfort, yet rarely kills its host. The warble fly is small in biomass, host-specific (so doesn't greatly affect other species), and has more limited interactions beyond its host-parasite relationship. Although it does reduce the grazing and reproductive activity of hosts, these effects are comparatively minor and could be offset with non-invasive fertility control"

Remember that it's not uncontroversial that it is prefera... (read more)

1
Tristan Katz
You're right that this is philosophically controversial. I find the debate interesting, and don't mean to dismiss it - but I also find it incredibly difficult. The challenge I see is whether such philosophical debates, ones that are totally unresolved, should inform our practical thinking and policy recommendations. Because within ordinary, day-to-day thinking, the idea that "it's preferable to have more beings with lower welfare" is controversial. If you were committed to this view, and thought insects have positive welfare (I agree with @Jim Buhler that this isn't clear), then it seems you would also have to say that the Against Malaria Foundation is doing overall bad work. Maybe you're willing to bite that bullet - but my own inclination is to assume a more common-sense view, even if philosophically incoherent, until there is something closer to a consensus on this topic.
3
Jim Buhler
...if you think welfare is net positive either way, yes. This seems like a tough case to make. I see how one can opt for agnosticism over believing net negative but I doubt there exists anything remotely close to a good case that WAW currently is net positive (and not just highly uncertain).

And it's not so much that I think I have zero evidence: I keep up with progress in AI to some degrees, I have some idea of what the remaining gaps are to general intelligence, I've seen the speed at which capabilities have improved in recent years etc. It's that how to evaluate that evidence is not obvious, and so simply presenting a skeptic with it probably won't move them, especially as the skeptic-in this case you-probably already has most of the evidence I have anyway. If it was just some random person who had never heard of AI asking why I thought the... (read more)

Yeah, I agree that in some sense saying "we should instantly reject a theory that recommends WD" doesn't not combine super-well with belief in classical U, for the reasons you give. That's compatible with classical U's problems with WD being less bad than NU's problem's with it, is all I'm saying. 

"I'm generally against this sort of appeal to authority. While I'm open to hear the arguments of smart people, we should evaluate those arguments themselves and not the people giving them. So far, I've heard no argument that would change my opinion on this matter."

I think this attitude is just a mistake if your goal is to form the most accurate credences you can. Obviously, it is always good practice to ask people for their arguments rather than only taking what they say on trust. But your evaluation of other people's arguments is fallible, and you know it... (read more)

There is a kernel of truth in this; some version of this argument is a good argument. But the devil is in the details. 

If you’re not a scientist or a person with relevant expertise and you feel inclined to disagree with the ~97-99.9% of climate scientists who think anthropogenic climate change is happening, you better adopt a boatload of epistemic humility. In practice, many non-scientists or non-experts disagree with the scientific consensus. I’m not aware of one example of such a person adopting the appropriate level of epistemic humility.

On th... (read more)

"It all comes down to the question of whether the current tech is relevant for ASI or not. In my estimation, it is not – something else entirely is required. The probability for us discovering that something else just now is low." 

I think Richard's idea is that you shouldn't have *super-high* confidence in your estimation here, but should put some non-negligible credence on the idea that it is wrong, and current progress is relevant. Why be close to certainty about a question that you probably think is hard and that other smart people disagree about b... (read more)

Thanks for your answer.

other smart people disagree

I'm generally against this sort of appeal to authority. While I'm open to hear the arguments of smart people, we should evaluate those arguments themselves and not the people giving them. So far, I've heard no argument that would change my opinion on this matter.

You seem to make a similar argument in your other comment:

[...] But when I ask myself what evidence I have for "there are not >20 similar sized jumps before AGI" I come up short. I don't necessarily think the burden of proof here is actually on p

... (read more)

It seems like if you find it incredible to deny and he doesn't, it's very hard to make further progress :(  I'm on your side about the chance being over 1% in the next decade, I think, but I don't know how I'd prove it to a skeptic, except to gesture and say that capabilities have improved loads in a short time, and it doesn't seem like the are >20 similar sized jumps before AGI. But when I ask myself what evidence I have for "there are not >20 similar sized jumps before AGI" I come up short. I don't necessarily think the burden of proof here is... (read more)

Saying for "other thoughts on why NU doesn't recommend extinction" is a bit of a misnomer here. The Knutsson argument you've just state doesn't even try to show NU doesn't recommend extinction, it just makes a case that it is part of a wider class of more popular theories that also sometimes do this. 

An obvious response to Knutsson is that it also matters in what circumstances a theory recommends extinction, and that NU probably recommends extinction in a wider variety of circumstances where other forms of consequentialism don't, including ones where ... (read more)

2
JoA🔸
Thank you so much for pushing back on my simplistic comment! I agree that my framing was misleading (I commented without even re-reading had said). Thanks for highlighting crucial considerations on counterintuitive conclusions in NU and CU.  Your comment makes me realize that an objection based on utopian situations makes sense (and I've found it reasonable in the past as a crux against NU). I guess my frustration with the use of the World Destruction Argument against NU, in the ways EAs often bring it up, is that it criticizes the fact that NU recommends extinction in our world (which contains suffering), even though CU has a decent chance of recommending extinction in our world (as soon as we determine whether wild invertebrates are living net-negative lives or not!).[1]  1. ^ Though again, if there are higher chances of astronomically good than astronomically bad futures, animal suffering is easily outweighed in CU, but not in NU (but CUs could change their mind on the empirical aspect and recommend extinction). But my impression is that this isn't what people (among non-philosophers, which includes me) are objecting to? They mostly seem to find deliberate extinction repugnant (which is something I think many views can agree upon).

Also, I notice no references to anything about the concentration of power or wealth here. Isn't that probably something we want to avoid if we want to reach a good destination, at least all things being equal?  

Even if we are bad at answering the "what would utopia look like" question, what's the reason to think we'd be any better answering the "what would viatopia look like" question? If we are just as bad or worse at answering the second question, it's either useless or actively counterproductive to switch from utopian to viatopian planning. 

N=1, but I looked at an ARC puzzle https://arcprize.org/play?task=e3721c99, and I couldn't just do it in a few minutes, and I have a PhD from the University of Oxford. I don't doubt that most of the puzzles are trivial for some humans, and some of the puzzles are trivial for most humans or that I could probably outscore any AI across the whole ARC-2 data set. But at the same time, I am a general intelligence, so being able to solve all ARC puzzles doesn't seem like a necessary criteria. Maybe this is the opposite of how doing well on benchmarks doesn't always generalize to real world tasks, and I am just dumb at these but smart overall, and the same could be true for an LLM.

 

2
Yarrow Bouchard 🔸
Ah, okay, that is tricky! I totally missed one of the rules that the examples are telling us about. Once you see it, it seems simple and obvious, but it's easy to miss. If you want to see the solution, it's here.  I believe all ARC-AGI-2 puzzles contain (at least?) two different rules that you have to combine. I forgot about that part! I was trying to solve the puzzle as if there was just one rule to figure out. I tried the next puzzle and was able to solve it right away, on the first try, keeping in mind the 'two rules' thing. These puzzles are actually pretty fun, I might do more.

 Gavi do vaccines, something that governments and other big bureaucratic orgs sure seem to handle well in other cases. Government funding for vaccines is how we eliminated smallpox, for example. I think "other vaccination programs" are a much better reference class for Gavi than the nebulous category of "social programs" in general. Indeed the Rossi piece you've linked to actually says "In the social program field, nothing has yet been invented which is as effective in its way as the smallpox vaccine was for the field of public health." I'm not sure i... (read more)

It can also be indeterminate over a short time who the winner of an election is because the deciding vote is being cast and plausibly there is at least some very short duration of time where it is indeterminate whether the process of that vote being cast is finished yet. It can be indeterminate how many animals were killed for food if one animal was killed for multiple reasons of which "to eat" was one reason but not the major one. Etc. etc. 

Also, pretty much every empirical outcome is potentially fuzzy in some possible situation. (Elections can be tied, whether an experience is painful and hence harm or neutral can be unclear etc.)

1
tobycrisford 🔸
Maybe, although an election being tied is about the only way that particular example can be fuzzy, and there is a well defined process for what happens in that situation (like flipping a coin). There is ultimately only one winner, and it is possible for a single vote to make the difference. Whether an experience is painful or not is extremely unclear, but if your metric is just something like "number of animals killed for meat each year" then again that is something well defined and precise, and it must in principle be possible to change it with an individual purchase.

Thanks, I get what you meant now.

The relatively more orthodox view amongst philosophers about the heap case is roughly there is a kind of ambiguous region of successive ns where it is neither true nor false that n grains make a heap. This is a very, very technical literature though, so possibly that characterization isn't quite right. None of the solutions are exactly great though, and some experts do think there is an exact "n" where some grains become a heap. 

I don't understand the bit about the IVT at all, absent further spelling out of the reasoning. 

What's the connection to the paradox? The sorites is far from trivial or solved. 

5
tobycrisford 🔸
Ironically I might also be guilty of using some technical terminology incorrectly here! I had in mind the discussion on valuing actions with imperceptible effects from the "Five Mistakes in Moral Mathematics" chapter in Reasons+Persons (relevant to all the examples mentioned in the IVT section of this post), where if I remember right Parfit makes an explicit comparison with the "paradox of the heap" (I think this is where I first came across the term). It feels the same in that for both cases we have a function from natural numbers (number of grains of sand in our potential heap, or number of people voting/buying meat) to some other set (boolean 'heap' vs 'not heap', or winner of election, or number of animals harmed). And the point is that mathematically, this function must at some point change with the addition of a single +1 to the input, or it can never change at all. Moreover, the sum of the expected value of lots of potential additions must equal the expected value of all of them being applied together, so that if the collective has a large effect, the individual effects can't be smaller, on average, than the collective effect divided by the number of consituents. I suppose the point is that this paradox is non-trivial and possibly unsolved when the output is fuzzy (like whether some grains of sand are a heap or not) but trivially true when the output is precise or quantitative (like who wins an election or how many animals are harmed)?

To be fair to Richard, there is a difference between a) stating your own personal probability in time of perils and b) making clear that for long-termist arguments to fail solely because they rely on time of perils, you need it to have  extremely low probability, not just low, at least if you accept the expected value theory and subjective probability estimates can legitimately be applied at all here, as you seemed to be doing for the sake of making an internal critique. I took it to be the latter that Richard was complaining your paper doesn't do.&nb... (read more)

Fair point, when I re-checked the paper, it doesn't clearly and explicitly display knowledge of the point you are making. I still highly doubt that Thorstad really misunderstands it though. I think he was probably just not being super-careful. 

I am far from sure that Thorstad is wrong that time of perils should be assigned ultra-low probability. (I do suspect he is wrong, but this stuff is extremely hard to assess.) But in my view there are multiple pretty obvious reasons why "time of Carols" is a poor analogy to "time of perils":

  1. "Time of carols" is just way more specific, in a bad way than time of perils. I know that there are indefinitely many ways time of carols could happen if you get really fine-grained, but it nonetheless, intuitively, there is in some sense way more significantly differen
... (read more)

Obviously David, as a highly trained moral philosopher with years of engagement with EA understands how expected value works though. I think the dispute must really be about whether to assign time of perils very low credence. (A dispute where I would probably side with you if "very low" is below say 1 in 10,000). 

9
Richard Y Chappell🔸
There's "understanding" in the weak sense of having the info tokened in a belief-box somewhere, and then there's understanding in the sense of never falling for tempting-but-fallacious inferences like those I discuss in my post. Have you read the paper I was responding to? I really don't think it's at all "obvious" that all "highly trained moral philosophers" have internalized the point I make in my blog post (that was the whole point of my writing it!), and I offered textual support. For example, Thorstad wrote: "the time of perils hypothesis is probably false. I conclude that existential risk pessimism may tell against the overwhelming importance of existential risk mitigation." This is a strange thing to write if he recognized that merely being "probably false" doesn't suffice to threaten the longtermist argument!  (Edited to add: the obvious reading is that he's making precisely the sort of "best model fallacy" that I critique in my post: assessing which empirical model we should regard as true, and then determining expected value on the basis of that one model. Even very senior philosophers, like Eric Schwitzgebel, have made the same mistake.) Going back to the OP's claims about what is or isn't "a good way to argue," I think it's important to pay attention to the actual text of what someone wrote. That's what my blog post did, and it's annoying to be subject to criticism (and now downvoting) from people who aren't willing to extend the same basic courtesy to me.

I think my basic reaction here is that longtermism is importantly correct about the central goal of EA if there are longtermist interventions that are actionable, promising and genuinely longtermist in the weak sense of "better than any other causes because of long-term effects", even if there are zero examples of LT interventions that meet the "novelty" criteria, or lack some significant near-term benefits. 

Firstly, I'd distinguish here between longtermism as a research program, and longtermism as a position about what causes should be prioritized ri... (read more)

3
Yarrow Bouchard 🔸
Whether society ends up spending, in the end, more money on asteroid defense or, possibly, more money on monitoring large volcanoes, is orders of magnitude more important than whether people in the EA community (or outside of it) understand the intellectual lineage of these ideas and how novel or non-novel they are. I don't know if that's exactly what you were saying, but I'm happy to concede that point anyway. To be clear, NASA's NEO Surveyor mission is one of the things I'm most excited about in the world. It makes me feel so happy thinking about it. And exposure to Bostrom's arguments from the early 2000s to the early 2010s is a major part of what convinced me that we, as a society, were underrating low-probability, high-impact risks. (The Canadian journalist Dan Gardner's book Risk also helped convince me of that, as did other people I'm probably forgetting right now.) Even so, I still think it's important to point out ideas are not novel or not that novel if they aren't, for all the sorts of reasons you would normally give to sweat the small stuff, and not let something slide that, on its own, seems like an error or a bit of a problem, just because it might plausibly benefit the world in some way. It's a slippery slope, for one... I may not have made this clear enough in the post, but I completely agree that if, for example, asteroid defense is not a novel idea, but a novel idea, X, tells you that you should spend 2x more money on asteroid defense, then spending 2x more on asteroid defense counts as a novel X-ist intervention. That's an important point, I'm glad you made it, and I probably wasn't clear enough about it. However, I am making the case that all the compelling arguments to do anything differently, including spend more on asteroid defense, or re-prioritize different interventions, were already made long before "longtermism" was coined. If you want to argue that "longtermism" was a successful re-branding of "existential risk", with some mistakes

Year of AGI

25 years seems about right to me, but with huge uncertainty. 

I think on the racism fron Yarrow is referring to the perception that the reason Moskowtiz won't fund rationalist stuff is because either he thinks that a lot of rationalist believe Black people have lower average IQs than whites for genetic reasons, or he thinks that other people believe that and doesn't want the hassle. I think that belief genuinely is quite common among rationalists, no? Although, there are clearly rationalists who don't believe it, and most rationalists are not right-wing extremists as  far as I can tell. 

-9
Yarrow Bouchard 🔸

What have EA funders done that's upset you? 

Not everything being funded here even IS alignment techniques, but also, insofar as you just want general better understanding of AI as a domain through science, why wouldn't you learn useful stuff from applying techniques to current models. If the claim is that current models are too different from any possible AGI for this info to be useful, why do you think "do science" would help prepare for AGI at all? Assuming you do think that, which still seems unclear to me. 

2
Yarrow Bouchard 🔸
You might learn useful stuff about current models from research on current models, but not necessarily anything useful about AGI (except maybe in the slightest, most indirect way). For example, I don't know if anyone thinks if we had invested 100x or 1,000x more into research on symbolic AI systems 30 years ago, that we would know meaningfully more about AGI today. So, as you anticipated, the relevance of this research to AGI depends on an assumption about the similar between a hypothetical future AGI and current models. However, even if you think AGI will be similar to current models, or it might be similar, there might be no cost to delaying research related to alignment, safety, control, preparedness, value lock-in, governance, and so on until more fundamental research progress on capabilities has been made. If in five or ten or fifteen years or whatever we understand much better how AGI will be built, then a single $1 million grant to a few researchers might produce more useful knowledge about alignment, safety, etc. than Dustin Moskovitz's entire net worth would produce today if it were spend on research into the same topics.  My argument about "doing basic science" vs. "mitigating existential risk" is that these collapse into the same thing unless you make very specific assumptions about which theory of AGI is correct. I don't think those assumptions are justifiable. Put it this way: let's say we are concerned that, for reasons due to fundamental physics, the universe might spontaneously end. But we also suspect that, if this is true, there may be something we can do to prevent it. What we want to know is a) if the universe is in danger in the first place, b) if so, how soon, and c) if so, what we can do about it. To know any of these three things, (a), (b), or (c), we need to know which fundamental theory of physics is correct, and what the fundamental physical properties of our universe are. Problem is, there are half a dozen competing versions of string

I asked about genuine research creativity not AGI, but I don't think this conversation is going anywhere at this point. It seems obvious to me that "does stuff mathematicians say makes up the building blocks of real research" is meaningful evidence that the chance that models will do research level maths in the near future is not ultra-low, given that capabilities do increase with time. I don't think this analogous to IQ tests or the bar exam, and for other benchmarks, I would really need to see what your claiming is the equivalent of the transfer from frontier math 4 to real math that was intuitive but failed. 

2
Yarrow Bouchard 🔸
What percentage probability would you assign to your ability to accurately forecast this particular question? I'm not sure why you're interested in getting me to forecast this. I haven't ever made any forecasts about AI systems' ability to do math research. I haven't made any statements about AI systems' current math capabilities. I haven't said that evidence of AI systems' ability to do math research would affect how I think about AGI. So, what's the relevance? Does it have a deeper significance, or is it just a random tangent? If there is a connection to the broader topic of AGI or AI capabilities, I already gave a bunch of examples of evidence I would consider to be relevant and that would change my mind. Math wasn't one of them. I would be happy to think of more examples as well.  I think a potentially good counterexample to your argument about FrontierMath → original math research is natural language processing → replacing human translators. Surely you would agree that LLMs have mastered the basic building blocks of translation? So, 2-3 years after GPT-4, why is demand for human translators still growing? One analysis claims that growth is counterfactually less that it would have been without the increase in the usage of machine translation, but demand is still growing.  I think this points to the difficulty in making these sorts of predictions. If back in 2015, someone had described to you the capabilities and benchmark performance of GPT-4 in 2023, as well as the rate of scaling of new models and progress on benchmarks, would you have thought that demand for human translators would continue to grow for at least the next 2-3 years?   I don't have any particular point other than what seems intuitively obvious in the realm of AI capabilities forecasting may in fact be false, and I am skeptical of hazy extrapolations. The most famous example of a failed prediction of this sort is Geoffrey Hinton’s prediction in 2016 that radiologists’ jobs would be fully au

The forum is kind of a bit dead generally, for one thing. 

I don't really get on what grounds your are saying that the Coefficient Grants are not to people to do science, apart from the governance ones. I also think you are switching back and forth between: "No one knows when AGI will arrive, best way to prepare just in case is more normal AI science" and "we know that AGI is far, so there's no point doing normal science to prepare against AGI now, although there might be other reasons to do normal science." 

2
Yarrow Bouchard 🔸
If we don’t know which of infinite or astronomically many possible theories about AGI are more likely to be correct than the others, how can we prepare? Maybe alignment techniques conceived based on our current wrong theory make otherwise benevolent and safe AGIs murderous and evil on the correct theory. Or maybe they’re just inapplicable. Who knows?

I guess I still just want to ask: If models hit 80% on frontier math by like June 2027, how much does that change your opinion on whether models will be capable of "genuine creativity" in at least one domain by 2033. I'm not asking for an exact figure, just a ballpark guess. If the answer is "hardly at all", is there anything short of an 100% clear example of a novel publishable research insight in some domain, that would change your opinion on when "real creativity" will arrive? 

0
Yarrow Bouchard 🔸
What I just said: AI systems acting like a toddler or a cat would make me think AGI might be developed soon. I’m not sure FrontierMath is any more meaningful than any other benchmark, including those on which LLMs have already gotten high scores. But I don’t know.

I think what you are saying here is mostly reasonable, even if I am not sure how much I agree: it seems to turn on very complicated issue in the philosophy of probability/decision theory, and what you should do when accurate prediction is hard, and exactly how bad predictions have to be to be valueless. Having said that, I don't think your going to succeed in steering conversation away from forecasts if you keep writing about how unlikely it is that AGI will arrive near term. Which you have done a lot, right? 

I'm genuinely not sure how much EA funding... (read more)

2
Yarrow Bouchard 🔸
I don’t really know all the specifics of all the different projects and grants, but my general impression is that very little (if any) of the current funding makes sense or can be justified if the goal is to do something useful about AGI (as opposed to, say, make sure Claude doesn’t give risky medical advice). Absent concerns about AGI, I don’t know if Coefficient Giving would be funding any of this stuff. To make it a bit concrete, there at least five different proposed pathways to AGI, and I imagine the research Coefficient Giving is only relevant to one of the five pathways, if it’s even relevant to that one. But the number five is arbitrary here. The actual decision-relevant number might be a hundred, or a thousand, or a million, or infinity. It just doesn’t feel meaningful or practical to try to map out the full space of possible theories of how the mind works and apply the precautionary principle against the whole possibility space. Why not just do science instead? By word count, I think I’ve written significantly more about object-level technical issues relevant to AGI than directly about AGI forecasts or my subjective guesses of timelines or probabilities. The object-level technical issues are what I’ve tried to emphasize. Unfortunately, commenters seem fixated on surveys, forecasts, and bets, and don’t seem to be as interested in the object-level technical topics. I keep trying to steer the conversation in a technical direction. But people keep wanting to steer it back toward forecasting, subjective guesses, and bets. For example, I wrote a 2,000-word post called "Unsolved research problems on the road to AGI". There are two top-level comments. The one with the most karma proposes a bet. My post "Frozen skills aren’t general intelligence" mainly focuses on object-level technical issues, including some of the research problems discussed in the other post. You have the top comment on that post (besides SummaryBot) and your comment is about a forecasting s

I guess I feel like if being able to solve mathematical problems designed by research mathematicians to be similar to the kind of problems they solve in their actual work is not decent evidence that AIs are on track to be able to do original research in mathematics in less than say 8 years then what would you EVER accept as empirical evidence that we are on track for that, but not there yet?  

Note that I am not saying this should push your overall confidence to over 50% or anything, just that it ought to move you up by a non-trivial amount relative to... (read more)

2
Yarrow Bouchard 🔸
I am not breaking new ground by saying it would be far more interesting to see an AI system behave like a playful, curious toddler or a playful, curious cat than a mathematician. That would be a sign of fundamental, paradigm-shifting capabilities improvement and would make me think maybe AGI is coming soon. I agree that IQ tests were designed for humans, not machines, and that’s a reason to think it’s a poor test for machines, but what about all the other tests that were designed for machines? GPT-4 scored quite high on a number of LLM benchmarks in March 2023. Has enough time passed that we can say LLM benchmark performance doesn’t meaningfully translate into real world capabilities? Or do we have to reserve judgment for some number of years still? If your argument is that math as a domain is uniquely well-suited to the talents of LLMs, that could be true. I don’t know. Maybe LLMs will become an amazing AI tool for math, similar to AlphaFold for protein structure prediction. That would certainly be interesting, and would be exciting progress for AI. I would say this argument is highly irreducibly uncertain and approaches the level of uncertainty of something like guessing whether the fundamental structure of physical reality matches the fundamental mathematical structure of string theory. I’m not sure it’s meaningful to assign probabilities to that. It also doesn’t seem like it would be particularly consequential outside of mathematics, or outside of things that mathematical research directly affects. If benchmark performance in other domains doesn’t generalize to research, but benchmark performance in math does generalize to math research, well, then, that affects math research and only math research. Which is really interesting, but would be a breakthrough akin to AlphaFold — consequential for one domain and not others. You said that my argument against accepting FrontierMath performance as evidence for AIs soon being able to perform original math research i

"Rob Wiblin opines that the fertility crash would be a global priority if not for AI likely replacing human labor soon and obviating the need for countries to have large human populations"

This is a case where it really matters whether you are giving an extremely high chance that AGI is coming within 20-30 years, or merely a decently high chance. If you think the chance is like 75%, and the claim that conditional on no AGI, low fertility would be a big problem is correct, then the problem is only cut by 4x, which is compatible with it still being large and ... (read more)

I'm not actually that interested in defending:

  1. The personal honor of Yudkowsky, who I've barely read and don't much like, or his influence on other people's intellectual style. I am not a rationalist, though I've met some impressive people who probably are.
  2. The specific judgment calls and arguments made in AI 2027.
  3. Using the METR graph to forecast superhuman coders (even if I probably do think this is MORE reasonable than you do; but I'm not super-confident about its validity as a measure of real-world coding. But I was not trying to describe how I personally
... (read more)
8
Yarrow Bouchard 🔸
Strong upvoted. Thank you for clarifying your views. That’s helpful. We might be getting somewhere. With regard to AI 2027, I get the impression that a lot of people in EA and in the wider world were not initially aware that AI 2027 was an exercise in judgmental forecasting. The AI 2027 authors did not sufficiently foreground this in the presentation of their "results". I would guess there are still a lot of people in EA and outside it who think AI 2027 is something more rigorous, empirical, quantitative, and/or scientific than a judgmental forecasting exercise. I think this was a case of some people in EA being fooled or tricked (even if that was not the authors’ intention). They didn’t evaluate the evidence they were looking at properly. You were quick to agree with my characterization of AI 2027 as a forecast based on subjective intuitions. However, in one previous instance on the EA Forum, I also cited nostalgebraist’s eloquent post and made essentially the same argument I just made, and someone strongly disagreed. So, I think people are just getting fooled, thinking that evidence exists that really doesn’t. What does the forecasting literature say about long-term technology forecasting? I’ve only looked into it a little bit, but generally technology forecasting seems really inaccurate, and the questions forecasters/experts are being asked in those studies seem way easier than forecasting something like AGI. So, I’m not sure there is a credible scientific basis for the idea of AGI forecasting. I have been saying from the beginning and I’ll say once again that my forecast of the probability and timeline of AGI is just a subjective guess and there’s a high level of irreducible uncertainty here. I wish that people would stop talking so much about forecasting and their subjective guesses. This eats up an inordinate portion of the conversation, despite its low epistemic value and credibility. For months, I have been trying to steer the conversation away from fore

My thought process didn't go beyond "Yarrow seems committed to a very low chance of AI having real, creative research insights in the next few years, here is something that puts some pressure on that". Obviously I agree that when AGI will arrive is a different question from when models will have real insights in research mathematics. Nonetheless I got the feeling-maybe incorrectly, that your strength of conviction that AGI is partly based on things like "models in the current paradigm can't have 'real insight'", so it seemed relevant, even though "real ins... (read more)

2
Yarrow Bouchard 🔸
I have no idea when AI systems will be able to do math research and generate original, creative ideas autonomously, but it will certainly be very interesting if/when they do. It seems like there’s not much of a connection between the FrontierMath benchmark and this, though. LLMs have been scoring well on question-and-answer benchmarks in multiple domains for years and haven’t produced any original, correct ideas yet, as far as I’m aware. So, why would this be different? LLMs have been scoring above 100 on IQ tests for years and yet can’t do most of the things humans who score above 100 on IQ tests can do. If an LLM does well on math problems that are hard for mathematicians or math grad students or whatever, that doesn’t necessarily imply it will be able to do the other things, even within the domain of math, that mathematicians or math grad students do. We have good evidence for this because LLMs as far back as GPT-4 nearly 3 years ago have done well on a bunch of written tests. Despite there being probably over 1 billion regular users of LLMs and trillions of queries put to LLMs, there’s no indication I’m aware of an LLM coming up with a novel, correct idea of any note in any academic or technical field. Is there a reason to think performance on the FrontierMath benchmark would be different than the trend we’ve already seen with other benchmarks over the last few years? The FrontierMath problems may indeed require creativity from humans to solve them, but that doesn’t necessarily mean solving them is a sign of creativity from LLMs. By analogy, playing grandmaster-level chess may require creativity from humans, but not from computers. This is related to an old idea in AI called Moravec’s paradox, which warns us not to assume what is hard for humans is hard for computers, or what is easy for humans is easy for computers.

Working on AI isn't the same as doing EA work on AI to reduce X-risk. Most people working in AI are just trying to make the AI more capable and reliable. There probably is a case for saying that "more reliable" is actually EA X-risk work in disguise, even if unintentionally, but it's definitely not obvious this is true. 

4
Denkenberger🔸
I agree, though I think the large reduction in EA funding for non-AI GCR work is not optimal (but I'm biased with my ALLFED association).

"Any sort of significant credible evidence of a major increase in AI capabilities, such as LLMs being able to autonomously and independently come up with new correct ideas in science, technology, engineering, medicine, philosophy, economics, psychology, etc"

Just in the spirit of pinning people to concrete claims: would you count progress on Frontier Math 4, like say, models hitting 40%*, as being evidence that this is not so far off for mathematics specifically? (To be clear, I think it is very easy to imagine models that are doing genuinely significant re... (read more)

2
Yarrow Bouchard 🔸
I wonder if you noticed that you changed the question? Did you not notice or did you change the question deliberately? What I brought up as a potential form of important evidence for near-term AGI was: You turned the question into: Now, rather than asking me about the evidence I use to forecast near-term AGI, you’re asking me to forecast the arrival of the evidence I would use for forecasting near-term AGI? Why?

Yeah, it's fair objection that even answer the why question like I did presupposes that EAs are wrong, or at least, merely luckily right. (I think this is a matter of degree, and that EAs overrated the imminence of AGI and the risk of takeover on average, but it's still at least reasonable to believe AI safety and governance work can have very high expected value for roughly the reasons EAs do.) But I was responding to Yarrow who does think that EAs are just totally wrong, so I guess really I was saying that "conditional on a sociological explanation being appropriate, I don't think it's as LW-driven as Yarrow thinks", although LW is undoubtedly important.)

4
Linch
Right, to be clear I'm far from certain that the stereotypical "EA view" is right here.  Sure that makes a lot of sense! I was mostly just using your comment to riff on a related concept.  I think reality is often complicated and confusing, and it's hard to separate out contingency vs inevitable stories for why people believe what they believe. But I think the correct view is that EAs' belief on AGI probability and risk (within an order of magnitude or so)  is mostly not contingent (as of the year 2025) even if it turns out to be ultimately wrong. The Google ads example was the best example I could think of to illustrate this. I'm far from certain that Google's decision to use ads was actually the best source of long-term revenue (never mind being morally good lol). But it still seemed like the internet as we understand it meant it was implausible that Google ads was counterfactually due to their specific acquisitions. Similarly, even if EAs ignored AI before for some reason, and never interacted with LW or Bostrom, it's implausible that, as of 2025, people who are concerned with ambitious, large-scale altruistic impact (and have other epistemic, cultural, and maybe demographic properties characteristic of the movement) would not think of AI as a big deal. AI is just a big thing in the world that's growing fast. Anybody capable of reading graphs can see that. That said, specific micro-level beliefs (and maybe macro ones) within EA and AI risk might be different without influence from either LW or the Oxford crowd. For example there might be a stronger accelerationist arm. Alternatively, people might be more queasy with the closeness with the major AI companies, and there will be a stronger and more well-funded contingent of folks interested in public messaging on pausing or stopping AI. And in general if the movement didn't "wake up" to AI concerns at all pre-ChatGPT I think we'd be in a more confused spot.
Load more