Yarrow Bouchard 🔸

1299 karmaJoined Canadastrangecosmos.substack.com

Bio

Pronouns: she/her or they/them. 

I got interested in effective altruism back before it was called effective altruism, back before Giving What We Can had a website. Later on, I got involved in my university EA group and helped run it for a few years. Now I’m trying to figure out where effective altruism can fit into my life these days and what it means to me.

Sequences
2

Criticism of specific accounts of imminent AGI
Skepticism about near-term AGI

Comments
531

Topic contributions
2

I’ll say just a little bit more on the topic of the precautionary principle for now. I have a complex multi-part argument on this, which will take some explaining that I won’t try to do here. I have covered a lot of this in some previous posts and comments. The main three points I’d make in relation to the precautionary principle and AGI risk are:

  • Near-term AGI is highly unlikely, much less than a 0.05% chance in the next decade

  • We don’t have enough knowledge of how AGI will be built to usefully prepare now

  • As knowledge of how to build AGI is gained, investment into preparing for AGI becomes vastly more useful, such that the benefits of investing resources into preparation at higher levels of knowledge totally overwhelm the benefits of investing resources at lower levels of knowledge

The point of the FTX comparison is that, in the wake of the FTX collapse, many people in EA were eager to reflect on the collapse and try to see if there were any lessons for EA. In the wake of the AI bubble popping, people in EA could either choose to reflect in a similar way, or they could choose not to. The two situations are analogous insofar as they are both financial collapses and both could lead to soul-searching. They are disanalogous insofar as the AI bubble popping won’t affect EA funding and won’t associate EA in the public’s mind with financial crimes or a moral scandal. 

It’s possible in the wake of the AI bubble popping, nobody in EA will try to learn anything. I fear that possibility. The comparisons I made to Ray Kurzweil and Elon Musk show that it is entirely possible to avoid learning anything, even when you ought to. So, EA could go multiple different ways with this, and I’m just saying what I hope will happen is the sort of reflection that happened post-FTX.

If the AI bubble popping wouldn’t convince you that EA’s focus on near-term AGI has been a mistake — or at least convince you to start seriously reflecting on whether it has been or not — what evidence would convince you? 

I think it’s fair to criticize Yudkowsky and Soares’ belief that there is a very high probability of AGI being created within ~5-20 years because that is a central part of their argument. The purpose of the book is to argue for an aggressive global moratorium on AI R&D. For such a moratorium to make sense, probabilities need to be high and timelines need to be short. If Yudkowsky and Soares believed there was an extremely low chance of AGI being developed within the next few decades, they wouldn’t be arguing for the moratorium. 

So, I think Oscar is right to notice and critique this part of their argument. I don’t think it’s fair to say Oscar is critiquing a straw man. 

You can respond with a logical, sensible appeal to the precautionary principle: shouldn’t we prepare anyway, just in case? First, I would say that even if this is the correct response, it doesn’t make Oscar’s critique wrong or not worth making. Second, I think arguments around whether AGI will be safe or unsafe, easy or hard to align, and what to do to prepare for it — these arguments depend on how specific assumptions on how AGI will be built. So, this is not actually a separate question from the topic Oscar raised in this post. 

It would be nice if there were something we could do just in case, to make any potential future AGI system safer or easier to align, but I don’t see how we can do this in advance of knowing what technology or science will be used to build AGI. So, the precautionary principle response doesn’t add up, either, in my view.

Eliezer Yudkowsky forecasts a 99.5% chance of human extinction from AGI "well before 2050", unless we implement his proposed aggressive global moratorium on AI R&D. Yudkowsky deliberately avoids giving more than a vague forecast on AGI, but he often strongly hints at a timeline. For example, in December 2022, he tweeted:

Pouring some cold water on the latest wave of AI hype:  I could be wrong, but my guess is that we do *not* get AGI just by scaling ChatGPT, and that it takes *surprisingly* long from here.  Parents conceiving today may have a fair chance of their child living to see kindergarten.

In April 2022, when Metaculus’ forecast for AGI was in the 2040s and 2050s, Yudkowsky harshly criticized Metaculus for having too long a timeline and not updating it downwards fast enough.

In his July 2023 TED Talk, Yudkowsky said:

At some point, the companies rushing headlong to scale AI will cough out something that's smarter than humanity. Nobody knows how to calculate when that will happen. My wild guess is that it will happen after zero to two more breakthroughs the size of transformers.

In March 2023, during an interview with Alex Fridman, Fridman asked Yudkowsky what advice he had for young people. Yudkowsky said

Don’t expect it to be a long life. Don’t put your happiness into the future. The future is probably not that long at this point.

In that segment, he also said, "we are not in the shape to frantically at the last minute do decades’ worth of work."

After reading these examples, do you still think Yudkowsky only believes that AGI is "not unlikely to be built in the future", "if not in 5 then maybe in 50 years"?
 

I can’t thank titotal enough for writing this post and for talking to the Forecasting Research Institute about the error described in this post.

I’m also incredibly thankful to the Forecasting Research Institute for listening to and integrating feedback from me and, in this case, mostly from titotal. It’s not nothing to be responsive to criticism and correction. I can only express appreciation for people who are willing to do this. Nobody loves criticism, but the acceptance of criticism is what it takes to move science, philosophy, and other fields forward. So, hallelujah for that.

I want to be clear that, as titotal noted, we’re just zeroing in here on one specific question discussed in the report, out of 18 total. It is an unfortunate thing that you can work hard on something that is quite large in scope and it can be almost entirely correct (I haven’t reviewed the rest of the report, but I’ll give the benefit of the doubt), but then the discussion focuses around the one mistake you made. I don’t want research or writing to be a thankless task that only elicits criticism, and I want to be thoughtful about how to raise criticism in the future.

For completeness, to make sure readers have a full understanding, I actually made three distinct and independent criticisms of this survey question and how it was reported. First, I noted that the probability of the rapid scenario was reported as an unqualified probability, rather than the probability of the scenario being the best matching of the three — “best matching” is the wording the question used. The Forecasting Research Institute was quick to accept this point and promise to revise the report.

Second, I raised the problem around the intersubjective resolution/metaprediction framing that titotal describes in this post. After a few attempts, I passed the baton to titotal, figuring that titotal’s reputation and math knowledge would make them more convincing. The Forecasting Research Institute has now revised the report in response, as well as their EA Forum post about the report.

Third, the primary issue I raised in my original post on this topic is about a potential anchoring effect or question wording bias with the survey question.[1] The slow progress scenario is extremely aggressive and optimistic about the amount of progress in AI capabilities between now and the end of 2030. I would personally guess the probability of AI gaining the sort of capabilities described in the slow progress scenario by the end of 2030 is significantly less than 0.1% or 1 in 1,000. I imagine most AI experts would say it’s unlikely, if presented with the scenario in isolation and asked directly about its probability.

For example, here is what is said about household robots in the slow progress scenario:

By the end of 2030 in this slower-progress future, AI is a capable assisting technology for humans; it can … conduct relatively standard tasks that are currently (2025) performed by humans in homes and factories.

Also:

Meanwhile, household robots can make a cup of coffee and unload and load a dishwasher in some modern homes—but they can’t do it as fast as most humans and they require a consistent environment and occasional human guidance.

Even Metaculus, which is known to be aggressive and optimistic about AI capabilities, and which is heavily used by people in the effective altruist community and the LessWrong community, where belief in near-term AGI is strong, puts the median date for the question “When will a reliable and general household robot be developed?” in mid-2032. The resolution criteria for the Metaculus question are compatible with the sentence in the slow progress scenario, although those criteria also stipulate a lot of details that are not stipulated in the slow progress scenario.

An expert panel surveyed in 2020 and 2021 was asked, “[5/10] years from now, what percentage of the time that currently goes into this task can be automated?” and answered 47% for dish washing in 10 years, so in 2030 or 2031. I find this to be a somewhat confusing framing — what does it mean for 47% of the time involved in dish washing to be automated? — but it points to the baseline scenario in the LEAP survey involving contested questions and not just things we can take for granted.

Adam Jonas, a financial analyst at Morgan Stanley who has a track record of being extremely optimistic about AI and robotics (sometimes mistakenly so), and who the financial world interprets as having aggressive, optimistic forecasts, predicts that a “general-purpose humanoid” robot for household chores will require “technological progress in both hardware and AI models, which should take about another decade”, meaning around 2035. So, on Wall Street, even an optimist seems to be less optimistic than the LEAP survey’s slow progress scenario.

If the baseline scenario is more optimistic about AI capabilities progress than Metaculus, the results of a previous expert survey, and a Wall Street analyst on the optimistic end of the spectrum, then it seems plausible that the baseline scenario is already more optimistic than what the LEAP panelists would have reported as their median forecast if they had been asked in a different way. It seems way too aggressive as a baseline scenario. This makes it hard to know how to to interpret the panelists' answers (in addition to the interpretative difficulty raised by the problem described in titotal's post above).

  1. ^

    I have also used the term “framing effect” to describe this before — following the Forecasting Research Institute and AI Impacts — but now checking again the definition of that term in psychology, it seems to specifically refer to framing the same information as positive or negative, which doesn’t apply here.

Update #2: titotal has published a full breakdown of the error involving the intersubjective resolution/metaprediction framing of the survey question. It’s a great post that explains the error very well. Many thanks to titotal for taking the time to write the post and for talking to the Forecasting Research Institute about this. Thanks again to the Forecasting Research Institute for revising the report and this post.

Do you stand by your accusation of bad faith?

Your accusation of bad faith seems to rest on your view that the restraints imposed by the laws of physics on space travel make an alien invasion or attack extremely improbable. Such an event may indeed be extremely improbable, but the laws of physics do not say so.

I have to imagine that you are referring to the speeds of spacecraft and the distances involved. The Milky Way Galaxy is 100,000 light-years in diameter organized along a plane in a disc shape that is 1,000 light-years thick. NASA’s Parker Space Probe has travelled at 0.064% the speed of light. Let’s round it to 0.05% of the speed of light for simplicity. At 0.05% the speed of light, the Parker Space Probe could travel between the two farthest points in the Milky Way Galaxy in 200 million years.

That means that if the maximum speed of spacecraft in the galaxy were limited to only the top speed of NASA’s fastest space probe today, an alien civilization that reached an advanced stage of science and technology — perhaps including things like AGI, advanced nanotechnology/atomically precise manufacturing, cheap nuclear fusion, interstellar spaceships, and so on — more than 200 million years ago would have had plenty of time to establish a presence in every star system of the Milky Way. At 1% the speed of light, the window of time shrinks to 10 million years, and so on.

Designs for spacecraft that credible scientists and engineers thought Earth could actually build in the near future include a light sail-based probe that would supposedly travel at 15-20% the speed of light. Such a probe could traverse the diameter of the Milky Way in under 1 million years at top speed. Acceleration and deceleration complicate the picture somewhat, but the fundamental idea still holds.

If there are alien civilizations in our galaxy, we don’t have any clear, compelling scientific reason to think they wouldn’t be many millions of years older than our civilization. The Earth formed 4.5 billion years ago, so if a habitable planet elsewhere in the galaxy formed just 10% sooner and put life on that planet on the same trajectory as on ours, the aliens would be 450 million years ahead of us. Plenty of time to reach everywhere in the galaxy.

The Fermi paradox has been considered and discussed by people working in physics, astronomy, rocket/spacecraft engineering, SETI, and related fields for decades. There is no consensus on the correct resolution to the paradox. Certainly, there is no consensus that the laws of physics resolve it.

So, if I’m understanding your reasoning correctly — that surely I must be behaving in a dishonest or deceitful way, i.e. engaging in bad faith, because obviously everyone knows the restraints imposed by the laws of physics on space travel make an alien attack on Earth extremely improbable — then your accusation of bad faith seems to rest on a mistake.

Thanks for giving me the opportunity to talk about this because the Fermi paradox is always so much fun to talk about.

My list is very similar to yours. I believe items 1, 2, 3, 4, and 5 have already been achieved to substantial degrees and we continue to see progress in the relevant areas on a quarterly basis. I don't know about the status of 6.

It’s hard to know what "to substantial degrees" means. That sounds very subjective. Without the "to substantial degrees" caveat, it would be easy to prove that 1, 3, 4, and 5 have not been achieved, and fairly straightforward to make a strong case that 2 has not been achieved.

For example, it is simply a fact that Waymo vehicles have a human in the loop — Waymo openly says so — so Waymo has not achieved Level 4/5 autonomy without a human in the loop. Has Waymo achieved Level 4/5 autonomy without humans in the loop "to a substantial degree"? That seems subjective. I don’t know what "to a substantial degree" means to you, and it might mean something different to me, or to other people.

Humanoid robots have not achieved any profitable new applications in recent years, as far as I’m aware. Again, I don’t know what achieving this "to a substantial degree" might mean to you.

I would be curious to know what progress you think has been made recently on the fundamental research problems I mentioned, or what the closest examples are to LLMs engaging in the sort of creative intellectual act I described. I imagine the examples you have in mind are not something the majority of AI experts would agree fit the descriptions I gave.

For clarity on item 1, AI company revenues in 2025 are on track to cover 2024 costs, so on a product basis, AI models are profitable; it's the cost of new models that pull annual figures into the red. I think this will stop being true soon, but that's my speculation, not evidence, so I remain open that scaling will continue to make progress towards AGI, potentially soon.

Distinguish here between gold mining vs. selling picks and shovels. I’m talking about applications of LLMs and AI tools that are profitable for end users. Nvidia is extremely profitable because it sells GPUs to AI companies. In theory, in a hypothetical scenario, AI companies could become profitable by selling AI models as a service (e.g. API tokens, subscriptions) to businesses. But then would those business customers see any profit from the use of LLMs (or other AI tools)? That’s what I’m talking about. Nvidia is selling picks and shovels, and to some extent even the AI companies are selling picks and shovels. Where’s the gold?

The six-item list I gave was a list of some things that — each on their own but especially in combination — would go a long way toward convincing me that I’m wrong and my near-term AGI skepticism is a mistake. When you say your list is similar, I’m not quite sure what you mean. Do you mean that if those things didn’t happen, that would convince you that the probability or level of credence you assign to near-term AGI is way too high? I was trying to ask you what evidence would convince you that you’re wrong.

This is directly answered in the post. Edit: Can you explain why you don’t find what is said about this in the post satisfactory?

Load more