B

BenjaminTereick

127 karmaJoined

Comments
13

I don't think this is an accurate summary of the disagreement, but I've tried to clarify my point twice already, so I'm going to leave it at that.

1. Are you referring to your exchange with David Mathers here?

3. I'm not sure what you're saying here. Just to clarify what my point is: you're arguing in the post that the slow scenario actually describes big improvements in AI capabilities. My counterpoint is that this scenario is not given a lot of weight by the respondents, suggesting that they mostly don't agree with you on this.

Thanks for the replies! 

  1. I agree that the scenario question could have been phrased better in hindsight, or maybe could have included an option like "progress falling behind all of the three scenarios". I also agree that given the way the question was asked, the summary on p. 38 is slightly inaccurate. (it doesn't seem like a big issue to me, but that's probably downstream of me disagreeing that the "slow" scenario describes AGI-like capabilities).
  2. Fair.
  3. I'm not saying that the responses show that there's no framing effect. I'm saying that they seem to indicate that at least for most respondents, the description of the slow scenario didn't seem wildly off as a floor of what could be expected.  

I forgot to include the text of the question in my post. I just added it now.

I think it would also be fair to include the disclaimer in the question I quoted above.

There are several capabilities mentioned in the slow progress scenario that seem indicative of AGI or something close, such as the ability of AI systems to largely automate various kinds of labour (e.g. research assistant, software engineer, customer service, novelist, musician, personal assistant, travel agent)

I would read the scenario as AI being able to do some of the tasks required by these jobs, but not to fully replace humans doing them, which I would think is the defining characteristic of slow AI progress scenarios.  

I’m confused by this, for a few reasons:

  1. The question asks what scenario a future panel would believe is "best matching" the general level of AI progress in 2030, so if things fell short of the "slow" scenario, it would still be the best matching. This point is also reinforced by the instructions: "Reasonable people may disagree with our characterization of what constitutes slow, moderate, or rapid AI progress. Or they may expect to see slow progress observed with some AI capabilities and moderate or fast progress in others. Nevertheless, we ask you to select which scenario, in sum, you feel best represents your views" (p. 104).
  2. There are several other questions in the survey that allow responses indicating very low capability levels or low societal impact. If there is a huge framing effect in the scenario question, it would have to strongly affect answers to these other questions, too (which I think is implausible), or else you should be able to show a mismatch between these questions and the scenario question (which I don't think there is).
  3. The actual answers don’t seem to reflect the view that most respondents believe that the slow scenario represents a very high bar (unless, again, you believe the framing effect is extremely strong): "By 2030, the average expert thinks there is a 23% chance that the state of AI most closely mirrors a “rapid” progress scenario [...]. Conversely, they give a 28% chance of a slow-progress scenario, in which AI is a widely useful assisting technology but falls short of transformative impact".

Aside from these methodological points, I’m also surprised that you believe that the slow scenario constitutes AI that is "either nearly AGI or AGI outright". Out of curiousity, what capability mentioned in the "slow" scenario do you think is the most implausible by 2030? To me, most of these seem pretty close to what we already have in 2025. 

[disclaimer: I recommended a major grant to FRI this year, and I’ve discussed LEAP with them several times]

Seems true. But then again, it feels suspicious that a cleverly worded and funny Quick Take would just happen to be one of the rare true generalizing statements about human psychology…

Thanks, that's a helpful clarification! "Allowed" still feels like a strong choice of words, but I can see that the line between that and "I'm not sure if this will be perceived as annoying" is blurry, and also, the latter feels frustrating enough.

I'm only speaking in personal capacity here, but my strong preference would always be for these questions to be raised! 

I'm not sure if I'm allowed to ask this [...].


Maybe you don't mean this literally, but I find this really sad kind of horrifying. Who do you think wouldn't allow you to ask this question, and why?

Disagree-voted. I think there are issues with the Neglectedness heuristic, but I don’t think the N in ITN is fully captured by I and T. 

For example, one possible rephrasing of ITN is: (certainly not covering all the ways in which it is used)

  1. Would it be good to solve problem P?
  2. Can I solve P?
  3. How many other people are trying to solve P?

I think this is a great way to decompose some decision problems. For instance, it seems very useful for thinking about prioritizing research, because (3) helps you answer the important question "If I don’t solve P, will someone else?" (even if this is also affected by 2).

(edited. Originally, I put the question "If I don’t solve P, will someone else?" under 3., which was a bit sloppy)

I think it’s borderline whether reports of this type are forecasting as commonly understood, but would personally lean no in the specific cases you mention (except maybe the bio anchors report).
 
I really don’t think that this intuition is driven by the amount of time or effort that went into them, but rather the percentage of intellectual labor that went into something like “quantifying uncertainty” (rather than, e.g. establishing empirical facts, reviewing the literature, or analyzing the structure of commonly-made arguments).  

As for our grantmaking program: I expect we’ll have a more detailed description of what we want to cover later this year, where we might also address points about the boundaries to worldview investigations.

Hi Dan, 

Thanks for writing this! Some (weakly-held) points of skepticism: 

  1. I find it a bit nebulous what you do and don't count as a rationale. Similarly to Eli,* I think on some readings of your post, “forecasting” becomes very broad and just encompasses all of research. Obviously, research is important!
  2. Rationales are costly! Taking that into account, I think there is a role to play for  “just the numbers” forecasting, e.g.: 
    1. Sometimes you just want to defer to others, especially if an existing track record establishes that the numbers are reliable. For instance, when looking at weather forecasts, or (at least until last year) looking at 538’s numbers for an upcoming election, it would be great if you understood all the details of what goes into the numbers, but the numbers themselves are plenty useful, too. 
    2. Even without a track record, just-the-number forecasts give you a baseline of what people believe, which allows you to observe big shifts. I’ve heard many people express things like “I don’t defer to the Metaculus on AGI arrival, but it was surely informative to see by how much the community prediction has moved over the last few years”.
    3. Just-the-number forecasts let you spot disagreements with other people, which helps finding out where talking about rationales/models is particularly important. 
       
  3. I’m worried that in the context of getting high-stakes decision makers to use forecasts, some of the demand for rationales is due to lack of trust in the forecasts. Replying to this demand with AI-generated rationales might shift the skeptical take from “they’re just making up numbers” to “it’s all based on LLM hallucinations” that I’m not sure really addresses the underlying problem. 
     

*OTOH, I think Eli is also hinting at a definition of forecasting that is too narrow. I do think that generating models/rationales is part of forecasting as it is commonly understood (including in EA circles), and certainly don't agree that forecasting by definition means that little effort was put into it!
Maybe the right place to draw the line between forecasting rationales and “just general research” is asking “is the model/rationale for the most part tightly linked to the numerical forecast?" If yes, it's forecasting, if not, it's something else. 


 

Load more