Hide table of contents

[Epistemic status: I am optimising for providing arguments worth considering, which means I'm not trying to make certain every argument is strictly valid. I'm not trying to be safe to defer to, so I'm not minimising false-positives. I am just trying to expand the range of available tools to explore this question with, so I'm minimising false-negatives.]


Prize questions

The Future Fund's AI Worldview Prize will award up to $1.5M for work that substantially changes their probabilities on the following three propositions.

  1. “P(misalignment x-risk|AGI)”: Conditional on AGI being developed by 2070, humanity will go extinct or drastically curtail its future potential due to loss of control of AGI
  2. AGI will be developed by January 1, 2043
  3. AGI will be developed by January 1, 2100

See their post if you want the details. Here, I just want to offer some very broad and hasty arguments for why I'm hesitant about the value of AI forecasting in general, and this prize in particular.


Arguments

  • It's worth stating explicitly that an actual working technical solution to the alignment problem with low tax would substantially update the panel's beliefs about proposition A. So this isn't necessarily a contest about forecasting-arguments even if it's (misguidedly, imo) presented as one.
  • I think this matters because I don't see forecasting as being a very targeted use of time for making the world better. In the worst case, this contest can make people less effective because it incentivises them to work on forecasting when they otherwise would have worked on generating solutions. If the latter is substantially more important on the margin, this contest is probably bad.
    • This seems especially true in a research community with high rates of intrinsic motivation.
      • I think a research community functions best when the people who have a well-formed opinion about what is the most effective thing for them to do, end up actually doing that thing. Especially in a pre-paradigmatic field like alignment, where there's no authority you can defer to that will safely ensure that you end up working on something worthwhile.
      • As long as external incentives are approximately smoothly distributed across tasks, their intrinsic motivation to do what they think is the most effective thing for them to do, is more likely to win out over their competing motivations.
      • So I'd be reluctant to introduce imbalanced incentives, because they might displace motivation to act on individual prioritisation.
    • In general, introducing disproportionately strong incentives for a narrow subset of tasks in an area where task prioritisation has very broad uncertainty, seems bad.
      • Ahmdal's law: "the overall performance improvement gained by optimizing a single part of a system is limited by the fraction of time that the improved part is actually used."
      • It's like running notepad--and only notepad--on a GTX 3080ti GPU.
  • These words don't have strict definitions, but I think of "forecasting" as spending optimisation on predicting what will happen, and "problem-solving" as trying to change what will happen (by generating new ideas and solutions). Forecasting is differentiating between what's already known, problem-solving is generating something that doesn't exist yet.
    • Forecasters search broadly and exploit what's already written because that's the more reliable way to form informed opinions. They analyse more arguments than they generate, because the former is more cost-effective in terms of value of information related to the forecasting questions.
    • Problem-solvers explore unknown territory with uncertain (hits-based) payoff. If they're part of a research community that can do parallel search, this is the optimal strategy for generating novel solutions that everyone can benefit from, but it's not optimal for making the best forecasts.
    • Forecasting can be important for directing object-level work and choosing between different alignment strategies, and is therefore essential to the project. But it is a step removed from actually trying to develop a solution.
    • At the theoretical limit, you can be perfectly calibrated on predicting everything about what will happen related to AGI, even if you do no work that actually increases the chances that it ends up aligned.
    • It's questionable to what extent prioritisation between alignment strategies is sensitive to differences in timeline forecasts, as opposed to mostly just being sensitive to their technical plausibility in the first place. If so, efforts to prioritise between different strategies is better spent on researching their plausibility, and less on figuring out where they fit on the timeline. But I expect different people will have wildly different takes on this.
    • For an individual, creative brainpower spent on forecasting doesn't translate as readily into competence for problem-solving compared to brainpower spent directly on problem-solving.
    • Written work on forecasting does not inform work on problem-solving as readily as problem-solving work informs forecasting work.
    • What is more robustly usefwl across a range of the most plausible scenarios: work on forecasting or problem-solving? I think others are in a better position to answer this question than I am, but I'd nudge you to consider Ahmdal's law again.
  • Optimisation spent on predicting worthwhile forecasting questions is plausibly hitting diminishing marginal returns much faster than trying to generate solutions. I have no theoretical model to support this (though I feel like one might exist), I just expect it to be the case in practice.
    • I expect promising research avenues targeted at producing solutions to last longer than promising research avenues targeted at estimating forecasting questions.
    • That is, if on the y-axis you plot the marginal value of information of spending one more hour researching a question, and on the x-axis you plot time spent researching it... I expect the distribution for AI forecasting questions to be front-loaded, at least as compared to AI problem-solving questions.
    • This relates to it being easier, in practice, to verify lines of thinking (e.g. while searching through existing literature as a foxy forecaster) compared to generating them in the first place.

Thanks to Mihnea Maftei for some helpfwl discussion on this.

9

0
0

Reactions

0
0

More posts like this

Comments6
Sorted by Click to highlight new comments since: Today at 9:59 PM

I think the value of information is really high for the Future Fund. If p(doom) is really high (e.g., the largest prize is claimed), they might decide to almost exclusively focus on AI stuff — this would be a major organizational change that (potentially/hopefully) would help with AI risk reduction quite a bit.

Mh, agreed. The general arguments in the post are probably overwhelmed in most cases by considerations specific to each case.

Some thoughts as I was reading your post:

> I don't see forecasting as being a very targeted use of time for making the world better

I disagree with this. I think that one generator of this is that I like to think of the thing that I do when forecasting as "improving my models of the world", and assigning probabilities as a tool that I use to cull inaccurate models. One past example where I think this was with estimates of [nuclear risk](https://forum.effectivealtruism.org/posts/2nDTrDPZJBEerZGrk/samotsvety-nuclear-risk-update-october-2022), where I think that forecasting was a useful lense & affected some decisions.

> this contest can make people less effective because it incentivises them to work on forecasting when they otherwise would have worked on generating solutions

At the same time, forecasting can make it more legible or more apparent to people that working on AI safety is important. It could also end up concluding that it is *not* that important. I'm also interested in other ways to do that, like [this contest](https://forum.effectivealtruism.org/posts/noDYmqoDxYk5TXoNm/usd5k-challenge-to-quantify-the-impact-of-80-000-hours-top) to quantify the value of different career paths.

> [some stuff about how direct work would be better]

So I agree that conditional on AI being a top priority, direct work is more important. As Misha mentioned, however, one still has to determine to what extent that is the case. Forecasting isn't the only tool to do that, though.

> These words don't have strict definitions, but I think of "forecasting" as spending optimisation on predicting what will happen, and "problem-solving" as trying to change what will happen (by generating new ideas and solutions). Forecasting is differentiating between what's already known, problem-solving is generating something that doesn't exist yet.

Idk, man, I like the "acquring better models of the world" framing better.

You might also get some mileage out of this old blogpost of mine: Building Blocks of Utility Maximization; maybe rewording your objections in terms of the specific parts of expected utility maximization will be clarifying.


Overall I'd guess most of the disagreement can be rounded off to you thinking that AI safety is known to be the top priority, and so the benefits of forecasting in terms of prioritization are pretty small. Is that fair?

I kinda disagree with yesterday-me on how important these arguments are. I'm not entirely sure why. I think writing out this post helped me see how limited they are, and decision-relevant evidence related to specific cases will likely overwhelm them. But anyway:

Clarification

Overall I'd guess most of the disagreement can be rounded off to you thinking that AI safety is known to be the top priority, and so the benefits of forecasting in terms of prioritization are pretty small. Is that fair?

  1. I don't try to argue the object-level. I'm instead suggesting reasons why direct work could be a higher priority under a greater range of uncertainty than people might think.
    1. If this is true, it doesn't necessarily mean that people should deprioritise forecasting. But it does mean that if your estimates are already within the range where direct work is higher priority, and expected evidence seems unlikely to shift estimates out of that range, then forecasting is marginally wastefwl.
  2. The Future Fund's estimates and resilience (or that of a large part of the community) might not be within that range, however. In which case they should probably prioritise forecasting.
  3. I'm only saying "if you think this, then that". The arguments could still be valid (and therefore potentially usefwl), even if the premises don't hold in specific cases.
  4. I'm not saying "forecasting is wastefwl", I'm saying "here are some reasons that may help you analyse". My opinions shouldn't matter to the value of the post, since I explicitly say that people shouldn't defer to me.
    1. The arguments are entirely filtered for anti-forecasting, because I expect people to already be aware of the pro-forecasting arguments I currently have on offer, and I only wish to provide tools that they may not already have.

Role-based socioepistemology, and "forecasters" vs "explorers"

  1. I'm supposed to try to figure out what a good research community looks like, and that will involve different people filling different roles. I believe there are tangible methodological differences between optimal forecasting and optimal exploring, and I want to refine my model of what those differences are.
  2. When I talk about "forecasters", it's usually because I want to contrast that with what I think good methodologies for "explorers" are. Truth is, I have no idea how to do good forecasting, so it usually ends up being rather strawman-ish.
  3. When I say "explorer" I think of people like V.S. Ramachandran, Feynman, Kary Mullis, and people who aren't afraid of being wrong a bunch in order to be extremely right on occasion.
  4. Whereas forecasters need to produce work that people can safely defer to and use for prioritising between consequential decisions, so the negative impact of being wrong are much greater.

Exploring helps forecasting more than the other way around

  1. The way I usually update my estimates on the importance of doing X (e.g. animal advocacy or AI alignment) is by spending most of my time actually doing X and thereby learning how worthwhile it is.
    1. If X hits diminishing returns, or I uncover evidence that reduces my confidence in X, then I'll spend more resources trying to look for alternative paths.
    2. This way, I still get evidence related to prioritisation and forecasting, but I also make progress on object-level projects. The forecasting-related flow of information from project-execution is often sufficient that I don't need to spend much time explicitly forecasting.
      1. (I realise the terms are insufficiently well-defined, but hopefwly I communicate my intention.)
  2. It seems plausible that if something like this algorithm is widely adopted in the community, we not only make progress on important projects faster, but we also uncover more evidence related to prioritisation and forecasting.

Thanks a lot : )

(Honestly just posting comments on posts linking to relevant stuff you can think of is both cheap and decent value.)