E

elifland

2859 karmaJoined Working (0-5 years)
www.elilifland.com/

Bio

You can give me anonymous feedback here. I often change my mind and don't necessarily endorse past writings.

Comments
149

Topic contributions
6

Thanks titotal for taking the time to dig deep into our model and write up your thoughts, it's much appreciated. This comment speaks for Daniel Kokotajlo and me, not necessarily any of the other authors on the timelines forecast or AI 2027. It addresses most but not all of titotal’s post.

Overall view: titotal pointed out a few mistakes and communication issues which we will mostly fix. We are therefore going to give titotal a $500 bounty to represent our appreciation.  However, we continue to disagree on the core points regarding whether the model’s takeaways are valid and whether it was reasonable to publish a model with this level of polish. We think titotal’s critiques aren’t strong enough to overturn the core conclusion that superhuman coders by 2027 are a serious possibility, nor to significantly move our overall median. Moreover, we continue to think that AI 2027’s timelines forecast is (unfortunately) the world’s state-of-the-art, and challenge others to do better. If instead of surpassing us, people simply want to offer us critiques, that’s helpful too; we hope to surpass ourselves every year in part by incorporating and responding to such critiques.

Clarification regarding the updated model

My apologies about quietly updating the timelines forecast with an update without announcing it; we are aiming to announce it soon. I’m glad that titotal was able to see it.

A few clarifications:

  1. titotal says “it predicts years longer timescales than the AI2027 short story anyway.” While the medians are indeed 2029 and 2030, the models still give ~25-40% to superhuman coders by the end of 2027.
  2. Other team members (e.g. Daniel K) haven’t reviewed the updated model in depth, and have not integrated it into their overall views. Daniel is planning to do this soon, and will publish a blog post about it when he does.

Most important disagreements

I'll let titotal correct us if we misrepresent them on any of this.

  1. Whether to estimate and model dynamics for which we don't have empirical data. e.g. titotal says there is "very little empirical validation of the model," and especially criticizes the modeling of superexponentiality as having no empirical backing. We agree that it would be great to have more empirical validation of more of the model components, but unfortunately that's not feasible at the moment while incorporating all of the highly relevant factors.[1]
    1. Whether to adjust our estimates based on factors outside the data. For example, titotal criticizes us for making judgmental forecasts for the date of RE-Bench saturation, rather than plugging in the logistic fit. I’m strongly in favor of allowing intuitive adjustments on top of quantitative modeling when estimating parameters.
  2. [Unsure about level of disagreement] The value of a "least bad" timelines model. While the model is certainly imperfect due to limited time and the inherent difficulties around forecasting AGI timelines, we still think overall it’s the “least bad” timelines model out there and it’s the model that features most prominently in my overall timelines views. I think titotal disagrees, though I’m not sure which one they consider least bad (perhaps METR’s simpler one in their time horizon paper?). But even if titotal agreed that ours was “least bad,” my sense is that they might still be much more negative on it than us. Some reasons I’m excited about publishing a least bad model:
    1. Reasoning transparency. We wanted to justify the timelines in AI 2027, given limited time. We think it’s valuable to be transparent about where our estimates come from even if the modeling is flawed in significant ways. Additionally, it allows others like titotal to critique it.
    2. Advancing the state of the art. Even if a model is flawed, it seems best to publish to inform others’ opinions and to allow others to build on top of it.
  3. The likelihood of time horizon growth being superexponential, before accounting for AI R&D automation. See this section for our arguments in favor of superexponentiallity being plausible, and titotal’s responses (I put it at 45% in our original model). This comment thread has further discussion. If you are very confident in no inherent superexponentiality, superhuman coders by end of 2027 become significantly less likely, though are still >10% if you agree with the rest of our modeling choices (see here for a side-by-side graph generated from my latest model).
    1. How strongly superexponential the progress would be. This section argues that our choice of superexponential function is arbitrary. While we agree that the choice is fairly arbitrary and ideally we would have uncertainty over the best function, my intuition is that titotal’s proposed alternative curve feels less plausible than the one we use in the report, conditional on some level of superexponentiality.
    2. Whether the argument for superexponentiality is stronger at higher time horizons. titotal is confused about why there would sometimes be a delayed superexponential rather than starting at the simulation starting point. The reasoning here is that the conceptual argument for superexponentiality is much stronger at higher time horizons (e.g. going from 100 to 1,000 years feels likely much easier than going from 1 to 10 days, while it’s less clear for 1 to 10 weeks vs. 1 to 10 days). It’s unclear that the delayed superexponential is the exact right way to model that, but it’s what I came up with for now.

Other disagreements

  1. Intermediate speedups: Unfortunately we haven’t had the chance to dig deeply into this section of titotal’s critique, and it’s mostly based on the original version of the model rather than the updated one so we probably will not get to this. The speedup from including AI R&D automation seems pretty reasonable intuitively at the moment (you can see a side-by-side here).
  2. RE-Bench logistic fit (section): We think it’s reasonable to set the ceiling of the logistic at wherever we think the maximum achievable performance would be. We don’t think it makes any sense to give weight to a fit that achieves a maximum of 0.5 when we know reference solutions achieve 1.0 and we also have reason to believe it’s possible to get substantially higher. We agree that we are making a guess (or with more positive connotation, “estimate”) about the maximum score, but it seems better than the alternative of doing no fit.

Mistakes that titotal pointed out

  1. We agree that the graph we’ve tweeted is not closely representative of the typical trajectory of our timelines model conditional on superhuman coders in March 2027. Sorry about that, we should have prioritized making it more precisely faithful to the model. We will fix this in future communications.
  2. They convinced us to remove the public vs. internal argument as a consideration in favor of superexponentiality (section).
  3. We like the analysis done regarding the inconsistency of the RE-Bench saturation forecasts with an interpolation of the time horizons progression. We agree that it’s plausible that we should just not have RE-Bench in the benchmarks and gaps model; this is partially an artifact of a version of the model that existed before the METR time horizons paper.

In accordance with our bounties program, we will award $500 to titotal for pointing these out.

Communication issues

There were several issues with communication that titotal pointed out which we agree should be clarified, and we will do so. These issues arose from lack of polish rather than malice. 2 of the most important ones:

  1. The “exponential” time horizon case still has superexponential growth once you account for automation of AI R&D.
  2. The forecasts for RE-Bench saturation were adjusted based on other factors on top of the logistic fit.
  1. ^

    Relatedly, titotal thinks that we made our model too complicated, while I think it's important to make our best guess for how each relevant factor affects our forecast.

Centre for the Governance of AI does alignment research and policy research. It appears to focus primarily on the former, which, as I've discussed, I'm not as optimistic about. (And I don't like policy research as much as policy advocacy.)

I'm confused, the claim here is that GovAI does more technical alignment than policy research?

Would you be interested in making quantitative predictions on the revenue of OpenAI/Anthropic in upcoming years, and/or when various benchmarks like these will be saturated (and OSWorld, released since that series was created), and/or when various Preparedness/ASL levels will be triggered?

Want to discuss bot-building with other competitors? We’ve set up a Discord channel just for this series. Join it here.

 

I get "Invite Invalid"

How did you decide to target Cognition? 

IMO it makes much more sense to target AI developers who are training foundation models with huge amounts of compute. My understanding is that Cognition isn't training foundation models, and is more of a "wrapper" in the sense that they are building on top of others' foundation models to apply scaffolding, and/or fine-tuning with <~1% of the foundation model training compute. Correct me if I'm wrong.

Gesturing at some of the reasons I think that wrappers should be deprioritized:

  1. Much of the risks from scheming AIs routes through internal AI R&D via internal foundation models
  2. Over time, I'd guess that wrapper companies working on AI R&D-relevant tasks like Cognition either get acquired or fade into irrelevancy since there will be pressure to make AI R&D agents internally (maybe this campaign is still useful if it gets acquired though?)
  3. Accelerating LM agent scaffolding has unclear sign for safety

Maybe the answer is that Cognition was way better than foundation model developers on other dimensions, in which case, fair enough.

Thanks for organizing this! Tentatively excited about work in this domain.

I do think that generating models/rationales is part of forecasting as it is commonly understood (including in EA circles), and certainly don't agree that forecasting by definition means that little effort was put into it!
Maybe the right place to draw the line between forecasting rationales and “just general research” is asking “is the model/rationale for the most part tightly linked to the numerical forecast?" If yes, it's forecasting, if not, it's something else. 

 

Thanks for clarifying! Would you consider OpenPhil worldview investigations reports such Scheming AIs, Is power-seeking AI an existential risk, Bio Anchors, and Davidson's takeoff model forecasting? It seems to me that they are forecasting in a relevant sense and (for all except Scheming AIs maybe?) the sense you describe of the rationale linked tightly to a numerical forecast, but wouldn't fit under the OP forecasting program area (correct me if I'm wrong).

Maybe not worth spending too much time on these terminological disputes, perhaps the relevant question for the community is what the scope of your grantmaking program is. If indeed the months-year-long reports above wouldn't be covered, then it seems to me that the amount of effort spent is a relevant dimension of what counts as "research with a forecast attached" vs. "forecasting as is generally understood in EA circles and would be covered under your program". So it might be worth clarifying the boundaries there. If you indeed would consider reports like worldview investigations ones under your program, then never mind but good to clarify as I'd guess most would not guess that.

Thanks for writing this up, and I'm excited about FutureSearch! I agree with most of this, but I'm not sure framing it as more in-depth forecasting is the most natural given how people generally use the word forecasting in EA circles (i.e. associated with Tetlock-style superforecasting, often aggregation of very part-time forecasters' views, etc.). It might be imo more natural to think of it as being a need for in-depth research, perhaps with a forecasting flavor. Here's part of a comment I left on a draft.

However, I kind of think the framing of the essay is wrong [ETA: I might hedge wrong a bit if writing on EAF :p] in that it categorizes a thing as "forecasting" that I think is more naturally categorized as "research" to avoid confusion. See point (2)(a)(ii) at https://www.foxy-scout.com/forecasting-interventions/ ; basically I think calling "forecasting" anything where you slap a number on the end is confusing, because basically every intellectual task/decision can be framed as forecasting.

It feels like this essay is overall arguing that AI safety macrostrategy research is more important than AI safety superforecasting (and the superforecasting is what EAs mean when they say "forecasting"). I don't think the distinction being pointed to here is necessarily whether you put a number at the end of your research project (though I think that's usually useful as well), but the difference between deep research projects and Tetlock-style superforecasting.

I don't think they are necessarily independent btw, they might be complementary (see https://www.foxy-scout.com/forecasting-interventions/ (6)(b)(ii) ), but I agree with you that the research is generally more important to focus on at the current margin.

[...] Like, it seems more intuitive to call https://arxiv.org/abs/2311.08379 a research project rather than forecasting project even though one of the conclusions is a forecast (because as you say, the vast majority of the value of that research doesn't come from the number at the end).

Thanks Ozzie for chatting! A few notes reflecting on places I think my arguments in the conversation were weak:

  1. It's unclear what short timelines would mean for AI-specific forecasting. If AI timelines are short it means you shouldn’t forecast non-AI things much, but it’s unclear what it means about forecasting AI stuff. There’s less time for effects to compound but you have more info and proximity to the most important decisions. It does discount non-AI forecasting a lot though, and some flavors of AI forecasting.
  2. I also feel weird about the comparison I made between forecasting and waiting for things to happen in the world. There might be something to it, but I think it is valuable to force yourself to think deeply about what will happen, to help form better models of the world, in order to better interpret new events as they happen.

Just chatted with @Ozzie Gooen about this and will hopefully release audio soon. I probably overstated a few things / gave a false impression of confidence in the parent in a few places (e.g., my tone was probably a little too harsh on non-AI-specific projects); hopefully the audio convo will give a more nuanced sense of my views. I'm also very interested in criticisms of my views and others sharing competing viewpoints.

Also want to emphasize the clarifications from my reply to Ozzie:

  1. While I think it's valuable to share thoughts about the value of different types of work candidly, I am very appreciative of both people working on forecasting projects and grantmakers in the space for their work trying to make the world a better place (and am friendly with many of them). As I maybe should have made more obvious, I am myself affiliated with Samotsvety Forecasting, and Sage which has done several forecasting projects (and am for the most part more pessimistic about forecasting than others in these groups/orgs). And I'm also doing AI forecasting research atm, though not the type that would be covered under the grantmaking program.
  2. I'm not trying to claim with significant confidence that this program shouldn't exist. I am trying to share my current views on the value of previous forecasting grants and the areas that seem most promising to me going forward. I'm also open to changing my mind on lots of this!
Load more