kokotajlod

2522Joined Aug 2015

Bio

Most of my stuff (even the stuff of interest to EAs) can be found on LessWrong: https://www.lesswrong.com/users/daniel-kokotajlo

Sequences
2

Tiny Probabilities of Vast Utilities: A Problem for Longtermism?
What to do about short timelines?

Comments
353

I'm so excited to see this go live! I've learned a lot from it & consider it to do for takeoff speeds what Ajeya's report did for timelines, i.e. it's an actual fucking serious-ass gears-level model, the best that exists in the world for now. Future work will critique it and build off it rather than start from scratch, I say. Thanks Tom and Epoch and everyone else who contributed!

I strongly encourage everyone reading this to spend 10min playing around with the model, trying out different settings, etc. For example: Try to get it to match what you intuitively felt like timelines and takeoff would look like, and see how hard it is to get it to do so. Or: Go through the top 5-10 variables one by one and change them to what you think they should be (leaving unchanged the ones about which you have no opinion) and then see what effect each change has.

Almost two years ago I wrote this story of what the next five years would look like on my median timeline. At the time I had the bio anchors framework in mind with a median training requirements of 3e29. So, you can use this takeoff model as a nice complement to that story:

  • Go to takeoffspeeds.com and load the preset: best guess scenario.
  • Set AGI training requirements to 3e29 instead of 1e36
  • (Optional) Set software returns to 2.5 instead of 1.25 (I endorse this change in general, because it's more consistent with the empirical evidence. See Tom's report for details & decide whether his justification for cutting it in half, to 1.25, is convincing.)
  • (Optional) Set FLOP gap to 1e2 instead of 1e4 (In general, as Tom discusses in the report, if training requirements are smaller then probably the FLOP gap is smaller too. So if we are starting with Tom's best guess scenario and lowering the training requirements we should also lower the FLOP gap.)
  • The result: 

In 2024, 4% of AI R&D tasks are automated; then 32% in 2026, and then singularity happens around when I expected, in mid 2028. This is close enough to what I had expected when I wrote the story that I'm tentatively making it canon.

Oh, also, a citation about my contribution to this post (Tom was going to make this a footnote but ran into technical difficulties): The extremely janky graph/diagram was made by me in may 2021, to help explain Ajeya's Bio Anchors model. The graph that forms the bottom left corner came from some ARK Invest webpage which I can't find now.

I didn't say it was easy! I just said that rational analysis, rational gathering of evidence, etc. can pay dividends.

And indeed, if you go back 5 years and look at what people were saying at the time, some people did do way better than most at predicting what happened.* I happen to remember being at dinner parties in the Bay in late 2018, early 2019 where LWers were discussing the topic of "If, as now seems quite plausible, predicting text is the key to general intelligence & will scale to AGI, what implications does that have?" This may even have been before GPT-2 was public, I don't remember. Probably it was shortly after. 

That's on hard mode though -- to prove my point all I have to do is point out that most of the world has been surprised by the general pace of progress in AI, and in particular progress towards AGI, in the last 5 years. It wasn't even on the radar for most people. But for some people not only was it on the radar but it was basically what they expected. (MIRI's timelines haven't changed much in the last 5 years, I hear, because things have more or less proceeded about as quickly as they thought. Different in the details of course, but not generally slower or faster.)

*And I don't think they just got lucky. They were well-connected and following the field closely, and took the forecasting job unusually seriously, and were unusually rational as people.

I mean, yeah, some things are basically impossible to get any signal/evidence/etc. on and for those things the NYT and Metaculus and the best forecaster in the world are all equally useless.

But predicting AGI isn't one of those topics. It instead is one of the vast majority of topics where rational analysis, rational gathering of evidence, etc. can pay dividends.

It's all relative. I trust my own forecasts more than Metaculus' forecasts (in some domains; in other domains I don't) because I've thought through the arguments for myself. But for topics I know little about, who would I trust more -- Metaculus or my Twitter feed or some op-ed in the New York Times? Answer: Metaculus. 

Obviously I'd drop Metaculus in an instant if I had a better source of evidence, it's just that often I don't.

"(The first public 'general AI' system is predicted in 2038, which makes me a bit confused. I fail to see how there's an 11 year gap between weak and 'strong' AI, especially with superintelligence ~10 months after the first AGI. Am I missing something?)."

Nice that you noticed this! This is, I think, an inconsistency in Metaculus' answers, one that has been pointed out at least twice before but still hasn't corrected itself.

  1. I've opted out from workplace retirement/pension schemes.
  2. Plans to have a second child were put on hold indefinitely due to my timelines collapsing in 2020. This sucks, as both me and my wife really want to have a second child & could have done so by now.
  3. I'm making trade-offs for 'career' over 'family' that I wouldn't normally make, most notably spending two-thirds of my time in SF whereas my wife and kid are in Boston. If I had 30-year timelines like I did in 2019, I'd probably be looking to settle down in Boston even at some cost to my productivity.
  4. While I've mostly reverted to my hedonic set point, I am probably somewhat more grim than I used to be, and I find myself dealing with small flashes of sadness on a daily basis.

I think work tests are a great way to hire people (because they are less biased than the alternatives) but I agree they should be paid for. I didn't know unpaid work tests were a thing. Which orgs had extensive unpaid work tests?

Wikipedia quotes someone disparaging the antislavery efforts saying they used less than 5% of Royal Navy ships, which I interpret as meaning they probably used between 4% and 5%.

This article says: "At its peak in the 1840s and 1850s, British operations off the West African coast involved up to 36 vessels and more than 4,000 men, costing an estimated half of all naval spending – amounting to between 1% and 2% of British government expenditure."

Big discrepancy between "half of all naval spending" and "less than 5%." Maybe the peak was half and the min was 4%?

Either way, I'm pretty impressed. I wonder what fraction of typical Great Power military budgets go to humanitarian efforts.

 

I would hope that there is a citation/reference/source listed in the book! Alas, I don't have the book myself. I too am curious about this question. 2% of GDP is quite a lot.

I wonder what % of Royal Navy resources were devoted to it. In some ways that's a more impressive metric than GDP, because governments make stupid decisions that cost huge fractions of GDP all the time, so just because a government made a decision that cost a fraction of GDP doesn't mean they knowingly accepted the sacrifice.

I just wanna say, if that's the best you can do for "EA is deceptive" then it seems like EA is significantly less deceptive than the average social movement, corporation, university, government, etc.

As for misaligned, yes definitely, in the way you mention. This is true of pretty much all human groups though, e.g.  social movements, corporations, universities, governments.  The officially stated goals and metrics that are officially optimized for would cause existential catastrophe if optimized at superintelligent levels.

Load More