2344Joined Aug 2015


Most of my stuff (even the stuff of interest to EAs) can be found on LessWrong: https://www.lesswrong.com/users/daniel-kokotajlo


Tiny Probabilities of Vast Utilities: A Problem for Longtermism?
What to do about short timelines?


Oops, accidentally voted twice on this. Didn't occur to me that the LW and EAF versions were the same underlying poll.

In addition to that, it's important not just that you actually have high integrity but that people believe you do. And people will be rightly hesitant to believe that you do if you are going around saying that the morally correct thing to do is maximize expected utility but don't worry it's always and everywhere true that the way to maximize expected utility is to act as if you have high integrity. There are two strategies available, then: Actually have high integrity, which means not being 100% a utilitarian/consequentialist, or carry out an extremely convincing deception campaign to fool people into thinking you have high integrity.  I recommend the former & if you attempt the latter, fuck you.

In practice, many of the utilitarians/consequentialists don't see the negative outcomes themselves, or at least sufficiently many of them don't that things will go to shit pretty quickly. (Relatedly, see the Unilateralists' Curse, the Epistemic Prisoner's Dilemma, and pretty much the entire literature of game theory, all those collective action problems...).

Hey, no need to apologize, and besides I wasn't even expecting a reply since I didn't ask a question. 

Your points 1 and 2 are good. I should have clarified what I meant by "people." I didn't mean everyone, I guess I meant something like "Most of the people who are likely to read this." But maybe I should be less extreme, as you mentioned, and exclude people who satisfy 1a+1b. Fair enough.

Re point 2: Yeah. I think your post is good; explicitly thinking about and working through feelings & biases etc. is an important complement to object-level thinking about a topic. I guess I was coming from a place of frustration with the way meta-level stuff seems to get more attention/clicks/discussion on forums like this, than object-level analysis. At least that's how it seems to me. But on reflection I'm not sure my impression is correct; I feel like the ideal ratio of object level to meta stuff should be 9:1 or so, and I haven't bothered to check maybe we aren't that far off on this forum (on the subject of timelines).


Note that Crunch Time is different for different people & different paths-to-impact. For example, maybe when it comes to AI alignment, crunch time begins 1 year before powerbase ability, because that's when people are deciding which alignment techniques to use on the model(s) that will seize power if they aren't aligned, and the value of learning & growing in the years immediately prior is huge. Yet at the same time it could be that for AI governance crunch time begins 5 years before powerbase ability, because coordination of labs and governments gets exponentially harder the closer you get to powerbase ability as the race to AGI heats up, and the value of learning and growing in those last few years is relatively small since it's more about implementing the obvious things (labs should coordinate, slow down, invest more in safety, etc.)

Thanks for this critique! I agree this is an important subject that is relatively understudied compared to other aspects of the problem. As far as I can tell there just isn't a science of takeover; there's military science and there's the science of how to win elections in a democracy and there's a bit of research and a few books on the topic of how to seize power in a dictatorship... but for such an important subject when you think about it it's unfortunate that there isn't a general study of how agents in multi-agent environments accumulate influence and achieve large-scale goals over long time periods.

I'm going to give my reactions below as I read:

These passages seem to imply that the rate of scientific progress is primarily limited by the number and intelligence level of those working on scientific research. It is not clear, however, that the evidence supports this.

I mean it's clearly more than JUST the number and intelligence of the people involved, but surely those are major factors!  Piece of evidence: Across many industries performance on important metrics (e.g. price) seems to predictably improve exponentially with investment/effort (this is called experience curve effect).  Another piece of evidence: AlphaFold 2.

Later you mention the gradual accumulation of ideas and cite the common occurrence of repeated independent discoveries. I think this quite plausible. But note that a society of AIs would be thinking and communicating much faster than a society of humans, so the process of ideas gradually accumulating in their society would also be sped up.

Frist, though the actual model training was rapid, the entire process of developing Alpha Zero was far more protracted. Focusing on the day of training presents a highly misleading picture of the actual rate of progress of this particular example. 

Sure, and similarly if AI R&D ability is like AI Go ability, there'll be a series of better and better AIs over the course of many years that gradually get better at various aspects of R&D, until one day an AI is trained that is better than the most brilliant genius scientists. I actually expect things to be slower and more smoothed out than this, probably, because training will take more like a year. This is all part of the standard picture of AI takeover, not an objection to it.

Second, Go is a fully-observable, discrete-time, zero-sum, two-player board game. 

I agree that the real world is more complex etc. and that just doing the same sort of self-play won't work. There may be more sophsiticated forms of self-play that work though. Also you don't need self-play to be superhuman at something, e.g. you could use decision transformers + imitation learning.

These all take time to develop and put into place, which is why the development of novel technologies takes a long time. For example, the Lockheed Martin F-35 took about fifteen years from initial design to scale production. The Gerald R. Ford aircraft carrier took about ten years to build and fit out. Semiconductor fabrication plants cost billions of dollars, and the entire process from the design of a chip to manufacturing takes years. Given such examples, it seems reasonable to expect that even a nascent AGI would require years to design and build a functioning nanofactory. Doing so in secret or without outside interference would be even more difficult given all the specialised equipment, raw materials, and human talent that would be needed. A bunch of humans hired online cannot simply construct a nanofactory from nothing in a few months, regardless of how advanced is the AGI overseeing the process.

I'd be interested to hear your thoughts on this post which details a combination of "near-future" military technologies. Perhaps you'll agree that the technologies on this list could be built in a few months or years by a developed nation with the help of superintelligent AI? Then the crux would be whether this tech would allow that nation to take over the world. I personally think that military takeover scenarios are unlikely because there are much easier and safer methods, but I still think military takeover is at least on the table -- crazier things have happened in history. 

That said, I don't concede the point -- You are right that it would take modern humans many years to build nanofactories etc. but I don't think this is strong evidence that a superintelligence would also take many years. Consider video games and speedrunning. Even if speedrunners don't allow themselves to use bugs/exploits, they still usually go significantly faster than reasonably good players. Consider also human engineers building something that is well-understood already how to build vs. building something for the first time ever. The point is, if you are really smart and know what you are doing, you can do stuff much faster. You said that a lot of experimentation and experience is necessary -- well, maybe it's not. In general there's a tradeoff between smarts and experimentation/experience; if you have more of one you need less of the other to reach the same level of performance. Maybe if you crank up smarts to superintelligence level -- so intelligent that the best human geniuses seem a rounding error away from the average -- you can get away with orders of magnitude less experimentation/experience. Not for everything perhaps, but for some things. Suppose there are N crazy sci-fi technologies that an AI could use to get a huge advantage: nanofactories, fusion, quantum shenanigans, bioengineering ... All it takes if for 1 of them to be such that you can mostly substitute superintelligence for experimentation. And also you can still do experimentation, and you can do it much faster than humans do it too because you know what you are doing. Instead of toying around until hypotheses gradually coalesce in your brain, you can begin with a million carefully crafted hypotheses consistent with all the evidence you've seen so far and an experiment regime designed to optimally search through the space of hypotheses as fast as possible.

I expect it to take somewhere between a day and five years to go from what you might call human-level AI to nanobot swarms. Perhaps this isn't that different from what you think? (Maybe you'd say something like 3 to 10 years?)

Relying on a ‘front man’ to serve as the face of the AGI would be highly dangerous, as the AGI would become dependent on this person for ensuring the loyalty of its followers. Of course one might argue that a combination of bribery and threats could be sufficient, but this is not the primary means by which successful leaders in history have obtained obedience and popularity, so an AGI limited to these tools would be at a significant disadvantage. Furthermore, an AGI reliant on control over money is susceptible to intervention by government authorities to freeze assets and hamper the transfer of funds. This would not be an issue if the AGI had control over its own territory, but then it would be subject to blockade and economic sanctions. For instance, it would take an AGI considerable effort to acquire the power of Vladimir Putin, and yet he is still facing considerable practical difficulties in exerting his will on his own (and neighbouring) populations without the intervention of the rest of the world. While none of these problems are necessarily insuperable, I believe they are significant issues that must be considered in an assessment of the plausibility of various AI takeover scenarios.

History has many examples of people ruling from behind the throne, so to speak. Often they have no official title whatsoever, but the people with the official titles are all loyal to them. Sometimes the people with the official titles do rebel and stop listening to the power behind the throne, and then said power behind the throne loses power. Other times, this doesn't happen.

AGI need not rule from behind the scenes though. If it's charismatic enough it can rule over a group of Blake Lemoines. Have you seen the movie Her? Did you find the behavior of the humans super implausible in that movie -- no way they would form personal relationships with an AI, no way they would trust it?

It is also unclear how an AGI would gain the skills needed to manipulate and manage large numbers of humans in the first place. It is by no means evident why an AGI would be constructed with this capability, or how it would even be trained for this task, which does not seem very amenable to traditional reinforcement learning approaches. In many discussions, an AGI is simply defined as having such abilities, but it is not explained why such skills would be expected to accompany general problem-solving or planning skills. Even if a generally competent AGI had instrumental reasons to develop such skills, would it have the capability of doing so? Humans learn social skills through years of interaction with other humans, and even then, many otherwise intelligent and wealthy humans possess such skills only to a minimal degree. Unless a credible explanation can be given as to how such an AI would acquire such skills or they why should necessarily follow from broader capabilities, I do not think it is reasonable to simply define an AGI as possessing them, and then assuming this as part of a broader takeover narrative. This presents a major issue for takeover scenarios which rely on an AGI engaging large numbers of humans in its employment for the development of weapons or novel technologies. 

It currently looks like most future AIs, and in particular AGIs, will have been trained on reading the whole internet & chatting to millions of humans over the course of several months. So, that's how they'll gain those skills.

(But also, if you are really good at generalizing to new tasks/situations, maybe manipulation of humans is one of the things you can generalize to. And if you aren't really good at generalizing to new tasks/situations, maybe you don't count as AGI.)

So far all I've done is critique your arguments but hopefully one day I'll have assembled some writing laying out my own arguments on this subject.

Anyhow, thanks again for writing this! I strongly disagree with your conclusions but I'm glad to see this topic getting serious & thoughtful attention.


OK. I'll DM Nuno.

Something about your characterization of what happened continues to feel unfair & inaccurate to me, but there's definitely truth in it & I think your advice is good so I will stop arguing & accept the criticism & try to remember it going forward. :)

Thanks for this thoughtful explanation & model.

(Aside: So, did I or didn't I come across as unfriendly/hostile? I never suggested that you said that, only that maybe it was true. This matters because I genuinely worry that I did & am thinking about being more cautious in the future as a result.)

So, given that I wanted to do both 1 and 2, would you think it would have been fine if I had just made them as separate comments, instead of mentioning 1 in passing in the thread on 2? Or do you think I really should have picked one to do and not done both?

The thing about changing my mind also resonates--that definitely happened to some extent during this conversation, because (as mentioned above) I didn't realize Nuno was talking about people who put lots of probability mass on the evolution anchor. For those people, a shift up or down by a couple OOMs really matters, and so the BOTEC  I did about how probably the environment can be simulated for less than 10^41 flops needs to be held to a higher standard of scrutiny & could end up being judged insufficient.


Did I come across as unfriendly and hostile? I am sorry if so, that was not my intent.

It seems like you think I was strongly disagreeing with your claims; I wasn't. I upvoted your response and said basically "Seems plausible idk. Could go either way." 

And then I said that it doesn't really impact the bottom line much, for reasons XYZ. And you agree. 

But now it seems like we are opposed somehow even though we seem to basically be on the same page.

For context: I think I didn't realize until now that some people actually took the evolution anchor seriously as an argument for AGI by 2100, not in the sense I endorse (which is as a loose upper bound on our probability distribution over OOMs of compute) but in the much stronger sense I don't endorse (as an actual place to clump lots of probability mass around, and naively extrapolate moore's law towards across many decades). I think insofar as people are doing that naive thing I don't endorse, they should totally stop. And yes, as Nuno has pointed out, insofar as they are doing that naive thing, then they should really pay more attention to the environment cost as well as the brain-simulation cost, because it could maaaybe add a few OOMs to the estimate which would push the extrapolated date of AGI back by decades or even centuries.

Huh, I guess I didn't realize how much weight some people put on the evolution anchor.  I thought everyone was (like me) treating it as a loose upper bound basically, not something to actually clump lots of probability mass on.

In other words: The people I know who were using the evolutionary anchor (people like myself, Ajeya, etc.) weren't using it in a way that would be significantly undermined by having to push the anchor up 6 OOMs or so. Like I said, it would be a minor change to the bottom line according to the spreadsheet. Insofar as people were arguing for AGI this century in a way which can be undermined by adding 6 OOMs to the evolutionary anchor then those people are silly & should stop, for multiple reasons, one of which is that maaaybe environmental simulation costs mean that the evolution anchor really is 6 OOMs bigger than Ajeya estimates.


Load More