All of kokotajlod's Comments + Replies

I agree that as time goes on states will take an increasing and eventually dominant role in AI stuff.

My position is that timelines are short enough, and takeoff is fast enough, that e.g. decisions and character traits of the CEO of an AI lab will explain more of the variance in outcomes than decisions and character traits of the US President.

4
Linch
2mo
Makes sense! I agree that fast takeoff + short timelines makes my position outlined above much weaker.  I want to flag that if an AI lab and the US gov't are equally responsible for something, then the comparison will still favor the AI lab CEO, as lab CEOs have much greater control of their company than the president has over the USG. 
  1. My understanding is that relatively few EAs are actual hardcore classic hedonist utilitarians. I think this is ~sufficient to explain why more haven't become accelerationists.
  2. Have you cornered a classic hedonist utilitarian EA and asked them? Have you cornered three? What did they say?
2
JWS
2mo
Don't know why this is being disagree-voted. I think point 1 is basically correct - it doesn't take diverging far from being a "hardcore classic hedonist utilitarian" to not support the case Matthew makes in the OP

Thanks for discussing with me!

(I forgot to mention an important part of my argument, oops -- You wouldn't have said "at least 100 years off" you would have said "at least 5000 years off." Because you are anchoring to recent-past rates of progress rather than looking at how rates of progress increase over time and extrapolating. (This is just an analogy / data point, not the key part of my argument, but look at GWP growth rates as a proxy for tech progress rates: According to this GWP doubling time was something like 600 years back then, whereas it's more l... (read more)

2
titotal
5mo
That argument does make more sense, although it still doesn't apply to me, as I would never confidently state a 5000 year forecast due to the inherent uncertainty of long term predictions. (My estimates for nanotech are also high uncertainty, for the record).  no worries, I enjoyed the debate!

I agree with the claims "this problem is extremely fucking hard" and "humans aren't cracking this anytime soon" and I suspect Yudkowsky does too these days.

I disagree that nanotech has to predate taking over the world; that wasn't an assumption I was making or a conclusion I was arguing for at any rate. I agree it is less likely that ASIs will make nanotech before takeover than that they will make nanotech while still on earth.

I like your suggestion to model a more earthly scenario but I lack the energy and interest to do so right now.

My closing statement ... (read more)

2
titotal
5mo
Sorry, to be clear, I wasn't actually making a prediction as to whether nanotech predates AI takeover. My point is that that these discussions are in the context of the question "can nanotech be used to defeat humanity". If AI can only invent nanotech after defeating humanity, that's interesting but has no bearing on the question.  I also lack the energy or interest to do the modelling, so we'll have to leave it there.  My closing rebuttal: I have never stated that I am certain that nanotech is impossible. I have only stated that it could be impossible, impractical, or disappointing, and that the timelines for development are large, and would remain so even with the advent of AGI.  If I had stated in 1600 that flying machines, moving pictures, thinking machines, etc were at least 100 years off, I would have been entirely correct and accurate. And for every great technological change that turned out to be true and transformative, there are a hundred great ideas that turned out to be prohibitively expensive, or impractical, or just plain not workable. And as for the ones that did work out, and did transform the world: it almost always took a long time to build them, once we had the ability to. And even then they started out shitty as hell, and took a long, long time to become as flawless as they are today.  I'm not saying new tech can't change the world, I'm just saying it can't do it instantly. 

Cool. Seems you and I are mostly agreed on terminology then.

Yeah we definitely disagree about that crux. You'll see. Happy to talk about it more sometime if you like.

Re: galaxy vs. earth: The difference is one of degree, not kind. In both cases we have a finite amount of resources and a finite amount of time with which to do experiments. The proper way to handle this, I think, is to smear out our uncertainty over many orders of magnitude. E.g. the first OOM gets 5% of our probability mass, the second OOM gets 5% of the remaining probability mass, and so fo... (read more)

2
titotal
5mo
It really seems to me like the galaxy thing is just going to mislead, rather than elucidate. I can make my judgements about a system where one planet is converted into computronium, one planets contains a store of every available element, one planet is tiled completely with experimental labs doing automated experiments, etc. But the results of that hypothetical won't scale down to what we actually care about. For example, it wouldn't account for the infrastructure that needs to be built to assemble any of those components in bulk.  If someone wants to try their hand at modelling a more earthly scenario, I'd be happy to offer my insights. Remember, this development of nanotech has to predate the AI taking over the world, or else the whole exercise is pointless. You could look at something like "AI blackmails the dictator of a small country into starting a research program" as a starting point.  Personally, I don't think there is very much you can be certain about, beyond: "this problem is extremely fucking hard", and "humans aren't cracking this anytime soon".  I think building the physical infrastructure required to properly do the research in bulk could easily take more than a year on it's own. 

What if he just said "Some sort of super-powerful nanofactory-like thing?" 

He's not citing some existing literature that shows how to do it, but rather citing some existing literature which should make it plausible to a reasonable judge that a million superintelligences working for a year could figure out how to do it. (If you dispute the plausibility of this, what's your argument? We have an unfinished exchange on this point elsewhere in this comment section. Seems you agree that a galaxy full of superintelligences could do it; I feel like it's pretty plausible that if a galaxy of superintelligences could do it, a mere million also could do it.)

6
titotal
5mo
I would vastly prefer this phrasing, because it would be an accurate relaying of his beliefs, and would not involve the use of scientific terms that are at best misleading and at worst active misinformation.  As for the "millions of superintelligences", one of my main cruxes is that I do not think we will have millions of superintelligences in my lifetime. We may have lots of AGI, but I do not believe that AGI=superintelligence. Also, I think that if a few superintelligences come into existence they may prevent others from being built out of self-preservation. These points are probably out of scope here though.  I don't think a million superintelligences could invent nanotech in a year, with only the avalaible resources on earth. Unlike the galaxy, there is limited computational power available on earth, and limited everything else as well. I do not think the sheer scale of experimentation required could be assembled in a year, without having already invented nanotech. The galaxy situation is fundamentally misleading.  Lastly, I think even if nanotech is invented, it will probably end up being disappointing or limited in some way. This tends to be the case with all technologies: Did anyone predict that when we could build an AI that could easily pass a simple turing test, but be unable to multiply large numbers together? Hypothetical technologies get to be perfect in our minds, but as something actually gets built, it accumulates shortcomings and weaknesses from the inevitable brushes with engineering. 

I think the tech companies -- and in particular the AGI companies -- are already too powerful for such an informal public backlash to slow them down significantly.

Disagree. Almost every successful moral campaign in history started out as an informal public backlash against some evil or danger.

The AGI companies involve a few thousand people versus 8 billion, a few tens of billions of funding versus 360 trillion total global assets, and about 3 key nation-states (US, UK, China) versus 195 nation-states in the world. 

Compared to actually powerful industries, AGI companies are very small potatoes. Very few people would miss them if they were set on 'pause'.

4
Greg_Colbourn
6mo
I imagine it going hand in hand with more formal backlashes (i.e. regulation, law, treaties).

I said IMO. In context it was unnecessary for me to justify the claim, because I was asking whether or not you agreed with it.

I take it that not only do you disagree, you agree it's the crux? Or don't you? If you agree it's the crux (i.e. you agree that probably a million cooperating superintelligences with an obedient nation of humans would be able to make some pretty awesome self-replicating nanotech within a few years) then I can turn to the task of justifying the claim that such a scenario is plausible. If you don't agree, and think that even such a su... (read more)

What part of the scenario would you dispute? A million superintelligences will probably exist by 2030, IMO; the hard part is getting to superintelligence at all, not getting to a million of them (since you'll probably have enough compute to make a million copies)

I agree that the question is about the actual scenario, not the galaxy. The galaxy is a helpful thought experiment though; it seems to have succeeded in establishing the right foundations: How many OOMs of various inputs (compute, experiments, genius insights) will be needed? Presumably a galaxy's ... (read more)

8
titotal
6mo
This is a very wild claim to throw out with no argumentation to back it up. Cotra puts a 15% chance on transformative AI by 2036, and I find his assumptions incredibly optimistic about AI arrival. (also worth noting that transformative AI and superintelligence are not the same thing). The other thing I dispute is that a million superintelligences would cooperate. They would presumably have different goals and interests: surely at least some of them would betray the other's plan for a leg-up from humanity.  You don't think some of the people of the "obedient nation" are gonna tip anyone off about the nanotech plan? Unless you think the AI's have some sort of mind-control powers, in which case why the hell would they need nanotech? 

I also would like to see such breakdowns, but I think you are drawing the wrong conclusions from this example.

Just because Yudkowsky's first guess about how to make nanotech, as an amateur, didn't pan out, doesn't mean that nanotech is impossible for a million superintelligences working for a year. In fact it's very little evidence. When there are a million superintelligences they will surely be able to produce many technological marvels very quickly, and for each such marvel, if you had asked Yudkowsky to speculate about how to build it, he would have fai... (read more)

Thanks for this thoughtful and detailed deep dive!

I think it misses the main cruxes though. Yes, some people (Drexler and young Yudkowsky) thought that ordinary human science would get us all the way to atomically precise manufacturing in our lifetimes. For the reasons you mention, that seems probably wrong.

But the question I'm interested in is whether a million superintelligences could figure it out in a few years or less. (If it takes them, say, 10 years or longer, then probably they'll have better ways of taking over the world) Since that's the situatio... (read more)

5
EliezerYudkowsky
5mo
I broadly endorse this reply and have mostly shifted to trying to talk about "covalently bonded" bacteria, since using the term "diamondoid" (tightly covalently bonded CHON) causes people to panic about the lack of currently known mechanosynthesis pathways for tetrahedral carbon lattices.

Hey, thanks for engaging. I saved the AGI theorizing for last because it's the most inherently speculative: I am highly uncertain about it, and everyone else should be too. 

But the question I'm interested in is whether a million superintelligences could figure it out in a few years or less. (If it takes them, say, 10 years or longer, then probably they'll have better ways of taking over the world) Since that's the situation we'll actually be facing.

I would dispute that "a million superintelligence exist and cooperate with each other to invent MNT" is ... (read more)

OK, so our credences aren't actually that different after all. I'm actually at less than 65%, funnily enough! (But that's for doom = extinction. I think human extinction is unlikely for reasons to do with acausal trade; there will be a small minority of AIs that care about humans, just not on Earth. I usually use a broader definition of "doom" as "About as bad as human extinction, or worse.")

I am pretty confident that what happens in the next 100 years will straightforwardly translate to what happens in the long run. If humans are still well-cared-for in 2... (read more)

Those words were not yours, but you did say you agreed it was the main crux, and in context it seemed like you were agreeing that it was a crux for you too. I see now on reread that I misread you and you were instead saying it was a secondary crux. Here, let's cut through the semantics and get quantitative:

What is your credence in doom conditional on AIs not caring for humans? 

If it's >50%, then I'm mildly surprised that you think the risk of accidentally creating a permanent pause is worse than the risks from not-pausing. I guess you did say that ... (read more)

2
Denkenberger
7mo
Paul Christiano argues here that AI would only need to have "pico-pseudokindness" (caring about humans one part in a trillion) to take over the universe but not trash Earth's environment to the point of uninhabitability, and that at least this is amount of kindness is likely.
2
Matthew_Barnett
7mo
How much do they care about humans, and what counts as doom? I think these things matter. If we're assuming all AIs don't care at all about humans and doom = human extinction, then I think the probability is pretty high, like 65%. If we're allowed to assume that some small minority of AIs cares about humans, or AIs care about humans to some degree, perhaps in the way humans care about wildlife species preservation, then I think the probability is quite a lot lower, at maybe 25%. For precision, both of these estimates are over the next 100 years, since I have almost no idea what will happen in the very long run. In most of these stories, including in Ajeya's story IIRC, humanity just doesn't seem to try very hard to reduce misalignment? I don't think that's a very reasonable assumption. (Charitably, it could be interpreted as a warning rather than a prediction.) I think that as systems get more capable, we will see a large increase in our alignment efforts and monitoring of AI systems, even without any further intervention from longtermists. I'm happy to meet up some time and explain in person. I'll try to remember to DM you later about that, but if I forget, then feel free to remind me.

First of all, you are goal-post-moving if you make this about "confident belief in total doom by default" instead of the original "if you really don't think unchecked AI will kill everyone." You need to defend the position that the probability of existential catastrophe conditional on misaligned AI is <50%.

Secondly, "AI motives will generalize extremely poorly from the training distribution" is a confused and misleading way of putting it. The problem is that it'll generalize in a way that wasn't the way we hoped it would generalize.

Third, to answer your... (read more)

7
Matthew_Barnett
7mo
I never said "I don't think unchecked AI will kill everyone". That quote was not from me. What I did say was, "Even if AIs end up not caring much for humans, it is dubious that they would decide to kill all of us." Google informs me that dubious means "not to be relied upon; suspect". I don't see how the first part of that leads to the second part. Humanity could be on the sidelines in a way that doesn't lead to total oppression and subjugation. The idea that these things will necessarily happen just seems like speculation. I could speculate that the opposite will occur and AIs will leave us alone. That doesn't get us anywhere. The question I'm asking is: why? You have told me what you expect to happen, but I want to see an argument for why you'd expect that to happen. In the absence of some evidence-based model of the situation, I don't think speculating about specific scenarios is a reliable guide.

Thanks! 

I think this is evidence for a groupthink phenomenon amongst superforecasters. Interestingly my other experiences talking with superforecasters have also made me update in this direction (they seemed much more groupthinky than I expected, as if they were deferring to each other a lot. Which, come to think of it, makes perfect sense -- I imagine if I were participating in forecasting tournaments, I'd gradually learn to reflexively defer to superforecasters too, since they genuinely would be performing well.)

Ironically, one of the two predictions you quote as example of bad prediction, is in fact an example of a good prediction: "The most realistic estimate for a seed AI transcendence is 2020."

Currently it seems that AGI/superintelligence/singularity/etc. will happen sometime in the 2020's. Yudkowsky's median estimate in 1999 was 2020 apparently, so he probably had something like 30% of his probability mass in the 2020s, and maybe 15% of it in the 2025-2030 period when IMO it's most likely to happen.

Now let's compare to what other people would have been saying... (read more)

The XPT forecast about compute in 2030 still boggles my mind. I'm genuinely confused what happened there. Is anybody reading this familiar with the answer?

FWIW you can see more information, including some of the reasoning, on page 655 (# written on pdf) /  659 (# according to page searcher) of the report. (H/t Isabel.) See also page 214 for the definition of the question.

Some tidbits:

Experts started out much higher than superforecasters, but updated downwards after discussion. Superforecasters updated a bit upward, but less:

(Those are billions on the y-axis.)

This was surprising to me. I think the experts' predictions look too low even before updating, and look much worse after updating!

The part of the ... (read more)

Reminds me of this:
A kind of conservativeness of "expert" opinion that doesn't correctly appreciate (rapid) exponential growth.

Fair, but still: In 2019 Microsoft invested a billion dollars in OpenAI, roughly half of which was compute: Microsoft invests billions more dollars in OpenAI, extends partnership | TechCrunch

And then GPT-3 happened, and was widely regarded to be a huge success and proof that scaling is a good idea etc.

So the amount of compute-spending that the most aggressive forecasters think could be spent on a single training run in 2032... is about 25% as much compute-spending as Microsoft gave OpenAI starting in 2019, before GPT-3 and before the scaling hypothesis. Th... (read more)

Also, if you do various searches on LW and Astral Codex Ten looking for comments I've made, you might see some useful ones maybe.

No, alas. However I do have this short summary doc I wrote back in 2021: The Master Argument for <10-year Timelines - Google Docs

And this sequence of posts making narrower points: AI Timelines - LessWrong

2
kokotajlod
9mo
Also, if you do various searches on LW and Astral Codex Ten looking for comments I've made, you might see some useful ones maybe.

The XPT forecasters are so in the dark about compute spending that I just pretend they gave more reasonable numbers. I'm honestly baffled how they could be so bad. The most aggressive of them thinks that in 2025 the most expensive training run will be $70M, and that it'll take 6+ years to double thereafter, so that in 2032 we'll have reached $140M training run spending... do these people have any idea how much GPT-4 cost in 2022?!?!? Did they not hear about the investments Microsoft has been making in OpenAI? And remember that's what the most aggressive among them thought! The conservatives seem to be living in an alternate reality where GPT-3 proved that scaling doesn't work and an AI winter set in in 2020.

1
erickb
9mo
Remember these predictions were made in summer 2022, before ChatGPT, before the big Microsoft investment and before any serious info about GPT-4. They're still low, but not ridiculous.
3
kokotajlod
9mo
Perhaps this should be a top-level comment.
  • I haven’t considered all of the inputs to Cotra’s model, most notably the 2020 training computation requirements distribution. Without forming a view on that, I can’t really say that ~53% represents my overall view.

Sorry to bang on about this again and again, but it's important to repeat for the benefit of those who don't know: The training computation requirements distribution is by far the biggest cruxy input to the whole thing; it's the input that matters most to the bottom line and is most subjective. If you hold fixed everything else Ajeya inputs, but... (read more)

8
Lukas Finnveden
9mo
It's the crux between you and Ajeya, because you're relatively more in agreement on the other numbers. But I think that adopting the xpt numbers on these other variables would slow down your own timelines notably, because of the almost complete lack of increase in spending. That said, if the forecasters agreed with your compute requirements, they would probably also forecast higher spending.
4
JoshuaBlake
9mo
Do you have a write-up of your beliefs that lead you to 2030 as your median?

Don't apologise, think it's a helpful point!

I agree that the training computation requirements distribution is more subjective and matters more to the eventual output.

I also want to note that while on your view of the compute reqs distribution, the hardware/spending/algorithmic progress inputs are a rounding error, this isn't true for other views of the compute reqs distribution. E.g. for anyone who does agree with Ajeya on the compute reqs distribution, the XPT hardware/spending/algorithmic progress inputs shift median timelines from ~2050 to ~2090, which... (read more)

Another nice story! I consider this to be more realistic the previous one about open-source LLMs. In fact I think this sort of 'soft power takeover' via persuasion is a lot more probable than most people seem to think. That said, I do think that hacking and R&D acceleration are also going to be important factors, and my main critique of this story is that it doesn't discuss those elements and implies that they aren't important. 

In addition to building more data centers, MegaAI starts constructing highly automated factories, which will produce the

... (read more)
3
Karl von Wendt
10mo
As replied on LW, we'll discuss this. Thanks!

I think it mostly means that you should be looking to get quick wins. When calculating the effectiveness of an intervention, don't assume things like "over the course of an 85-year lifespan this person will be healthier due to better nutrition now." or "this person will have better education and thus more income 20 years from now." Instead just think: How much good does this intervention accomplish in the next 5 years? (Or if you want to get fancy, use e.g. a 10%/yr discount rate)

See Neartermists should consider AGI timelines in their spending decisions - ... (read more)

FWIW I think that it's pretty likely that AGI etc. will happen within 10 years absent strong regulation, and moreover that if it doesn't, the 'crying wolf' effect will be relatively minor, enough that even if I had 20-year medians I wouldn't worry about it compared to the benefits.

I also guess cry wolf-effects won't be as large as one might think - e.g. I think people will look more at how strong AI systems appear at a given point than at whether people have previously warned about AI risk.

Normally I'd say yes, but my AGI timelines are now 50% in ~4 years, so there isn't much time for R&D to make a difference. I'd recommend interventions that pay off quickly, therefore. Bed nets, GiveDirectly, etc.

Ouch, I wasn't aware of those rules, they do seem quite restrictive. If it's a website rather than an app, how easy would it be to set it up so that you can access it with a single button press? I guess you can have favorites, default sites, etc.

3
Aaron
1y
On both iOS and Android, you can add websites to your home screen like apps. It’s just usually not quite as nice an experience and not as intuitive for people.

But really, before anyone actually goes and invests significant effort into building this, you should coordinate with me + other people in this comment section.

Minimum acceptable features are the bits prior to "That's it really."

2
kokotajlod
1y
But really, before anyone actually goes and invests significant effort into building this, you should coordinate with me + other people in this comment section.

Oh dang. How about: When you press the button it doesn't donate the money right away, but just adds it to an internal tally, and then once a quarter you get a notification saying 'time to actually process this quarter's donations, press here to submit your face for scanning, sorry bout the inconvenience'

Oh dang. I definitely want it to be the former, not the latter. Maybe we can get around the iOS platform constraints somehow, e.g. when you press the button it doesn't donate the money right away, but just adds it to an internal tally, and then once a quarter you get a notification saying 'time to actually process this quarter's donations, press here to submit your face for scanning, sorry bout the inconvenience'

Everyone please downvote this comment of mine if they want to support the app idea but don't want to give me karma as a byproduct of my polling strategy; this cancels out the karma I get from the OP.

Hmmm. I really don't want the karma, I was using it as a signal of how good the idea is. Like, creating this app is only worth someone's time and money if it becomes a popular app that lots of people use. So if it only gets  like 20 karma then it isn't worth it, and arguably even if it gets 50 karma it isn't worth it. But if it blows up and hundreds of people like it, that's a signal that it's going to be used by lots of people.

Maybe I should have just asked "Comment in the comments if you'd use this app; if at least 30 people do so then I'll fund this app." Idk. If y'all think I should do something like that instead I'm happy to do so.

ETA: Edited the OP to remove the vote-brigady aspect.

-4
kokotajlod
1y
Everyone please downvote this comment of mine if they want to support the app idea but don't want to give me karma as a byproduct of my polling strategy; this cancels out the karma I get from the OP.

Yep! I love when old threads get resurrected.

Not sure I'm assuming that. Maybe. The way I'd put it is, selection pressure towards grabby values seems to require lots of diverse agents competing over a lengthy period, with the more successful ones reproducing more / acquiring more influence / etc. Currently we have this with humans competing for influence over AGI development, but it's overall fairly weak pressure. What sorts of things are you imagining happening that would strengthen the pressure? Can you elaborate on the sort of scenario you have in mind?

1
Jim Buhler
1y
Right so assuming no early value lock-in and the values of the AGI being (at least somewhat) controlled/influenced by its creators, I imagine these creators to have values that are grabby to varying extents, and these values are competing against one another in the big tournament that is cultural evolution. For simplicity, say there are only two types of creators: the pure grabbers (who value grabbing (quasi-)intrinsically) and the safe grabbers (who are in favor of grabbing only if it is done in a "safe" way, whatever that means). Since we're assuming there hasn't been any early value lock-in, the AGI isn't committed to some form of compromise between the values of the pure and safe grabbers. Therefore, you can imagine that the AGI allows for competition and helps both groups accomplish what they want proportionally to their size, or something like that. From there, I see two plausible scenarios: A) The pure and safe grabbers are two cleanly separated groups running a space expansion race against one another, and we should -- all else equal -- expect the pure grabbers to win, for the same reasons why we should -- all else equal -- expect the AGI race to be won by the labs optimizing for AI capabilities rather than for AI safety. B) The safe grabbers "infiltrate" the pure grabbers in an attempt to make their space-expansion efforts "safer", but are progressively selected against since they drag the pure-grabby project down. The few safe grabbers who might manage not to value drift and not to get kicked out of the pure grabbers are those who are complacent and not pushing really hard for more safety. The reason why the intra-civ grabby values selection is currently fairly weak on Earth, as you point out, is that humans didn't even start colonizing space, which makes something like A or B very unlikely to have happened yet. Arguably, the process that may eventually lead to something like A or B hasn't even begun for real. We're unlikely to notice a selection for gr

NIce post! My current guess is that the inter-civ selection effect is extremely weak and that the intra-civ selection effect is fairly weak. N=1, but in our civilization the people gunning for control of AGI seem more grabby than average but not drastically so, and it seems possible for this trend to reverse e.g. if the US government nationalizes all the AGI projects.

3
Jim Buhler
1y
Thanks for the comment! :) You're assuming that the AGI's values will be pretty much locked-in forever once it is deployed such that the evolution of values will stop, right? Assuming this, I agree. But I can also imagine worlds where the AGI is made very corrigible (such that the overseers stay in control of the AGI's values) and where intra-civ value evolution continues/accelerates. I'd be curious if you see reasons to think these worlds are unlikely.

Nice. Well, I guess we just have different intuitions then -- for me, the chance of extinction or worse in the Octopcracy case seems a lot bigger than "small but non-negligible" (though I also wouldn't put it as high as 99%).

Human groups struggle against each other for influence/power/control constantly; why wouldn't these octopi (or AIs) also seek influence? You don't need to be an expected utility maximizer to instrumentally converge; humans instrumentally converge all the time.

Oh also you might be interested in Joe Carlsmith's report on power-seeking AI, it has a relatively thorough discussion of the overall argument for risk.

Nice analysis! 

I think a main point of disagreement is that I don't think systems need to be "dangerous maximizers" in the sense you described in order to predictably disempower humanity and then kill everyone. Humans aren't dangerous maximizers yet we've killed many species of animals, neanderthals, and various other human groups (genocide, wars, oppression of populations by governments, etc.) Katja's scenario sounds plausible for me except for the part where somehow it all turns out OK in the end for humans. :)

Another, related point of disagreement:... (read more)

3
Violet Hour
1y
thnx! : ) Your analogy successfully motivates the “man, I’d really like more people to be thinking about the potentially looming Octopcracy” sentiment, and my intuitions here feel pretty similar to the AI case. I would expect the relevant systems (AIs, von-Neumann-Squidwards, etc) to inherit human-like properties wrt human cognition (including normative cognition, like plan search), and a small-but-non-negligible chance that we end up with extinction (or worse).  On maximizers: to me, the most plausible reason for believing that continued human survival would be unstable in Grace’s story either consists in the emergence of dangerous maximizers, or the emergence of related behaviors like rapacious influence-seeking (e.g., Part II of What Failure Looks Like). I agree that maximizers aren't necessary for human extinction, but it does seem like the most plausible route to ‘human extinction’ rather than ‘something else weird and potentially not great’.
2
kokotajlod
1y
Oh also you might be interested in Joe Carlsmith's report on power-seeking AI, it has a relatively thorough discussion of the overall argument for risk.

Alright, let's make it happen! I'll DM you + Timothy + anyone else who replies to this comment in the next few days, and we can arrange something.

3
demirev
1y
did you end up doing this? If it's still upcoming, I'd also be interested
3
Dan Valentine
1y
Also interested!
3
Isobel P
1y
+1 I'm interested :)
3
Quadratic Reciprocity
1y
+1, also interested
3
Angelina Li
1y
+1, I'd be interested in this if it happens :)
3
RachelM
1y
I'd join, time zones permitting.

Great list, thanks!
My current tentative expectation is that we'll see a couple things in 1, but nothing in 2+, until it's already too late (i.e. until humanity is already basically in a game of chess with a superior opponent, i.e. until there's no longer a realistic hope of humanity coordinating to stop the slide into oblivion, by contrast with today where we are on a path to oblivion but there's a realistic possibility of changing course.)

I definitely agree that near term, non-agentic AI will cause a lot of chaos. I just don't expect it to be so much chaos that the world as a whole feels significantly more chaotic than usual. But I also agree that might happen too.

I also agree that this sort of thing will have a warning-shot effect that makes a Covid-in-feb-2020-type response plausible.

It seems we maybe don't actually disagree that much?

Re: uncharitability: I think I was about as uncharitable as you were. That said, I do apologize -- I should hold myself to a higher standard.

I agree they might be impossible. (If it only finds some niche application in medicine, that means it's impossible, btw. Anything remotely similar to what Drexler described would be much more revolutionary than that.)

If they are possible though, and it takes (say) 50 years for ordinary human scientists to figure it out starting now... then it's quite plausible to me that it could take 2 OOMs less time than that, or po... (read more)

Oh, I thought you had much more intense things in mind than that. Malicious actor using LLMs in some hacking scheme to get security breaches seems probable to me.

But that wouldn't cause instability to go above baseline. Things like this happen every year. Russia invaded Ukraine last year, for example--for the world to generally become less stable there needs to be either events that are a much bigger deal than that invasion, or events like that invasion happening every few months.

2
Max_He-Ho
1y
I guess that really depends on how deep this particular problem runs. If it makes most big companies very vulnerable since most employees use LLMs which are susceptible to prompt injections, I'd expect this to cause more chaos in the US than Russia's invasion of Ukraine. I think we're talking slightly past each other though, I wanted to make the point that the baseline (non-existential) chaos from agentic AI should be high since near term, non-agentic AI may already cause a lot of chaos. I was not comparing it to other causes of chaos; though I'm very uncertain about how these will compare. I'm surprised btw that you don't expect a (sufficient) fire alarm solely on the basis of short timelines. To me, the relevant issue seems more 'how many more misaligned AIs with what level of capabilities will be deployed before takeoff'. Since a lot more models with higher capabilities got deployed recently, it doesn't change the picture for me. If anything, I expect non-existential disasters before takeoff more since the last few months since AI companies seem to just release every model & new feature they got. I'd also expect a slow takeoff of misaligned AI to raise the chances of a loud warning shot & the general public having a Covid-in-Feb-2020-wake-up-moment on the issue.

If you can convince me of the "many many years" claim, that would be an update. Other than that you are just saying things I already know and believe.

I never claimed that nanotech would be the best plan, nor that it would be Yudkowky's bathtub-nanotech scenario instead of a scenario involving huge amounts of experimentation. I was just reacting to your terrible leaps of logic, e.g. "nanobots are a terrible method for world destruction given that they have not been invented yet" and "making nanobots requires experimentation and resources therefore AIs won't... (read more)

2
titotal
1y
These are both statements I still believe are true. None of them are "terrible leaps of logic", as I have patiently explained the logic behind them with arguments. I do not appreciate the lack of charity you have displayed here. Well, I think theres a pretty decent chance that they are impossible. See this post for several reasons why. If they are possible, I would suspect it would take decades at the least to make something that is useful for anyone, and also that the results would still fail to live up to the nigh-magical expectations set by science fiction scenarios. The most likely scenario involves making a toy nanobot system in a lab somewhere that is stupidly expensive to make and doesn't work that well, which eventually finds some niche applications in medicine or something. 

It seems a lot of people are interested in this one! For my part, the answer is "Infohazards kinda, but mostly it's just that I haven't gotten around to it yet." I was going to do it two years ago but never finished the story.

If there's enough interest, perhaps we should just have a group video call sometime and talk it over? That would be easier for me than writing up a post, and plus, I have no idea what kinds of things you find plausible and implausible, so it'll be valuable data for me to hear these things from you.

3
aaron_mai
1y
I'd be very interested in this!
3
Timothy Chan
1y
I'd be interested in this :)

A superintelligent AI will be able to do significant amounts of experimentation and acquire significant amounts of resources. 

3
titotal
1y
I'm not talking about tinkering in someone's backyard, making nanomachines feasible would require ridiculous amounts of funding and resources over many many years. It's an extremely risky plan that provides signficant amount of risk of exposure.  Why would an AI choose this plan, instead of something with a much lower footprint like bio-weapons? 
Load more