Preface

I'm going to start this post with a personal story, in part because people tend to enjoy writing that does that. If you don't want the gossip, just skip section one, the takeaway of which is: "EA has a strong cultural bias in favor of believing arbitrary problems are solvable".

The gossip - and this takeaway - are not the only insight I'm trying to communicate. I don't mean for this post to be a "community" post overall, but rather one that is action relevant to doing good on the object level.

N=1

I had a two week work trial with a prominent EA org. There were some red flags. Nobody would tell me the projected salary, despite the job opportunity taking place across the country and in one of the most expensive cities on Earth. But whatever. I quit my job and flew over.

It didn't work out. My best guess is that this was for cultural reasons. My would-be manager didn't think I'd been making fast enough progress understanding a technical framework, but the jobs I've had since have involved that framework, and I've received overwhelmingly positive feedback, working on products dramatically more complicated than the job opportunity called for. C'est la vie.

Much later, I was told some of the things in my file for that organization. I was told by the organization's leader in a totally open way - nothing sneaky or "here's the dirt", just some feedback to help me improve. I appreciate this, and welcomed it. But here's the part relevant to the post:

One of the negative things in my file was that someone had said I was "a bit of a downer". Much like with my technical competency, maybe so. But it's worth mentioning that in my day to day life, my coworkers generally think I'm weirdly positive, and often comment that my outlook is shockingly sanguine.

I believe that both are true. I'm unusually optimistic. But professional EA culture is much, much more so.

That's not a bad thing (he said, optimistically). But it's also not all good.

(Why) is there an optimism bias?

If you want to complete an ambitious project, it's extremely useful to presume that (almost) any challenge can be met. This is a big part of being "agentic", a much-celebrated and indeed valuable virtue within the EA community. (And also within elite circles more generally.) The high-end professional world has lots of upside opportunities and relatively little downside risk (you will probably always find a pretty great job as a fallback), so it's rational to make lots of bets on long odds and try to find holy grails.

Therefore, people who are flagged as "ambitious", "impressive", "agentic", will both be selected for and encouraged to further cultivate a mindset where you never say a problem is insurmountable, merely challenging or, if you truly must, "not a top priority right now".

But yeah. No odds are too long to be worth a shot!

How is this action relevant?

To avoid burying the lede, it's a major part of my reasoning to donate my 10% pledge to the Against Malaria Foundation, rather than x-risk reduction efforts. I'll trace out the argument, then pile on the caveats.

On the 80,000 Hours Podcast, Will MacAskill put the odds of a misaligned AI takeover around 3%. Many community figures put the odds much higher, but I feel pretty comfortable anchoring on a combination of Will and Katja Grace, who put the odds at 7% that AI destroys the world. Low to mid single digits. Okay.

So here's a valid argument, given its premises:

Premise One: There is at least a 6% chance that AI destroys the world, or removes all humans from it.

Premise Two: There exist interventions that can reliably reduce the risk we face by at least 10% (of the risk, not of the total - so 6% would turn into 5.4%, not -4%/0%).

Premise Three: We can identify these interventions with at least 10% probability.

Premise Four: We can pursue these interventions, and have at least 10% odds of succeeding, provided we've found the right ones.

Premise Five: If the world ends, about 8 billion people die.

Conclusion One: Pursuing the basic plan entailed in premises 1-4 saves, in expectation, at least 480,000 lives (800,000 * 0.06 * 0.1 * 0.1 * 0.1). 

Let's take that as an anchor point and add two further premises.

Premise Six: The (next) best opportunity to save human lives is the Against Malaria Foundation, and saving lives through AMF costs approximately $4,000 per life.

Premise Seven: We want to save as many lives as possible in expectation.

Conclusion Two: We should pursue AI x-risk mitigation if strategies in line with the above premises cost $1.92B or less.

This is simplified in many ways. I could see arguments to challenge every premise in every direction. And, of course, longtermist arguments massively change the calculus by counting all future generations on the ledger (and guessing that there's some chance there will be a truly staggering number of such generations).

We'll come back to longtermism, which deserves its own section. But first let's focus on premises three and four.

Are we overoptimistic about key premises?

Yes, I think so.

Premise Three: We can identify these interventions with at least 10% probability.

I won't dispute that there exist interventions that would reduce risk. There exist actions that achieve nearly anything. But can we find them? 

Recall that there are decent reasons to think goal alignment is impossible - in other words, it's not a priori obvious that there's any way to declare a goal and have some other agent pursue that goal exactly as you mean it.

Recall that engineering ideas very, very rarely work on the first try, and that if we only have one chance at anything, failure is very likely.

Recall that interventions that slow technological development in the face of strong economic pressures seem extremely hard to find, to the degree that I'm not sure one has ever worked (and if one had, I'd guess it had been in a dictatorship rather than a liberal democracy).

10% seems significantly too high to me, even to reduce the relative risk by 1%. As it looks to me, there are broad swathes of possible worlds where the problem either basically never comes up or solves itself, and broad swathes of possible worlds where our trajectory is already set in stone and we're going down.

The presumption that we live in the sweet spot and need merely roll up our sleeves strikes me as an example of powerful optimism bias.

Premise Four: We can pursue these interventions, and have at least 10% odds of succeeding, provided we've found the right ones.

Recall that getting "humanity" to agree on a good spec for ethical behavior is extremely difficult: some places are against gene drives to reduce mosquito populations, for example, despite this saving many lives in expectation.

Recall that there is a gigantic economic incentive to keep pushing AI capabilities up, and referenda to reduce animal suffering in exchange for more expensive meat tend to fail.

Recall that we have to implement any solution in a way that appeals to the cultural sensibilities of all major and technically savvy governments on the planet, plus major tech companies, plus, under certain circumstances, idiosyncratic ultra-talented individual hackers.

The we-only-get-one-shot idea applies on this stage too.

So again, 10% strikes me as really optimistic. It's worth mentioning here, too, that I don't tend to see these premises valued at 10% in most analyses, or even part of the calculation. Most often it's taken as a given that levers exist to reduce risk by at least 1% (or much more), and that we're competent to push those levers.

$1.92B to save 480,000 lives in expectation is a great deal. But it seems really, really rosy to think we can accomplish simultaneous extremely difficult and currently poorly specified political, philosophical, and technical tasks on a global scale at that price point. Heck, we've been working on figuring out how good deworming is for a decade. This stuff is hard

So, presuming shorttermism, AMF looks like the better option. Of course, we shouldn't presume shorttermism. Let's get into that.

The promised longtermism section

Thank you for your characteristic patience, longtermists.

I am going to risk looking stupid here, because longtermist arguments are often strong and generally really complicated. Please presume I know what I'm talking about and will get to relevant objections, then judge me extra if I in fact leave a big one out.

Here's a longtermist argument as I understand it:

Premise One: If you do the math on certain x-risk reduction initiatives with the proposed benefit as being "save 8 billion lives", it may or may not be cost effective.

Premise Two: However, extinction is a really big deal beyond that, because we lose not only the 8 billion lives, but also however many lives there would have been counterfactually.

These premises are both totally solid. I am not nearly so arrogant as to pick a fight with the legendary Derek Parfit.

Premise Three: In expectation, there are extremely many people in the long-term future, such that we should model x-risk reduction initiatives as saving orders of magnitude more lives than a mere 8 billion.

I am not as sold on premise three.

Some basic (and troubling) anchor points

Recall that until nuclear weapons were developed, humanity had no realistic shot at accidentally messing up the biome we live in so badly that our survival as a species was at risk.

Recall that bioweapons - including engineered pathogens that could kill almost everyone - are possible and protections against biological attack are spectacularly underfunded (for the record, I am very in favor of improving this state of affairs and think people are doing incredible work here).

Scenarios if we pass the (potential) AI sieve

Imagine that we create well-aligned AGI. There are several things this could mean.

Scenario One: The AGI does global surveillance good enough to prevent rogue actors from destroying the planet, no matter how powerful technology gets. Everyone's cool with this level of surveillance, and it doesn't cause permanent rule by some bad ideology or other. Also, no cosmic threats happen to manifest.

Scenario Two: Same as scenario one, but there's a black hole/alien invasion/unstoppable asteroid/solar flare/some other astronomical event we don't know about yet that unavoidably destroys the planet in the next millennium or two. (I don't think this scenario is likely, but it is possible.)

Scenario Three: The AGI causes massive technological progress, but there are actually lots of AGIs basically at once. None of them is trying to kill us all, but none of them is given permission to surveil everyone all the time. We have many more "make sure the world doesn't get destroyed" sieves to get through as a species, and over time it gets easier and easier for rogue actors or industrial accidents to kill us all. Eventually, and probably in under 1,000 years, one does.

Scenario Four: The AGI causes massive technological progress, but less than we currently imagine a "singularity" to look like, due to limits to returns from intelligence far above what our species can reach, but far short of godlike powers. Same problems as scenario three, but slower.

And many others! The most likely case to me is that if AI x-risk is solved or turns out not to be a serious issue, and we just keep facing x-risks in proportion to how strong our technology gets, forever. Eventually we draw a black ball and all die. If technology keeps improving really fast, that's likely in the next 500 years or so. The second most likely case to me is stuff just gets so weird as to be unrecognizable, but it's not straightforwardly catastrophic.

Key Longtermist Objection: Use expected value

Okay, a longtermist might say. Maybe the odds are really slim that we thread this needle, and then also the subsequent needles required to create an interstellar civilization spanning billions of years. But the value of that scenario is so high that if you shut up and multiply, it's worth putting a lot of resources in that direction.

To which I say... man, I dunno, this starts to feel like Pascal's Mugging. There are a lot of unknown unknowns, interstellar travel seems really hard and like no particular generation has a strong incentive to bear the huge sacrifices to make it happen, and it's just very suspicious to suppose we're at the precipice of the one sieve that matters most and all further ones are comfortably manageable by our descendants. We're just getting into territory that feels roughly analogous to assigning some probability mass to each fundamentalist major religion being true. I can't easily put into words why I don't want to do that, but I really, really don't, and I feel like digging deeper into it will make me less sane rather than more sane.

End of longtermist section

So okay, am I anti-longtermism? No, I don't think so. I think Will MacAskill's argument that we're dramatically underfunding the long term future is just straightforwardly right, even if that future only extends 500 years in expectation. On net we should move the needle up.

But that doesn't mean the most cost effective interventions are longtermist-motivated x-risk reduction, just that longtermist organizations and projects are way better than the baseline in terms of projects that exist and are funded/well regarded.

So, scrap AI x-risk projects?

No! Actually, this is awkward, because I actually spend a bunch of my own time professionally working on those. I find AI really interesting, and I think that much like with longtermism, more resources should go toward AI safety rather than fewer, and we as a planet should take AI risk more seriously.

But much like Tyler Cowen's view on EA as a whole, I think something can be overrated locally and (badly) underrated globally. Buy your local AI safety researcher a coffee. Heck, buy your local AI safety research editor (that's me) a coffee, while you're at it.

But does it beat AMF? Is it clearly the highest value altruistic project available? Is it a slam dunk that reducing x-risk from AI should be our North Star?

I don't think so. I think the assumptions behind that conclusion are biased severely - maybe irrecoverably - by a culture of intense optimism. When I put on my AI x-risk hat, I myself prefer to have that optimism. Conditional on having selected that as a project to work on, it's probably good.

But when I'm deciding where to donate, and I do the math, I genuinely think the most effectively altruistic option is just saving little children from malaria.

88

New Comment
22 comments, sorted by Click to highlight new comments since: Today at 8:18 PM

Thanks for posting! I'm sympathetic to the broad intuition that any one person being at the sweet spot where they make a decisive impact seems unlikely , but I'm not sold on most of the specific arguments given here.

Recall that there are decent reasons to think goal alignment is impossible - in other words, it's not a priori obvious that there's any way to declare a goal and have some other agent pursue that goal exactly as you mean it.

I don't see why this is the relevant standard. "Just" avoiding egregiously unintended behavior seems sufficient for avoiding the worst accidents (and is clearly possible, since humans do it often).

Also, I don't think I've heard these decent reasons--what are they?

Recall that engineering ideas very, very rarely work on the first try, and that if we only have one chance at anything, failure is very likely.

It's also unclear that we only have one chance at this. Optimistically (but not that optimistically?), incremental progress and failsafes can allow for effectively multiple chances. (The main argument against seems to involve assumptions of very discontinuous or abrupt AI progress, but I haven't seen very strong arguments for expecting that.)

Recall that getting "humanity" to agree on a good spec for ethical behavior is extremely difficult: some places are against gene drives to reduce mosquito populations, for example, despite this saving many lives in expectation.

Agree, but also unclear why this is the relevant standard. A smaller set of actors agreeing on a more limited goal might be enough to help.

Recall that there is a gigantic economic incentive to keep pushing AI capabilities up, and referenda to reduce animal suffering in exchange for more expensive meat tend to fail.

Yup, though we should make sure not to double-count this, since this point was also included earlier (which isn't to say you're necessarily double-counting).

Recall that we have to implement any solution in a way that appeals to the cultural sensibilities of all major and technically savvy governments on the planet, plus major tech companies, plus, under certain circumstances, idiosyncratic ultra-talented individual hackers.

This also seems like an unnecessarily high standard, since regulations have been passed and enforced before without unanimous support from affected companies.

Also, getting acceptance from all major governments does seem very hard but not quite as hard as the above quotes makes it sound. After all, many major governments (developed Western ones) have relatively similar cultural sensibilities, and ambitious efforts to prevent unilateral actions have previously gotten very broad acceptance (e.g. many actors could have made and launched nukes, done large-scale human germline editing, or maybe done large-scale climate engineering, but to my knowledge none of those have happened).

The we-only-get-one-shot idea applies on this stage too.

Yup, though this is also potential double-counting.

Yeah, I share the view that the "Recalls" are the weakest part -- I mostly was trying to get my fuzzy, accumulated-over-many-years vague sense of "whoa no we're being way too confident about this" into a more postable form. Seeing your criticisms I think the main issue is a little bit of a Motte-and-Bailey sort of thing where I'm kind of responding to a Yudkowskian model, but smuggling in a more moderate perspective's odds (ie. Yudkowsky thinks we need to get it right on the first try, but Grace and MacAskill may be agnostic there).

I may think more about this! I do think there's something there sort of between the parts you're quoting, by which I mean yes, we could get agreement to a narrower standard than solving ethics, but even just making ethical progress at all, or coming up with standards that go anywhere good/predictable politically seems hard. Like, the political dimension and the technical/problem specification dimensions both seem super hard in a way where we'd have to trust ourselves to be extremely competent across both dimensions, and our actual testable experiments against either outcome are mostly a wash (ie. we can't get a US congressperson elected yet, or get affordable lab-grown meat on grocery store shelves, so doing harder versions of both at once seems...I dunno, might hedge my portfolio far beyond that!).

"EA has a strong cultural bias in favor of believing arbitrary problems are solvable".

I think you're pointing to a real phenomenon here (though I might not call it an "optimism bias"—EAs also tend to be unusually pessimistic about some things).

I have pretty strong disagreements with a lot of the more concrete points in the post though, I've tried to focus on the most important ones below.

Conclusion One: Pursuing the basic plan entailed in premises 1-4 saves, in expectation, at least 4.8 million lives (800,000 * 0.06 * 0.1 * 0.1). 

(I think you may have missed the factor of 0.01, the relative risk reduction you postulated? I get 8 billion * 0.06 * 0.01 * 0.1 * 0.1 = 48,000. So AI safety would look worse by a factor of 100 compared to your numbers.)

But anyway, I strongly disagree with those numbers, and I'm pretty confused as to what kind of model generates them. Specifically, you seem to be extremely confident that we can't solve AI X-risk (< 1/10,000 chance if we multiply together the 1% relative reduction with your two 10% chances). On the other hand, you think we'll most likely be fine by default (94%). So you seem to be saying that there probably isn't any problem in the first place, but if there is, then we should be extremely certain that it's basically intractable. This seems weird to me. Why are you so sure that there isn't a problem which would lead to catastrophe by default, but which could be solved by e.g. 1,000 AI safety researchers working for 10 years? To get to your level of certainty (<1/10,000 is a lot!), you'd need a very detailed model of AI X-risk IMO, more detailed than I think anyone has written about. A lot of the uncertainty people tend to have about AI X-risk comes specifically from the fact that we're unsure what the main sources of risk are etc., so it's unclear how you'd exclude the possibility that there are significant sources of risk that are reasonably easy to address.

As to why I'm not convinced by the argument that leads you to the <1/10,000 chance: the methodology of "split my claim into a conjunction of subclaims, then assign reasonable-sounding probabilities to each, then multiply" often just doesn't work well (there are exceptions, but this certainly isn't one of them IMO). You can get basically arbitrary result by splitting up the claim in different ways, since what probabilities are "reasonable-sounding" isn't very consistent in humans.

Okay, a longtermist might say. Maybe the odds are really slim that we thread this needle, and then also the subsequent needles required to create an interstellar civilization spanning billions of years. But the value of that scenario is so high that if you shut up and multiply, it's worth putting a lot of resources in that direction.

I can't speak for all longtermists of course, but that is decidedly not an argument I want to make (and FWIW, my impression is that this is not the key objection most longtermists would raise). If you convinced me that our chances of preventing an AI existential catastrophe were <1/10,000, and that additionally we'd very likely die in a few centuries anyway (not sure just how likely you think that is?), then I would probably throw the expected value calculations out the window and start from scratch trying to figure out what's important. Basically for exactly the reasons you mention: at some point this starts feeling like a Pascal's mugging, and that seems fishy and confusing.

But I think the actual chances we prevent an AI existential catastrophe are way higher than 1/10,000 (more like 1/10 in terms of the order of magnitude). And I think conditioned on that, our chances of surviving for billions of years are pretty decent (very spontaneous take: >=50%). Those feel like cruxes to me way more than whether we should blindly do expected value calculations with tiny probabilities, because my probabilities aren't tiny.

 

Scenario Two: Same as scenario one, but there's a black hole/alien invasion/unstoppable asteroid/solar flare/some other astronomical event we don't know about yet that unavoidably destroys the planet in the next millennium or two. (I don't think this scenario is likely, but it is possible.)

I agree it's possible in a very weak sense, but I think we can say something stronger about just how unlikely this is (over the next millennium or two): Nothing like this has happened over the past 65 million years (where I'm counting the asteroid back then as "unstoppable" even though I think we could stop that soon after AGI). So unless you think that alien invasions are reasonably likely to happen soon (but were't likely before we sent out radio waves, for example), this scenario seems to be firmly in the "not really worth thinking about" category.

This may seem really nitpicky, but I think it's important when we talk about how likely it is that we'll continue living for billions of years. You give several scenarios for how things could go badly, but it would be just as easy to list scenarios for how things could go well. Listing very unlikely scenarios, especially just on one side, actively makes our impression of the overall probabilities worse.

Ah yeah, you're right - I think basically I put in the percent rather than the probability. So it would indeed be very expensive to be competitive with AMF. Though so is everything else, so that's not hugely surprising.

As for the numbers, yeah, it does just strike me as really, really unlikely that we can solve AI x-risk right now. 1/10,000 does feel about right to me. I certainly wouldn't expect everyone else to agree though! I think some people would put the odds much higher, and others (like Tyler Cowen maybe?) would put them a bit lower. Probably the 1% step is the step I'm least confident in - wouldn't surprise me if the (hard to find, hard to execute) solutions that are findable would reduce risk significantly more.

EDIT: tried to fix the math and switched the "relative risk reduction term" to 10%. I feel like among findable, executable interventions there's probably a lot of variance, and it's plausible some of the best ones do reduce risk by 10% or so. And 1/1000 feels about as plausible as 1/10000 to me. So, somewhere in there.

it does just strike me as really, really unlikely that we can solve AI x-risk right now

I think Erik wasn't commenting so much on this number, but rather its combination with the assumption that there is a 94% chance things are fine by default.

I.e. you are assuming that there is a 94% chance it's trivially easy, and 6% chance it's insanely hard.

Very few problems have such a bimodal nature, and I also would be interested to understand what's generating it for you.

I think you should be substantially more optimistic about the effects of aligned AGI.  Once we have aligned AGI, this basically means high end cognitive labor becomes very cheap, as once an AI system is trained, it is relatively cheap to deploy it en masse.  Some of these AI scientists would presumably work on making AI's at least cheaper if not more capable, which limits to a functionally infinite supply of high end scientists.  Given a functionally infinite supply of high end scientists, we will quickly discover basically everything that can be discovered through parallelizable scientific labor which is, if not everything, I think at least quite a few things (e. g. I have pretty high confidence that we could solve aging, develop extremely good vaccines to prevent against biorisk, etc.).  Moreover, this is only a lower bound; I think AGI will probably relatively quickly become significantly smarter than the smartest human, so we will probably do even better than the aforementioned scenario.

To me, "aligned" does a lot of work here. Like yes, if it's perfectly aligned and totally general, the benefits are mind boggling. But maybe we just get a bunch of AI that are mostly generating pretty good/safe outputs, but a few outputs here and there lower the threshold required for random small groups to wreak mass destruction, and then at least one of those groups blows up the biome.

But yeah given the premise we get AGI that mostly does what we tell it to, and we don't immediately tell it  to do anything stupid, I do think it's  very hard to predict what will happen but it's gonna be wild (and indeed possibly really good).

Strong upvote - I found your perspective really fresh:
"The most likely case to me is that if AI x-risk is solved or turns out not to be a serious issue, and we just keep facing x-risks in proportion to how strong our technology gets, forever. Eventually we draw a black ball and all die."

Lots of us are considering a career pivot into AI safety. Is it...actually tractable at all? How hopeful should we be about it? No idea.

Thank you! My perspective is: "figuring out if it's tractable is at least tractable enough that it's worth a lot more time/attention going there than is currently", but not necessarily "working on it is far and away the best use of time/money/attention for altruistic purposes", and almost certainly not "working on it is the best use of time/money/attention under a wide variety of ethical frameworks and it should dominate a healthy moral parliament".

It's hard to say. Considering there are fewer than 300 people estimated working on AI Safety and it's still just starting to gain traction, I wouldn't expect us to know a ton about it yet. 

Even in established fields people are expected to usually take years or even decades before they can produce truly great research. 

Psychology was still using lobotomies until 55 years ago. We've learned a lot since then and there's still much more to learn. It took a similar amount of time for AI capabilities to get to where they are now. AI Safety is much newer and could look completely different in 10 years. Or, if nobody works on it or the people working on it are unable to make progress, it could look relatively similar. 

Data point: I wasn't there for this but Justis is a friend of mine, and on an interpersonal level he's one of the chillest, highest-contentment-set-point people I know. He doesn't brim over with cheerleading or American dynamism, but my default assumption is if someone calls him a downer they can't mean interpersonal affect.

Re optimism bias

Towards the top of the post I think you made a claim that EAs are often very optimistic (particularly agentic one’s doing ambitious things or in ‘elitist’ positions).

I just wanted to flag that this isn’t my impression of many EAs who I think are doing ambitious projects, I think a disproportionate number of agentic people I know in EA are pretty pessimistic in general.

I think the optimism thing and something like desire to try hard/motivation/ enthusiasm for projects are getting a bit confused here, but low confidence.

Justis,do you, as someone involved in AI safety research, think that AI safety researchers would mostly dislike the total termination of AI research (assuming they all found great alternative jobs, etc)?

Hmm. I think reactions to that would vary really widely between researchers, and be super sensitive to when it happened, why, whether it was permanent, and other considerations.

I wonder if they are truly against AGI or ASI, or if they just want the safe versions? I am not sure if there are really two positions here (one for AI, one against), or really just one with caveats.

One of the negative things in my file was that someone had said I was "a bit of a downer". Much like with my technical competency, maybe so. But it's worth mentioning that in my day to day life, my coworkers generally think I'm weirdly positive, and often comment that my outlook is shockingly sanguine.

I wonder how much of this is an EA thing vs idiosyncrasies of the org you trialed at, or for that matter, West Coast American culture overall. Fwiw, my own experience is that I worked at three non-EA tech companies (Epic, Impossible Foods, and Google),  and broadly people seemed more positive/confident in the organization than people I know in EA orgs. Certainly EA funders seem more pessimistic (though I've never talked to top VCs).

I had a two week work trial with a prominent EA org. There were some red flags. Nobody would tell me the projected salary, despite the job opportunity taking place across the country and in one of the most expensive cities on Earth. But whatever. I quit my job and flew over.

This seems quite bad and I'm sorry you had to go through that. The org's actions feels rather unprofessional to me tbh.

Yeah I think a lot of it is West Coast American culture! I imagine EA would have super different vibes if it were mostly centered in New York.

For a contrasting opinion by Kat Woods and Amber Dawn, here's this post: Two reasons we might be closer to solving alignment than it seems.

Link below:

https://forum.effectivealtruism.org/posts/RkpdA8763yGtEovj9/two-reasons-we-might-be-closer-to-solving-alignment-than-it

(Comment to flag that I looked back over this and just totally pretended 4,000 was equal to 1,000. Whoops. Don't think it affects the argument very strongly, but I have multiplied the relevant dollar figures by 4.)

I think you raise an interesting point around:

Slow takeoff + alignment solved + competitive dynamics + xrisk due to future technologies I vented with help AGI (in competitive world)

I tend to assume world government with high probability because of the number of ways in which future tech (possibly invented with help of AGI) can be used to snowball towards one, with or without the consent of all political actors at the time.

There is a taboo around discussing ways to attempt totalitarian power grabs in general so I won't be doing that here.