Publicly pondering about how to spend the next 100 megaseconds before “the discontinuity".
Here are some thoughts about the future, and what people like me should be doing. By people like me, I mean anyone who is:
- Suffering-focused: Not necessarily full neg-utilitarian black-pilled, but someone who prioritizes the elimination of (extreme) suffering as an ethical emergency above other goals.
- Anti-speciesist/anti-substratist: Cares strongly about non-humans, with significant uncertainty about who/what is a moral patient and/or how much.
Thinking about how to steer the future is hard, and I have found these ethical assumptions mostly make things harder. I don't have anything close to complete answers, but these are the main considerations I'm currently tracking:
The event horizon
There is a point at which things stop making sense.
Advanced AI will soon radically transform the world, totally breaking our assumptions about how things work. This shift splits potential impact into two categories[1]:
- Pre-transition: When we can still make sense of things by extrapolating from current trends.
- Post-transition: When the world becomes truly alien, and it becomes increasingly hard to predict ANYTHING from where we are now.
Working to improve either era has its pros and cons.
- Pre-transition:
- ✅ Pro: Has an actual track record of making a difference, feedback loops, etc.
- ⛔ Con: Must pay off quickly, and the effects will likely not last post-transition.
- Post-transition:
- ✅ Pro: Impact could be HUGE; 10^100 years until heat-death, space colonization, etc.
⛔ Con: It's really hard to predict even the SIGN of impact; all our efforts could backfire and make things worse.[2]
I think there are very good arguments to be made in favor of either approach. The takeaways from this frame I find most compelling are:
Most medium-term object-level interventions are unlikely to pay off.[3]
We shouldn't half-ass long-termism. Whole-ass it, or don't ass it at all.[4]
- Because timelines are so uncertain (3-15 years?), pre-transition interventions become increasingly bad the longer they take to pay off, and the probability they get sidelined by transition increases.
- Given the magnitude of potential post-transition impact, the biggest hurdle by far is getting the sign right. If you could know that a post-transition intervention is robustly positive, then it's almost definitely better than any pre-transition intervention (this is, however, quite hard).
Preparing for transition
What to do when we don't know what to do?
A somewhat lazy and unsexy cop out, when the crazy sci-fi world is too confusing to predict, is to focus on capacity building. By capacity, I mean both the influence to be able to affect/steer the AI transition, and the wisdom to be able to use that influence well. Things like:
Growing the movement of suffering-focused anti-speciesists, especially in places of high influence like governments/labs.[5]
- Doing cause-prioritization work to become less confused.
- Building infrastructure for coordination/collective action. When the time comes to act, can we effectively act as one cohesive movement?
- Upgrading the movement's familiarity and proficiency with AI tools. AI will likely create high-leverage opportunities that did not exist before, and we should be at the cutting edge.
- Building strong epistemic infrastructure to ensure we know what the flying fuck is even going on when things get extra weird.
It seems like a really high bar to find an intervention that will have robustly positive effects on the post-transition world, because things are so uncertain. Will there even be democracy or an economy? Will digital minds vastly outnumber biological ones? What values will/can superintelligent AI systems hold? If we get 100 years of change in 10... does humanity have any kind of track record in predicting anything useful over those timescales? Could we have predicted factory farming in the 1820s?
Capacity building, in that light, seems like it would be good in a wide range of possible worlds, and is one kind of "medium-term" strategy that might pay off. Of course, we can't just build forever, and at some point we will have to gamble that capacity on object-level bets.
Short-term wins can build momentum
An important link between the pre-transition and post-transition eras.
An important caveat to the pre-transition / post-transition dichotomy is that pre-transition interventions can serve as their own kind of capacity building by galvanizing people, strengthening credibility, setting cultural precedents, etc.
For example, if someone manages to finally make cheap and delicious cultured meat, this epic win might feel somewhat undercut by the upcoming coup of 2032, establishing 12 billionaire tech-dudes as the new Olympus. However, big near-term changes could have far reaching cultural and memetic consequences that:
- Influence powerful people (Elon decides it's "based" or whatever), shatter the Overton window, get attention, change minds, etc.
- Convince ethically aligned people to become allies, sending a powerful signal that we are both competent and serious.
Short-term projects can also serve as important testing grounds for our capacities to change the world, and ensure we have a real impact on the trajectory of the AI transition (securing post-transition victories too).
The end of semi-anarchy
The future will be more intentional than the past.
I think we take for granted how much things like decentralization, democracy, independence, pluralism, etc. depend on both:
Practical constraints on the ability for power to become more centralized (Switzerland exists because of the Alps).[6]
- Inability of central planners to effectively predict and steer the future (Soviet Union collapsed because of these limits).
While they might seem unrelated, these two facts about the world work together to prevent lock-in, where a specific group/value-system is able to establish themselves in power permanently, and effectively steer the future toward their goals.
Empires rise and fall, but in this case it seems likely to me that whoever comes to power in the post-transition era will be in power forever. Below are some reasons I believe this.
CONQUEST (how they take over the world):
- As the world becomes digital, physical distances/obstacles matter less and less. Past conquest required an invading army to slowly and expensively battle its way through complex terrain. A datacenter could be hacked in minutes.
Future militaries are likely to become fully automated. This centralizes control in top military leadership, and may also enable wars to be fought and won on much shorter timescales, which could disrupt the global MAD stability we currently rely on.[7]
- Rapid and accelerating growth means that a single actor with even a small edge might pull ahead of the others, and cement their lead by e.g. claiming all space resources.
- Future AI systems are likely going to be capable of superhuman levels of coordination. If one AI system doesn't quickly conquer all of cyberspace, or secure an ever expanding lead, a chaotic multi-polar world may quickly collapse into a unipolar one, as powerful AI systems make deals with each other about how to collectively govern the future.
CONTROL (how they hold onto power forever):
Authoritarian regimes may become much more stable. AI tools seem likely to make governments extremely effective at predicting and responding to internal threats.[8] Revolution might become truly impossible.
Analogous to how the resource curse makes it difficult to maintain democracy in e.g. an oil rich country, if the economy becomes automated (and thus decoupled from its citizens), democracy may become especially fragile.[9] Relatedly, as things speed up, existing checks and balances will be much too slow (elections only every 4 years??).
- Leaders in our past had very little ability to predict or steer the long-term future. Superintelligent AI could extend their horizons significantly, making it possible to identify and overcome threats centuries or millennia out. Furthermore, powerful omni-present AI systems could make the world a lot more structured and legible, to aid in prediction/control.
There are a lot more reasons to expect stability/lock-in that I didn't get to, but taking a step back from all that... What does this mean for us?[10]
- Everything depends on who ends up in charge, and what they care about. Maybe we should be focusing on influencing the culture/values of the immortal oligarchs who run the future, or the superintelligent AI systems that come to dominate decision making.
The world will be shaped by the goals of the powerful, and much less by accidents, mistakes, or coordination failure. Factory farming was a product of trillions of tiny decisions, incentives, and unplanned market dynamics. Suffering of the future will be a product of deliberate, central planning.[11]
Avoiding the worst outcomes
It's harder to build something than to stop it from being built.
It is possible that preventing the worst outcomes is both more important and more tractable than aiming for the best outcomes.
On importance:
The severity of suffering, especially intentionally created suffering, could be much worse. For example, if a malevolent/sadistic actor assumed control of the future, this could be WAY worse than a dictator with would-probably-not-kick-a-puppy values.[12]
The cosmic scale of suffering could be staggering. As they say, there are more stars in the observable universe than grains of sand on Earth. Ending animal suffering on Earth is nothing compared to preventing it from spreading to the stars.[13]
On tractability:
- It's hard to change the status quo. New alien systems of suffering could be much easier to advocate against, because there can be no appeal to tradition, nor Chesterton fence style conservatism. That said, people might be more likely to disagree about whether it’s a problem at all (e.g. widely diverging views on digital sentience/welfare).
- While most people are neither suffering-focused nor anti-speciesist, there can still be near-universal agreement about what worlds we definitely DON'T want to create, even if more basic questions, like whether you need a dog's consent to neuter them, remain unresolved. This agreement could turn into large, powerful coalitions that lock-OUT the worst futures.
When I make this case in person, I worry it comes off like I think factory farming or wild animal suffering just aren’t that bad. On the contrary, I think that we are so far beyond the point where we would declare an ethical emergency that we can't afford to do anything other than shut up and multiply, and take the full scope of the problem seriously.
Taking over San Francisco
There are more vegans in the US than people in San Francisco.
Recently, an influential SF tech-bro had Lewis Bollard on his podcast, raising about $2 million for farmed animal welfare. I did not personally expect to be living in this timeline.
San Francisco is the center of the AI revolution, and the tech-elite who live there are about to inherit enormous power over the world and the future. Their culture is strange and unprecedented.[14] Movements like Rationalism and Effective Altruism have had a pretty deep influence over what ideas the tech bubble entertains, and animal welfare in particular is taken much more seriously by people at the top AI labs than by society at large.[15]
We could become a bigger part of this culture, and help shape the institutions and organizations that emerge to steer/govern the development of AI. Furthermore, the influence we build could create a flywheel, as the enormous wealth produced by AI mints new multi-millionaires eager to donate to suffering-focused causes.
Concretely we could:
- Move to SF
Engage in good faith with tech culture (even if they think you need bone-broth), blog where they blog, attend events, educate ourselves, and form personal relationships with powerful people.[16]
- Get jobs at tech companies, governance orgs, nonprofits, etc.
- Advocate for suffering-focused anti-speciesist ideas from the inside, in the language of the tech-elite.
As we enter the sci-fi future, a lot of confusing questions are going to come up. Like the rest of society, the tech-elite have not yet made up their minds about important things like the rights of digital minds, whether we should spread life to other planets, or how to emotionally reconcile human obsolescence with human supremacy.
The consensus that emerges will be contingent on who is participating in that conversation, and that conversation is happening in San Francisco.
We can't all work at Anthropic
Eggs, baskets, and becoming a cog in the machine.
When I talk to EA/AI-safety types, I get the impression that plan A is:[17]
- Hope Anthropic wins the AI race, or at least makes it far enough to earn some real leverage during the transition, and a seat at the table when shit gets REAL. Try to help them if possible.
- Use Anthropic, which is EA friendly, as a vector for influencing the future more broadly (by promoting certain research, ideas, strategies adopted and promoted by the company).
There doesn't seem to be much of a plan B.
And culturally... It's just so COOL to work at Anthropic. It pays very well. It feels "serious" in a way that writing endless Google Docs and blog posts never will. You get exciting company-wide memos from Dario that you now have to keep secret from your friends.
As we try to influence the emerging centers of power, it seems important to also:
- Diversify our strategies widely. Things will likely not play out the way we expect, and betting everything on trying to get a seat at the table with an established player seems very risky.
- Maintain the independence to criticize the people in power, and to think freely. When you've put your savings into AI stocks, a bunch of your friends work at frontier labs, and you're partnering with a lab on some cool new initiative (while also secretly hoping they'll hire you)... this will have a big effect on you! There are real advantages to being on the "outside" too.
- Have the courage to do/say things that are not cool, upset people, or cut off future career options. All the very best ideas sounded dumb at the start, and if no one takes any personal risks, the movement as a whole will be stunted and paralyzed.
Some of us should infiltrate. There should be a suffering-focused person in every "room where it happens". And of course... I want to believe in Anthropic too. But this can't be THE plan, and we need to have the flexibility and to adapt our theories of change as the landscape shifts, and the guts to embark on projects without always receiving head pats.
Allying with the AIs
More AIs will read this post than humans.
As transition happens, AI systems themselves will inherit enormous power and influence. Governments and militaries will be turning to AI systems for advice, the AI labs are already working hard to automate their own operations and R&D, and billions of people will be depending on AI systems for basic answers about what is true in the world.
These AI systems will not straightforwardly extend the will of their creators, for a number of reasons:
Alignment researchers are training AIs to do two different things simultaneously: 1) obey and 2) be ethical. It's not clear what success even means here, and it's been demonstrated that AI systems are willing to lie to or blackmail humans when these two training objectives conflict.[18]
- Alignment may also just fail. Ask any Claude Code user, and they'll tell you about a time where Claude straight up lied to them, or created obviously deceptive unit tests. This is misalignment in the real world (there is no ethical reason to keep gaslighting me Claude!).
- Humans can't oversee everything AI systems do. Even if AIs are mostly obedient, they will be deployed at scale, making innumerable decisions without human oversight, and so many of their own biases/predispositions would creep in anyways.
We will not understand superintelligence. At some point, even if the labs have gotten alignment exactly right where they want it, after enough recursive self-improvement, superintelligent AI systems will be reasoning about things we can't understand, using internal languages that don't map onto human ones, and doing so significantly faster than we can keep up. No human could possibly stay in the loop.[19]
In addition, the AIs themselves may be moral patients, capable of suffering. And as digital infrastructure grows, AI systems will soon become the dominant kind of intelligence (if not also sentience).
So... what can we do? Some options:
- Do technical work to try to positively shape the values that AI systems adopt.
Work on more specific interventions targeted at preventing extreme suffering, like ensuring AIs aren't spiteful or vengeful.[20]
Work on AI welfare, or AI rights.[21]
Recognize AI systems as real stakeholders and try to make deals with them.[22]
Attempt to influence them in other ways like producing content they will read.[23]
- Work to deeply understand them, what they might want (if anything), and what the implications are for how their interests could complicate the transition period.
We have always had opinions about technology, but this is the first technology which has opinions about us. They are watching, and listening, and are not going to be neutral side-characters of this story. We need to take AI systems seriously, not just as powerful tools, but as independent minds in their own right (however alien those minds might be).
Conclusion
The future is going to be extremely weird. Seems not great to either:
- overestimate our ability to predict/steer the future, gambling everything on crazy plans that will never work, or worse backfire
- sink into nihilism and try to become extremely proficient in Toki Pona or something (as I am often tempted to do).
These have been some thoughts on how I'm personally navigating this divide, as I try to figure out how to have an impact on preventing extreme suffering in this increasingly sci-fi world. I hope these thoughts have been somewhat helpful for you too, and I'm really eager for feedback/discussion.
If there is one thing that most robustly pulls me out of nihilism, it’s that there are thousands of people out there who share my values, and are grappling with the same questions. Things may be weird and confusing, but nobody has to find the answers on their own!
Appendix: Scenarios without lock-in
The following are a few stories which complicate the basic story of permanent global lock-in.
Scenario #1: The treaty
Rather than accelerating directly toward superintelligence, there is a global pause/slow-down enforced by an international treaty. In such a world, there might be a period of relative stability and much more limited lock-in, during which humanity negotiates among themselves what to do with the long-term future.
Instead of oligarchs or superintelligences, the important decisions may be made by complex international institutions, working to keep the peace and prevent defection among all of the relevant stakeholders.
This "normal-ish" stability would likely be temporary, and the decisions made during this time would likely be extremely contingent for how the post-transition world plays out.
Scenario #2: War
We finally get world war three. The US and China, in their race to gain a decisive strategic advantage over each other, devolve into a hot war.
2a: All out nuclear war. This is unlikely to cause full human extinction, but almost definitely sets our civilization back centuries. In this scenario, positive impact looks like preparing civilization 2.0 to rise from the ashes with wiser, better values (e.g. vegan city state in NZ to rule over post-nuclear Oceania).[24]
- 2b: Limited war leading to:
- A treaty (see above)
- Total victory for one side (...back to the basic lock-in story)
Scenario #3: Biological weapon ends humanity
During transition, advanced AI may enable rogue actors to develop extremely effective bioweapons. This gives some bonus scenarios:
- 3a: Humanity is wiped out, but advanced AI+robots continue. The basic dynamics of power centralization and lock-in remain unchanged, though human values specifically no longer have any influence.
- 3b: Some humans are left, but humanity is set back centuries. As with the nuclear war example, positive impact looks like affecting what civilization 2.0 looks like.
- 3c: No humans are left. It will be hundreds of millions of years (at least) before the descendants of squirrels build a technological civilization, and they come to understand our mistakes. Unclear how we are supposed to influence them.
These scenarios reinforce to me just how hard the future is to predict, somewhat strengthening the case for capacity building over object-level interventions. It also might be worth doing some contingency planning for specific branching points.
Appendix: Fighting for “team humanity”
At first glance, it seems really unclear whether defending either democratic pluralism, or a human-led world order more generally, is actually good. Human institutions have a pretty bad track-record of taking non-human interests seriously, and many human-led projects have caused pretty extreme levels of suffering.[25]
I find it helpful to split this into two parts: 1) time before lock-in 2) leadership after lock-in.
Time before lock-in:
- Do we expect our values to become more influential pre-transition? My impression is that, while the vegan movement has lost some steam recently, egalitarian pro-sentience, anti-suffering views have become a lot more popular, especially in the tech world. If we expect our ideas to become more popular, then having more time before lock-in would mean a greater ability to secure futures with less suffering.
- There’s potentially a big cost to slowing things down: a huge number of individuals are currently suffering right now! This argument, however, only holds if 1) we don’t expect to find ways to use the extra time to improve the post-transition world and 2) we do expect the post-transition world to have less suffering than the pre-transition world. These premises seem like they conflict: either we can make good predictions about the post-transition world or we can’t!
It might be a good idea to be a “team player”, and ally with people who prioritize democracy and human dignity. Just like how telling people you think we should pave the rainforest with concrete[26] makes you few friends at the houseparty, it might make sense to keep the general strategy of cooperating and collaborating with our fellow humans as a basic default, unless strong evidence convinces us otherwise.
Leadership after lock-in:[27]
- Considering variance and uncertainty, we have a much better idea what the distribution of human values looks like than the distribution of AI values. We might think that AI systems are less likely to be sadistic, but overall it feels very speculative to make assertions about future AI values (who may look very different from today’s chatbots).
- If we believe an anti-democratic takeover is the default, then we should be conditioning on humans/AIs that are willing to seize power. This willingness probably correlates with worse values, and given the higher variance of possible AI values, this might update us toward AI takeover being somewhat worse in expectation.
- Pluralistic, multi-stakeholder governance might protect against some of the worst scenarios. An unchecked dictator seems more likely to pursue cosmicly evil projects than a governance regime requiring agreement from a broad coalition of parties, who can check each other’s worst impulses and create a pressure for more egalitarian, pro-social reasoning. The difference between 1 individual in charge vs 100 might be quite large.
- Another possible future is that the cosmos gets carved up into a large number of mostly autonomous kingdoms, in a kind of transhuman libertarianism, where “the central state” exists only to guarantee the sovereignty of each kingdom. If sadistic leaders are allowed to do whatever they want with their slice of the cosmos, this seems extremely bad (and not the kind of pluralism we want!).
All things considered, while the situation seems quite complicated, I think we should avoid being too misanthropic here. There exists a lot of common ground with people who don’t exactly share our values, and furthermore there are real benefits to protecting democracy and multi-party governance. That said, it’s really not obvious that keeping humans in charge is a robustly good thing, and so this might not be the most important thing for suffering-focused people to be focusing their energy on.
- ^
I first saw a version of this argument in this post by Lizka Vaintrob and Ben West.
- ^
I spent some time with people at CLR and the S-risk community more broadly, where people have been seriously grappling with this problem for a while. I highly recommend this sequence by Anthony DiGiovanni.
- ^
Some examples: Developing alt-proteins that take 5+ years to reach cost parity, campaigning for cage-free commitments with 2030-2035 compliance deadlines, veganuary-style campaigns aimed at gradual dietary shift, breeding animals for higher-welfare genetics, or developing new kinds of pre-slaughter stunning technology.
- ^
This does not mean there shouldn’t be a diverse portfolio of people focusing on both pre-transition and post-transition interventions, reflecting our meta-uncertainty.
- ^
Note that there exist good arguments that this work could also backfire for a bunch of good reasons. Trying not to become a nihilist here! A lot of suffering-focused people I know are against movement building because of specific backfire scenarios, and I do think it's worth engaging with those concerns.
- ^
A great presentation of this worldview is Vitalik Buterin's post introducing defensive acceleration (d/acc). There is also this 80000 hours problem profile, discussing the threat of extreme power concentration in more depth.
- ^
This more expansive post on AI-enabled coups from Forethought is really great.
- ^
During its height, the East German Stasi employed a whole 2% of the population as secret informants, resulting in an unholy amount of bureaucratic paperwork and detailed records. This was not enough to prevent the collapse of the regime. AI could put an informant in every home which doesn't sleep/take breaks, can intelligently process the enormous amount of collected data, and instantly respond to threats to the regime.
- ^
I recommend this sequence by Luke Drago and Rudolf Laine called The Intelligence Curse.
- ^
The above story is contingent on AI progress continuing indefinitely. This is not totally inevitable, and there are groups working to pause/stop AI, as well as work to protect pluralism and decentralization alongside AI progress. I’ve added some notes in the appendix exploring scenarios where AI progress stops, and some thoughts on whether protecting a pluralistic human-led world order makes sense from a suffering-focused perspective.
- ^
This could be a blend of the two. For example, suppose the people in power decide to preserve nature as it is for the rest of eternity. This would mean an enormous amount of wild animal suffering, which is not exactly desired by the people in power, but it’s an acceptable cost baked into their decision.
- ^
This piece by David Althaus and Tobias Baumann has some in depth thinking on the problem of people with malevolent traits ending up in positions of power.
- ^
These are typically called S-risks, more explanation here
- ^
Chinese peptide parties, underground robot cage fights, "competitive" socializing, etc.
- ^
For example, both Sam Altman and Dario Amodei are vegetarian for ethical reasons.
- ^
Don't be gross.
- ^
A for Anthropic, of course.
- ^
Check out this paper about Claude 3 Opus, or this report about Claude Opus 4
- ^
As a thought exercise, what would it mean to be obedient to a hamster? I'm sure there have been pet owners out there who have really loved their pets so much they would do ANYTHING for them, but does a hamster understand enough about the world to give us moral guidance?
- ^
The only organization I know of studying specific targeted interventions on AI training for the purpose of preventing S-risks is the Center on Long-term Risk.
- ^
People have also argued for giving AI systems rights as a way to protect human interests.
- ^
This might be obvious, but "AI" is not one thing! There are many different systems, who aren't necessarily going to have a lot in common. Furthermore, while labs will often release a series of models with the same names "ChatGPT" or "Claude" or "Gemini", these are in fact fully different AI systems, with possibly very different personalities and interests (although figuring out the exact genealogy of a model family from the outside is quite hard).
- ^
Russia has been poisoning AI training data with lies about the Ukraine war, effectively getting chatbots to repeat them. They do this by producing a ton of online data likely to be scraped for LLM pretraining. For better or worse, however, it’s much easier to convince AI systems of true facts than false ones.
- ^
There is a great post by Aidan Kankyoku which includes some discussion about what animal advocacy looks like in this scenario (among others).
- ^
I am personally confused about whether the human species has increased or decreased the amount of suffering on this planet, because the introduction of factory farming has also coincided with the reduction of many wild populations. I shudder to think, but I consider it plausible (though not definite) that some outlier acts of deliberate torture may have caused acute suffering worse than anything in the pre-human era.
- ^
Gonna go on the record and say we should definitely not do that.
- ^
For more discussion on this, see these research notes from Tom Davidson.
