Existential risk
Existential risk
Discussions of risks which threaten the destruction of the long-term potential of life

Quick takes

36
2d
2
Scrappy note on the AI safety landscape. Very incomplete, but probably a good way to get oriented to (a) some of the orgs in the space, and (b) how the space is carved up more generally.   (A) Technical (i) A lot of the safety work happens in the scaling-based AGI companies (OpenAI, GDM, Anthropic, and possibly Meta, xAI, Mistral, and some Chinese players). Some of it is directly useful, some of it is indirectly useful (e.g. negative results, datasets, open-source models, position pieces etc.), and some is not useful and/or a distraction. It's worth developing good assessment mechanisms/instincts about these. (ii) A lot of safety work happens in collaboration with the AGI companies, but by individuals/organisations with some amount of independence and/or different incentives. Some examples: METR, Redwood, UK AISI, Epoch, Apollo. It's worth understanding what they're doing with AGI cos and what their theories of change are. (iii) Orgs that don't seem to work directly with AGI cos but are deeply technically engaging with frontier models and their relationship to catastrophic risk: places like Palisade, FAR AI, CAIS. These orgs maintain even more independence, and are able to do/say things which maybe the previous tier might not be able to. A recent cool thing was CAIS finding that models don't do well on remote work tasks -- only 2.5% of tasks -- in contrast to OpenAI's findings in GDPval suggests models have an almost 50% win-rate against industry professionals on a suite of "economically valuable, real-world tasks" tasks. (iv) Orgs that are pursuing other* technical AI safety bets, different from the AGI cos: FAR AI, ARC, Timaeus, Simplex AI, AE Studio, LawZero, many independents, some academics at e.g. CHAI/Berkeley, MIT, Stanford, MILA, Vector Institute, Oxford, Cambridge, UCL and elsewhere. It's worth understanding why they want to make these bets, including whether it's their comparative advantage, an alignment with their incentives/grants, or whether they
12
2d
I try to maintain this public doc of AI safety cheap tests and resources, although it's due a deep overhaul.    Suggestions and feedback welcome!
23
24d
PSA: If you're doing evals things, every now and then you should look back at OpenPhil's page on capabilities evals to check against their desiderata and questions in sections 2.1-2.2, 3.1-3.4, 4.1-4.3 as a way to critically appraise the work you're doing.
17
22d
1
If the people arguing that there is an AI bubble turn out to be correct and the bubble pops, to what extent would that change people's minds about near-term AGI?  I strongly suspect there is an AI bubble because the financial expectations around AI seem to be based on AI significantly enhancing productivity and the evidence seems to show it doesn't do that yet. This could change — and I think that's what a lot of people in the business world are thinking and hoping. But my view is a) LLMs have fundamental weaknesses that make this unlikely and b) scaling is running out of steam. Scaling running out of steam actually means three things: 1) Each new 10x increase in compute is less practically or qualitatively valuable than previous 10x increases in compute. 2) Each new 10x increase in compute is getting harder to pull off because the amount of money involved is getting unwieldy. 3) There is an absolute ceiling to the amount of data LLMs can train on that they are probably approaching. So, AI investment is dependent on financial expectations that are depending on LLMs enhancing productivity, which isn't happening and probably won't happen due to fundamental problems with LLMs and due to scaling becoming less valuable and less feasible. This implies an AI bubble, which implies the bubble will eventually pop.  So, if the bubble pops, will that lead people who currently have a much higher estimation than I do of LLMs' current capabilities and near-term prospects to lower that estimation? If AI investment turns out to be a bubble, and it pops, would you change your mind about near-term AGI? Would you think it's much less likely? Would you think AGI is probably much farther away?
17
1mo
Ajeya Cotra writes: Like Ajeya, I haven't thought about this a ton. But I do feel quite confident in recommending that generalist EAs — especially the "get shit done" kind —  at least strongly consider working on biosecurity if they're looking for their next thing.
48
4mo
2
AI governance could be much more relevant in the EU, if the EU was willing to regulate ASML. Tell ASML they can only service compliant semiconductor foundries, where a "compliant semicondunctor foundry" is defined as a foundry which only allows its chips to be used by compliant AI companies. I think this is a really promising path for slower, more responsible AI development globally. The EU is known for its cautious approach to regulation. Many EAs believe that a cautious, risk-averse approach to AI development is appropriate. Yet EU regulations are often viewed as less important, since major AI firms are mostly outside the EU. However, ASML is located in the EU, and serves as a chokepoint for the entire AI industry. Regulating ASML addresses the standard complaint that "AI firms will simply relocate to the most permissive jurisdiction". Advocating this path could be a high-leverage way to make global AI development more responsible without the need for an international treaty.
10
20d
8
I have a question I would like some thoughts on: As a utilitarianist I personally believe alignment to be the most important course area - though weirdly enough, even though I believe x-risk reduction to be positive in expectation, I believe the future is most likely to be net negative. I personally believe, without a high level of certainty, that the current utilities on earth are net negative due to wild animal suffering. If we therefore give the current world a utility value of -1, I would describe my believe for future scenarios like this: 1. ~5% likelihood: ~101000  (Very good future e.g. hedonium shockwave) 2. ~90% likelihood: -1 (the future is likely good for humans, but this is still negligible compared to wild animal suffering, which will remain) 3. ~5% likelihood: ~-10100 (s-risk like scenarios)   My reasoning for thinking “scenario 2” is more likely than “scenario 1” is based on what seems to be the values of the general public currently. Most people seem to care about nature conservation, but no one seems interested in mass producing (artificial) happiness. While the earth is only expected to remain habitable for about two billion years (and humans, assuming we avoid any x-risks, are likely to remain for much longer), I think, when it comes to it, we’ll find a way to keep the earth habitable, and thus wild animal suffering.  Based on these 3 scenarios, you don’t have to be a great mathematician to realize that the future is most likely to be net negative, yet positive in expectation. While I still, based on this, find alignment to be the most important course area, I find it quite demotivating that I spend so much of my time (donations) to preserve a future that I find unlikely to be positive. But the thing is, most long-termist (and EAs in general) don't seem to have similar beliefs to me. Even someone like Brian Tomasik has said that if you’re a classical utilitarian, the future is likely to be good.   So now I’m asking, what am I getting wro
80
9mo
1
I recently created a simple workflow to allow people to write to the Attorneys General of California and Delaware to share thoughts + encourage scrutiny of the upcoming OpenAI nonprofit conversion attempt. Write a letter to the CA and DE Attorneys General I think this might be a high-leverage opportunity for outreach. Both AG offices have already begun investigations, and AGs are elected officials who are primarily tasked with protecting the public interest, so they should care what the public thinks and prioritizes. Unlike e.g. congresspeople, I don't AGs often receive grassroots outreach (I found ~0 examples of this in the past), and an influx of polite and thoughtful letters may have some influence — especially from CA and DE residents, although I think anyone impacted by their decision should feel comfortable contacting them. Personally I don't expect the conversion to be blocked, but I do think the value and nature of the eventual deal might be significantly influenced by the degree of scrutiny on the transaction. Please consider writing a short letter — even a few sentences is fine. Our partner handles the actual delivery, so all you need to do is submit the form. If you want to write one on your own and can't find contact info, feel free to dm me.
Load more (8/147)