Quick takes

Show community
View more
Set topic
Frontpage
Global health
Animal welfare
Existential risk
Biosecurity & pandemics
11 more
Scrappy note on the AI safety landscape. Very incomplete, but probably a good way to get oriented to (a) some of the orgs in the space, and (b) how the space is carved up more generally.   (A) Technical (i) A lot of the safety work happens in the scaling-based AGI companies (OpenAI, GDM, Anthropic, and possibly Meta, xAI, Mistral, and some Chinese players). Some of it is directly useful, some of it is indirectly useful (e.g. negative results, datasets, open-source models, position pieces etc.), and some is not useful and/or a distraction. It's worth developing good assessment mechanisms/instincts about these. (ii) A lot of safety work happens in collaboration with the AGI companies, but by individuals/organisations with some amount of independence and/or different incentives. Some examples: METR, Redwood, UK AISI, Epoch, Apollo. It's worth understanding what they're doing with AGI cos and what their theories of change are. (iii) Orgs that don't seem to work directly with AGI cos but are deeply technically engaging with frontier models and their relationship to catastrophic risk: places like Palisade, FAR AI, CAIS. These orgs maintain even more independence, and are able to do/say things which maybe the previous tier might not be able to. A recent cool thing was CAIS finding that models don't do well on remote work tasks -- only 2.5% of tasks -- in contrast to OpenAI's findings in GDPval suggests models have an almost 50% win-rate against industry professionals on a suite of "economically valuable, real-world tasks" tasks. (iv) Orgs that are pursuing other* technical AI safety bets, different from the AGI cos: FAR AI, ARC, Timaeus, Simplex AI, AE Studio, LawZero, many independents, some academics at e.g. CHAI/Berkeley, MIT, Stanford, MILA, Vector Institute, Oxford, Cambridge, UCL and elsewhere. It's worth understanding why they want to make these bets, including whether it's their comparative advantage, an alignment with their incentives/grants, or whether they
I try to maintain this public doc of AI safety cheap tests and resources, although it's due a deep overhaul.    Suggestions and feedback welcome!
I just learned via Martin Sustrik about the late Sofia Corradi,  Sustrik points out that none of the glowing obituaries for her mention the sheer scale of Erasmus. The Fulbright in the US is the 2nd largest comparable program, but it's a very distant second: Sustrik argues that the Erasmus programme is gargantuan-scale social engineering done right: The backstory to how Sofia came to focus on Erasmus is touching: I've previously wondered what a shortlist of people who've beneficially impacted the world at the scale of ~100 milliBorlaugs might look like, and suggested Melinda & Bill Gates and Tom Frieden. (A "Borlaug" is a unit of impact I made up, it means a billion lives saved.) If you buy Corradi's argument that the Erasmus programme is at heart really a peace programme and that it deserves some credit for the long period of relative peace we've experienced globally post-WWII, then Sofia Corradi seems eminently deserving of inclusion.
People are underrating making the future go well conditioned on no AI takeover. This deserves a full post, but for now a quick take: in my opinion, P(no AI takeover) = 75%, P(future goes extremely well | no AI takeover) = 20%, and most of the value of the future is in worlds where it goes extremely well (and comparatively little value comes from locking in a world that's good-but-not-great). Under this view, an intervention is good insofar as it affects P(no AI takeover) * P(things go really well | no AI takeover). Suppose that a given intervention can change P(no AI takeover) and/or P(future goes extremely well | no AI takeover). Then the overall effect of the intervention is proportional to ΔP(no AI takeover) * P(things go really well | no AI takeover) + P(no AI takeover) * ΔP(things go really well | no AI takeover). Plugging in my numbers, this gives us 0.2 * ΔP(no AI takeover) + 0.75 * ΔP(things go really well | no AI takeover). And yet, I think that very little AI safety work focuses on affecting P(things go really well | no AI takeover). Probably Forethought is doing the best work in this space. (And I don't think it's a tractability issue: I think affecting P(things go really well | no AI takeover) is pretty tractable!) (Of course, if you think P(AI takeover) is 90%, that would probably be a crux.)
Nancy Pelosi is retiring; consider donating to Scott Wiener. [Link to donate; or consider a bank transfer option to avoid fees, see below.] Nancy Pelosi has just announced that she is retiring. Previously I wrote up a case for donating to Scott Wiener, who is running for her seat, in which I estimated a 60% chance that she would retire. While I recommended donating on the day that he announced his campaign launch, I noted that donations would look much better ex post in worlds where Pelosi retires, and that my recommendation to donate on launch day was sensitive to my assessment of the probability that she would retire. I know some people who read my post and decided (quite reasonably) to wait to see whether Pelosi retired. If that was you, consider donating today! How to donate You can donate through ActBlue here (please use this link rather than going directly to his website, because the URL lets his team know that these are donations from people who care about AI safety). Note that ActBlue charges a 4% fee. I think that's not a huge deal; however, if you want to make a large contribution and are already comfortable making bank transfers, shoot be a DM and I'll give you instructions for making the bank transfer!