AI safety
AI safety
Studying and reducing the existential risks posed by advanced artificial intelligence

Quick takes

17
2d
Hey everyone, in collaboration with Apart Research, I'm helping organize a hackathon this weekend to build tools for accelerating alignment research. This hackathon is very much related to my effort in building an "Alignment Research Assistant." Here's the announcement post: 2 days until we revolutionize AI alignment research at the Research Augmentation Hackathon! As AI safety researchers, we pour countless hours into crucial work. It's time we built tools to accelerate our efforts! Join us in creating AI assistants that could supercharge the very research we're passionate about. Date: July 26th to 28th, online and in-person Prizes: $2,000 in prizes Why join? * Build tools that matter for the future of AI * Learn from top minds in AI alignment * Boost your skills and portfolio We've got a Hackbook with an exciting project to work on waiting for you! No advanced AI knowledge required - just bring your creativity! Register now: Sign up on the website here, and don't miss this chance to shape the future of AI research!
50
17d
2
The recently released 2024 Republican platform said they'll repeal the recent White House Executive Order on AI, which many in this community thought is a necessary first step to make future AI progress more safe/secure. This seems bad. From https://s3.documentcloud.org/documents/24795758/read-the-2024-republican-party-platform.pdf, see bottom of pg 9.
14
4d
3
Meta has just released Llama 3.1 405B. It's open-source and in many benchmarks it beats GPT-4o and Claude 3.5 Sonnet: Zuck's letter "Open Source AI Is the Path Forward".
68
2mo
8
We should expect that the incentives and culture for AI-focused companies to make them uniquely terrible for producing safe AGI.    From a “safety from catastrophic risk” perspective, I suspect an “AI-focused company” (e.g. Anthropic, OpenAI, Mistral) is abstractly pretty close to the worst possible organizational structure for getting us towards AGI. I have two distinct but related reasons: 1. Incentives 2. Culture From an incentives perspective, consider realistic alternative organizational structures to “AI-focused company” that nonetheless has enough firepower to host successful multibillion-dollar scientific/engineering projects: 1. As part of an intergovernmental effort (e.g. CERN’s Large Hadron Collider, the ISS) 2. As part of a governmental effort of a single country (e.g. Apollo Program, Manhattan Project, China’s Tiangong) 3. As part of a larger company (e.g. Google DeepMind, Meta AI) In each of those cases, I claim that there are stronger (though still not ideal) organizational incentives to slow down, pause/stop, or roll back deployment if there is sufficient evidence or reason to believe that further development can result in major catastrophe. In contrast, an AI-focused company has every incentive to go ahead on AI when the case for pausing is uncertain, and minimal incentive to stop or even take things slowly.  From a culture perspective, I claim that without knowing any details of the specific companies, you should expect AI-focused companies to be more likely than plausible contenders to have the following cultural elements: 1. Ideological AGI Vision AI-focused companies may have a large contingent of “true believers” who are ideologically motivated to make AGI at all costs and 2. No Pre-existing Safety Culture AI-focused companies may have minimal or no strong “safety” culture where people deeply understand, have experience in, and are motivated by a desire to avoid catastrophic outcomes.  The first one should be self-explanatory. Th
30
2mo
5
Pretty wild discussion in this podcast about how aggressively the USSR cut corners on safety in their space program in order to stay ahead of the US. In the author's telling of the history, this was in large part because Khrushchev wanted to rack up as many "firsts" (e.g., first satellite, first woman in space) as possible. This seems like it was most proximately for prestige and propaganda rather than any immediate strategic or technological benefit (though of course the space program did eventually produce such bigger benefits). Evidence of the following claim for AI: people may not need a reason to cut corners on safety because the material benefits are so high. They may do so just because of the prestige and glory of being first. https://www.lawfaremedia.org/article/chatter--the-harrowing-history-of-the-soviet-space-program-with-john-strausbaugh
23
1mo
5
Hey everyone, my name is Jacques, I'm an independent technical alignment researcher (primarily focused on evaluations, interpretability, and scalable oversight). I'm now focusing more of my attention on building an Alignment Research Assistant. I'm looking for people who would like to contribute to the project. This project will be private unless I say otherwise. Side note: I helped build the Alignment Research Dataset ~2 years ago. It has been used at OpenAI (by someone on the alignment team), (as far as I know) at Anthropic for evals, and is now used as the backend for Stampy.ai. If you are interested in potentially helping out (or know someone who might be!), send me a DM with a bit of your background and why you'd like to help out. To keep things focused, I may or may not accept. I have written up the vision and core features for the project here. I expect to see it evolve in terms of features, but the vision will likely remain the same. I'm currently working on some of the features and have delegated some tasks to others (tasks are in a private GitHub project board). I'm also collaborating with different groups. For now, the focus is to build core features that can be used individually but will eventually work together into the core product. In 2-3 months, I want to get it to a place where I know whether this is useful for other researchers and if we should apply for additional funding to turn it into a serious project.
27
2mo
8
Besides Ilya Sutskever, is there any person not related to the EA community who quit or was fired from OpenAI for safety concerns?
9
16d
1
Microsoft have backed out of their OpenAI board observer seat, and Apple will refuse a rumoured seat, both in response to antitrust threats from US regulators, per Reuters. I don’t know how to parse this—I think it’s likely that the US regulators don’t care much about safety in this decision, and nor do I think it meaningfully changes Microsoft’s power over the firm. Apple’s rumoured seat was interesting, but unlikely to have any bearing either.
Load more (8/118)