AI safety
AI safety
Studying and reducing the existential risks posed by advanced artificial intelligence

Quick takes

10
4d
1
One thing the AI Pause Debate Week has made salient to me: there appears to be a mismatch between the kind of slowing that on-the-ground AI policy folks talk about, versus the type that AI policy researchers and technical alignment people talk about. My impression from talking to policy folks who are in or close to government—admittedly a sample of only five or so—is that the main[1] coordination problem for reducing AI x-risk is about ensuring the so-called alignment tax gets paid (i.e., ensuring that all the big labs put some time/money/effort into safety, and that none “defect” by skimping on safety to jump ahead on capabilities). This seems to rest on the assumption that the alignment tax is a coherent notion and that technical alignment people are somewhat on track to pay this tax. On the other hand, my impression is that technical alignment people, and AI policy researchers at EA-oriented orgs,[2] are not at all confident in there being a viable level of time/money/effort that will produce safe AGI on the default trajectory. The type of policy action that’s needed, so they seem to say, is much more drastic. For example, something in the vein of global coordination to slow, limit, or outright stop development and deployment of AI capabilities (see, e.g., Larsen’s,[3] Bensinger’s, and Stein-Perlman’s debate week posts), whilst alignment researchers scramble to figure out how on earth to align frontier systems. I’m concerned by this mismatch. It would appear that the game plans of two adjacent clusters of people working to reduce AI x-risk are at odds. (Clearly, this is an oversimplification and there are a range of takes from within both clusters, but my current epistemic status is that this oversimplification gestures at a true and important pattern.) Am I simply mistaken about there being a mismatch here? If not, is anyone working to remedy the situation? Or does anyone have thoughts on how this arose, how it could be rectified, or how to prevent similar m
49
1mo
I've just written a blog post to summarise EA-relevant UK political news from the last ~six weeks. The post is here: AI summit, semiconductor trade policy, and a green light for alternative proteins (substack.com) I'm planning to circulate this around some EAs, but also some people working in the Civil Service, political consulting and journalism. Many might already be familiar with the stories. But I think this might be useful if I can (a) provide insightful UK political context for EAs, or (b) provide an EA perspective to curious adjacents. I'll probably continue this if I think either (a) or (b) is paying off. (I work at Rethink Priorities, but this is entirely in my personal capacity).
39
2mo
5
Immigration is such a tight constraint for me. My next career steps after I'm done with my TCS Masters are primarily bottlenecked by "what allows me to remain in the UK" and then "keeps me on track to contribute to technical AI safety research". What I would like to do for the next 1 - 2 years ("independent research"/ "further upskilling to get into a top ML PhD program") is not all that viable a path given my visa constraints. Above all, I want to avoid wasting N more years by taking a detour through software engineering again so I can get Visa sponsorship. [I'm not conscientious enough to pursue AI safety research/ML upskilling while managing a full time job.] Might just try and see if I can pursue a TCS PhD at my current university and do TCS research that I think would be valuable for theoretical AI safety research. The main detriment of that is I'd have to spend N more years in <city> and I was really hoping to come down to London. Advice very, very welcome. [Not sure who to tag.]
46
4mo
1
Protesting at leading AI labs may be significantly more effective than most protests, even ignoring the object-level arguments for the importance of AI safety as a cause area. The impact per protester is likely unusually big, since early protests involve only a handful of people and impact probably scales sublinearly with size. And very early protests are unprecedented and hence more likely (for their size) to attract attention, shape future protests, and have other effects that boost their impact.
42
4mo
2
In Twitter and elsewhere, I've seen a bunch of people argue that AI company execs and academics are only talking about AI existential risk because they want to manufacture concern to increase investments and/or as a distraction away from near-term risks and/or regulatory capture. This is obviously false.  However, there is a nearby argument that is likely true: which is that incentives drive how people talk about AI risk, as well as which specific regulations or interventions they ask for. This is likely to happen both explicitly and unconsciously. It's important (as always) to have extremely solid epistemics, and understand that even apparent allies may have (large) degrees of self-interest and motivated reasoning.  Safety-washing is a significant concern; similar things have happened a bunch in other fields, it likely has already happened a bunch in AI, and will likely happen again in the months and years to come, especially if/as policymakers and/or the general public become increasingly uneasy about AI.
1
6d
Some lawyers claim that there may be significant (though not at all ideal) whistleblowing protection for individuals at AI companies that don't fully comply with the Voluntary Commitments: https://katzbanks.com/wp-content/uploads/KBK-Law360-Despite-Regulation-Lag-AI-Whistleblowers-Have-Protections.pdf
1
4d
2
Is someone planning on doing an overview post of all the AI Pause discussion? I’m guessing some people would appreciate it if someone took the time to make an unbiased synthesis of the posts and discussions.
22
4mo
1
Quick updates:  * Our next critique (on Conjecture) will be published in 2 weeks.  * The critqiue after that will be on Anthropic. If you'd like to be a reviewer, or have critiques you'd like to share, please message us or email anonymouseaomega@gmail.com.
Load more (8/51)