Hide table of contents

Hi, I've been asked to recommend a couple of short introductions/overviews about the key issues in AI safety and AI alignment. This is will be for the 'Philosophy, Politics, & Economics (PPE) major at Oxford University - which trains some of the brightest undergrads in Britain, many of which go on to influential positions in government and industry.

Ideal readings would be recent (e.g. 2024 onwards), short (e.g. less than 4,000 words), non-technical, vivid & engaging, and reputable (in terms of author(s) and/or outlets). 

Any suggestions would be much appreciated! 

11

0
0

Reactions

0
0
New Answer
New Comment

2 Answers sorted by

Hi, fellow Oxford neighbour here!

The AI Safety Atlas is an amazing resource, and just the type I think you are looking for (understandable by me, a pianist with zero STEM background beyond high-school). 

For your purposes I'd recommend Chapter 1.4 + maybe one or two extra chapters specifically around capabilities and 'the bitter lesson', then perhaps this video by Rational Animations about goal misgeneralisation.

Since you're in Oxford, I'd also recommend reaching out to the Oxford AI Safety Initiative, a student-led group doing amazing work to educate around issues of AI Safety. 

I've been doing their Core Fellowship this term and it has been amazing. 

Curated and popular this week
Relevant opportunities