Hide table of contents

Background

In the fall of 2023, I'm teaching a course called "Philosophy and The Challenge of the Future"[1] which is focused on AI risk and safety. I designed the syllabus keeping in mind that my students:

  • will have no prior exposure to what AI is or how it works
  • will not necessarily have a strong philosophy background (the course is offered by the Philosophy department, but is open to everyone)
  • will not necessarily be familiar with Effective Altruism at all

Goals

My approach combines three perspectives: 1) philosophy, 2) AI safety, and 3) Science, Technology, Society (STS); this combination reflects my training in these fields and attempts to create an alternative introduction to AI safety (that doesn't just copy the AISF curriculum). That said, I plan to recommend the AISF course towards the end of the semester; since my students are majoring in all sorts of different things, from CS to psychology, it'd be great if some of them considered AI safety research as their career path. 

Course Overview 

INTRO TO AI 

Week 1 (8/28-9/1): The foundations of Artificial Intelligence (AI)

Required Readings: 

  • Artificial Intelligence, A Modern Approach, pp. 1-27, Russell & Norvig. 
  • Superintelligence, pp. 1-16, Bostrom. 

Week 2 (9/5-8): AI, Machine Learning (ML), and Deep Learning (DL)

Required Readings: 

Week 3 (9/11-16): What can current AI models do? 

Required Readings: 

AI AND THE FUTURE OF HUMANITY 

Week 4 (9/18-22): What are the stakes? 

Required Readings: 

  • The Precipice, pp. 15-21, Ord. 
  • Existential risk and human extinction: An intellectual history, Moynihan.  
  • Everything might change forever this century (video) 

Week 5 (9/25-29): What are the risks? 

Required Readings: 

  • Taxonomy of Risks posed by Language Models, Weidinger et al. 
  • Human Compatible, pp. 140-152, Russell. 
  • Loss of Control: “Normal Accidents and AI Systems”, Chan. 

Week 6 (10/2-6): From Intelligence to Superintelligence 

Required Readings: 

  • A Collection of Definitions of Intelligence, Legg & Hutter. 
  • Artificial Intelligence as a positive and negative factor in global risk, Yudkowsky. 
  • Paths to Superintelligence, Bostrom.

Week 7 (10/10-13): Human-Machine interaction and cooperation 

Required Readings: 

THE BASICS OF AI SAFETY 

Week 8 (10/16-20): Value learning and goal-directed behavior  

Required Readings: 

  • Machines Learning Values, Petersen.
  • The Basic AI Drives, Omuhundro.  
  • The Value Learning Problem, Soares. 

Week 9 (10/23-27): Instrumental rationality and the orthogonality thesis  

Required Readings: 

  • The Superintelligent Will: Motivation and Instrumental Rationality in Advanced Artificial Agents, Bostrom.  
  • General Purpose Intelligence: Arguing The Orthogonality Thesis, Armstrong. 

METAPHYSICAL & EPISTEMOLOGICAL CONSIDERATIONS 

Week 10 (10/30-11/4): Thinking about the Singularity

Required Readings: 

  • The Singularity: A Philosophical Analysis, Chalmers.
  • Can Intelligence Explode?, Hutter. 

Week 11 (11/6-11): AI and Consciousness 

Required Readings: 

  • Could a Large Language Model be Conscious?, Chalmers. 
  • Will AI Achieve Consciousness? Wrong Question, Dennett. 

ETHICAL QUESTIONS

Week 12 (11/13-17): What are the moral challenges of high-risk technologies?   

Required Readings: 

  • Human Compatible, “Misuses of AI”, Russell.
  • The Ethics of Invention, “Risk and Responsibility”, Jasanoff. 

Week 13 (11/20-22): Do we owe anything to the future? 

Required Readings: 

WHAT CAN WE DO NOW 

Week 14 (11/27-12/1): Technical AI Alignment 

Required Readings: 

Week 15 (12/4-8): AI governance and regulation 

Required Readings: 

 

Feedback is welcome! Especially if you have readings in mind that you can imagine your 19-year-old self being excited about. 

  1. ^

    It's Phil 122, at Queens College, CUNY. 

Comments3


Sorted by Click to highlight new comments since:

I think Thorstad's "Against the singularity hypothesis" might complement the week 10 readings.

I'd also potentially include the latest version of Carlsmiths chapter on Power-seeking AI.

This seems great!

More from Eleni_A
51
Eleni_A
· · 1m read
33
Eleni_A
· · 2m read
Curated and popular this week
 ·  · 10m read
 · 
I wrote this to try to explain the key thing going on with AI right now to a broader audience. Feedback welcome. Most people think of AI as a pattern-matching chatbot – good at writing emails, terrible at real thinking. They've missed something huge. In 2024, while many declared AI was reaching a plateau, it was actually entering a new paradigm: learning to reason using reinforcement learning. This approach isn’t limited by data, so could deliver beyond-human capabilities in coding and scientific reasoning within two years. Here's a simple introduction to how it works, and why it's the most important development that most people have missed. The new paradigm: reinforcement learning People sometimes say “chatGPT is just next token prediction on the internet”. But that’s never been quite true. Raw next token prediction produces outputs that are regularly crazy. GPT only became useful with the addition of what’s called “reinforcement learning from human feedback” (RLHF): 1. The model produces outputs 2. Humans rate those outputs for helpfulness 3. The model is adjusted in a way expected to get a higher rating A model that’s under RLHF hasn’t been trained only to predict next tokens, it’s been trained to produce whatever output is most helpful to human raters. Think of the initial large language model (LLM) as containing a foundation of knowledge and concepts. Reinforcement learning is what enables that structure to be turned to a specific end. Now AI companies are using reinforcement learning in a powerful new way – training models to reason step-by-step: 1. Show the model a problem like a math puzzle. 2. Ask it to produce a chain of reasoning to solve the problem (“chain of thought”).[1] 3. If the answer is correct, adjust the model to be more like that (“reinforcement”).[2] 4. Repeat thousands of times. Before 2023 this didn’t seem to work. If each step of reasoning is too unreliable, then the chains quickly go wrong. Without getting close to co
JamesÖz
 ·  · 3m read
 · 
Why it’s important to fill out this consultation The UK Government is currently consulting on allowing insects to be fed to chickens and pigs. This is worrying as the government explicitly says changes would “enable investment in the insect protein sector”. Given the likely sentience of insects (see this summary of recent research), and that median predictions estimate that 3.9 trillion insects will be killed annually by 2030, we think it’s crucial to try to limit this huge source of animal suffering.  Overview * Link to complete the consultation: HERE. You can see the context of the consultation here. * How long it takes to fill it out: 5-10 minutes (5 questions total with only 1 of them requiring a written answer) * Deadline to respond: April 1st 2025 * What else you can do: Share the consultation document far and wide!  * You can use the UK Voters for Animals GPT to help draft your responses. * If you want to hear about other high-impact ways to use your political voice to help animals, sign up for the UK Voters for Animals newsletter. There is an option to be contacted only for very time-sensitive opportunities like this one, which we expect will happen less than 6 times a year. See guidance on submitting in a Google Doc Questions and suggested responses: It is helpful to have a lot of variation between responses. As such, please feel free to add your own reasoning for your responses or, in addition to animal welfare reasons for opposing insects as feed, include non-animal welfare reasons e.g., health implications, concerns about farming intensification, or the climate implications of using insects for feed.    Question 7 on the consultation: Do you agree with allowing poultry processed animal protein in porcine feed?  Suggested response: No (up to you if you want to elaborate further).  We think it’s useful to say no to all questions in the consultation, particularly as changing these rules means that meat producers can make more profit from sel
 ·  · 1m read
 · 
 The Life You Can Save, a nonprofit organization dedicated to fighting extreme poverty, and Founders Pledge, a global nonprofit empowering entrepreneurs to do the most good possible with their charitable giving, have announced today the formation of their Rapid Response Fund. In the face of imminent federal funding cuts, the Fund will ensure that some of the world's highest-impact charities and programs can continue to function. Affected organizations include those offering critical interventions, particularly in basic health services, maternal and child health, infectious disease control, mental health, domestic violence, and organized crime.
Recent opportunities in AI safety