Summaries: Alignment Fundamentals Curriculum

Leon Lang

This is a linkpost for https://docs.google.com/document/d/1mVwQgyrEgWJ9xsO6wIizDp2T3578OqNe4K5goKiOLgE/edit?usp=sharing

The linked document provides my summaries for most core readings and many further readings of the alignment fundamentals curriculum composed by Richard Ngo, as accessed from July to early September 2022. Additionally, it often contains my preliminary opinions on the texts. Note that I’m not an expert on the topic.

I have read all texts while simultaneously doing full-time work unrelated to AI alignment, and thus, due to time constraints, many summaries probably contain mistakes, and my opinions would change upon further reflection. Additionally:

I only streamlined the process after a few weeks
the summaries of the first weeks are of lower quality, and more of them or my opinions are missing
Some summaries are also missing since I had a minor repetitive strain issue along the way, and since the curriculum changed while reading through it
Sometimes, the formatting is not ideal since I originally wrote the summaries on a slack channel and then copy-pasted them to google docs

Nevertheless, I was told that these summaries are useful, and therefore I’m sharing them with the wider community of people interested in alignment.

If anyone wants to contribute their own summary, please put a suggestion into the google doc, and I will accept it with attribution to the (optionally anonymous) author.

Note: I have posted this on lesswrong before and wanted to crosspost, but there was a bug, so I'm now posting it manually to the EA forum as well.

Acknowledgments: I want to thank Albert Garde, Benjamin Kolb, Fritz Dorn, Jens Brandt, and Tom Lieberum for discussions on the curriculum.

Effective Altruism Forum
EA Forum

Summaries: Alignment Fundamentals Curriculum

25

25

Reactions