Hide table of contents

Related: AI policy ideas: Reading list.

This document is about ideas for AI labs. It's mostly from an x-risk perspective. Its underlying organization black-boxes technical AI stuff, including technical AI safety.

Lists & discussion



Maybe I should make a separate post on desiderata for labs (for existential safety).



See generally The Role of Cooperation in Responsible AI Development (Askell et al. 2019).


Transparency enables coordination (and some regulation).

Publication practices

Labs should minimize/delay the diffusion of their capabilities research.

Structured access to AI models

Governance structure


See also

Some sources are roughly sorted within sections by a combination of x-risk-relevance, quality, and influentialness– but sometimes I didn't bother to try to sort them, and I haven't read all of them.

Please have a low bar to suggest additions, substitutions, rearrangements, etc.

Current as of: 9 July 2023.

  1. ^

    At various levels of abstraction, coordination can look like:
    - Avoiding a race to the bottom
    - Internalizing some externalities
    - Sharing some benefits and risks
    - Differentially advancing more prosocial actors?
    - More?

  2. ^

    Policymaking in the Pause (FLI 2023) cites A Systematic Review on Model Watermarking for Neural Networks (Boenisch 2021); I don't know if that source is good. (Note: this disclaimer does not imply that I know that the other sources in this doc are good!)

    I am not excited about watermarking. (Note: this disclaimer does not imply that I am excited about the other ideas in this doc! But I am excited about most of them.)





More posts like this

Sorted by Click to highlight new comments since:

Zach - it can be helpful to develop reading lists. But in my experience, busy people are much more likely to dive into a list of 3-4 things that are each no more than 2,000 words, rather than a comprehensive list of all the great things they could possibly read if they have a few extra weeks of life.

So, the ideally targeted 'AI risk/AI alignment' reading list, IMHO, would involve no more than 8,000 words total (that could be read in about 40 minutes).

That would be good too! And it would fill a different niche. This list is mostly meant for AI strategy researchers rather than busy laymen, and it's certainly not meant to be read cover to cover.

(Note also that this list isn't really about AI risk and certainly isn't about AI alignment.)

(Note also that I'm not trying to make people "more likely" to read it-- it's optimal for some people to engage with it and not optimal for others.)

Curated and popular this week
Relevant opportunities