Disclosure: I do contract work for Ought.
Iterated Distillation and Amplification (IDA) is a framework for training ML models.
IDA is related to existing frameworks like imitation learning and reinforcement learning, but it aims to solve tasks for which humans cannot construct a suitable reward function or solve directly.
This document reviews IDA and proposes three projects that explore aspects of IDA. Project 1 applies IDA to problems in high school mathematics and investigates whether learning to decompose problems can improve performance over supervised learning. Project 2 applies IDA to neural program interpretation, where neural nets are trained on the internal behavior (execution traces) of traditional computer programs. Project 3 investigates whether adaptive computation time (varying compute at inference time as a function of the input) can improve the robustness and efficiency of IDA.
Our goal in outlining these projects is to generate discussion and encourage research on IDA. We are not (as of June 2019) working on these projects, but we are interested in collaboration.
Also contains one of the clearer explanations of iterated distillation & amplification I've come across (see section 0.1).