Data collection for AI alignment - Career review

Benjamin Hilton; 80000_Hours

This is a linkpost for https://80000hours.org/career-reviews/alignment-data-expert/

This is a (slightly edited) cross-post of 80,000 Hours' newest career review, on data collection for AI alignment. Thanks to Beth Barnes, Leo Gao, Long Ouyang and Connor Leahy for helpful comments and conversation when writing this review.

In a nutshell:

To reduce the risks posed by the rise of artificial intelligence, we need to figure out how to make sure that powerful AI systems do what we want. Many potential solutions to this problem will require a lot of high-quality data from humans to train machine learning models. Building excellent pipelines so that this data can be collected more easily could be an important way to support technical research into AI alignment, as well as lay the foundation for actually building aligned AIs in the future. If not handled correctly, this work risks making things worse, so this path needs people who can and will change directions if needed.

This career will be some people’s highest-impact option if their personal fit is especially good.

Why might becoming an expert in data collection for AI alignment be high impact?

We think it’s crucial that we work to positively shape the development of AI, including through technical research on how to ensure that any potentially transformative AI we develop does what we want it to do (known as the alignment problem). If we don’t find ways to align AI with our values and goals — or worse, don’t find ways to prevent AI from actively harming us or otherwise working against our values — the development of AI could pose an existential threat to humanity.

There are lots of different proposals for building aligned AI, and it’s unclear which (if any) of these approaches will work. A sizeable subset of these approaches require humans to give data to machine learning models, including include AI safety via debate, microscope AI, and iterated amplification.

These proposals involve collecting human data on tasks like:

Evaluating whether a critique of an argument was good
Breaking a difficult question into easier subquestions
Examining the outputs of tools that interpret deep neural networks
Using one model as a tool to make a judgement on how good or bad the outputs of another model are
Finding ways to make models behave badly (e.g. generating adversarial examples by hand)

Collecting this data — ideally by setting up scalable systems to both contract people to carry out these sorts of tasks as well as collect and communicate the results — could be a valuable way to support alignment researchers who use it in their experiments.

But also, once we have good alignment techniques, we may need AI companies around the world to have the capacity to implement them. That means developing systems and pipelines for the collection of this data now could make it easier to implement alignment solutions that require this data in the future. And if it’s easier, it’s more likely to actually happen.

What does this path involve?

Human data collection mostly involves hiring contractors to answer relevant questions and then creating well-designed systems to collect high-quality data from them.

This includes:

Figuring out who will be good at actually generating this data (i.e. doing the sorts of tasks that we listed earlier, like evaluating arguments), as well as how to find and hire these people
Designing training materials, processes, pay levels, and incentivisation structures for contractors
Ensuring good communication between researchers and contractors, for example by translating researcher needs into clear instructions for contractors (as well as being able to predict and prevent people misinterpreting these instructions)
Designing user interfaces to make it easy for contractors to complete their tasks as well as for alignment researchers to design and update tasks for contractors to carry out
Scheduling workloads among contractors, for example making sure that when data needs to be moved in sequence among contractors, the entire data collection can happen reasonably quickly
Assessing data quality, including developing ways of rapidly detecting problems with your data or using hierarchical schemes of more and less trusted contractors

Being able to do all these things well is a pretty unique and rare skillset (similar to entrepreneurship or operations), so if you’re a good fit for this type of work, it could be the most impactful thing you could do.

Avoiding harm

If you follow this path, it’s particularly important to make sure that you are able to exercise excellent judgement about when not to provide these services.

We think it’s extremely difficult to make accurate calls about when research into AI capabilities could be harmful.

For example, it sounds pretty likely to us that work that helps make current AI systems safe and useful will be fairly different from work that is useful for making transformative AI (when we’re able to build it) safe and useful. You’ll need to be able to make judgements about whether the work you are doing is good for this future task.

If you think you might be a good fit for this career path, but aren’t sure how to avoid doing harm, our advising team may be able to help you decide what to do.

Example people

Long Ouyang

After majoring in psychology, Long went on to do a PhD in cognitive psychology at Stanford. His research was at the intersection of psychology and machine learning. During his PhD, Long was convinced that it would be valuable to contribute to work on AI safety. He got a grant from the Future of Life Institute to research psychology and intent alignment. However, Long found it difficult to self-motivate in this research; as an entirely independent researcher, he felt too disconnected from important things going on elsewhere.

At the time, OpenAI was hiring social scientists to help with AI safety via debate. While the work ended up going in a different direction, Long was useful to the OpenAI safety team because of his experience in machine learning. At one point, the safety team started discussing how it would be useful to have a cognitive psychologist around to help collect human data, and Long volunteered himself for this new role. He now works as a research scientist doing human data collection at OpenAI.

How to predict your fit in advance

The best experts at human data collection will have:

Experience designing surveys and social science experiments
Ability to analyse the data collected from experiments
Some familiarity with the field of AI alignment
Enough knowledge about machine learning to understand what sorts of data are useful to collect and the machine learning research process
At least some front-end software engineering knowledge
Some aptitude for entrepreneurship or operations

Data collection is often considered somewhat less glamorous than research, making it especially hard to find good people. So if you have three or more of these skills, you’re likely a better candidate than most!

How to enter

If you already have experience in this area, there are two main ways you might get a job as a human data expert:

Find jobs at organisations working on alignment, particularly those doing empirical alignment research. For example, OpenAI, DeepMind, Anthropic, Redwood Research, and Ought are all good choices. (As March 2022, Anthropic is hiring for these sorts of roles.) Surge AI is a startup that is also carrying out this sort of work.
Consider founding an organisation to do this work, as suggested by alignment researcher Beth Barnes. If you’re interested in doing this, contact our team.

If you don’t have enough experience to work directly on this now, you can gain experience in a few ways:

Do academic research, for example in psychology, sociology, economics, or another social science.
Work in human-computer interaction or software crowdsourcing.
Work for machine learning companies in labelling teams — and because these roles are less popular, they can be a great way to rapidly gain experience and promotions in machine learning organisations.

The Effective Altruism Long-Term Future Fund and the Survival and Flourishing Fund may provide funding for promising individuals to learn skills relevant to helping future generations — including human data collection. As a way of learning the necessary skills (and directly helping at the same time), you could apply for a grant to build a dataset that you think could be useful for AI alignment. The Machine Intelligence Research Institute has put up a bounty for such a dataset.

The review on the 80,000 Hours website also includes links to our job board, one-on-one advice, and a short reading list.

L Rudolf LJun 3 20225

There is currently an active cofounder matching process going on for an organisation to do this, expected to finish in late mid-June and with work starting at the latest a month or two later. Feel free to DM me or Marc-Everin Carauleanu (who independently submitted this idea to the FTX FF idea competition) if you want to know more.

Anything concrete about the exact nature of what service alignment researchers most need, how much this problem is estimated to block progress on alignment, pros and cons of existing orgs each having their own internal service for this, and how large the alignment-related data market are very welcome.

Effective Altruism Forum
EA Forum