Hiring engineers and researchers to help align GPT-3

I used AI to fix transcription errors, rerrarange the ideas, and suggest tweaks to the title and some sentences. Three of the most exciting projects to come out of EA in recent years are, in a vague sense, CEA spinouts: * Kairos is directly a spinout of CEA and now handles most support for university AI safety groups. Basically everyone I've found who knows them is really excited about what they do * NEST is an opinionated ideas-fi...

RP is looking for project founders in neglected animal areas

Rethink Priorities·5d ago·7m read

TLDR; To help the effective animal advocacy movement cost-effectively absorb greater amounts of funding in the near future, we are seeking expressions of interest from people who could found a new organization focused on: * Highly neglected animals: insects, wild animals, shrimp, fish, etc, or * AI and animals: AI alignment and governance for animal welfare, strategic actions considering transformative AI, AI for wild animals, etc. * ...

Recent opportunities to take action

New round of digital minds funding opportunities at Longview

zdgroff, Longview Philanthropy·1d ago·2m read

RP is looking for project founders in neglected animal areas

Rethink Priorities·5d ago·7m read

170

Possible mistake EAs are making and shout out to Pause AI UK

Michelle_Hutchinson·1w ago·4m read

What the team does

In the past Reflection focused on fine-tuning GPT-3 using a reward function learned from human feedback. Our most recent results are here, and had the unusual virtue of simultaneously being exciting enough to ML researchers to be accepted at NeurIPS while being described by Eliezer as “directly, straight-up relevant to real alignment problems.”

We’re currently working on three things:

[20%] Applying basic alignment approaches to the API, aiming to close the gap between theory and practice.

[60%] Extending existing approaches to tasks that are too hard for humans to evaluate; in particular, we are training models that summarize more text than human trainers have time to read. Our approach is to use weaker ML systems operating over shorter contexts to help oversee stronger ones over longer contexts. This is conceptually straightforward but still poses significant engineering and ML challenges.

[20%] Conceptual research on domains that no one knows how to oversee and empirical work on debates between humans (see our 2019 writeup). I think the biggest open problem is figuring out how and if human overseers can leverage “knowledge” the model acquired during training (see an example here).

If successful, ideas will eventually move up this list, from the conceptual stage to ML prototypes to real deployments. We’re viewing this as practice for integrating alignment into transformative AI deployed by OpenAI or another organization.

What you’d do

Most people on the team do a subset of these core tasks:

Design+build+maintain code for experimenting with novel training strategies for large language models. This infrastructure needs to support a diversity of experimental changes that are hard to anticipate in advance, work as a solid base to build on for 6-12 months, and handle the complexity of working with large language models. Most of our code is maintained by 1-3 people and consumed by 2-4 people (all on the team).

Oversee ML training. Evaluate how well models are learning, figure out why they are learning badly, and identify+prioritize+implement changes to make them learn better. Tune hyperparameters and manage computing resources. Process datasets for machine consumption; understand datasets and how they affect the model’s behavior.

Design and conduct experiments to answer questions about our models or our training strategies.

Design+build+maintain code for delegating work to ~70 people who provide input to training. We automate workflows like sampling text from books, getting multiple workers’ answers to questions about that text, running a language model on those answers, then showing the results to someone else for evaluation. It also involves monitoring worker throughput and quality, automating decisions about what tasks to delegate to whom, and making it easy to add new work or change what people are working on.

Participate in high-level discussion about what the team should be working on, and help brainstorm and prioritize projects and approaches. Complicated projects seem to go more smoothly if everyone understands why they are doing what they are doing, is on the lookout for things that might slip through the cracks, is thinking about the big picture and helping prioritize, and cares about the success of the whole project.

If you are excited about this work, apply here for the ML engineer role and here for the ML researcher role.