Hide table of contents

Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.[Additional note: this post was crossposted with the author's permission, but not by the author. The author may not see or respond to comments on this post. You can see the original post here.]

Many technical alignment researchers are bad-to-mediocre at writing up their ideas and results in a form intelligible to other people. And even for those who are reasonably good at it, writing up a good intuitive explanation still takes a lot of work, and that work lengthens the turn-time on publishing new results. For instance, a couple months ago I wrote a post which formalized the idea of abstractions as redundant information, and argued that it’s equivalent to abstractions as information relevant at a distance. That post came out about two months after I had the rough math worked out, because it took a lot of work to explain it decently - and I don’t even think the end result was all that good an explanation! And I still don’t have a post which explains well why that result is interesting.

I think there’s a lot of potential space in the field for people who are good at figuring out what other researchers’ math is saying intuitively, and why it’s interesting, and then communicating that clearly - i.e. the skill of distillation. This post will briefly sketch out what two kinds of distillation roles might look like, what skills are needed, and talk about how one might get started in such a role.

Two Distiller Roles

The two types of distiller role I’ll sketch are:

  • “Independent” distiller: someone who works independently, understanding work published by other researchers and producing distillations of that work.
  • “Adjunct” distiller: someone who works directly with one researcher or a small team, producing regular write-ups of what the person/team is thinking about and why.

 These two roles add value in slightly different ways.

An independent distiller’s main value-adds are:

  • Explaining the motivation and intended applications
  • Coming up with new examples
  • Boiling down the “key intuitive story” behind an argument
  • Showing how the intuitive story fits into the context of the intended applications

I expect the ability to come up with novel examples and boil down the core intuitive story behind a bunch of math are the rate-limiting skills here.

Rob Miles is a good example of an existing independent distiller in the field. He makes YouTube videos intuitively explaining various technical results and arguments. Rob’s work is aimed somewhat more at a popular audience than what I have in mind, but it’s nonetheless been useful for people in the field.

I expect an adjunct distiller’s main value-adds are:

  • Writing up explanations, examples, and intuitions, similar to the independent distiller
  • Saving time for the technical researcher/team; allow more specialization
  • Providing more external visibility/legibility into the research process and motivation
  • Accelerating the research process directly by coming up with good examples and intuitive explanations

I expect finding a researcher/team to work with is the rate-limiting step to this sort of work.

Mark Xu is a good example of an existing adjunct distiller. He’s worked with both Evan Hubinger and Paul Christiano, and has written up decent distillations of some of their thoughts. I believe Mark did this with the aim of later doing technical research himself, rather than mostly being a distiller. That is a pretty good strategy and I expect it to be a common pathway, though naturally I expect people who aim to specialize in distillation long-term will end up better at distillation.

What Kind Of Skills Are Needed?

I expect the key rate-limiting skills are:

  • Ability to independently generate intuitive examples when reading mathematical arguments, or having a mathematical discussion
  • Ability to extract the core intuitive story from a mathematical argument
  • Writing/drawing skills to clearly convey technical intuitions to a wider audience
  • Ability to do most of the work of crossing the communication gap yourself - both so that researchers do not need to spend a lot of effort communicating to you, and so that readers do not need to spend a lot of effort understanding you
  • For the adjunct role, ability to write decent things quickly and frequently without too much perfectionism
  • For the non-adjunct role, ability to do all this relatively independently

How To Get Started

Getting started in an independent distiller role should be pretty straightforward: choose some research, and produce some distillations. It’s inherently a very legible job, so you should pretty quickly have some good example pieces which you could showcase in a grant application (e.g. from the Long Term Future Fund or FTX Future Fund). That said, bear in mind that you may need some practice before you actually start to produce very good distillations.

An adjunct role is more difficult, because you need someone to work with. Obvious advice: just asking people is an underutilized strategy, and works surprisingly well. Be sure to emphasize your intended value-add to the researcher(s). If you want to prove yourself a bit before reaching out, independently distilling some of a researcher’s existing public work is another obvious step. You might also try interviewing a researcher on some part of their work, and then distilling that, in order to get a better feel for what it would be like to work together before actually committing.





More posts like this

Sorted by Click to highlight new comments since: Today at 7:27 PM

See also: "Research Debt" (Olah & Carter, 2017).

At what point can large language models start to do distillation, especially of the early LW sequences?

I was just talking about this with some friends! Has anyone trained a GPT-3 language model on LessWrong or EA posts and then had it create or try to distill some posts? I think this would be mostly entertaining, kinda like the postmodern philosophy generator or subreddit simulator. But it seems like a win-win regardless of whether it is a good or bad distiller.

If it is bad, it could impart some valuable lessons about recognizing vacuous gpt-3 generated ideas. If it is good, then maybe it could really distill some ideas well or generate new ones (doubtful atm?).

Also there could be a monthly award for whoever predicts which post is AI generated :).

This made me incredibly excited about distilling research! However, I don't really know where to find research that would be valuable distilling. Could you give me some general pointers to help me get started? Also, do you have examples of great distillations that I can use as my benchmark? I'm fairly new to technical AI since I've been majoring Chemistry the last three years, however I'm determined to upskill in AI quickly, where distilling seems like a great challenge to boost my learning process while being impactful.

Jonas Hallgren shared this Distillation Alignment Practicum with me, which answers all my question and much more. 

If you look for someone with 'distiller' skills, maybe look for people with experience as "technical writer" in industry, especially those that have worked on complicated and technical software products.

Curated and popular this week
Relevant opportunities