Hide table of contents

Tl;dr: Are there any research tasks/projects in AI or biosecurity that could potentially benefit from having a large group of lightly-supervised (but preferably paid) interns work on the problem? (E.g., collecting and codifying data for a dataset, compiling and tagging literature on a topic)



Some of my previous unpaid internships might be negatively described as “intern mills”: 

  • At the START Consortium, there were some ~60-70 interns working on various teams to collect and codify data for terrorism research, including the widely-cited Global Terrorism Database (GTD). The paid-supervisor to unpaid intern ratio was perhaps around 1:6?
  • At the Think Tanks and Civil Societies Project (TTCSP), there were some ~100 interns working on various teams to collect organizational data (e.g., budgets, founding dates, contact information) and scholarly literature on think tanks. There was only one (partially) paid director, but the interns were organized into a hierarchy of “executive” interns, team leads (which I was), and regular interns. None of the interns were paid.

Some people may look at this and consider the degree of unpaid labor appalling. Personally, I found the overall programs somewhat revelatory: you can point and shoot large numbers of interns at some problems and come out with useful outputs like the GTD, while the interns get to work on topic areas they might be interested in and develop skills/experience that may be useful for future job applications. TTCSP was really the standout example here: a single person could lead an organization of roughly 100 unpaid interns in producing various reports, databases, and literature compilations! (Full disclosure: the quality of the work predictably suffered from such dramatic overextension, but I feel fairly confident it was better than nothing, and with more funding it definitely could have improved)

However, I didn’t want to do work on terrorism or on think tanks generally: I’ve sought work on AI or biosecurity for the past three years, but none of the six internships I managed to get have related to those topics, and only one heavily focused on technology more broadly. Five of the six internships were unpaid.

Given my past experiences, I’ve long been wondering “why can’t some EA organization just do some kind of internship project-family on various cause area topics, including AI or biosecurity? Surely there has to be some task or project out there that a large group of just lightly-supervised, unpaid (or minimally-paid) interns could help out with, whether it’s some kind of dataset to establish base rates, creating literature/argument/epistemic maps, tracking science funding or government projects related to AI/biosecurity, etc.”

This all leads me to the question stated up front: Are there any research tasks/projects in AI or biosecurity that could potentially benefit from having a large group of lightly-supervised interns work on the problem? (E.g., collecting and codifying data for a dataset, compiling and tagging literature on a topic)


Note: In this question post I’m not trying to defend the merits of such a proposal; I’m mainly just trying to solicit topic ideas from people (and preview the idea), which I will then incorporate into a normal post that proposes/discusses the idea. That being said, I would love to hear any initial feedback people have on the idea, including any kinds of objections you think I should address in a larger post (if I were to proceed with this)!

New Answer
New Comment

4 Answers sorted by

I'm running Redwood Research's interpretability research.

I've considered running an "interpretability mine"--we get 50 interns, put them through a three week training course on transformers and our interpretability tools, and then put them to work on building mechanistic explanations of parts of some model like GPT-2 for the rest of their internship.

My usual joke is "GPT-2 has 12 attention heads per layer and 48 layers. If we had 50 interns and gave them each a different attention head every day, we'd have an intern-day of analysis of each attention head in 11 days."

This is bottlenecked on various things:

  • having a good operationalization of what it means to interpret an attention head, and having some way to do quality analysis of explanations produced by the interns. This could also be phrased as "having more of a paradigm for interpretability work".
  • having organizational structures that would make this work
  • building various interpretability tools to make it so that it's relatively easy to do this work if you're a smart CS/math undergrad who has done our three week course

I think there's a 30% chance that in July, we'll wish that we had 50 interns to do something like this. Unfortunately this is too low a probability for it to make sense for us to organize the internship.

Now that it's after July, did you ever end up wishing you had 50 interns to do something like this?

I am glad we did not have 50 interns in July. But I’m 75% that we’ll run a giant event like this with at least 25 participants by the end of January. I’ll publish something about this in maybe a month.
Peter Wildeford

One potential idea I've had—which I'll admit isn't strictly related to AI or biosecurity but does seem like it could heavily scale with larger numbers of researchers—is to have interns flesh out a visualized reasoning model (perhaps similar to an "epistemic map") regarding existential risk/recovery scenarios, inspired by the work that Luisa Rodriguez did regarding this topic (see her post here on the forum and her appearance on the 80K podcast).

Such a project could basically lay out different scenarios with their input variables (e.g., "Only Q number of people survive and they are distributed in L locations", "ash/dust particles cover X% of the world for Y years and reduce light/photosynthesis by Z%"), diagram how different variables or world states may importantly interact with each other, conduct and input research on details like "what is the population size-viability likelihood curve (given concerns of genetic diversity in conditions which may feature higher rates of infant/maternal mortality)", theorize on various dynamics (e.g., how serious will security dilemmas be for groups with pre-apocalyptic firearms but without pre-apocalyptic law enforcement and military structures), etc. In addition to creating such maps (which ideally could be published and subsequently explored at a user's direction), the interns could produce reports with noteworthy "extinction vignettes" or "extinction variable-combinations" (e.g., what are some key conditions that seem likely to cause slow extinction/non-recovery in scenarios that do not feature near-immediate widespread extinction) as well as key uncertainties (e.g., climate models).

I'm curious whether other people find this idea potentially interesting or valuable?

(I briefly searched for more on this a few months ago but didn't see anything similar, and Luisa Rodriguez didn't mention any such projects when I emailed her, but if there already is something like this please let me know!)

Idea: Could take a long list of project ideas and have interns prioritise them. If listed out 200-300 bio or AI or EA meta projects and had 3 interns each do separate 1 day review pieces on each project. Could be done with minimal oversight and listing ideas could be quick and in theory it could create a useful resource.

Of course not sure how well it would match an expert take on the topic and there are lots of challenges and potential problems with unpaid intern labour.

If someone wants to organise this and has an intern army I would be happy to discuss / help.