Hide table of contents

During the last 2.5 or 3 years, I have been trying to learn and get experience on AI Safety so that when I finish my PhD in Physics I might be able to contribute to the AI Safety community efforts. During these years I have noticed that:

  • There seem to be large groups of potential junior researchers, and the community has several programs in place for them such as the AI Safety Camp or AI Safety Research Program.
  • Funding is growing, but still largely concentrated in a handful of places such as CHAI, FHI, CSER or institutes not affiliated to universities (eg Center for Long Term Risks); and a few companies (DeepMind, OpenAI, Anthropic). But it seems to me that there are still great places out there where research could happen and is currently not.

So, given that the existence of more senior researchers seems to be a bottleneck: what can the community do to get them more interested in this topic? I read somewhere that there are two ways to get people involved:

  • Telling them this is an important problem.
  • Telling them this is an interesting problem.

I think it may be worth trying out a combination of both, eg: "hey, this is an interesting problem that could be important to make systems reliable". I really don't think one needs to convince them about longtermism as a previous step.

In any case, I wanted to use this question to generate concrete actions such that people such as the EA Long Term Fund managers could put money to solve this bottleneck. The only time I have seen that something similar is the "Claudia Shi ($5,000): Organizing a "Human-Aligned AI” event at NeurIPS." donation registered in https://funds.effectivealtruism.org/funds/payouts/september-2020-long-term-future-fund-grants.

There might also be other ways, but I don't think I know academic dynamics so well to know. In any case, I am aware that publication of technical AI Safety in these conferences does not seem to be an issue, so I believe the bottleneck is in getting them to be genuinely interested on the topic.




New Answer
New Comment

4 Answers sorted by

Academics choose to work on things when they're doable, important, interesting, publishable, and fundable. Importance and interestingness seem to be the least bottlenecked parts of that list.

The root of the problem is difficulty in evaluating the quality of work. There's no public benchmark for AI safety that people really believe in (nor do I think there can be, yet - talk about AI safety is still a pre-paradigmatic problem), so evaluating the quality of work actually requires trusted experts sitting down and thinking hard about a paper - much harder than just checking if it beat the state of the art. This difficulty restricts doability, publishability, and fundability. It also makes un-vetted research even less useful to you than it is in other fields.

Perhaps the solution is the production of a lot more experts, but becoming an expertise on this "weird" problem takes work - work that is not particularly important or publishable, and so working academics aren't going to take a year or two off to do it. At best we could sponsor outreach events/conferences/symposia aimed at giving academics some information and context to make somewhat better evaluations of the quality of AI safety work.

Thus I think we're stuck with growing the ranks of experts not slowly per se (we could certainly be growing faster), but at least gradually, and then we have to leverage that network of trust both to evaluate academic AI safety work for fundability / publishability, and also to inform it to improve doability.

Create a journal of AI safety, and get prestigious people like Russell publishing on them.

Basically many people in academia are stuck chasing publications. Aligning that incentive seems important.

The problem is that journals are hard work, and require a very specific profile to push it forward.

Here is a post mortem of a previous attempt: https://distill.pub/2021/distill-hiatus/

From your comment, I just learned that Distill.pub is shutting down and this is sad.

The site was beautiful. The attention to detail, and attention to the reader and presentation were amazing. 

Their mission seems relevant to AI safety and risk.

Relevant to the main post and the comment above, the issues with Distill.pub seem not to be structural/institutional/academic/social—but operational, related to resources and burnout.

This seems entirely fixable by money, maybe even a reasonable amount compared to other major interventions in the AI/longtermist space?

To explain, consider the explanation on the page:

But it is not sustainable for us to continue running the journal in its current form...We also think that it’s a lot healthier for us and frees up our energy to do new projects that provide value to the community.


We set extremely high standards for ourselves: with early articles, volunteer editors would often spend 50 or more hours improving articles that were submitted to Distill and bringing them up to the level of quality we aspired to. This invisible effort was comparable to the work of writing a short article of one’s own. It wasn’t sustainable, and this left us with a c

... (read more)
Jaime Sevilla
I am more bullish about this. I think for distill to succeed it needs to have at least two full time editors committed to the mission. Managing people is hard. Managing people, training them and making sure the vision of the project is preserved is insanely hard - a full time job for at least two people. Plus the part Distill was bottlenecked on is very high skilled labour, which needed a special aesthetic sensitivity and commitment. 50 senior hours per draft sounds insane - but I do believe the Distill staff when they say it is needed. This wraps back to why new journals are so difficult : you need talented researchers with additional entrepreneurial skills to push it forward. But researchers by and large would much rather just work on their research than manage a journal.
Charles He
Hi, this is another great comment, thank you! Note: I think what was meant here was "bearish", not bullish. I think what you're saying is you're bearish or have a lower view of this intervention because the editor/founders have a rare combination of vision, aesthetic view and commitment.  You point out this highly skilled management/leadership/labor is not fungible—we can't just hire 10 AI practitioners and 10 designers to equal the editors who may have left.
Jaime Sevilla
Oops yes 🐻 Yes, exactly. I think what I am pointing towards is something like "if you are one such highly skilled editor, and your plan is to work on something like this part time delegating work to more junior people, then you are going to find yourself burnt out very soon. Managing a team of junior people / people who do not share your aesthetic sense to do highly skilled labor will be, at least for the first six months or so, much more work than if you do it on your own.". I think an editor will be ten times more likely to succeed if: 1. They have a high skilled co-founder who shares their vision 2. They have a plan to work on something like this full time, at least for a while 3. They have a plan for training aligned junior people on skills OR to teach taste to experts On hindsight I think my comment was too negative, since I would still be excited about someone retrying a distill-like experiment and throwing money at it.

I think you can still publish in conferences, and I have seen that at least AAAI has the topic of safety and trustworthiness between their areas of interest. I would say then that this is not the main issue?

Creating a good journal seems like a good thing to do, but I think it addresses a bit different problem, "how to align researchers incentive with publishing quality results", not necessarily getting them excited about AIS.

I think it's more of a comment that one would find the number of academics 'excited' about AIS would increase as the number of venues for publication grew.

Another idea is replicating something like Hilbert‘s speech in 1900 in which he lists 23 open maths problems, which seems to have had some impact in agenda setting for the whole scientific community. https://en.wikipedia.org/wiki/Hilbert's_problems

Doing this well for the field of AI might get some attention from AI scientists and funders.

I wonder if a movie about realistic AI x-risk scenarios might have promise. I have somewhere in the back of my mind that Dr. Strangelove possibly inspired some people to work on the threat of nuclear war (the Wikipedia article is surprisingly sparse on the topic of the movie’s impact, though).

  • There was a 2020 documentary We Need To Talk About AI. All-star lineup of interviewees! Stuart Russell, Roman Yampolskiy, Max Tegmark, Sam Harris, Jurgen Schmidhuber, …. I've seen it, but it appears to be pretty obscure, AFAICT.
  • I happened to watch the 2020 Melissa McCarthy film Superintelligence yesterday. It's umm, not what you're looking for. The superintelligent AI's story arc was a mix of 20% arguably-plausible things that experts say about superintelligent AGI, and 80% deliberately absurd things for comedy. I doubt it made anyone in the audience think very hard about anything in particular. (I did like it as a romantic comedy :-P )
  • There's some potential tension between "things that make for a good movie" and "realistic", I think.

At least the novel the movie is based on seems to have had significant influence:

Kubrick had researched the subject for years, consulted experts, and worked closely with a former R.A.F. pilot, Peter George, on the screenplay of the film. George’s novel about the risk of accidental nuclear war, “Red Alert,” was the source for most of “Strangelove” ’s plot. Unbeknownst to both Kubrick and George, a top official at the Department of Defense had already sent a copy of “Red Alert” to every member of the Pentagon’s Scientific Advisory Committee for Ballistic M

... (read more)
Maybe one could send a free copy of Brian Christians „The Alignment Problem“ or Russel‘s „Human Compatible“ to the office addresses of all AI researchers that might find it potentially interesting?
Steven Byrnes
I saw Jeff Hawkins mention (in some online video) that someone had sent Human Compatible to him unsolicited but he didn't say who. And then (separately) a bit later the mystery was resolved: I saw some EA-affiliated person or institution mention that they had sent Human Compatible to a bunch of AI researchers. But I can't remember where I saw that, or who it was.   :-(
Interesting anyway, thanks! Did you by any chance notice if he reacted positively or negatively to being send the book? I was a bit worried it might be considered spammy. On the other hand, I remember reading that Andrew Gelman regularly gets send copies of books he might be interested in for him to write a blurp or review, so maybe it's just a thing that happens to scientists and one needn't be worried.
Steven Byrnes
See here, the first post is a video of a research meeting where he talks dismissively about Stuart Russell's argument, and then the ensuing forum discussion features a lot of posts by me trying to sell everyone on AI risk :-P (Other context here.)
Perfect, so he appreciated it despite finding the accompanying letter pretty generic, and thought he received it because someone (the letter listed Max Tegmark, Joshua Bengio and Tim O’Reilly, though w/o signatures) believed he’d find it interesting and that the book is important for the field. Pretty much what one could hope for. And thanks for the work trying to get them to take this more seriously, would be really great if you could find more neuroscience people to contribute to AI safety.
Sorted by Click to highlight new comments since:

I’m not an academic nor do I otherwise have much familiarity with academia careers, but I have occasionally heard people talk about the importance of incentive structures like tenure publication qualification, the ease of staying in established fields, the difficulty/opportunity cost of changing research fields later in your career, etc. Thus I think it would be helpful/interesting to look at things more from that incentive structure side of things in addition to asking “how can we convince people AI safety is important and interesting?”

I agree that the creation of incentives is a good framing for the problem. I wanted to notice some things though:

  • Academics often have much more freedom to research what they want, and most incentives are number of publications or citations. Since you can publish AIS papers in standard top conferences, I do not see a big problem, although I might be wrong, of course.
  • Changing the incentives is either more difficult (changing protocols at universities or government bodies?) or just giving money, which the community seems to be doing already. That's what makes me think that the academic interest is more of a bottleneck, but I am not superinformed.
Curated and popular this week
Relevant opportunities