Calling for Student Submissions: AI Safety Distillation Contest

a_e_r

Comments 28

Sorted by

New & upvoted

Dan Hendrycks and I would love for somebody to distill some of his papers! https://arxiv.org/abs/2008.02275 https://arxiv.org/abs/2110.13136

mariushobbhahn

Hi, are PhD students also allowed to submit? I would like to submit a distillation and would be fine with not receiving any money in case I win a prize. In case this complicates things too much, I could understand if you don't want that.

a_e_r

Hi! I’ve been thinking about this a bit more and I do think I want graduate students to be able to submit! However, since the main audience is meant to be undergraduate students, I may have to be harsher in evaluation or, more excitingly, maybe I could create a new tier for graduate students? For now I’d say feel free to submit and I’ll work out more specifics on my end and make an edit (+ reply to this) if I make official changes!

mariushobbhahn

That sounds very reasonable. Thanks for the swift reply.

ChanaMessinger

This is exciting, I really like this idea, and I'm glad it's being put into action. Do you know who your judges are? I don't have any technical knowledge myself, so I'm not speaking from inside view, but one concern I have is whether it might be relatively easy to write distillations that seem good but contain subtle misunderstandings that it would be hard for someone not in the field to catch but matter (certainly in my conversations with friends who know more than me, getting to talk through my amateur understanding of the discord chats has resulted in demonstrating some important ignorances on my part).

a_e_r

Great point! Early on, I had someone more connected than me make a list of potential judges. We have 15 names brainstormed and sectioned off by how much they know about alignment. I can say with pretty high certainty that I imagine we will at least have someone whose full-time job is alignment reading the submissions (likely a person with a CS doctorate), but hopefully, we could get even more expertise :)

ChanaMessinger

Awesome! Will you be releasing the list at some point?

Also, if you're still on the lookout for more judges, I can potentially send people your way! If not, great!

a_e_r

I wasn’t anticipating releasing the list (in some part because people may try to pander to a certain judge’s background and in some part to allow myself and the judges more flexibility in adding people last second).

Sending some judge recommendations my way would be great! I think having a variety of readers would be helpful :) Thank you!

MaxRa

Really cool, just last week I was thinking about whether the alignment community should (massively) scale up prizes with relatively low barriers to entry!

Having you considered making this bigger? E.g. with more prices and a more active outreach to other universities?

I initially thought that ideally every contribution that clears a certain bar should be rewarded accordingly, that way there's less uncertainty about payoffs and more people will contribute
I think you likely could find more texts to recommend, but even duplicated distillations are still valuable for getting students into thinking about alignment research and identifying particularly promising candidates
Evaluation time is a likely bottleneck, but probably you could find a handful of e.g. AGI Safety Fundamentals alumni to volunteer a few hours, or many more if you offer compensation for helping out

a_e_r

Thank you! These are thoughtful comments! I think I will try to add more texts and find more readers, as you suggest.

I've been thinking of going into working on creating contests in the future as a potentially serious work project, so I hope to create some contests that can be larger scale then! Right now, I'm rather limited in capacity. Thankfully, I'm connected with some other great university organizers who I've let know about advertising at their schools.

I think it would be tricky to have clear baseline cutoffs for distillation that still capture quality since writing varies so much between people. Do you have any ideas of clear cutoffs that would retain quality (for future contests if nothing else)?

MaxRa

You probably already have seen that the contest was featured on AstralCodexTen, so you might get more obviously good submissions than you have prices for and it would kinda feel like a wasted opportunity to not clearly signal (i.e. with money) to those authors that their work is highly appreciated and that we would love for them to do more of this work.

MaxRa

I think I will try to add more texts and find more readers, as you suggest.
I've been thinking of going into working on creating contests in the future as a potentially serious work project

Nice and nice! :)

Do you have any ideas of clear cutoffs that would retain quality (for future contests if nothing else)?

Hmm, is your worry that distillations that in hindsight seem to be fairly sub-optimal (e.g. with major mistakes or confusing explanations) end up receiving the lowest tier price because there is some noise introduced by the people who rate the distillations? I think this might happen only rarely, for maybe 2 in 100 distillations? I think your list of scoring criteria already goes a long way giving raters a good idea for what solid work looks like. The money for the lowest tier would also not be a lot, maybe 200$. Giving a price to in-hindsight subpar quality work would maybe reduce the prestige of the price a little bit, but I think it's a fairly junior price anyway that mostly encourages and rewards initial solid efforts. Also you still would have the higher tiers for especially good work which would lose little prestige.

a_e_r

I do think it's possible that we might award more prizes retroactively if we recognize that we receive a lot of valuable submissions! Maybe an "honorable mentions" category.

Ah, I think my worry is that it feels difficult for me to find a standard to rate that actually tracks quality. If I give a couple of examples, people may feel limited to having their work look like those examples. I might say "make your distillation 1,000 words and explain two papers and I'll give you a prize" but 1,500 words on one paper might have made an optimal submission and I would have limited people's abilities. I think I find it hard to quantify a bar on writing since everyone has such different approaches. I think the real bar is something more like "the judges who know more about AI Safety than me believe that you have communicated this idea really well" and because of that it feels wrong for me to try to say "and if you do x you will definitely win something."

MaxRa

Maybe an "honorable mentions" category.

If they already get a price, I wouldn't call it "honorable mentions" because that unnecessarily diminishes it in my eyes. Just have anything that seems that would get at B- in school be in the same category as the 250$ price?

Ah, I think my worry is that it feels difficult for me to find a standard to rate that actually tracks quality.

Ah, interesting, I have the opposite intuition!:D I completely agree that you shouldn't give advice about the length of the distillations, but the criteria you mention here just seem really useful and like I'd be surprised if e.g. you find something clearly presented and accessible, and I wouldn't.

Depth of understanding
Clarity of presentation
Rigor of work
Concision/Length (longer papers will need to present more information than shorter papers)
Originality of insight
Accessibility

And I feel like somebody who has spend like ~40 hours reading and discussing AI Safety material (e.g. as part AGI Safety Fundamentals course) could do a reasonably coherent job at rating the understanding and rigor. Originality seems maybe the trickiest, as you probably have to have some grasp of what ideas/framings are already in the water and which aren't.

Miranda_Zhang

Really excited for this - I think distillation will be useful not only for checking the distiller's understanding, but also in better communicating ideas around AI safety. Thanks for starting up this project!

aog

Hi, would Anthropic's research agenda be a good candidate for distilling?

badeliz

Would the definition of "student enrolled in a university/college" include master's students? I would normally think so but the ACX signal boost from today describes this as an "undergraduate" contest, so I wanted to double check.

a_e_r

The primary audience for this contests is undergrad, but Master’s students are allowed!

mlc

more distillations, yay! 🥳

the Distillation Contest, is now open to any student enrolled in a university/college

could you organize this to also include people that aren't enrolled students?

a_e_r

Thank you!

Could you clarify what you mean? Do you mean students who are on a break from college, newly admitted students who aren’t yet attending, or something else?

mlc

I am referring to people that chose alternative career paths to AI, autodidacts and independent ML researchers for example.

a_e_r

Unfortunately, I created this contest to help build up university groups, so I think keeping the contest limited to enrolled students (including students who are entering college later this year and students who will graduate before the contest ends) would be the best way to ensure that students feel like they have an advantage in the contest. Thank you for clarifying!

mlc

Thanks for the explanation.

other thoughts: Abram’s decision theory and Vanessa’s infrabayes work might be good for distillation. Also, might be worth thinking about some type of collab with current distillers, such as Robert Miles or Mark Xu, and the site distill.pub?

a_e_r

Oh, great idea! If nothing else, distill.pub is a great resource for me to list!

mlc

Thanks for pointing that out!

TW123

Distill was never really about distillations in the sense this post is referring to. It was a journal that focused on having very high-quality presentation/visualizations. It's also no longer active: https://distill.pub/2021/distill-hiatus/

Ishan Mukherjee

Are multiple submissions allowed?

a_e_r

Sure :)

Comments

Calling for Student Submissions: AI Safety Distillation Contest

Purpose

Contest description:

What makes a good distillation?

Prizes

Scoring

Final Notes