108 karmaJoined

# Posts 1

Sorted by New
87

I completely agree! The definition of marginal is somewhat ambiguous the way I've written it. What I mean to say is that the marginal project is the one that is close to the funding bar, like you pointed out.

Thanks for writing this, its great to hear your thoughts on talent pipelines in AIS.

I agree with your model of AISC, MATS and your diagram of talent pipelines. I generally see MATS as a "next step" after AISC for many participants. Because of that, its true that we can't cleanly compare the cost-per-researcher-produced between programs at different points in the pipeline since they are complements rather than substitutes.

A funder would have to consider how to distribute funding between these options (e.g. conversion vs. acceleration) and that's something I'm hoping to model mathematically at some point.

I believe the "carrying capacity" of the AI safety research field is largely bottlenecked on good research leads (i.e., who can scope and lead useful AIS research projects), especially given how many competent software engineers are flooding into AIS. It seems a mistake not to account for this source of impact in this review.

Good idea, this could be a valuable follow-up analysis. To give this a proper treatment, we would need a model for how students and mentors interact to (say) produce more research and estimate how much they compliment each other.

In general, we assumed that impacts were negligible if we couldn't model or measure them well in order to get a more conservative estimate. But hopefully we can build the capacity to consider these things!

Thanks for the clarification, a 20% changes things a lot, I'll have to read into why they chose that.

Let's try to update it. I'm not sure how to categorize different roles into scientists vs engineers, but eyeballing the list of positions AISC participants got, assume half become scientists and disregard the contributions of research engineers. With a 20% discount rate,  10 years of work in a row is more like 4.5. so we get:

333 * 0.45 / 2= ~75 QARY's / \$1M

The real number would be lower since AISC focuses on new researchers who have a delay in their time to entering the field, e.g. a 3 year delay would halve this value.

Yes we're definitely interested in doing more work along these lines! Personally, I think there are returns-to-scale to doing these kinds of assessments across similar programs since we can compare across programs and draw more general lessons.

Probably the best way for people to contact us in general is to email us at hi@arbresearch.com. Misha and I can have a few meetings with you to determine if/how we can help.

I'm going to refrain from giving you a cost estimate since it's not really my department and depends pretty heavily on how many participants you have, the kinds of things you want to measure, etc. But we have flexibility and work with orgs of all budgets/sizes.

Yes, we were particularly concerned with the fact that earlier camps were in-person and likely had a stronger selection bias for people interested in AIS (due to AI/AIS being more niche at the time) as well as a geographic selection bias. That's why I have more trust in the participant tracking data for camps 4-6 which were more recent, virtual and had a more consistent format.

Since AISC 8 is so big, it will be interesting to re-do this analysis with a single group under the same format and degree of selection.

Thank you for the pointer! I hadn't seen this before and it looks like there's a lot of interesting thinking on how to study AI safety field building. I appreciate having more cost-effectiveness estimates to compare to.

I haven't given it a full read, but it seems like the quality-adjusted researcher year is very similar to the metric I'm proposing here.

To do a comparison between our estimates, lets assume a new AIS researcher does 10 years of quality adjusted, time-discounted AIS research (note that timelines become pretty important here) then we get:

(10 QARY's/researcher) / (\$30K/researcher) = 3.33E-4 QURY's per dollar = 333 QURY's per \$1M

That seems similar to the CAIS estimates for MLSS, so it seems like these approaches have pretty comparable results!

In the future I'm also interested in modelling how to distribute funding optimally in talent pipelines like these.