2019 AI Alignment Literature Review and Charity Comparison

Larks

2019 AI Alignment Literature Review and Charity Comparison

Larks

75 min readDec 19, 2019

147

Comments 28

Sorted by

New & upvoted

SethBaum

My commendations on another detailed and thoughtful review. A few reactions (my views, not GCRI's):

The only case I can think of where scientists are relatively happy about punitive safety regulations, nuclear power, is one where many of those initially concerned were scientists themselves.

Actually, a lot of scientists & engineers in nuclear power are not happy about the strict regulations on nuclear power. Note, I've been exposed to this because my father worked as an engineer in the nuclear power industry, and I've had other interactions with it through my career in climate change & risk analysis. Basically, widespread overestimation of the medical harms from radiation has caused nuclear power to be held to a much higher standard than other sources, especially fossil fuels.

A better example would be recombinant DNA - see Katja Grace's very nice study of it. The key point is the importance of the scientists/engineers buying into the regulation. This is consistent with other work I'm familiar with on risk regulation etc., and with work I've published, e.g. this and this.

My impression is that policy on most subjects, especially those that are more technical than emotional is generally made by the government and civil servants in consultation with, and being lobbied by, outside experts and interests

More precisely, the distinction is between issues that matter to voters in elections (plus campaign donors etc.) and issues that fly more under the radar. For now at least, AI still flies under the radar, creating more opportunity for expert insiders (like us) to have significant impact, as do most other global catastrophic risks. The big exception is climate change. (I'm speaking in terms of US politics/policy. I don't know about other countries.)

Without expert (e.g. top ML researchers in academia and industry) consensus, no useful policy will be enacted. Pushing directly for policy seems if anything likely to hinder expert consensus. Attempts to directly influence the government to regulate AI research seem very adversarial

This depends on the policy. A lot of policy is not about restricting AI, but instead about coordination, harmonizing standards, ensuring quality applications, setting directions for the field, etc. That said, it is definitely important to factor the reactions of AI communities into policy outreach efforts. (As I have been pushing for in e.g. the work referenced above.)

With regard to published research, in general I think it is better for it to be open access, rather than behind journal paywalls, to maximise impact. Reducing this impact by a significant amount in order for the researcher to gain a small amount of prestige does not seem like an efficient way of compensating researchers to me.

It varies from case to case. For a lot of research, the primary audience is other researchers/experts in the field. They generally have access to paywall journals and place significant weight on journal quality/prestige. Also open access journals typically charge author publication fees, generally in the range of hundreds to thousands of dollars. That raises the question of whether it's a good use of funds. I'm not at all against open access (I like open access!); I only mean to note that there are other factors that may make it not always the best option.

it seems a bit of a waste to have to charge for books

Again it depends. Mass-market books typically get a lot more attention when they're from a major publisher. These books are more than just books - they are platforms for a lot of attention and discussion. If e.g. Bostrom had self-published Superintelligence, it probably wouldn't have gotten nearly the same attention. Also good publishers have editors who improve the books, and that costs money. I see a stronger case for self-publishing technical reports that have a narrower audience, especially if the author and/or their organization have the resources to do editing, page layout, promotion, etc.

More prosaically, organisations should make sure to upload the research they have published to their website

Yes, definitely! I for one frequent the websites of peer organizations, and often wish they were more up to date.

in general I do not give full credence to charities saying they need more funding because they want much more than a 18 months or so of runway in the bank

I might worry that this could bias the field away from more senior people who may have larger financial responsibilities (family, mortgage, etc.) and better alternative opportunities for income. There's no guarantee that future donations will be made, which creates a risk for the worker even if they're doing excellent work.

the conventional peer review system seems to be extremely bad at dealing with this issue

Peer review should filter out bad/unoriginal research, sort it by topic (journal X publishes on topic X etc.), and improve papers via revision requests. Good journals do this. Not all journals are good. Overall I for one find significantly better quality work in peer reviewed journals (especially good journals) than outside of peer review.

The Bay Area

I can't speak to concerns about the Bay Area, but I can say that GCRI has found a lot of value in connecting with people outside the usual geographic hubs, and that this is something ripe for further investment in (whether via GCRI or other entities). See e.g. this on GCRI's 2019 advising/collaboration program, which we're continuing in 2020.

Misha_Yagudin

A bit of a tangent. I am confused by SFF's grant to OAK (Optimizing Awakening and Kindness). Could any recommender comment on its purpose or at least briefly describe what OAK is about as the hyperlink is not very informative.

PeterMcCluskey

OAK intends to train people who are likely to have important impacts on AI, to help them be kinder or something like that. So I see a good deal of overlap with the reasons why CFAR is valuable.

I attended a 2-day OAK retreat. It was run in a professional manner that suggests they'll provide a good deal of benefit to people who they train. But my intuition is that the impact will be mainly to make those people happier, and I expect that OAK's impact will have less effect on peoples' behavior than CFAR has.

I considered donating to OAK as an EA charity, but have decided it isn't quite effective enough for me to treat it that way.

I believe that the person who promoted that grant at SFF has more experience with OAK than I do.

I'm surprised that SFF gave more to OAK than to ALLFED.

Misha_Yagudin

Peter, thank you! I am slightly confused by your phrasing.

To benchmark, would you say that

(a) CFAR mainline workshops are aimed to train [...] to "people who are likely to have important impacts on AI";
(b) AIRCS workshops are aimed at the same audience;
(c) MSFP is aimed at the same audience?

PeterMcCluskey

Nearly all of CFAR's activity is motivated by their effects on people who are likely to impact AI. As a donor, I don't distinguish much between the various types of workshops.

There are many ways that people can impact AI, and I presume the different types of workshop are slightly optimized for different strategies and different skills, and differ a bit in how strongly they're selecting for people who have a high probability of doing AI-relevant things. CFAR likely doesn't have a good prediction in advance about whether any individual person will prioritize AI, and we shouldn't expect them to try to admit only those with high probabilities of working on AI-related tasks.

Misha_Yagudin

Thank you, Peter. If you are curious Anna Salamon connected various types of activities with CFAR's mission in the recent Q&A.

Habryka [Deactivated]

I was a recommender for the round, but did not recommend a grant to OAK, so I sadly can't speak to it. I think to get more detailed reasoning on this, you would have to get an answer from the specific recommenders who made that grant. However, if you can't get ahold of them, I can probably give a somewhat bad summary of their views, though I think it would be better to hear things directly from them.

Ofer

Financial Reserves

You listed important considerations; here are some additional points to consider:

1. As suggested in SethBaum's comment, a short runway may deter people from joining the org (especially people with larger personal financial responsibilities and opportunity cost).

2. It seems likely that—all other things being equal—orgs with a longer runway are "less vulnerable to Goodhart's law" and generally less prone to optimize for short-term impressiveness in costly ways. Selection effects alone seem sufficient to justify this belief: Orgs with a short runway that don't optimize for short-term impressiveness seem less likely to keep on existing.

Habryka [Deactivated]

Some minor clarifications for the Long Term Future Fund section (Larks did reach out to us and ask for feedback, though it looks like not all of our corrections made it into the final version). [Correction: We hadn't actually send the relevant corrections back yet, so this is not Lark's fault. Sorry for that!]

Notably this means the funds only paid out $150,000 to CFAR (10%), as the balance was made up by a private donor after CEA did not approve the second grant.

The thing as written is correct, but I want to clarify that CEA did not reject the grant, but was still in the process of deciding whether to approve the grant, when a private donor stepped in. I expect that CEA would have eventually approved this grant, though there is definitely still some uncertainty in that.

I was not impressed that one grant that saw harsh and accurate criticism on the forum after the first round was re-submitted for the second round. Ex post this didn’t matter as CEA rejected it on substantive grounds the second time

This is correct as written, but I think I want to clarify what is meant by resubmission here:

We’ve had a few grants that have run into logistical difficulties (like there not being a clear way to make this grant compatible with CEA’s charitable objectives), and in cases like this we’ve worked with the potential grantee to resolve those issues. I think that support we provide should be independent from the evaluations of grants that we make, and I don’t think we should reject grants because of logistical issues like that, if they are easily fixable.

The Lauren Lee grant ran into some issues in this space that took a while to resolve, so CEA ended up only properly evaluating the grant in the round following the one in which we recommended it, and then subsequently rejected it. The “resubmission” in that sense shouldn’t be seen as an additional strong endorsement of the grant, but is just a thing that could happen to any grant and I think doesn’t say much about how good we thought the grant was, after we had made the decision to recommend the grant.

Geography chart

We were about to send Larks an updated version of the geography data before this post went up. Here is a graph with my best guesses (this includes all recommendations we made, even for grants that didn't end up going through):

And here is one excluding the three grants that ended up being covered by private donors:

[Edit Note: I briefly had a version of the comment up that showed the geographic distribution by count instead of by grant amount. This is now fixed.]

Habryka [Deactivated]

Also, obviously. Thank you a lot for writing this review. As someone with a strong interest in having good public discourse around AI Alignment, I am deeply grateful for all of your work on this, and deeply appreciate the care and effort that goes into these reviews and the effects it has on people trying to successfully navigate the growing AI Alignment landscape.

RyanCarey

Of these categories, I am most excited by the Individual Research, Event and Platform projects. I am generally somewhat sceptical of paying people to ‘level up’ their skills.

If I'm understanding the categories correctly, I agree here.

While generally good, one side effect of this (perhaps combined with the fact that many low-hanging fruits of the insight tree have been plucked) is that a considerable amount of low-quality work has been produced. Furthermore, the conventional peer review system seems to be extremely bad at dealing with this issue... Perhaps you, enlightened reader, can judge that “How to solve AI Ethics: Just use RNNs” is not great. But is it really efficient to require everyone to independently work this out?

I agree. I think part of the equation is that peer review does not just filter papers "in" or "out" - it accepts them to a journal of a certain quality. Many bad papers will get into weak journals, but will usually get read much less. Researchers who read these papers cite them, also taking into account to their quality, thereby boosting the readership of good papers. Finally, some core of elite researchers bats down arguments that due to being weirdly attractive yet misguided, manage to make it through the earlier filters. I think this process works okay in general, and can also work okay in AI safety.

I do have some ideas for improving our process though, basically to establish a steeper incentive gradient for research quality (in the dimensions of quality that we care about): (i) more private and public criticism of misguided work, (ii) stronger filters on papers being published in safety workshops, probably by agreeing to have fewer workshops, with fewer papers, and by largely ignoring any extra workshops from "rogue" creators, and (iii) funding undersupervised talent-pipeline projects a bit more carefully.

Bar guvat V jbhyq yvxr gb frr zber bs va gur shgher vf tenagf sbe CuQ fghqragf jub jnag gb jbex va gur nern. Hasbeghangryl ng cerfrag V nz abg njner bs znal jnlf sbe vaqvivqhny qbabef gb cenpgvpnyyl fhccbeg guvf.

Svygrevat ~100 nccyvpnagf qbja gb n srj npprcgrq fpubynefuvc erpvcvragf vf abg gung qvssrerag gb jung PUNV naq SUV nyernql qb va fryrpgvat vagreaf. Gur rkcrpgrq bhgchgf frrz ng yrnfg pbzcnenoyl-uvtu. Fb V guvax pubbfvat fpubynefuvc erpvcvragf jbhyq or fvzvyneyl tbbq inyhr va grezf bs rinyhngbef' gvzr, naq nyfb n cerggl tbbq hfr bs shaqf.

It's an impressive effort as in previous years! One meta-thought: if you stop providing this service at some point, it might be worth reaching out to the authors of the alignment newsletter, to ask whether they or anyone they know would jump in to fill the breach.

Ben_West🔸

Thanks for writing this up Larks! A highlight of the year, as always.

Derek

Why isn't there a GiveWell-style evaluator for longtermist (or specifically AI safety) orgs?

cole_haus

I'd guess it's because it's very hard to apply a GiveWell approach (i.e. explicit quantitative modeling based on a substantial body of empirical evidence) to many long-termist orgs (whose impacts will often definitionally be unknown for the foreseeable future, and around which there is large and pervasive uncertainty). Open Philanthropy describes an evaluation methodology that seems more suited to long-termist orgs.

Milan Griffes

It would be extremely surprising to me if someone, acting as a self-interested owner of all the world's shoe companies (for example) found it more profitable to protect biodiversity than to raise the price of shoes. Fortunately, in practice universal investors are quite supportive of competition.

Do you have a take about why universal investors tend to support competition, in practice?

Milan Griffes

Cummings's On the referendum #31: Project Maven, procurement, lollapalooza results & nuclear/AGI safety discusses various important trends, including a sophisticated discussion of AGI safety. This is mainly noteworthy because the author is the mastermind of Brexit and the recent Conservative landslide in the UK, and perhaps the most influential man in the UK as a result.

Want to signal-boost this. It's a big, big deal. Probably the most important EA / rationalist thing that happened this decade (in terms of discrete events, harder to say for trends).

See also Cummings' High-performance government.

Sean_o_h

Worth noting the changes that are apparently going to be made in the UK civil service, likely by Cummings' design, and which seem quite compatible with a lot of rationalist thinking.

More scientists in the civil service
Data science, systems thinking, and superforecasting training prioritised.

https://twitter.com/JohnRentoul/status/1212486981713899520

Milan Griffes

More on this, hot off the presses: https://dominiccummings.com/2020/01/02/two-hands-are-a-lot-were-hiring-data-scientists-project-managers-policy-experts-assorted-weirdos/ (a)

Milan Griffes

MIRI, in collaboration with CFAR, runs a series of four-day workshop/camps, the AI Risk for Computer Scientists workshops, which gather mathematicians/computer scientists who are potentially interested in the issue in one place to learn and interact. This sort of workshop seems very valuable to me as an on-ramp for technically talented researchers, which is one of the major bottlenecks in my mind.

Curious why CFAR wasn't included in the review as a standalone org, given your high opinion of AIRCS workshops?

Some more context in this comment.

Milan Griffes

On MIRI's finances:

They spent $3,750,000 in 2018 and $6,000,000 in 2019, and plan to spend around $6,800,000 in 2020.

I was surprised to see such a big increase between 2018 & 2019.

Also, $6M for 18 staff seems quite high in absolute terms (~$333,000 per capita).

But I don't have complete information here + have pretty coarse models about all the costs concomitant with administering a nonprofit research org.

Do you have any more information about their finances, why they increased so much from 2018 to 2019, and where the money is going?

Malo

(I'm COO at MIRI.)

Just wanted to provide some info that might be helpful:

We currently have 30 staff at the moment.
Our 2019 fundraiser post has a high-level breakdown of our spending estimates for 2020.
In our 2018 fundraiser post there's a budget estimate for 2019. The upper end of our estimated spending for 2019 in that post was $5.5M. I expect we'll actually come in under $6M but definitely over $5.5M. (This is inline with the upper end of updated spending estimates I generated internally in Q1 2019.)
Our 2018 in review post has a high-level breakdown of our 2018 spending. You can also see audited financial statements on our transparency page. (Note that figures in the financial statements and in the review post might not match up for a bunch of reasons, e.g., differences in how expenses are categorized, expenses for lease holder improvement and equipment etc. being considered fixed assets that depreciate over time on the financial statements, etc.)
The notable increase in our spending after 2017 is for the most part due to doubling the size of our staff, where more new staff were added in 2019 than in 2018.
The above doesn't include AI Impacts, which operates on it's own restricted funding.

Milan Griffes

Thanks!

We currently have 30 staff at the moment.

Were most of the 12 new staff onboarded early enough in 2019 such that it makes sense to include them in a 2019 per capita expenditure estimate?

Our 2019 fundraiser post has a high-level breakdown of our spending estimates for 2020.

Thanks for the pointer – this is helpful.

Is it right that you're estimating 2020's compensation expenditure at ~$182,000 per employee? (($3.56M + $1.4M + $0.51M) / 30 employees)
What's included in the "cost of doing business" category? $0.8M strikes me as high, but I don't have a granular understanding here.

Malo

What's included in the "cost of doing business" category? $0.8M strikes me as high, but I don't have a granular understanding here.

It includes things like, rent, utilities, general office expenses, furnishings/equipment, bank/processing fees, software/services, insurance, bookkeeping/accounting, visas/legal. The largest expense that makes up the estimated ~$0.8M is rent, which accounts for just over half.

Is it right that you're estimating 2020's compensation expenditure at ~$182,000 per employee? (($3.56M + $1.4M + $0.51M) / 30 employees)

No, that will be an over estimate for a few reasons:

The $0.51M is an estimate of what new research staff we'll add to the team in 2020 will cost. (So above the 30 we have at the moment.)
The $1.4M estimate for General Personnel assumes we'll add one new operations staff in 2020.
The $3.56M estimate for Research Personnel, largely represents salaries and related costs for existing research staff, but it also includes compensation for research interns and research contractors.

Were most of the 12 new staff onboarded early enough in 2019 such that it makes sense to include them in a 2019 per capita expenditure estimate?

We added 8 new staff in 2019. When I make our spending estimates, I assume new staff are added evenly throughout the year, i.e., I assume the spending on all new staff in a given year will be ~50% of their total annual cost. In practice given that we aren't talking about very large numbers here the accuracy of that estimate varies quite a bit. The distributions of when new staff were added in 2019 was pretty centered on the middle of the year, though salary level of those staff will likely complicate things here (I haven't run those numbers.)

Milan Griffes

No, that will be an over estimate for a few reasons...

Got it. Is ~$142,000 per employee closer to what MIRI is estimating for 2020 compensation expenditure?

(Removing the $0.51M term entirely, removing $0.56M from the Research Personnel estimate to account for interns & contractors, and adding one new ops employee to the denominator gives ($3M + $1.4M) / 31 employees = $142k per employee)

Malo

Yeah, that should be a reasonably good estimate.

Milan Griffes

Ought recently published a technical progress update on their recent research.

MichaelA🔸

Thanks for this - this seems a very valuable service, and will inform my donations. (Though it's harder to say what counterfactual influence it'll have, given I was already likely to donate to some of the organisations/funders you speak highly of and suggest you'll likely donate to.)

MichaelA🔸

One question/nit-pick: In the discussion of Zabel & Muehlhauser's post, you write "Researchers from Google Brain were also named authors on the paper." Is that accurate? It seems to me that only Zabel & Muehlhauser are listed as authors of that post, and Google Brain itself (as opposed to Google more generally) isn't mentioned anywhere in the post. (But maybe I'm missing/misunderstanding something.)

Comments

2019 AI Alignment Literature Review and Charity Comparison

2019 AI Alignment Literature Review and Charity Comparison

Introduction

How to read this document

New to Artificial Intelligence as an existential risk?

Research Organisations

FHI: The Future of Humanity Institute

Research

Finances

CHAI: The Center for Human-Aligned AI

Research

Finances

MIRI: The Machine Intelligence Research Institute

Research

Non-disclosure policy

Finances

GCRI: The Global Catastrophic Risks Institute

Research

Finances

CSER: The Center for the Study of Existential Risk

Research

Finances

Ought

Research

Finances

OpenAI

Research

Finances

Google Deepmind

Research

Finances

AI Safety camp

Research

Finances

FLI: The Future of Life Institute

AIImpacts

Research

Finances

GPI: The Global Priorities Institute

Research

Finances

FRI: The Foundational Research Institute

Research

Finances

Median Group

Research

Finances

CSET: The Center for Security and Emerging Technology

Leverhulme Center for the Future of Intelligence

Research

BERI: The Berkeley Existential Risk Initiative

Grants

Research

Finances

AI Pulse

Research

Other Research

Capital Allocators

LTFF: Long-term future fund

OpenPhil: The Open Philanthropy Project

Grants

Research

Finances

SFF: The Survival and Flourishing Fund

Grants

Other News

Methodological Thoughts

Inside View vs Outside View

Politics

Openness

Research Flywheel

Differential AI progress

Near-term safety AI issues

Financial Reserves

Donation Matching

Poor Quality Research

The Bay Area

Conclusions

Disclosures

Sources