Wiki Contributions


2019 AI Alignment Literature Review and Charity Comparison

My commendations on another detailed and thoughtful review. A few reactions (my views, not GCRI's):

The only case I can think of where scientists are relatively happy about punitive safety regulations, nuclear power, is one where many of those initially concerned were scientists themselves.

Actually, a lot of scientists & engineers in nuclear power are not happy about the strict regulations on nuclear power. Note, I've been exposed to this because my father worked as an engineer in the nuclear power industry, and I've had other interactions with it through my career in climate change & risk analysis. Basically, widespread overestimation of the medical harms from radiation has caused nuclear power to be held to a much higher standard than other sources, especially fossil fuels.

A better example would be recombinant DNA - see Katja Grace's very nice study of it. The key point is the importance of the scientists/engineers buying into the regulation. This is consistent with other work I'm familiar with on risk regulation etc., and with work I've published, e.g. this and this.

My impression is that policy on most subjects, especially those that are more technical than emotional is generally made by the government and civil servants in consultation with, and being lobbied by, outside experts and interests

More precisely, the distinction is between issues that matter to voters in elections (plus campaign donors etc.) and issues that fly more under the radar. For now at least, AI still flies under the radar, creating more opportunity for expert insiders (like us) to have significant impact, as do most other global catastrophic risks. The big exception is climate change. (I'm speaking in terms of US politics/policy. I don't know about other countries.)

Without expert (e.g. top ML researchers in academia and industry) consensus, no useful policy will be enacted. Pushing directly for policy seems if anything likely to hinder expert consensus. Attempts to directly influence the government to regulate AI research seem very adversarial

This depends on the policy. A lot of policy is not about restricting AI, but instead about coordination, harmonizing standards, ensuring quality applications, setting directions for the field, etc. That said, it is definitely important to factor the reactions of AI communities into policy outreach efforts. (As I have been pushing for in e.g. the work referenced above.)

With regard to published research, in general I think it is better for it to be open access, rather than behind journal paywalls, to maximise impact. Reducing this impact by a significant amount in order for the researcher to gain a small amount of prestige does not seem like an efficient way of compensating researchers to me.

It varies from case to case. For a lot of research, the primary audience is other researchers/experts in the field. They generally have access to paywall journals and place significant weight on journal quality/prestige. Also open access journals typically charge author publication fees, generally in the range of hundreds to thousands of dollars. That raises the question of whether it's a good use of funds. I'm not at all against open access (I like open access!); I only mean to note that there are other factors that may make it not always the best option.

it seems a bit of a waste to have to charge for books

Again it depends. Mass-market books typically get a lot more attention when they're from a major publisher. These books are more than just books - they are platforms for a lot of attention and discussion. If e.g. Bostrom had self-published Superintelligence, it probably wouldn't have gotten nearly the same attention. Also good publishers have editors who improve the books, and that costs money. I see a stronger case for self-publishing technical reports that have a narrower audience, especially if the author and/or their organization have the resources to do editing, page layout, promotion, etc.

More prosaically, organisations should make sure to upload the research they have published to their website

Yes, definitely! I for one frequent the websites of peer organizations, and often wish they were more up to date.

in general I do not give full credence to charities saying they need more funding because they want much more than a 18 months or so of runway in the bank

I might worry that this could bias the field away from more senior people who may have larger financial responsibilities (family, mortgage, etc.) and better alternative opportunities for income. There's no guarantee that future donations will be made, which creates a risk for the worker even if they're doing excellent work.

the conventional peer review system seems to be extremely bad at dealing with this issue

Peer review should filter out bad/unoriginal research, sort it by topic (journal X publishes on topic X etc.), and improve papers via revision requests. Good journals do this. Not all journals are good. Overall I for one find significantly better quality work in peer reviewed journals (especially good journals) than outside of peer review.

The Bay Area

I can't speak to concerns about the Bay Area, but I can say that GCRI has found a lot of value in connecting with people outside the usual geographic hubs, and that this is something ripe for further investment in (whether via GCRI or other entities). See e.g. this on GCRI's 2019 advising/collaboration program, which we're continuing in 2020.

Long-Term Future Fund: April 2019 grant recommendations

Thanks, that makes sense. This is one aspect in which audience is an important factor. Our two recent nuclear war model papers (on the probability and impacts) were written to be accessible to wider audiences, including audiences less familiar with risk analysis. This is of course a factor for all research groups that work on topics of interest to multiple audiences, not just GCRI.

Long-Term Future Fund: April 2019 grant recommendations

All good to know, thanks.

I'll briefly note that I am currently working on a more extended discussion of policy outreach suitable for posting online, possibly on this site, that is oriented toward improving the understanding of people in the EA-LTF-GCR community. It's not certain I'll have the chance to complete given my other responsibilities it but hopefully I will.

Also if it would help I can provide suggestions of people at other organizations who can give perspectives on various aspects of GCRI's work. We could follow up privately about that.

Long-Term Future Fund: April 2019 grant recommendations

I actually had a sense that these broad overviews were significantly less valuable to me than some of the other GCRI papers that I've read and I predict that other people who have thought about global catastrophic risks for a while would feel the same.

That is interesting to hear. Some aspects of the overviews are of course going to be more familiar to domain experts. The integrated assessment paper in particular describes an agenda and is not intended to have much in the way of original conclusions.

The argument seemed to mostly consists of a few concrete examples, most of which seemed relatively tenuous to me. Happy to go into more depth on that).

I would be quite interested in further thoughts you have on this. I’ve actually found that the central ideas of the far future argument paper have held up quite well, possibly even better than I had originally expected. Ditto for the primary follow-up to this paper, “Reconciliation between factions focused on near-term and long-term artificial intelligence”, which is a deeper dive on this theme in the context of AI. Some examples of work that is in this spirit:

· Open Philanthropy Project’s grant for the new Georgetown CSET group, which pursues “opportunities to inform current and future policies that could affect long-term outcomes” (link)

· The study The Malicious Use of Artificial Intelligence, which, despite being led by FHI and CSER, is focused on near-term and sub-existential risks from AI

· The paper Bridging near- and long-term concerns about AI by Stephen Cave and Seán S. ÓhÉigeartaigh of CSER/CFI

All of these are more recent than the GCRI papers, though I don’t actually know how influential GCRI’s work was in any of the above. The Cave and ÓhÉigeartaigh paper is the only one that cites our work, and I know that some other people have independently reached the same conclusion about synergies between near-term and long-term AI. Even if GCRI’s work was not causative in these cases, these data points show that the underlying ideas have wider currency, and that GCRI may have been (probably was?) ahead of the curve.

One kind of bad operationalization might be "research that would give the best people at FHI, MIRI and Open Phil a concrete sense of being able to make better decisions in the GCR space".

That’s fine, but note that those organizations have much larger budgets than GCRI. Of them, GCRI has closest ties to FHI. Indeed, two FHI researchers were co-authors on the long-term trajectories paper. Also, if GCRI was to be funded specifically for research to improve the decision-making of people at those organizations, then we would invest more in interacting with them, learning what they don't know / are getting wrong, and focusing our work accordingly. I would be open to considering such funding, but that is not what we have been funded for, so our existing body of work may be oriented in an at least somewhat different direction.

It may also be worth noting that the long-term trajectories paper functioned as more of a consensus paper, and so I had to be more restrained with respect to bolder and more controversial claims. To me, the paper’s primary contributions are in showing broad consensus for the topic, integrating the many co-author’s perspectives into one narrative, breaking ground especially in the empirical analysis of long-term trajectories, and providing entry points for a wider range of researchers to contribute to the topic. Most of the existing literature is primarily theoretical/philosophical, but the empirical details are very important. (The paper also played a professional development role for me in that it gave me experience leading a massively-multi-authored paper.)

Given the consensus format of the paper, I was intrigued that the co-author group was able to support the (admittedly toned down) punch-line in the conclusion “contrary to some claims in the catastrophic risk literature, extinction risks may not be categorically more important than large subextinction risks”. A bolder/more controversial idea that I have a lot of affinity for is that the common emphasis on extinction risk is wrong, and that a wider—potentially much wider—set of risks merits comparable concern. Related to this is the idea that “existential risk” is either bad terminology or not the right thing to prioritize. I have not yet had the chance to develop these ideas exactly as I see them (largely due to lack of funding for it), but the long-term trajectories paper does cover a lot of the relevant ground.

(I have also not had the chance to do much to engage the wider range of researchers who could contribute to the topic, again due to lack of funding for it. These would mainly be researchers with expertise on important empirical details. That sort of follow-up is a thing that funding often goes toward, but we didn't even have dedicated funding for the original paper, so we've instead focused on other work.)

Overall, the response to the long-term trajectories paper has been quite positive. Some public examples:

· The 2018 AI Alignment Literature Review and Charity Comparison, which wrote: “The scope is very broad but the analysis is still quite detailed; it reminds me of Superintelligence a bit. I think this paper has a strong claim to becoming the default reference for the topic.”

· A BBC article on the long-term future, which calls the paper “intriguing and readable” and then describes it in detail. The BBC also invited me to contribute an article on the topic for them, which turned into this.

Long-Term Future Fund: April 2019 grant recommendations
I do view this publishing of the LTF-responses as part of an iterative process.

That makes sense. I might suggest making this clear to other applicants. It was not obvious to me.

Long-Term Future Fund: April 2019 grant recommendations

Oliver Habryka's comments raise some important issues, concerns, and ideas for future directions. I elaborate on these below. First, I would like to express my appreciation for his writing these comments and making them available for public discussion. Doing this on top of the reviews themselves strikes me as quite a lot of work, but also very valuable for advancing grant-making and activity on the long-term future.

My understanding of Oliver's comments is that while he found GCRI's research to be of a high intellectual quality, he did not have confidence that the research is having sufficient positive impact. There seem to be four issues at play: GCRI’s audience, the value of policy outreach on global catastrophic risk (GCR), the review of proposals on unfamiliar topics, and the extent to which GCRI’s research addresses fundamental issues in GCR.

(1) GCRI’s audience

I would certainly agree that it is important for research to have a positive impact on the issues at hand and not just be an intellectual exercise. To have an impact, it needs an audience.

Oliver's stated impression is that GCRI's audience is primarily policy makers, and not the EA long-term future (EA-LTF) community or global catastrophic risk (GCR) experts. I would agree that GCRI's audience includes policy makers, but I would disagree that our audience does not include the EA-LTF community or GCR experts. I would add that our audience also includes scholars who work on topics adjacent to GCR and can make important contributions to GCR, as well as people in other relevant sectors, e.g. private companies working on AI. We try to prioritize our outreach to these audiences based on what will have the most positive impact on reducing GCR given our (unfortunately rather limited) resources and our need to also make progress on the research we are funded for. We very much welcome suggestions on how we can do this better.

The GCRI paper that Oliver described ("the paper that lists and analyzes all the nuclear weapon close-calls" is A Model for the Probability of Nuclear War. This paper is indeed framed for policy audiences, which was in part due to the specifications of the sponsor of this work (the Global Challenges Foundation) and in part because the policy audience is the most important audience for work on nuclear weapons. It is easy to see how reading that paper could suggest that policy makers are GCRI's primary audience. Nonetheless, we did manage to embed some EA themes into the paper, such as the question of how much nuclear war should be prioritized relative to other issues. This is an example of us trying to stretch our limited resources in directions of relevance to wider audiences including EA.

Some other examples: Long-term trajectories of human civilization was largely written for audiences of EA-LTF, GCR experts, and scholars of adjacent topics. Global Catastrophes: The Most Extreme Risks was largely written for the professional risk analysis community. Reconciliation between factions focused on near-term and long-term artificial intelligence was largely written for… well, the title speaks for itself, and is a good example of GCRI engaging across multiple audiences.

The question of GCRI’s audience is a detail for which an iterative review process could have helped. Had GCRI known that our audience would be an important factor in the review, we could have spoken to this more clearly in our proposal. An iterative process would increase the workload, but perhaps in some cases it would be worth it.

(2) The value of policy outreach

Oliver writes, “I am broadly not super excited about reaching out to policy makers at this stage of the GCR community's strategic understanding, and am confused enough about policy capacity-building that I feel uncomfortable making strong recommendations based on my models there.”

This is consistent with comments I've heard expressed by other people in the EA-LTF-GCR community, and some colleagues report hearing things like this too. The general trend has been that people within this community who are not active in policy outreach are much less comfortable with it than those who are. This makes sense, but it also is a problem that holds us back from having a larger positive impact on policy. This includes GCRI’s funding and the work that the funding supports, but it is definitely bigger than GCRI.

This is not the space for a lengthy discussion of policy outreach. For now, it suffices to note that there is considerable policy expertise within the EA-LTF-GCR community, including at GCRI and several other organizations. There are some legitimately tricky policy outreach issues, such as in drawing attention to certain aspects of risky technologies. Those of us who are active in policy outreach are very attentive to these issues. A lot of the outreach is more straightforward, and a nontrivial portion is actually rather mundane. Improving awareness about policy outreach within the EA-LTF-GCR community should be an ongoing project.

It is also worth distinguishing between policy outreach and policy research. Much of GCRI's policy-oriented work is the latter. The research can and often does inform the outreach. Where there is uncertainty about what policy outreach to do, policy research is an appropriate investment. While I'm not quite sure what is meant by "this stage of the GCR community's strategic understanding", there's a good chance that this understanding could be improved by research by groups like GCRI, if we were funded to do so.

(3) Reviewing proposals on unfamiliar topics

We should in general expect better results when proposals are reviewed by people who are knowledgeable of the domains covered in the proposals. Insofar as Oliver is not knowledgeable about policy outreach or other aspects of GCRI's work, then arguably someone else should have reviewed GCRI’s proposal, or at least these aspects of GCRI’s proposal.

This makes me wonder if the Long-Term Future Fund may benefit from a more decentralized review process, possibly including some form of peer review. It seems like an enormous burden for the fund’s team to have to know all the nuances of all the projects and issue areas that they could be funding. I certainly would not want to do all that on my own. It is common for funding proposal evaluation to include peer review, especially in the sciences. Perhaps that could be a way for the fund’s team to lighten its load while bringing in a wider mix of perspectives and expertise. I know I would volunteer to review some proposals, and I'm confident at least some of my colleagues would too.

It may be worth noting that the sciences struggle to review interdisciplinary funding proposals. Studies report a perceived bias against interdisciplinary proposals: “peers tend to favor research belonging to their own field” (link), so work that cuts across fields is funded less. Some evidence supports this perception (link). GCRI’s work is highly interdisciplinary, and it is plausible that this creates a bias against us among funders. Ditto for other interdisciplinary projects. This is a problem because a lot of the most important work is cross-cutting and interdisciplinary.

(4) GCRI’s research on fundamental issues in GCR

As noted above, GCRI does work for a variety of audiences. Some of our work is not oriented toward fundamental issues in GCR. But here is some that is:

* Long-term trajectories of human civilization is on (among other things) the relative importance of extinction vs. sub-extinction risks.

* The far future argument for confronting catastrophic threats to humanity: Practical significance and alternatives is on strategy for how to reduce GCR in a world that is mostly not dedicated to reducing GCR.

* Towards an integrated assessment of global catastrophic risk outlines an agenda for identifying and evaluating the best ways of reducing the entirety of global catastrophic risk.

See also our pages on Cross-Risk Evaluation & Prioritization, Solutions & Strategy, and perhaps also Risk & Decision Analysis.

Oliver writes “I did not have a sense that they were trying to make conceptual progress on what I consider to be the current fundamental confusions around global catastrophic risk, which I think are more centered around a set of broad strategic questions and a set of technical problems.” He can speak for himself on what he sees the fundamental confusions as being, but I find it hard to conclude that GCRI’s work is not substantially oriented toward fundamental issues in GCR.

I will note that GCRI has always wanted to focus primarily on the big cross-cutting GCR issues, but we have never gotten significant funding for it. Instead, our funding has gone almost exclusively to more narrow work on specific risks. That is important work too, and we are grateful for the funding, but I think a case can be made for more support for cross-cutting work on the big issues. We still find ways to do some work on the big issues, but our funding reality prevents us from doing much.

Have we underestimated the risk of a NATO-Russia nuclear war? Can we do anything about it?

Thanks for this conversation. Here are a few comments.

Regarding the Ukraine crisis and the current NATO-Russia situation, I think Max Fisher at Vox is right to raise the issue as he has, with an excellent mix of insider perspectives. There should be more effort like this, in particular to understand Russia's viewpoint. For more on this topic I recommend recent work by Rajan Menon [], [], [] and Martin Hellman's blog []. I do think Fisher somewhat overstates the risk by understating the possibility of a "frozen conflict" - see Daniel Drezner's discussion of this []. That said, the Ukraine crisis clearly increases the probability of nuclear war, though I think it also increases the prospects and opportunities for resolving major international tensions by drawing them to attention []. Never let a good crisis go to waste.

Regarding the merits of the EA community working on nuclear war risk, I think it's worth pursuing. Yes, the existence of an established nuclear weapons community means there is more supply of work on this topic, but there is also more demand, especially more high-level demand. I see a favorable supply-demand balance, which is a core reason why GCRI has done a lot on this topic. (We also happen to have relevant background and connections.) Of note, the established community has less inclination towards quantitative risk analysis, and also often takes partisan nationalistic or ideological perspectives; people with EA backgrounds can make valuable contributions on both fronts. My big piece of advice for EAs seeking to get involved is to immerse yourself in the nuclear weapons community to understand its concepts, perspectives, etc., and to respect all that it has already accomplished, instead of showing up expecting to immediately teach them things they didn't know already. This is comparable to the situation with foreign aid projects that don't bother to see what local communities actually benefit from.

I am Seth Baum, AMA!

I see the logic here, but I would hesitate to treat it as universally applicable. Under some circumstances, more centralized structrues can outperform. For example if China or Wal-Mart decide to reduce greenhouse gas emissions, then you can get a lot more than if the US or the corner store decide to, because the latter are more decentralized. That's for avoiding catastrophes. For surviving them, sometimes you can get similar effects. However, local self-sufficiency can be really important. We argued this in As for anti-trust, perhaps this could help, but this doesn't strike me as the right place to start. It seems like a difficult area to make progress on relative to the potential gains in terms of gcr reduction. But I could be wrong, as I've not looked into it in any detail.

I am Seth Baum, AMA!

OK, I'm wrapping up for the evening. Thank you all for these great questions and discussion. And thanks again to Ryan Carey for organizing.

I'll check back in tomorrow morning and try to answer any new questions that show up.

Load More