MichaelA

I'm a Researcher and Writer for Convergence Analysis (https://www.convergenceanalysis.org/), an existential risk strategy research group.

Posts of mine that were written for/with Convergence will mention that fact. In other posts, and in most of my comments, opinions expressed are my own.

I'm always very interested in feedback, comments, ideas, etc., and potentially research/writing collaborations.

About half of my posts are on LessWrong: https://www.lesswrong.com/users/michaela

MichaelA's Comments

State Space of X-Risk Trajectories

Good points.

Also, this comment reminded of somewhat similar arguments in this older post by Justin (and Ozzie Gooen).

My personal cruxes for working on AI safety

Thanks for writing this! As others have commented, I thought the focus on your actual cruxes and uncertainties, rather than just trying to lay out a clean or convincing argument, was really great. I'd be excited to see more talks/write-ups of a similar style from other people working on AI safety or other causes.

I think that long-term, it's not acceptable to have there be people who have the ability to kill everyone. It so happens that so far no one has been able to kill everyone. This seems good. I think long-term we're either going to have to fix the problem where some portion of humans want to kill everyone or fix the problem where humans are able to kill everyone.

This, and the section it's a part of, reminded me quite a bit of Nick Bostrom's Vulnerable World Hypothesis paper (and specifically his "easy nukes" thought experiment). From that paper's abstract:

Scientific and technological progress might change people’s capabilities or incentives in ways that would destabilize civilization. For example, advances in DIY biohacking tools might make it easy for anybody with basic training in biology to kill millions; novel military technologies could trigger arms races in which whoever strikes first has a decisive advantage; or some economically advantageous process may be invented that produces disastrous negative global externalities that are hard to regulate. This paper introduces the concept of a vulnerable world: roughly, one in which there is some level of technological development at which civilization almost certainly gets devastated by default, i.e. unless it has exited the ‘semi-anarchic default condition’. [...] A general ability to stabilize a vulnerable world would require greatly amplified capacities for preventive policing and global governance.

I'd recommend that paper for people who found that section of this post interesting.

Update on civilizational collapse research

Thanks for writing this. Just wanted to note two things, for future readers:

  • I thought the talk landfish linked to (and the rest of that video) was great, and would recommend that to others.
    • I also found it easier to take substantive insights from that video than from this post, as this post seems to mostly be intentionally a quick summary of overall impressions (which is fine too)
  • I've begun a list of sources on civilizational collapse here. I hope to expand it over time, and would also be keen to have others comment additional sources (or lists of sources) there.
MichaelA's Shortform

Sources I've found that seem very relevant to the topic of civilizational collapse

Civilization Re-Emerging After a Catastrophe - Karim Jebari [EAGx Nordics]

Civilizational Collapse: Scenarios, Prevention, Responses - Dave Denkenberger, Jeffrey Ladish [talks + Q&A]

Update on civilizational collapse research - Ladish [EA Forum] (I found his talk more useful, personally)

Long-Term Trajectories of Human Civilization - Baum et al. [open access paper] (the authors never actually write "collapse", but their section 4 is very relevant to the topic, and the paper is great in general)

Defence in Depth Against Human Extinction: Prevention, Response, Resilience, and Why They All Matter - Cotton-Barratt, Daniel, Sandberg [open access paper] (collapse is only explicitly addressed briefly, but the paper as a whole still seems quite relevant and useful)

Civilization: Institutions, Knowledge and the Future - Samo Burja [Foresight talk]

Things I haven't properly read/watched/listened to yet but which might be relevant

The long-term significance of reducing global catastrophic risks - Beckstead [GiveWell/OPP]

Why and how civilisations collapse - Kemp [CSER]

https://en.wikipedia.org/wiki/Societal_collapse

https://en.wikipedia.org/wiki/Collapse:_How_Societies_Choose_to_Fail_or_Succeed

I intend to add to this list over time. If you know of other relevant work, please mention it in a comment.

What are information hazards?

Just so you know, we've now (finally!) published the post on how to deal with potential information hazards over on LessWrong.

We'll be putting most of our posts on the topic on that forum, as part of a "sequence".

MichaelA's Shortform

All prior work I've found that seemed substantially relevant to the unilateralist’s curse

Unilateralist's curse [EA Concepts]

Horsepox synthesis: A case of the unilateralist's curse? [Lewis] (usefully connects the curse to other factors)

The Unilateralist's Curse and the Case for a Principle of Conformity [Bostrom et al.’s original paper]

Hard-to-reverse decisions destroy option value [CEA]

Somewhat less directly relevant

Managing risk in the EA policy space [EA Forum] (touches briefly on the curse)

Ways people trying to do good accidentally make things worse, and how to avoid them [80k] (only one section on the curse)

I intend to add to this list over time. If you know of other relevant work, please mention it in a comment.

MichaelA's Shortform

All prior work I found that explicitly uses the terms differential progress / intellectual progress / technological development

Differential Intellectual Progress as a Positive-Sum Project [FRI]

Differential technological development: Some early thinking [GiveWell]

Differential progress [EA Concepts]

Differential technological development [Wikipedia]

On Progress and Prosperity [EA Forum]

Differential intellectual progress [LW Wiki]

Existential Risks: Analyzing Human Extinction Scenarios [open access paper] (section 9.4) (introduced the term differential technological development, I think)

Intelligence Explosion: Evidence and Import [MIRI] (section 4.2) (introduced the term differential intellectual development, I think)

Some things that are quite relevant but that don’t explicitly use the terms

Strategic Implications of Openness in AI Development [open access paper]

I intend to add to this list over time. If you know of other relevant work, please mention it in a comment.

MichaelA's Shortform

All prior work I found that seemed substantially relevant to information hazards

(See also my/Convergence’s posts on the topic.)

Information hazards [EA concepts]

Information Hazards in Biotechnology - Lewis et al. - 2019 - Risk Analysis [open access paper]

Bioinfohazards [EA Forum]

Information Hazards [Bostrom’s original paper; open access]

Terrorism, Tylenol, and dangerous information [LessWrong]

Lessons from the Cold War on Information Hazards: Why Internal Communication is Critical [LessWrong]

Horsepox synthesis: A case of the unilateralist's curse? [Lewis]

Information hazard [LW Wiki]

Informational hazards and the cost-effectiveness of open discussion of catastrophic risks [EA Forum]

A point of clarification on infohazard terminology [LessWrong]

Somewhat less directly relevant

The Offense-Defense Balance of Scientific Knowledge: Does Publishing AI Research Reduce Misuse? [open access paper] (commentary here)

The Vulnerable World Hypothesis [open access paper] (footnotes 39 and 41 in particular)

Managing risk in the EA policy space [EA Forum] (touches briefly on information hazards)

Strategic Implications of Openness in AI Development [open access paper] (sort-of relevant, though not explicitly about information hazards)

I intend to add to this list over time. If you know of other relevant work, please mention it in a comment.

The Web of Prevention

Also related is the recent (very interesting) paper using that same term (linkpost).

(Interestingly, I don't recall the paper mentioning getting the term from computer security, and, skimming it again now, I indeed can't see them mention that. In fact, they only seem to say "defence in depth" once in the paper.

I wonder if they got the term from computer security and forgot they'd done so, if they got it from computer security but thought it wasn't worth mentioning, or if the term has now become fairly common outside of computer security, but with the same basic meaning, rather than the somewhat different military meaning. Not really an important question, though.)

Load More