Introducing spirit hazards

brb243

This post contextualizes info hazards within their normative and power/attention dynamics environment: the greater willingness and power to harm (spirit hazard), the greater the risk of sharing a piece of information. I suggest that notifications about the existence of hazardous info is shared in a way that highlights responsibility and only with relevant stakeholders. Attention-constrained decisionmakers’ interest can be gained by risk topics that cannot be used to harm or by sincere support. At the end of this piece, I ask a few questions about spirit hazards in and beyond EA.

I am thankful to Rían O Mahoney, Stanislav Fořt, and Owen Cotton-Barratt who inspired this post. All errors are mine.

Epistemic status: to inspire rather than to provide accurate definitions.

Spirit hazard is the risk arising from the normalization of a harmful attitude toward a piece of information within a group. It is the product of the group’s power to harm with the information (info hazard?) and its probability. Here, the group’s power to harm is the product of the information’s maximum harm per unit resource and the amount of these resources available to the group’s decisionmakers.

p o w e r_{h a r m} = h a r m_{m a x} / r e s o u r c e * r e s o u r c e s

For example, if a group can cause 1000 DALYs with information about making a landmine, and is 10% likely to use it, the spirit hazard is 100 DALYs (1000 × 10%). In this calculation, assume that the information about a landmine can maximally cause 10,000 DALYs with $100,000 and the group decisionmakers can use $10,000.

Spirit hazard is mitigated by decreasing a group’s power or probability to harm with certain information. The power to harm can be lowered by decreased availability of resources to decisionmakers and the probability to harm by introducing safety understanding among them.

For instance, if only $1,000 is available to the decisionmakers in the group in the above example, then the group can cause only 10 DALYs. If, in addition, safety understanding among decisionmakers rises to lower the probability to harm to 1%, then the spirit hazard is only 1 DALY.

Within EA, biosecurity and AI safety could be subject to the greatest spirit hazard, because of the information’s high maximum harm per unit resource. Within these two cause areas, extensive checks should be done so that EA does not raise either of the remaining subcomponents of spirit hazard: decisionmakers’ resources or willingness to harm.

EA can empower decisionmakers by providing them with funding, skills, networks, and other resources and motivate them to cause harm through advancing narratives that motivate them to cause harm. Since supported decisionmakers can prevent or reduce harm, their motivation determines the sign of the effects of this support.

I suggest that risky topics are shared only with relevant decisionmakers and in a way that normalizes responsibility and rational decisionmaking. For example, information on biosecurity treaty provisions or budgetary recommendations should be shared with the few community members who inform these decisions. Or, for instance, AI safety should continue to be talked about in a way that focuses on the universal moral values alignment conundrum rather than the great harm it can cause.

If someone seeks to captivate people’s attention by sharing powerfully benevolent narratives, they can focus on topics that cannot be used to harm (such as low-probability natural phenomena protection) or offering effective support of the target audience’s positive impact objectives. ‘External risk’ protection memes can spread similarly expeditiously as those that different humans can increase. Skilled assistance, especially pro bono, can grow reputation among networks.

Instead of a summary, I would like to ask:

Should spirit hazards be considered in EA? If so, how?
What are the optimal ways of mitigating spirit hazards in EA while retaining its fundamental focus?
Should EA seek to reduce spirit hazards in areas that it does not focus on?
How did hazardous info become popular in EA? How can this be different outside of EA?
What are some scenarios in which the existence of hazardous topics is introduced to a community member and negative outcomes of various extent occur? How can these scenarios be prevented?

9 Reactions

More posts like this

Comments2

Sorted by

New & upvoted

Click to highlight new comments since: Today at 8:54 AM

Karthik Tadepalli3y5

In essence, is this is an alteration of information hazard that accounts for a group's propensity to commit harm? That seems like a useful concept. It seems small, but I think it's important to differentiate between the groups that information is being spread to (e.g. spreading information on this forum probably has lower risk than spreading the same information on 4chan).

In regards to your question about whether EA should consider spirit hazards, I haven't read deeply enough into the concept of information hazards to know if this is well-trodden ground (I just read the top forum posts from your link) but it seems like spirit hazards are only one half of the cost-benefit equation in sharing information. You suggest that risky information should be shared only with decisionmakers, but I can think of two scenarios in which that would not be ideal.

1. Accountability - when information is spread more broadly, decisionmakers can be held accountable for their actions (e.g. an AI company could be called out for unsafe practices if it's broadly known what practices are unsafe and why).
2. Wisdom of crowds - spreading information about a risk allows many people to independently try to develop solutions to that risk, which can be more successful than a lone decisionmaker trying to develop solutions.

This whole balance is analogous to cybersecurity, where security researchers actively spread information on vulnerabilities, because that allows other security researchers to fix them + learn from them, to stay ahead of hackers. Of course, not every scenario calls for the same information spread - explosives experts would definitely not want to spread bomb recipes on internet forums in the hope of helping out other explosives experts.

Perhaps the difference between cybersecurity and explosives that makes information spread more beneficial in the former case is that hacking is much more accessible to malicious actors than building a bomb, so dangerous information will inevitably be discovered by someone. So spirit hazard depends on the likelihood of information spreading to bad actors anyway. In domains which have a higher cost of involvement (and thus are more centralized) this is less likely to be the case, so maybe spirit hazard is still high in cases that we generally care about.

brb2433y1

Ok, that can be a better interpretation: adding the audience's capacity to commit harm into info hazards considerations.

That makes sense that the information about the existence of potentially harmful info can be shared also with people who can hold decisionmakers accountable to use their knowledge positively.

Whether this will succeed can depend on the attitude of the public toward the topic, which can depend on the 'spirit' of those who share the info. Using your examples, it an info comes from a resource such as the EA Forum, where the norm is to focus on impact and prevent harm, then even public who would normatively influence decisionmakers can have a similarly safe preferences regarding the topic.

However, one can also imagine that the public will seek to present that the info can be used for selfish gain or harm (since people may want to 'side' with a harmful entity due to fear, seek to gain standing or attention on social media due to posting about a threat, or aim to gain privilege for their group by harming others). Since the general public is not trained in double-thinking the possible impacts of their actions and since risk memes can spread faster than safety ones, publicly sharing the existence of risky topics, in good faith, can normalize and expedite harmful advancement of these subjects.

Crowd wisdom can apply when solutions are not already developed, only decisionmakers need to implement them, and when the public has the skills to come up with these solutions. For example, if only a treaty needs to be signed and budget spent on lab safety, then a few individuals can complete it. Or, people untrained in universal values research can have a limited ability to contribute to it.

Cybersecurity is an example of a field that requires cooperation of many experts who are not more likely to engage in a risky use of the info. Bomb recipes info, on the other hand, does not extensively help safety experts (who may specialize in legislation and regulations to prevent harm due to explosives) and could motivate otherwise uninterested actors to research this topic further. In this, cybersecurity can be analogous to AI safety and explosives info to biosecurity.

Spirit hazard can also make (empower or inspire) bad actors. The lower the cost of involvement (e. g. due to consequences, financial and other resources cost), the riskier it can be to share the info (and not necessarily more likely that (potentially) bad actors could have it already). So, risky info with low cost of negative involvement should not be shared.

Risky info should be shared if i) the cost of involvement is high, ii) it is highly unlikely that the group would use it to increase the riskiness of norms, iii) it is likely that not sharing this security info with the group would make decisionmakers advance risk, and iv) this topic is not subject to the unilateralist's curse (e. g. if one person tries to make an explosive many others would prevent them from doing so).