Principia Qualia: blueprint for a new cause area, consciousness research with an eye toward ethics and x-risk

by MikeJohnson 9th Dec 201650 comments


Hi all,

Effective altruism has given a lot of attention to ethics, and in particular suffering reduction. However, nobody seems to have a clear definition for what suffering actually is, or what moral value is. The implicit assumptions seem to be:

  1. We (as individuals and as a community) tend to have reasonably good intuitions as to what suffering and moral value are, so there's little urgency to put things on a more formal basis; and/or
  2. Formally defining suffering & moral value is much too intractable to make progress on, so it would be wasted effort to try (this seems to be the implicit position of e.g., Foundational Research Institute, and other similar orgs).

I think both of these assumptions are wrong, and that formal research into consciousness and valence is tractable, time-sensitive, and critically important.


Consciousness research is critically important and time-sensitive

Obviously, if one wants to do any sort of ethical calculus for a utilitarian intervention, or realistically estimate the magnitude of wild animal suffering, it's vital to have both a good theory of consciousness (what is conscious?) and valence (which conscious states feel good, which ones feel bad?). It seems obvious but is worth emphasizing that if your goal is to reduce suffering, it's important to know what suffering is.

But I would go further, and say that the fact that we don't have a good theory of consciousness & valence constitutes an existential risk.

First, here's Max Tegmark making the case that precision in moral terminology is critically important when teaching AIs what to value, and that we currently lack this precision: 

Relentless progress in artificial intelligence (AI) is increasingly raising concerns that machines will replace humans on the job market, and perhaps altogether. Eliezer Yudkowsky and others have explored the possibility that a promising future for humankind could be guaranteed by a superintelligent "Friendly AI", designed to safeguard humanity and its values. I argue that, from a physics perspective where everything is simply an arrangement of elementary particles, this might be even harder than it appears. Indeed, it may require thinking rigorously about the meaning of life: What is "meaning" in a particle arrangement? What is "life"? What is the ultimate ethical imperative, i.e., how should we strive to rearrange the particles of our Universe and shape its future? If we fail to answer the last question rigorously, this future is unlikely to contain humans.

I discuss the potential for consciousness & valence research to help AI safety here. But the x-risk argument for consciousness research goes much further. Namely: if consciousness is a precondition for value, and we can't define what consciousness is, then we may inadvertently trade it away for competitive advantage. Nick Bostrom and Scott Alexander have both noted this possibility:

We could thus imagine, as an extreme case, a technologically highly advanced society, containing many complex structures, some of them far more intricate and intelligent than anything that exists on the planet today – a society which nevertheless lacks any type of being that is conscious or whose welfare has moral significance. In a sense, this would be an uninhabited society. It would be a society of economic miracles and technological awesomeness, with nobody there to benefit. A Disneyland with no children. (Superintelligence)

Moloch is exactly what the history books say he is. He is the god of Carthage. He is the god of child sacrifice, the fiery furnace into which you can toss your babies in exchange for victory in war.

He always and everywhere offers the same deal: throw what you love most into the flames, and I will grant you power.


The last value we have to sacrifice is being anything at all, having the lights on inside. With sufficient technology we will be “able” to give up even the final spark. (Meditations on Moloch)

Finally, here's Andres Gomez Emilsson on the danger of a highly competitive landscape which is indifferent toward consciousness & valence:

I will define a pure replicator, in the context of agents and minds, to be an intelligence that is indifferent towards the valence of its conscious states and those of others. A pure replicator invests all of its energy and resources into surviving and reproducing, even at the cost of continuous suffering to themselves or others. Its main evolutionary advantage is that it does not need to spend any resources making the world a better place.

A decade ago, these concerns would have seemed very sci-fi. Today, they seem interesting and a little worrying. In ten years, they'll seem incredibly pressing and we'll wish we had started on them sooner.


Consciousness research and valence research are tractable

I'm writing this post not because I hope these topics ultimately turn out to be tractable. I'm writing it because I know they are, because I've spent several years of focused research on them and have progress to show for it.

The result of this research is Principia QualiaEssentially, it's five things:

  1. A literature review on what affective neuroscience knows about pain & pleasure, why it's so difficult to research, and why a more principled approach is needed;
  2. A literature review on quantitative theories of consciousness, centered on IIT and its flaws;
  3. A framework for clarifying & generalizing IIT in order to fix these flaws;
  4. A crisp, concise, and falsifiable hypothesis about what valence is, in the context of a mathematical theory of consciousness;
  5. A blueprint for turning qualia research into a formal scientific discipline.

The most immediately significant takeaway is probably (4), a definition of valence (pain/pleasure) in terms of identity, not just correlation. It's long, but I don't think any shorter paper could do all these topics justice. I encourage systems-thinkers who are genuinely interested in consciousness, morality, and x-risk to read it and comment.


Relatedly, I see this as the start of consciousness & valence research as an EA cause area, an area which is currently not being served within the community or by academia. As such I strongly encourage organizations dealing with cause prioritization and suffering reduction to consider my case whether this area is as important, time-sensitive, and tractable as I'm arguing it to be. A handful of researchers are working on the problem- mostly Andres Gomez Emilsson and I- and I've spoken with brilliant, talented folks who really want to work on the research program I've outlined, but we're significantly resource-constrained. (I'm happy to make a more detailed case for prioritization of this cause area, what organizations are doing related work, and why this specific area hasn't been addressed, later; I think saying more now would be premature.)




Strong claims invite critical examination. This post is intended as sort of an "open house" for EAs to examine the research, ask for clarification, discuss alternatives, et cetera. 

One point I'd like to stress is that this research is developed enough to make specific, object-level, novel, falsifiable predictions (Sections XI and XII). I've made the framework broad enough to be compatible with many different theories of consciousness, but in order to say anything meaningful about consciousness, we have to rule out certain possibilities. We can discuss metaphysics, but in my experience it's more effective to discuss things on the object-level. So for objections such as, "consciousness can't be X sort of thing, because it's Y sort of thing," consider framing it as an object-level objection- i.e., a divergent prediction. A final point- the link above goes to an executive summary. The primary document, which can be found here, goes into much more detail.

All comments welcome.

Mike, Qualia Research Institute 


Edit, 12-20-16 & 1-9-17: In addition to the above remarks, qualia research also seems important for smoothing certain coordination problems between various EA and x-risk organizations. My comment to Jessica Taylor:

>I would expect the significance of this question [about qualia] to go up over time, both in terms of direct work MIRI expects to do, and in terms of MIRI's ability to strategically collaborate with other organizations. I.e., when things shift from "let's build alignable AGI" to "let's align the AGI", it would be very good to have some of this metaphysical fog cleared away so that people could get on the same ethical page, and see that they are in fact on the same page.

Right now, it's reasonable for EA organizations to think they're on the same page and working toward the same purpose. But as AGI approaches and the stakes get higher & our possible futures become more divergent, I fear apparently small differences may grow very large. Research into qualia alone won't solve this, but it would help a lot. This question seems to parallel a debate between Paul Christiano and Wei Dai, about whether philosophical confusion magnifies x-risk, and if so, how much.