Qualia Research Institute: History & 2021 Strategy

Hi Daniel,

Thanks for the reply! I am a bit surprised at this:

Getting more clarity on emotional valence does not seem particularly high-leverage to me. What's the argument that it is?

The quippy version is that, if we’re EAs trying to maximize utility, and we don’t have a good understanding of what utility is, more clarity on such concepts seems obviously insanely high-leverage. I’ve written about specific relevant to FAI here: https://opentheory.net/2015/09/fai_and_valence/ Relevance to building a better QALY here: https://opentheory.net/2015/06/effective-altruism-and-building-a-better-qaly/ And I discuss object-level considerations on how better understanding of emotional valence could lead to novel therapies for well-being here: https://opentheory.net/2018/08/a-future-for-neuroscience/ https://opentheory.net/2019/11/neural-annealing-toward-a-neural-theory-of-everything/ On mobile, pardon the formatting.

Your points about sufficiently advanced AIs obsoleting human philosophers are well-taken, though I would touch back on my concern that we won’t have particular clarity on philosophical path-dependencies in AI development without doing some of the initial work ourselves, and these questions could end up being incredibly significant for our long-term trajectory — I gave a talk about this for MCS that I’ll try to get transcribed (in the meantime I can share my slides if you’re interested). I’d also be curious to flip your criticism and ping your models for a positive model for directing EA donations — is the implication that there are no good places to donate to, or that narrow-sense AI safety is the only useful place for donations? What do you think the highest-leverage questions to work on are? And how big are your ‘metaphysical uncertainty error bars’? What sorts of work would shrink these bars?

Qualia Research Institute: History & 2021 Strategy

Hi Daniel,

Thanks for the remarks! Prioritization reasoning can get complicated, but to your first concern:

Is emotional valence a particularly confused and particularly high-leverage topic, and one that might plausibly be particularly conductive getting clarity on? I think it would be hard to argue in the negative on the first two questions. Resolving the third question might be harder, but I’d point to our outputs and increasing momentum. I.e. one can levy your skepticism on literally any cause, and I think we hold up excellently in a relative sense. We may have to jump to the object-level to say more.

To your second concern, I think a lot about AI and ‘order of operations’. Could we postulate that some future superintelligence might be better equipped to research consciousness than we mere mortals? Certainly. But might there be path-dependencies here such that the best futures happen if we gain more clarity on consciousness, emotional valence, the human nervous system, the nature of human preferences, and so on, before we reach certain critical thresholds in superintelligence development and capacity? Also — certainly.

Widening the lens a bit, qualia research is many things, and one of these things is an investment in the human-improvement ecosystem, which I think is a lot harder to invest effectively in (yet also arguably more default-safe) than the AI improvement ecosystem. Another ‘thing’ qualia research can be thought of as being is an investment in Schelling point exploration, and this is a particularly valuable thing for AI coordination.

I’m confident that, even if we grant that the majority of humanity's future trajectory will be determined by AGI trajectory — which seems plausible to me — I think it’s also reasonable to argue that qualia research is one of the highest-leverage areas for positively influencing AGI trajectory and/or the overall AGI safety landscape.

New book — "Suffering-Focused Ethics: Defense and Implications"

Congratulations on the book! I think long works are surprisingly difficult and valuable (both to author and reader) and I'm really happy to see this.

My intuition on why there's little discussion of core values is a combination of "a certain value system [is] tacitly assumed" and "we avoid discussing it because ... discussing values is considered uncooperative." To wit, most people in this sphere are computationalists, and the people here who have thought the most about this realize that computationalism inherently denies the possibility of any 'satisfyingly objective' definition of core values (and suffering). Thus it's seen as a bit of a faux pas to dig at this -- the tacit assumption is, the more digging that is done, the less ground for cooperation there will be. (I believe this stance is unnecessarily cynical about the possibility of a formalism.)

I look forward to digging into the book. From a skim, I would just say I strongly agree about the badness of extreme suffering; when times are good we often forget just how bad things can be. A couple quick questions in the meantime:

  • If you could change peoples' minds on one thing, what would it be? I.e. what do you find the most frustrating/pernicious/widespread mistake on this topic?
  • One intuition pump I like to use is: 'if you were given 10 billion dollars and 10 years to move your field forward, how precisely would you allocate it, and what do you think you could achieve at the end?'
Reducing long-term risks from malevolent actors

A core 'hole' here is metrics for malevolence (and related traits) visible to present-day or near-future neuroimaging.

Briefly -- Qualia Research Institute's work around connectome-specific harmonic waves (CSHW) suggests a couple angles:

(1) proxying malevolence via the degree to which the consonance/harmony in your brain is correlated with the dissonance in nearby brains;
(2) proxying empathy (lack of psychopathy) by the degree to which your CSHWs show integration/coupling with the CSHWs around you.

Both of these analyses could be done today, given sufficient resource investment. We have all the algorithms and in-house expertise.

Background about the paradigm: https://opentheory.net/2018/08/a-future-for-neuroscience/

Intro to Consciousness + QRI Reading List

Very important topic! I touch on McCabe's work in Against Functionalism (EA forum discussion); I hope this thread gets more airtime in EA, since it seems like a crucial consideration for long-term planning.

I'm Buck Shlegeris, I do research and outreach at MIRI, AMA

Hey Pablo! I think Andres has a few up on Metaculus; I just posted QRI's latest piece of neuroscience here, which has a bunch of predictions (though I haven't separated them out from the text):


I'm Buck Shlegeris, I do research and outreach at MIRI, AMA

We’ve looked for someone from the community to do a solid ‘adversarial review’ of our work, but we haven’t found anyone that feels qualified to do so and that we trust to do a good job, aside from Scott, and he's not available at this time. If anyone comes to mind do let me know!

I'm Buck Shlegeris, I do research and outreach at MIRI, AMA

I think this is a great description. "What happens if we seek out symmetry gradients in brain networks, but STV isn't true?" is something we've considered, and determining ground-truth is definitely tricky. I refer to this scenario as the "Symmetry Theory of Homeostatic Regulation" - https://opentheory.net/2017/05/why-we-seek-out-pleasure-the-symmetry-theory-of-homeostatic-regulation/ (mostly worth looking at the title image, no need to read the post)

I'm (hopefully) about a week away from releasing an update to some of the things we discussed in Boston, basically a unification of Friston/Carhart-Harris's work on FEP/REBUS with Atasoy's work on CSHW -- will be glad to get your thoughts when it's posted.

I'm Buck Shlegeris, I do research and outreach at MIRI, AMA

I think we actually mostly agree: QRI doesn't 'need' you to believe qualia are real, that symmetry in some formalism of qualia corresponds to pleasure, that there is any formalism about qualia to be found at all. If we find some cool predictions, you can strip out any mention of qualia from them, and use them within the functionalism frame. As you say, the existence of some cool predictions won't force you to update your metaphysics (your understanding of which things are ontologically 'first class objects').

But- you won't be able to copy our generator by doing that, the thing that created those novel predictions, and I think that's significant, and gets into questions of elegance metrics and philosophy of science.

I actually think the electromagnetism analogy is a good one: skepticism is always defensible, and in 1600, 1700, 1800, 1862, and 2018, people could be skeptical of whether there's 'deep unifying structure' behind these things we call static, lightning, magnetism, shocks, and so on. But it was much more reasonable to be skeptical in 1600 than in 1862 (the year Maxwell's Equations were published), and more reasonable in 1862 than it was in 2018 (the era of the iPhone).

Whether there is 'deep structure' in qualia is of course an open question in 2019. I might suggest STV is equivalent to a very early draft of Maxwell's Equations: not a full systematization of qualia, but something that can be tested and built on in order to get there. And one that potentially ties together many disparate observations into a unified frame, and offers novel / falsifiable predictions (which seem incredibly worth trying to falsify!)

I'd definitely push back on the frame of dualism, although this might be a terminology nitpick: my preferred frame here is monism: https://opentheory.net/2019/06/taking-monism-seriously/ - and perhaps this somewhat addresses your objection that 'QRI posits the existence of too many things'.

I'm Buck Shlegeris, I do research and outreach at MIRI, AMA

Thanks Matthew! I agree issues of epistemology and metaphysics get very sticky very quickly when speaking of consciousness.

My basic approach is 'never argue metaphysics when you can argue physics' -- the core strategy we have for 'proving' we can mathematically model qualia is to make better and more elegant predictions using our frameworks, with predicting pain/pleasure from fMRI data as the pilot project.

One way to frame this is that at various points in time, it was completely reasonable to be a skeptic about modeling things like lightning, static, magnetic lodestones, and such, mathematically. This is true to an extent even after Faraday and Maxwell formalized things. But over time, with more and more unusual predictions and fantastic inventions built around electromagnetic theory, it became less reasonable to be skeptical of such.

My metaphysical arguments are in my 'Against Functionalism' piece, and to date I don't believe any commenters have addressed my core claims:


But, I think metaphysical arguments change distressingly few peoples' minds. Experiments and especially technology changes peoples' minds. So that's what our limited optimization energy is pointed at right now.

Load More