EA’s brain-over-body bias, and the embodied value problem in AI alignment

Geoffrey Miller

Comments 3

Sorted by

New & upvoted

Great post, Dr. Milller! I'll be curious what you think of my thoughts here!

And we view many blue-collar jobs as historically transient, soon to be automated by AI and robotics – freeing human bodies from the drudgery of actually working as bodies. (In the future, whoever used to work with their body will presumably just hang out, supported by Universal Basic Income, enjoying virtual-reality leisure time in avatar bodies, or indulging in a few physical arts and crafts, using their soft, uncalloused fingers)

This makes me recall Brave New World by Aldous Huxley. Specifically, the use of "Soma" - a drug that everyone takes to feel pleasure, so they never have to feel pain. Something seems off about imagining people living their lives in virtual reality in an everlasting “pleasurable" state. I think I find this unsettling because it does not fit into my conception of a good life (the point of Brave New World). There is something powerful and rewarding in facing reality and going through a bit of pain. Although I have not explored the AI alignment problem in depth, there seems to be some neglect for the value of pain (I’m using "pain" in a relatively broad, nonserious manner here). For instance, the pain of working an 8-hour shift moving boxes in a warehouse. This is an interesting idea to consider, Paul Bloom in The Sweet Spot reviews how there is frequently pleasure in pain. This might not be super problematic if there is pain that is inherently pleasurable, but the situation gets sticky when pain is only interpreted as worthwhile after the fact. Yet another layer of complexity.

EA consequentialism tends to assume that ethically relevant values (e.g. for AI alignment) are coterminous with sentience. This sentientism gets tricky enough when we consider whether non-cortical parts of our nervous system should be considered sentient, or treated as if they embody ethically relevant values. It gets even tricker when we ask whether body systems outside the nervous system, which may not be sentient in most traditional views, carry values worth considering.

I do think that the AI alignment problem becomes infinitely complex if we consider panpsychism. Or if we consider dualism for that matter. What if everything has subjective experience? What if subjective experience isn't dependent on the body? Or as you suggest "Are non-sentient, corporeal values possible?"

Although I (mostly) agree with many of the assumptions EAs make (As I see them: reductive materialism, the existence of objective truth), I agree that there is a lot of neglect to a wide array of beliefs and values.

Another problem that I see is that if we are considering reductive materialism to be correct, consciousness could be obtained by AI. If consciousness can be obtained by AI, we have created a system with values of its own. What do we owe to these new beings we have created? It is possible that the future is populated primarily by AI because their subjective experience is qualitatively better than humans' subjective experience.

The thermostat does not need to be fully sentient (capable of experiencing pleasure or pain) to have goals.

I partially disagree with this. As Searle points out, these machines do not have intentionality and lack a true understanding of what "they" are doing. I think I can grant that the thermostat has goals, but the thermostat does not truly understand those goals, because it lacks intentionality.

Although I don't think this point is central to your argument. I think we can get to the inherent value of the body because of its connection with the brain - and therefore the mind (subjective experience). Thus, the values of the body are only valuable because the body mediates the subjective experience of the person.

If this argument is correct, it means there may not be any top-down, generic, all-purpose way to achieve AI alignment until we have a much better understanding of the human body’s complex adaptations. If Artificial General Intelligence is likely to be developed within a few decades, but if it will take more than a few decades to have a very fine-grained understanding of body values, and if body values are crucial to align with, then we will not achieve AGI alignment. We would need, at minimum, a period of Long Reflection focused on developing better evolutionary medicine models of body values, before proceeding with AGI development.

I think this is largely pointing out that EAs have failed to consider that some people might not be comfortable with the idea of uploading their brain (mind) into a virtual reality device. I personally find this idea absurd; I don't have any urges to be immortal. Ah, I run into so many problems trying to think this through. What do we do if people consider their own demise valuable?

If an AI system is aligned with the human brain, but it ignores the microbiome hosted within the human body, then it won’t be aligned with human interests (or the microbiome’s interests).

Perhaps I do disagree with your conclusion here...

Following your above logic, I find it likely that AGI would be developed before we have a full understanding of the human body. I don't agree that it is necessary to have a full understanding of the human body for AGI to be generally aligned with the values of the body. It might be possible for AGI to be aligned with the human body in a more abstract way. "The body is a vital part of subjective experience, don't destroy it." Then, theoretically, AGI would be able to learn everything possible about the body to truly align itself with that interest. (Maybe this idea is impractical from the side of creating the AGI?)

So, which should our AI systems align with – our brains’ revealed preferences for donuts, or our bodies’ revealed preferences for leafy greens?

Could an AGI transcend this choice? Leafy greens that taste like donuts? Or donuts that have the nutritional value of leafy greens?

Regardless of this, I do get your point of conflicting values between the body and brain. I was mostly considering the values of the body and brain as highly conducive to each other. Not sure what to do about the frequent incongruencies.

She may have no idea how to verbally express her body’s biomechanical capabilities and vulnerabilities to the robot sparring partner. But it better get aligned with her body somehow – just as her human BJJ sparring partners do. And it better not take her stated preferences for maximum-intensity training too seriously.

"But it better get aligned with her body somehow" is a key point for me here. If the AGI has the general notion to not hurt human bodies, it might be possible that the AGI would just use caution in this situation. Or even refuse to play because it understands the risk. This is to say, there might be ways for AGI to be aligned with the body values without it having a complete understanding of the body. Although, a complete understanding would be best.

On one hand, I agree with the sentiment that we need to consider the body more! On the other hand, I'm not positive that we need to completely understand the body to align AGI with the body. Although it seems to be a logical possibility that understanding the body isn't necessary for AGI alignment, I'm not sure if it is a practical point.

Please provide some pushback! I don't feel strongly about any of my arguments here, I know there is a lot of background that I'm missing out on.

Paul J. Watson

I am deeply impressed by the amount of ground this essay covers so thoughtfully. I have a few remarks. They pertain to Miller's focal topic as well as avoiding massive popular backlash against general AI and limited, expert system AI, backlash that will make current resistance against science and human "expertise" in general look pretty innocuous. I close with a remark on alignment of AI with animal interests.

I offer everyone an a priori apology if this comment seems pathologically wordy.

I think that AI alignment with interests of the body are quite essential to achieve alignment with human minds; probably necessary but not sufficient. Cross-culturally, regardless of superficial and (anyway) dynamic differences in values amongst cultures, humans generally have a hard time being happy and content if they are concerned with bodily well-being. Sicknesses of all kinds lead to invasive thoughts and emotions amounting to existential dread for most people, even the likes of Warren Zevon.

The point is that know that I, and probably most people across cultures, would be delighted to have a human doctor or nurse walk into and exam or hospital room with a pleasent-looking robot that we (the patient) truly perceived , correctly so, to possess general diagnostic super-intelligence based on deep knowledge of the healthy functioning of every organ and physiological system in the human body. Personally, I've never had the experience that any doctor of mine, including renowned specialists, had much of a clue about any aspect of my biology. I'd also feel better right now if I knew there was an expert system that was going to be in charge of my palliative care, which I'll probably need sooner rather than later, a system that would customize my care to minimize my physical pain and allow me to die consciously, without irresistible distraction from physical suffering. Get to work on that, please.

Such a diagnostic AI system, like a deeply respected human shamanic healer treating a devout in-group religious follower, would even be capable of generating a supernormal placebo effect (current Western medicine and associated health-insurance systems most often produces strong nocebo effects, Ugg.), which it seems clear would be based on nonconscious mental processes in the patient. (I think one of the important albeit secondary adaptive functions of religions is to produce supernormal placebo effects; I have a hypothesis about why placebo effects exist and why religious healers in spiritual alignment with their patients are especially good at evoking them, a topic for a future essay.) The existence of placebo effects, and their opposite, are good evidence that AI alignment with body is somewhat equivalent to alignment with mind.

Truly perceived is important. That is one reason I recommend that a relaxed and competent human health professional accompany the visiting AI system. Even though the AI itself may speak to the patient, it is important to have this super-expert system, perhaps limited in its ability to engage emotionally with the patient (like the Hugh Laurie character, "House") be gazed upon with admiration and a bit of awe by the human partner during any interaction with the patient. The human then at least competently pretends to understand the diagnosis and promises the patient to promptly implement the recommended treatment. They can also help answer questions the patient may have throughout the encounter. An appropriate religious professional could also be added to the team, as needed, as long as they too show deep respect for the AI system.

I think a big part of my point is that when an AI consequentially aligns with our bodies, it thereby engenders a powerful "pre-reflective" intimacy with the person. This will help preempt reflective objections to the existence and activities of any AI system. And this will work cross-culturally, with practically everyone to ameliorate the alignment problem, at least as humans perceive it. It will promote AI adoption.

Stepping back a moment, as humans evolved the cognitive capacities to cooperate in large groups while preserving significant degrees of individual sovernity (e.g., unlike social insects) and then promptly began to co-evolve capacities to engage in the quintessentially human cross-cultural way of life I'll call "complex contractual reciprocity" (CCR), a term better unpacked elsewhere, we also had to co-evolve a stong hunger for externally-sourced maximally authoritative moral systems, preferably ones perceived as "sacred." (Enter a long history of natural selection for multiple cognitive traits favoring religiosity.) If its not from a sacred external source, but amounts to some person's or subculture's opinion, argument and instability, and the risk of chaos, is going to be on everyone's minds. Durable, high degrees of moral alignment within groups (whose boundaries can, under competent leadership, adaptively expand and contract) facilitates maximally productive CCR, and that is almost synonymous with high, on average, individual lifetime inclusive fitness within groups.

AI expert systems, especially when accompanied by caring compassionate human partners, can be made to look like highly authoritative, externally-sourced fountains of sacred knowledge related to fundamental aspects of our well-being. Operationally here, sacred means minimally questionable. As humans we instinctively need the culturally-supplied contractual boilerplate, our group's moral system (all about alignment), and other forms of knowledge intimately linked to our well-being to be minimally questionable. If a person feels like an AI system is BOTH morally aligned with them and their in-group, and can take care of their health practically like a god, then from the human standpoint, alignment doubts will be greatly ameliorated.

Finally, a side note, which I'll keep brief. Having studied animals in nature and in the lab for decades, I'm convinced that they suffer. This includes invertebrates. However, I don't think that even dogs reflect on their suffering. (Using meditative techniques, humans can get access to what it means to have a pre-reflective yet very real experience.) Anyway, for AI to every become aligned with animals, I think it's going to require that the AI aligns with their whole bodies, not just their nervous systems or particular ganglia therein. Again, because with animals the AI is facing the challenge of ameliorating pre-reflective suffering. (I'd say most human suffering, because of the functional design of human consciousness, is on the reflective level.) So, by designing AI systems that can achieve alignment with humans in mind and body, I think we may simultaneously generate AI that is much more capable of tethering to the welfare of diverse animals.

Best wishes to all, PJW

john zac

Amazing. Thank you

Comments