This is a linkpost for

I'm posting this here because I think it's an interesting perspective on the nature of artificial intelligence that isn't very common in the EA and AI alignment communities. I care about x-risks, including possible x-risks and s-risks from "advanced AI," but I think that our arguments for prioritizing AI risk need to more rigorous. One way in which we can strengthen these arguments is by grounding them in a better understanding of intelligence and AI themselves.


Since its beginning in the 1950s, the field of artificial intelligence has cycled several times between periods of optimistic predictions and massive investment ("AI spring") and periods of disappointment, loss of confidence, and reduced funding ("AI winter"). Even with today's seemingly fast pace of AI breakthroughs, the development of long-promised technologies such as self-driving cars, housekeeping robots, and conversational companions has turned out to be much harder than many people expected. One reason for these repeating cycles is our limited understanding of the nature and complexity of intelligence itself. In this paper I describe four fallacies in common assumptions made by AI researchers, which can lead to overconfident predictions about the field. I conclude by discussing the open questions spurred by these fallacies, including the age-old challenge of imbuing machines with humanlike common sense.

According to Mitchell, the four fallacies are:

  1. Narrow intelligence is on a continuum with general intelligence: Advances in narrow AI, such as GPT-3, aren't "first steps" toward AGI because they still lack common-sense knowledge.
  2. Easy things are easy and hard things are hard: Actually, the tasks that are easy for most humans are often hard to replicate in machines.
  3. "The lure of wishful mnemonics": Names used in the AI field, such as "Stanford Question Answering Dataset" for a question-answering benchmark, give off the impression that AI programs that do well at a benchmark are doing the underlying task that the benchmark is designed to approximate, even though that task really requires general intelligence.
  4. Intelligence is all in the brain: Here, Mitchell questions the common assumption that "intelligence can in principle be 'disembodied'," or separated conceptually from the rest of the organism it occupies, because it is simply a form of information processing. Instead, evidence from neuroscience, psychology, and other disciplines suggests that human cognition is deeply integrated with the rest of the nervous system.
    1. The disembodiment assumption implies that, to achieve human-level AI, all we would need is the right algorithms and enough compute. According to Mitchell, this is not supported by the embodiment thesis. (Embodied cognition similarly purports to undermine the concept of mind uploading.)
    2. Mitchell also argues that it is unlikely that an AI system could be "'superintelligent' without any basic humanlike common sense, yet while seamlessly preserving the speed, precision and programmability of a computer," because human rationality is tied up with our emotions and cognitive biases. More generally, "human intelligence seems to be a strongly integrated system with closely interconnected attributes, includ[ing] emotions, desires, a strong sense of selfhood and autonomy, and a commonsense understanding of the world," and this may be true of AGI as well.

My opinion: I am compelled to put some weight on these points, which makes me think that creating AGI would be more technically difficult than I previously thought. However, I think Mitchell is wrongly assuming that it would have to resemble the human mind. Based on the embodied cognition hypothesis, I can conclude, at most, that it's probably harder for humans to create a mind without an underlying body, but I can't conclude that it's impossible. (Similarly, classic AGI risk arguments are probably also assuming too much about the nature of an AGI mind.) Also, even if intelligence is necessarily integrated with other cognitive functions like emotions (which I doubt), these don't necessarily decrease the safety risks that AGI systems would pose, and may even increase them.


5 comments, sorted by Highlighting new comments since Today at 11:50 AM
New Comment

I was pleasantly surprised by this paper (given how much dross has been written on this topic). My thoughts on the four fallacies Mitchell identifies:

Fallacy 1: Narrow intelligence is on a continuum with general intelligence

This is hard to evaluate, since Mitchell only discusses it very briefly. I do think that people underestimate the gap between solving tasks with near-infinite data (like Starcraft) vs low-data tasks. But saying that GPT-3 isn't a step towards general intelligence also seems misguided, given the importance of few-shot learning.

Fallacy 2: Easy things are easy and hard things are hard

I agree that Moravec's paradox is important and underrated. But this also cuts the other way: if chess and Go were easy, then we should be open to the possibility that maths and physics are too.

Fallacy 3: The lure of wishful mnemonics

This is true and important. My favourite example is artificial planning. Tree search algorithms are radically different from human planning, which operates over abstractions. Yet this is hard to see because we use the same word for both.

Fallacy 4: Intelligence is all in the brain

This is the one I disagree with most, because "embodied cognition" is a very slippery concept. What does it mean? "The representation of conceptual knowledge is ... multimodal" - okay, but CLIP is multimodal.

"Thoughts are inextricably associated with perception, action, and emotion." Okay, but RL agents have perceptions and actions. And even if the body plays a crucial role in human emotions, it's a big leap to claim that disembodied agents therefore can't develop emotions.

Under this fallacy, Mitchell also discusses AI safety arguments by Bostrom and Russell. I agree that early characterisations of AIs as "purely rational" were misguided. Mitchell argues that AIs will likely also have emotions, cultural biases, a strong sense of selfhood and autonomy, and a commonsense understanding of the world. This seems plausible! But note that none of these directly solves the problem of misaligned goals. Sociopaths have all these traits, but we wouldn't want them to have superhuman intelligence.

This does raise the question: can early arguments for AI risk be reformulated to rely less on this "purely rational" characterisation? I think so - in fact, that's what I tried to do in this report.

I wish "relative skeptics" about deep learning capability timelines such as Melanie Mitchell and Gary Marcus would move beyond qualitative arguments and try to build models and make quantified predictions about how quickly they expect things to proceed, a la Cotra (2020) or Davidson (2021) or even Kurzweil. As things stand today, I can't even tell whether Mitchell or Marcus have more or less optimistic timelines than the people who have made quantified predictions, including e.g. authors from top ML conferences.

She does talk about century plus timelines here and there.

Stuart Russell debated Melanie Mitchell in February 2021 in an episode of The Munk Debates, a debate series on major policy issues.

The question was “Be it resolved, the quest for true AI is one of the great existential risks of our time.” Stuart Russell argued for and Melanie Mitchell argued against.

You can listen to the debate here or on any podcast service.

I would love to read more about 'Fallacy 1: Narrow intelligence is on a continuum with general intelligence'. It sounds plausible to an outsider like myself, and not just in the 'minds have some sorcerous essence which silicone will never reach' sense.