Hide table of contents

The Social Dilemma is a new Netflix documentary which discusses the effects of social media algorithms on mental health, human behaviour, and social interactions. In the film, Tristan Harris from the Centre for Humane Technology argues that these algorithms present an existential risk. Perhaps not one that "threatens the premature extinction of Earth-originating intelligent life" but one that threatens "the permanent and drastic destruction of its potential for desirable future development" (Bostrom, 2012).

Ivan Vendrov and Jeremy Nixon have written about Aligning Recommender Systems as Cause Area and Lê Nguyên Hoang has written about short-term AI alignment. However, I expect many EAs will find it valuable to discuss the arguments made in The Social Dilemma and I want to create a space for that to happen.


UPDATE (December 3, 2020)

Tristan Harris has been interviewed by Rob Wiblin on the 80,000 Hours podcast.

You can listen to the episode, read the transcript, and find other recommended reading on this topic here.

New Answer
New Comment

5 Answers sorted by

I haven't watched the documentary, but I'm antecedently skeptical of claims that social media constitute an existential risk in the sense in which EAs use that term. The brief summary provided by the Wikipedia article doesn't seem to support that characterization:

the film explores the rise of social media and the damage it has caused to society, focusing on its exploitation of its users for financial gain through surveillance capitalism and data mining, how its design is meant to nurture an addiction, its use in politics, its impact on mental health (including the mental health of adolescents and rising teen suicide rates), and its role in spreading conspiracy theories and aiding groups such as flat-earthers and white supremacists.

While many of these effects are terrible (and concern about them partly explains why I myself basically don't use social media), they do not appear to amount to threats of existential catastrophe. Maybe the claim is that the kind of surveillance made possible by social media and big tech firms more generally ("surveillance capitalism") has the potential to establish an unrecoverable global dystopia?

Are there other concrete mechanisms discussed by the documentary? 

The argument that concerned me most was that disinformation spreads 6 times faster than the truth.

The implication is that it’s becoming increasingly difficult for people to establish what the truth is. This undermines democracy and the ability to build consensus. I think we will see this play out with the results of the US election in November and the extent to which people believe and accept the result. 

There are some studies suggesting fake news isn't quite the problem some think.



There are also a number of papers which are sceptical of there being pervasive social media "echo chambers" or "filter bubbles".



Cf also this recent book by Hugo Mercier, which argues that people are less gullible than many think.

I don't know this literature well and am not quite sure what conclusions to draw. My impression is, however, that some claims of the dangers of fake news on social media are exaggerated.

Cf also my comment on the post on recommender systems, relating to other effects of social media.


I would be interested to see any evidence on whether citizen knowledge has increased or not since social media formed. People often assert this but don't argue for it and the long-term trend isn't that clear.

I'm not sure this answers your question but the Edelman Trust Barometer has been tracking levels of trust in societal institutions (government, business, NGOs and media) for the last 20 years. The trend shows a widening division between the "Informed Public" and the "Mass Population" using the following definitions:

Informed Public 

  • 500 respondents in U.S. and China; 200 in all other markets
  • Represents 17% of total global population 
  • Must meet 4 criteria
    • Ages 25-64 
    • College-educated
    • In top 25% of household income per age group in each market
    • Report significant media consumption and engagement in public policy and business news 

Mass Population 

  • All population not including informed public 
  • Represents 83% of total global population

I would just like to point out three "classical EA" arguments for taking recommender systems very seriously.

1) The dangerousness of AGI has been argued to be orthogonal from the purpose of AGI, as illustrated by the paperclip maximizers. If you accept this "orthogonality thesis" and if you are concerned about AGI, then you should be concerned about the most sophisticated maximization algorithms. Recommender systems seem to be today's most sophisticated maximization algorithms (a lot more money and computing power has been invested in optimizing recommender systems than in GPT-3). Given the enormous economic incentives, we should probably not discard the probability that they will remain the most sophisticated maximization algorithms in the future.

As a result, arguments of the form "I don't see how recommender systems can pose an existential threat" seem akin to arguments of the form "I don't see how AGI can pose an existential threat".

(of course, if you reject the latter, I can see why you could reject the former 🙂)

2) Yudkowsky argues that “By far the greatest danger of Artificial Intelligence is that people conclude too early that they understand it.” Today's recommender systems are typical examples of something "that people conclude too early that they understand it". Such algorithms learn from enormous amounts of data which will definitely bias them in ways that no one can understand, since no one can view even a iota of what the YouTube algorithm sees. After all, YouTube receives 500 hours of new video per minute (!!), which it processes at least for copyrights, hate speech filtering and automated captioning.

As a result, arguments of the form "I don't think the YouTube recommender system is intelligent/sophisticated" might be signs that, perhaps, you may be underestimating today's algorithms. If so, then you might be prey to Yudkowsky's "greatest danger". At the very least, discarding the dangerousness of large-scale algorithms without an adequate understanding of them should probably be regarded as a bad habit.

3) Toby Ord's latest book stresses the problem of risk factors. Typically, if everybody cared about political scandals while a deadly pandemic (much worse than COVID-19) is going on, then, surely, the probability of mitigating pandemic risks will be greatly diminish. Arguably, recommender systems are major risk factors, because they point billions of individuals' attentions away from the most pressing problems. Including the attention of the brightest of us.

Bill Gates seems to have given a lot of importance to the risk factor of exposure to poor information, or to the lack of quality information, as his foundation has been investing a lot in "solutions journalism". Perhaps more interestingly still, he has decided to be a YouTuber himself. His channel has 2.3M views (!!) and 450 videos (!!). He publishes several videos per week, especially during this COVID-19 pandemic, probably because he considers that the battle of information is a major cause area! At the very least, he seems to believe that this huge investment is worth this (very valuable) time.

In particular, arguments of the form "I don't see how recommender systems can pose an existential threat" are at least as invalid as "I don't see how AGI can pose an existential threat"

Hold on for a second here. AGI is (by construction) capable of doing everything a recommender system can do plus presumably other things, so it cannot be the case that arguments for AGI posing an existential threat is necessarily weaker than recommender systems posing an existential threat.

NB: I've edited the sentence to clarify what I meant. The argument here is more that recommender systems are maximization algorithms, and that, if you buy the "orthogonality thesis", there is no reason to think that there cannot go AGI. In particular, you should not judge the capability of an algorithm by the simplicity of the task it is given. Of course, you may reject the orthogonality thesis. If so, please ignore the first argument.

If you buy it, there is a neat continuity from the problems with current social media and AI alignment, explained in some detail in What Failure Looks Like.

It’s already much easier to pursue easy-to-measure goals, but machine
learning will widen the gap by letting us try a huge number of possible
strategies and search over massive spaces of possible actions. That
force will combine with and amplify existing institutional and social
dynamics that already favor easily-measured goals.

From an AI safety perspective, the algorithms that create the feeds that social media users see do have some properties that make them potentially more concerning than most AI applications:

  1. The top capabilities are likely to be concentrated rather than distributed. For example, very few actors in the near future are likely to invest resources in such algorithms in a similar scale to Facebook.
  2. The feed-creation-solution (or policy, in reinforcement learning terminology) being searched for has a very rich real-world action space (e.g. showing some post X to some user Y, where Y is any person from a set of 3 billion FB users).
  3. The social media company is incentivized to find a policy that maximizes users' time-spent over a long time horizon (rather than using a very small discount factor).
  4. Early failures/deception-attempts may be very hard to detect, especially if the social media company itself is not on the lookout for such failures.

These properties seem to make it less likely that relevant people would see sufficiently alarming small-scale failures before the point where some AI systems pose existential risks.

(The following is long, sorry about that. Maybe I should have written it up already as a normal post. A one sentence abstract could be: "Social media algorithms could be dangerous as a part of the overall process of leading people to 'consent' to being lesser forms of themselves to further elite/AI/state goals, perhaps threatening the destruction of humanity's longterm potential.")

It seems plausible to me that something like algorithmic behavior modification (social media algorithms are algorithms designed to modify human behavior, to some extent; could be early examples of the phenomenon) could bend human preferences so that future humans freely (or "freely"?) choose things that we (the readers of this comment? reflective humans of 2020?) would consider non-optimal. If you combine that with the possibility of algorithms recommending changes in human genes, it's possible to rewrite human nature (with the consent of humans) into a form that AI (or the elite who control AI) find more convenient. For instance, humans could be simplified so that they consume fewer resources or present less of a political threat. The simplest humans are blobs of pleasure (easily satisfying hedonism) and/or "yes machines" (people who prefer cheap and easy things and thus whose preferences are trivial to satisfy). Whether this technically counts as existential risk, I'm not sure. It might be considered a "destruction of humanity's longterm potential". Part of human potential is the potential of humans to be something.

I suggest "freely" might ought to be in quotes for two reasons. One is the "scam phenomenon". A scammer can get a mark into a mindset in which they do things they wouldn't ordinarily do. (Withdraw a large sum of money from their bank account and give it to the scammer, just because the scammer asks for it.) The scammer never puts a gun to the mark's head. They just give them a plausible-enough story, and perhaps build a simple relationship, skillfully but not forcefully suggesting that the mark has something to gain from giving, or some obligation compelling it. If after "giving" the money, the mark wises up and feels regret, they might appeal to the police. Surely they were psychologically manipulated. And they were, they were in a kind of dream world woven by the scammer, who never forced anything but who drew the mark into an alternate reality. In some sense what happened was criminal, a form of theft. But the police will say "But it was of your own free will." The police are somewhat correct in what they say. The mark was "free" in some sense. But in another sense, the mark was not. We might fear that an algorithm (or AI) could be like a sophisticated scammer, and scam the human race, much like some humans have scammed large numbers of humans before.

The second reason is that adoption of changes (notably technology, but also social changes), of which changing human genes would be an example, and of which accepting algorithmic behavior modification could be another, is something that is only in a limited sense a satisfaction of the preferences of humans, or the result of their conscious decision. In the S-shaped curve of adoption, there are early adopters, late/non-adopters, and people in the middle. Early adopters probably really do affirm the innovations they adopt. Late or non-adopters probably really do have some kind of aversion to them. These people have true opinions about innovations. But most people, in the middle of the graph, are incentivized to a large extent by "doing whatever it is looks like is popular, is becoming popular, is something that looks pretty clear has become and will be popular". So technological adoption, or the adoption of any other innovation, is not necessarily something we as a whole species truly prefer or decide for, but there's enough momentum that we find ourselves falling in line.

I think more likely than the extreme of "blobs of pleasure / yes machines" are people who lack depth, are useless, and live in a VR dream world. On some, deeper, level they would be analogous to blobs/yes machines, but their subjective experience, on a surface level, would be more recognizably human. Their lives would be positive on some level and thus would be such that altruistic/paternalistic AI or AI-controlling elite could feel like they were doing the right thing by them. But their lives would be lacking in dimensions that perhaps AI or AI-controlling elite wouldn't think of including in their (the people's, or even the elite's/AI's own) experience. The people might not have to pay a significant price for anything and thus never value things (or other people) in a deeper way. They might be incapable of desiring anything other than "this life", such as a "spiritual world" (or something like a "spiritual world", a place of greater meaning) (something the author of Brave New World or Christians or Nietzscheans would all object to). In some objective sense, perhaps capability -- toward securing your own well-being, capability in general, behaving in a significant way, being able to behave in a way that really matters -- is something that is part of human well-being (and so civilization is both progress and regress as we make people who are less and less capable of, say, growing their own food, because of all the conveniences and safety we build up). We could further open up the thought that there is some objective state of affairs, something other than human perceptions of well-being or preference-satisfaction, which constitutes part of human well-being. Perhaps to be rightly related to reality (properly believing in God, or properly not believing in God, as the case may be).

So we might need to figure out exactly what human well-being is, or if we can't figure it out in advance for the whole human species (after all, each person has a claim to knowing what human well-being is), then try to keep technology and policy from doing things that hamper the ability of each person to come to discover and to pursue true human well-being. One could see in hedonism and preferentialism a kind of attempt at value agnosticism: we no longer say that God (a particular understanding of God), or the state, or some sacred site is the Real Value, we instead say "well, we as the state will support you or at least not hinder you in your preference for God, the state, or the sacred site, whatever you want, as long as it doesn't get in the way of someone else's preference -- whatever makes you happy". But preferentialism and hedonism aren't value-agnostic if they start to imply through their shaping of a person's experience "none of your sacred things are worth anything, we're just going to make you into a blob of pleasure who says yes, on most levels, with a veneer of human experience on the surface level of your consciousness." I think that a truly value-agnostic state/elite/AI might ought to try to maximize "the ability for each person to secure their own decision-making ability and basic physical movement", which could be taken as a proxy for the maximization of each person's agency and thus their ability to discover and pursue true human well-being. And to make fewer and fewer decisions for the populace, to try to make itself less and less necessary from a paternalistic point of view. Rather than paternalism, adopt a parental view -- parents tend to want their children to be capable, and to become, in a sense, their equals. All these are things that altruists who might influence the AI-controlling elite in the coming decades or centuries, or those who might want to align AI, could take into account.

We might be concerned with AI alignment, but we should also be concerned with the alignment of human civilization. Or the non-alignment, the drift of it. Fast take-off AI can give us stark stories where someone accidentally misaligns an AI to a fake utility function and it messes up human experience and/or existence irrevocably and suddenly -- and we consider that a fate to worry about and try to avoid. But slow take-off AI (I think) would/will involve the emergence of a bunch of powerful Tool AIs, each of which (I would expect) would be designed to be basically controllable by some human and to not obviously kill anyone or cause comparably clear harm (analogous to design of airplanes, bridges, etc.) -- that's what "alignment" means in that context [correct me if I'm wrong]; none of which are explicitly defined to take care of human well-being as a whole (something a fast-takeoff aligner might consciously worry about and decide about); no one of which rules decisively; all of which would be in some kind of equilibrium reminiscent of democracy, capitalism, and the geopolitical world. They would be more a continuation of human civilization than a break with it. Because the fake utility function imposition in a slow takeoff civilizational evolution is slow and "consensual", it is not stark and we can "sleep through it". The fact that Nietzsche and Huxley raised their complaints against this drift long ago shows that it's a slow and relatively steady one, a gradual iteration of versions of the status quo, easy for us to discount or adapt to. Social media algorithms are just a more recent expression of it.

Sorted by Click to highlight new comments since:

Perhaps not one that "threatens the premature extinction of Earth-originating intelligent life" (Bostrom, 2012)

I just want to flag that the full sentence from that paper is: "An existential risk is one that threatens the premature extinction of Earth-originating intelligent life or the permanent and drastic destruction of its potential for desirable future development (Bostrom 2002)."

Good point. I worded that clumsily. I've edited the post so it incorporates the full quote now.

Curated and popular this week
Relevant opportunities