MB

Matthew_Barnett

2970 karmaJoined Nov 2017

Comments
253

Some people seem to think the risk from AI comes from AIs gaining dangerous capabilities, like situational awareness. I don't really agree. I view the main risk as simply arising from the fact that AIs will be increasingly integrated into our world, diminishing human control.

Under my view, the most important thing is whether AIs will be capable of automating economically valuable tasks, since this will prompt people to adopt AIs widely to automate labor. If AIs have situational awareness, but aren't economically important, that's not as concerning.

The risk is not so much that AIs will suddenly and unexpectedly take control of the world. It's that we will voluntarily hand over control to them anyway, and we want to make sure this handoff is handled responsibly. 

An untimely coup, while possible, is not necessary.

Barnett argues that future technology will be primarily used to satisfy economic consumption (aka selfish desires). That seems even plausible to me, however, I’m not that concerned about this causing huge amounts of future suffering (at least compared to other s-risks). It seems to me that most humans place non-trivial value on the welfare of (neutral) others such as animals. Right now, this preference (for most people) isn’t strong enough to outweigh the selfish benefits of eating meat. However, I’m relatively hopeful that future technology would make such types of tradeoffs much less costly.

At the same time it becomes less selfishly costly to be kind to animals due to technological progress, it could become more selfishly enticing to commit other moral tragedies. For example, it could hypothetically turn out, just as a brute empirical fact, that the most effective way of aligning AIs is to treat them terribly in some way, e.g. by brainwashing them or subjecting them to painful stimuli. 

More generally, technological progress doesn't seem to asymmetrically make people more moral. Factory farming, as a chief example, allowed people to satisfy their desire for meat more cost-effectively, but at a larger moral cost compared to what existed previously. Even if factory farming is eventually replaced with something humane, there doesn't seem to be an obvious general trend here.

The argument you allude to that I find most plausible here is the idea that incidental s-risks as a byproduct of economic activity might not be as bad as some other forms of s-risks. But at the very least, incidental s-risks seem plausibly quite bad in expectation regardless.

In some circles that I frequent, I've gotten the impression that a decent fraction of existing rhetoric around AI has gotten pretty emotionally charged. And I'm worried about the presence of what I perceive as demagoguery regarding the merits of AI capabilities and AI safety. Out of a desire to avoid calling out specific people or statements, I'll just discuss a hypothetical example for now.

Suppose an EA says, "I'm against OpenAI's strategy for straightforward reasons: OpenAI is selfishly gambling everyone's life in a dark gamble to make themselves immortal." Would this be a true, non-misleading statement? Would this statement likely convey the speaker's genuine beliefs about why they think OpenAI's strategy is bad for the world?

To begin to answer these questions, we can consider the following observations:

  1. It seems likely that AI powerful enough to end the world would presumably also be powerful enough to do lots of incredibly positive things, such as reducing global mortality and curing diseases. By delaying AI, we are therefore equally "gambling everyone's life" by forcing people to face ordinary mortality.
  2. Selfish motives can be, and frequently are, aligned with the public interest. For example, Jeff Bezos was very likely motivated by selfish desires in his accumulation of wealth, but building Amazon nonetheless benefitted millions of people in the process. Such win-win situations are common in business, especially when developing technologies.

Because of the potential for AI to both pose great risks and great benefits, it seems to me that there are plenty of plausible pro-social arguments one can give for favoring OpenAI's strategy of pushing forward with AI. Therefore, it seems pretty misleading to me to frame their mission as a dark and selfish gamble, at least on a first impression.

Here's my point: Depending on the speaker, I frequently think their actual reason for being against OpenAI's strategy is not because they think OpenAI is undertaking a dark, selfish gamble. Instead, it's often just standard strong longtermism. A less misleading statement of their view would go something like this:

"I'm against OpenAI's strategy because I think potential future generations matter more than the current generation of people, and OpenAI is endangering future generations in their gamble to improve the lives of people who currently exist."

I claim this statement would—at least in many cases—be less misleading than the other statement because it captures a major genuine crux of the disagreement: whether you think potential future generations matter more than currently-existing people.

This statement also omits the "selfish" accusation, which I think is often just a red herring designed to mislead people: we don't normally accuse someone of being selfish when they do a good thing, even if the accusation is literally true.

(There can, of course, be further cruxes, such as your p(doom), your timelines, your beliefs about the normative value of unaligned AIs, and so on. But at the very least, a longtermist preference for future generations over currently existing people seems like a huge, actual crux that many people have in this debate, when they work through these things carefully together.)

Here's why I care about discussing this. I admit that I care a substantial amount—not overwhelming, but it's hardly insignificant—about currently existing people. I want to see people around me live long, healthy and prosperous lives, and I don't want to see them die. And indeed, I think advancing AI could greatly help currently existing people. As a result, I find it pretty frustrating to see people use what I perceive to be essentially demagogic tactics designed to sway people against AI, rather than plainly stating their cruxes about why they actually favor the policies they do. 

These allegedly demagogic tactics include:

  1. Highlighting the risks of AI to argue against development while systematically omitting the potential benefits, hiding a more comprehensive assessment of your preferred policies.
  2. Highlighting random, extraneous drawbacks of AI development that you wouldn't ordinarily care much about in other contexts when discussing innovation, such as potential for job losses from automation. This type of rhetoric looks a lot like "deceptively searching for random arguments designed to persuade, rather than honestly explain one's perspective" to me, a lot of the time.
  3. Conflating, or at least strongly associating, the selfish motives of people who work at AI firms with their allegedly harmful effects. This rhetoric plays on public prejudices by appealing to a widespread but false belief that selfish motives are usually suspicious, or can't translate into pro-social results. In fact, there is no contradiction with the idea that most people at OpenAI are in it for the money, status, and fame, but also what they're doing is good for the world, and they genuinely believe that.

I'm against these tactics for a variety of reasons, but one of the biggest reasons is that they can, in some cases, indicate a degree of dishonesty, depending on the context. And I'd really prefer EAs to focus on trying to be almost-maximally truth-seeking in both their beliefs and their words.

Speaking more generally—to drive one of my points home a little more—I think there are roughly three possible views you could have about pushing for AI capabilities relative to pushing for pausing or more caution:

  1. Full-steam ahead view: We should accelerate AI at any and all costs. We should oppose any regulations that might impede AI capabilities, and embark on a massive spending spree to accelerate AI capabilities.
  2. Full-safety view: We should try as hard as possible to shut down AI right now, and thwart any attempt to develop AI capabilities further, while simultaneously embarking on a massive spending spree to accelerate AI safety.
  3. Balanced view: We should support a substantial mix of both safety and acceleration efforts, attempting to carefully balance the risks and rewards of AI development to ensure that we can seize the benefits of AI without bearing intolerably high costs.

I tend to think most informed people, when pushed, advocate the third view, albeit with wide disagreement about the right mix of support for safety and acceleration. Yet, on a superficial level—on the level of rhetoric—I find that the first and second view are surprisingly common. On this level, I tend to find e/accs in the first camp, and a large fraction of EAs in the second camp.

But if your actual beliefs are something like the third view, I think that's an important fact to emphasize in honest discussions about what we should do with AI. If your rhetoric is consistently aligned with (1) or (2) but your actual beliefs are aligned with (3), I think that can often be misleading. And it can be especially misleading if you're trying to publicly paint other people in the same camp—the third one—as somehow having bad motives merely because they advocate a moderately higher mix of acceleration over safety efforts than you do, or vice versa.

I think OpenAI doesn't actually advocate a "full-speed ahead approach" in a strong sense. A hypothetical version of OpenAI that advocated a full speed ahead approach would immediately gut its safety and preparedness teams, advocate subsidies for AI, and argue against any and all regulations that might impede their mission.

Now, of course, there might be political reasons why OpenAI doesn't come out and do this. They care about their image, and I'm not claiming we should take all their statements at face value. But another plausible theory is simply that OpenAI leaders care about both acceleration and safety. In fact, caring about both safety and acceleration seems quite rational from a purely selfish perspective.

I claim that such a stance wouldn't actually be much different than the allegedly "ordinary" view that I described previously: that acceleration, rather than pausing or shutting down AI, can be favored in many circumstances.

OpenAI might be less risk averse than average compared to the general public, but in that case we're talking about a difference in degree here, not a qualitative difference in motives.

I think "if you believe the probability that a technology will make humanity go extinct with a probability of 1% or more, be very very cautious" would be endorsed by a large majority of the general population & intellectual 'elite'.

I'm not sure we disagree. A lot seems to depend on what is meant by "very very cautious". If it means shutting down AI as a field, I'm pretty skeptical. If it means regulating AI, then I agree, but I also think Sam Altman advocates regulation too.

I agree the general population would probably endorse the statement "if a technology will make humanity go extinct with a probability of 1% or more, be very very cautious" if given to them in a survey of some kind, but I think this statement is vague, and somewhat misleading as a frame for how people would think about AI if they were given more facts about the situation.

Firstly, we're not merely talking about any technology here; we're talking about a technology that has the potential to both disempower humans, but also make their lives dramatically better. Almost every technology has risks as well as benefits. Probably the most common method people use when deciding whether to adopt a technology themselves is to check whether the risks outweigh the benefits. Just looking at the risks alone gives a misleading picture.

The relevant statistic is the risk to benefit ratio, and here it's really not obvious that most people would endorse shutting down AI if they were aware of all the facts. Yes, the risks are high, but so are the benefits. 

If elites were made aware of both the risks and the benefits from AI development, most of them seem likely to want to proceed cautiously, rather than not proceed at all, or pause AI for many years, as many EAs have suggested. To test this claim empirically, we can just look at what governments are already doing with regards to AI risk policy, after having been advised by experts; and as far as I can tell, all of the relevant governments are substantially interested in both innovation and safety regulation.

Secondly, there's a persistent and often large gap between what people say through their words (e.g. when answering surveys) and what they actually want as measured by their behavior. For example, plenty of polling has indicated that a large fraction of people are very cautious regarding GMOs, but in practice most people are willing to eat GM foods happily without much concern. People are often largely thoughtless when answering many types of abstract questions posed to them, especially about topics they have little knowledge about. And this makes sense, because their responses typically have almost no impact on anything that might immediately or directly impact them. Bryan Caplan has discussed these issues in surveys and voting systems before.

There's an IMO fairly simple and plausible explanation for why Sam Altman would want to accelerate AI that doesn't require positing massive cognitive biases or dark motives. The explanation is simply: according to his moral views, accelerating AI is a good thing to do.

[ETA: also, presumably, Sam Altman thinks that some level of safety work is good. He just prefers a lower level of safety work/deceleration than a typical EA might recommend.]

It wouldn't be unusual for him to have such a moral view. If one's moral view puts substantial weight on the lives and preferences of currently existing humans, then plausible models of the tradeoff between safety and capabilities say that acceleration can easily be favored. This idea was illustrated by Nick Bostrom in 2003 and more recently by Chad Jones.

Arguably, it is effective altruists who are the unusual ones here. The standard EA theory employed to justify extreme levels of caution around AI is strong longtermism. But most people, probably including Sam Altman, are not strong longtermists.

Me being alive is a relatively small part of my values.

I agree some people (such as yourself) might be extremely altruistic, and therefore might not care much about their own life relative to other values they hold, but this position is fairly uncommon. Most people care a lot about their own lives (and especially the lives of their family and friends) relative to other things they care about. We can empirically test this hypothesis by looking at how people choose to spend their time and money; and the results are generally that people spend their money on themselves, their family and their friends.

since I am not the director of the world, me personally being around to influence things is unlikely to have a decisive impact on things I value.

You don't need to be director of the world to have influence over things. You can just be a small part of the world to have influence over things that you care about. This is essentially what you're already doing by living and using your income to make decisions, to satisfy your own preferences. I'm claiming this situation could and probably will persist into the indefinite future, for the agents that exist in the future.

I'm very skeptical that there will ever be a moment in time during which there will be a "director of the world", in a strong sense. And I doubt the developer of the first AGI will become the director of the world, even remotely (including versions of them that reflect on moral philosophy etc.). You might want to read my post about this.

One intuitive argument for why capitalism should be expected to advance AI faster than competing economic systems is because capitalist institutions incentivize capital accumulation, and AI progress is mainly driven by the accumulation of computer capital. 

This is a straightforward argument: traditionally it is widely considered that a core element of capitalist institutions is the ability to own physical capital, and receive income from this ownership. AI progress and AI-driven growth requires physical computer capital, both for training and for inference. Right now, all the major tech companies, including Microsoft, Meta and Google, are spending large sums to amass a stockpile of compute to train larger, more capable models and serve customers AI services via cloud APIs. The obvious reason why these companies are taking these actions is because they expect to profit from their ownership over AI capital.

While it's true that competing economic systems also have mechanisms to accumulate capital, the capitalist system is practically synonymous with this motive. For example, while a centrally planned government could theoretically decide to spend 20% of GDP to purchase computer capital, the politicians and bureaucrats within such a system might only have weak incentives to pursue such a strategy, since they may not directly profit from the decision over and above the gains received by the general population. By contrast, a decentralized property and price system make such a decision extremely natural if one expects huge returns from investments in physical capital.

One can interpret this argument as a positive argument in favor of capitalist institutions (as I mostly do), or as an argument for reining in these institutions if you think that rapid AI progress is bad.

I have the feeling we're talking past each other a bit. I suspect talking about this poll was kind of a distraction. I personally have the sense of trying to convey a central point, and instead of getting the point across, I feel the conversation keeps slipping into talking about how to interpret minor things I said, which I don't see as very relevant.

I will probably take a break from replying for now, for these reasons, although I'd be happy to catch up some time and maybe have a call to discuss these questions in more depth. I definitely see you as trying a lot harder than most other EAs in trying to make progress on these questions collaboratively with me.

This response still seems underspecified to me. Is the default unaligned alternative paperclip maximization in your view? I understand that Eliezer Yudkowsky has given arguments for this position, but it seems like you diverge significantly from Eliezer's general worldview, so I'd still prefer to hear this take spelled out in more detail from your own point of view.

Load more