All of Matthew_Barnett's Comments + Replies

If waiting is indeed very risky, then an AI may face a difficult trade-off between the risk of attempting a takeover before it has enough resources to succeed, and waiting too long and being cut off from even being able to make an attempt.

Attempting takeover or biding one's time are not the only options an AI may take. Indeed, in the human world, world takeover is rarely contemplated. For an agent that is not more powerful than the rest of the world combined, it seems likely that they will consider alternative strategies of achieving their goals before con... (read more)

2
Ben Millwood
7d
Yeah I think this is quite sensible -- I feel like I noticed one thing missing from the normal doom scenario and didn't notice all of the implications of missing that thing, in particular that the reason the AI in the normal doom scenario takes over is because it is highly likely to succeed, and if it isn't, takeover seems much less interesting.

the original statement still just seems to imagine that norms will be a non-trivial reason to avoid theft, which seems quite unlikely for a moderately rational agent.

Sorry, I think you're still conflating two different concepts. I am not claiming:

  • Social norms will prevent single agents from stealing from others, even in the absence of mechanisms to enforce laws against theft

I am claiming:

  • Agents will likely not want to establish a collective norm that it's OK (on a collective level) to expropriate wealth from old, vulnerable individuals. The reason is becau
... (read more)

If the scenario were such that any one AI agent can expect to get away with defecting (expropriation from older agents) and the norm-breaking requires passing a non-small threshold of such actions

This isn't the scenario I intended to describe, since it seems very unlikely that a single agent could get away with mass expropriation. The more likely scenario is that any expropriation that occurs must have been a collective action to begin with, and thus, there's no problem of coordination that you describe.

This is common in ordinary expropriation in the re... (read more)

2
Harrison Durland
18d
I don't find this response to be a compelling defense of what you actually wrote: It's one thing if the argument is "there will be effective enforcement mechanisms which prevent theft," but the original statement still just seems to imagine that norms will be a non-trivial reason to avoid theft, which seems quite unlikely for a moderately rational agent. Ultimately, perhaps much of your scenario was trying to convey a different idea from what I see as the straightforward interpretation, but I think it makes it hard for me to productively engage with it, as it feels like engaging with a motte-and-bailey.

My guess is that at some point someone will just solve the technical problem of alignment. Thus, future generations of AIs would be actually aligned to prior generations and the group they are aligned to would no longer need to worry about expropriation.

I don't think it's realistic that solutions to the alignment problem will be binary in the way you're describing. One could theoretically imagine a perfect solution — i.e. one that allows you to build an agent whose values never drift, that acts well on every possible input it could receive, whose preferenc... (read more)

1
Ryan Greenblatt
18d
The main reason to expect nearly perfect (e.g. >99% of value) solutions to be doable are: * Corrigibility seems much easier * Value might not be that fragile such that if you get reasonably close you get nearly all the value. (E.g., I currently think the way I would utilize vast resources on reflection probably isn't that much better than other people who's philosophical views I broadly endorse.)
1
Ryan Greenblatt
18d
I don't think it's binary, but I do think it's likely to be a sigmoid in practice. And I expect this sigmoid will saturate relatively early.

Perhaps you think this view is worth dismising because either:

  • You think humanity wouldn't do things which are better than what AIs would do, so it's unimportant. (E.g. because humanity is 99.9% selfish. I'm skeptical, I think this is going to be more like 50% selfish and the naive billionare extrapolation is more like 90% selfish.)

From an impartial (non-selfish) perspective, yes, I'm not particularly attached to human economic consumption relative to AI economic consumption. In general, my utilitarian intuitions are such that I don't have a strong preferen... (read more)

It could be that the AI can achieve much more of their objectives if it takes over (violently or non-violently) than it can achieve by playing by the rules.

Sure, that could be true, but I don't see why it would be true. In the human world, it isn't true that you can usually get what you want more easily by force. For example, the United States seems better off trading with small nations for their resources than attempting to invade and occupy them, even from a self-interested perspective.

More generally, war is costly, even between entities with very dif... (read more)

1
Ryan Greenblatt
18d
See here for some earlier discussion of whether violent takeover is likely. (For third parties to view, Matthew was in this discussion.)

Animals are not socially integrated in society, and we do not share a common legal system or culture with them. We did not inherit legal traditions from them. Nor can we agree to mutual contracts, or coordinate with them in a meaningful way. These differences seem sufficient to explain why we treat them very differently as you described.

If this difference in treatment was solely due to differences in power, you'd need to explain why vulnerable humans are not regularly expropriated, such as old retired folks, or small nations.

1
alexherwix
18d
I have never said that how we treat nonhuman animals is “solely” due to differences in power. The point that I have made is that AIs are not humans and I have tried to illustrate that differences between species tend to matter in culture and social systems. But we don’t even have to go to species differences, ethnic differences are already enough to create quite a bit of friction in our societies (e.g., racism, caste systems, etc.). Why don’t we all engage in mutually beneficial trade and cooperate to live happily ever after? Because while we have mostly converging needs in a biological sense, we have different values and beliefs. It still roughly works out in the grand scheme of things because cultural checks and balances have evolved in environments where we had strongly overlapping values and interests. So most humans have comparable degrees of power or are kept in check by those checks and balances. That was basically our societal process of getting to value alignment but as you can probably tell by looking at the news, this process has not reached a satisfactory quality, yet. We have come far but it’s still a shit show out there. The powerful take what they can get and often only give a sh*t to the degree that they actually feel consequences from it. So, my point is that your “loose” definition of value alignment is an illusion if you are talking about super powerful actors that have divergent needs and don’t share your values. They will play along as long as it suits them but will stop doing it as soon as an alternative more aligned with their needs and values is more convenient. And the key point here is that AIs are not humans and that they have very different needs from us. If they become much more powerful than us, only their values can keep them in check in the long run.

For my part, I define “alignment” as “the AI is trying to do things that the AGI designer had intended for it to be trying to do, as an end in itself and not just as a means-to-an-end towards some different goal that it really cares about.”

This is a reasonable definition, but it's important to note that under this definition of alignment, humans are routinely misaligned with each other. In almost any interaction I have with strangers -- for example, when buying a meal at a restaurant -- we are performing acts for each other because of mutually beneficia... (read more)

5
Steven Byrnes
18d
Humans are less than maximally aligned with each other (e.g. we care less about the welfare of a random stranger than about our own welfare), and humans are also less than maximally misaligned with each other (e.g. most people don’t feel a sadistic desire for random strangers to suffer). I hope that everyone can agree about both those obvious things. That still leaves the question of where we are on the vast spectrum in between those two extremes. But I think your claim “humans are largely misaligned with each other” is not meaningful enough to argue about. What percentage is “largely”, and how do we even measure that? Anyway, I am concerned that future AIs will be more misaligned with random humans than random humans are with each other, and that this difference will have important bad consequences, and I also think there are other disanalogies / reasons-for-concern as well. But this is supposed to be a post about terminology so maybe we shouldn’t get into that kind of stuff here.
2
alexherwix
18d
The difference is that a superintelligence or even an AGI is not human and they will likely need very different environments from us to truly thrive. Ask factory farmed animals or basically any other kind of nonhuman animal if our world is in a state of violance or war… As soon as strong power differentials and diverging needs show up the value cocreation narrative starts to lose it’s magic. It works great for humans but it doesn’t really work with other species that are not very close and aligned with us. Dogs and cats have arguably fared quite well but only at the price of becoming strongly adapted to OUR needs and desires. In the end, if you don’t have anything valuable to offer there is not much more you can do besides hoping for, or ideally ensuring, value alignment in the strict sense. Your scenario may work well for some time but it’s not a longterm solution.

Is there a particular part of my post that you disagree with? Or do you think the post is misleading. If so, how?

I think there are a lot of ways AI could go wrong, and "AIs dominating humans like how humans dominate animals" does not exhaust the scope of potential issues.

I really don’t get the “simplicity” arguments for fanatical maximising behaviour. When you consider subgoals, it seems that secretly plotting to take over the world will obviously be much more complicated? Do you have any idea how much computing power and subgoals it takes to try and conquer the entire planet? 

I think this is underspecified because 

  1. The hard part of taking over the whole planet is being able to execute a strategy that actually works in a world with other agents (who are themselves vying for power), rather than the compute or compl
... (read more)

This seems like an isolated demand for rigor to me. I think it's fine to say something is "no evidence" when, speaking pedantically, it's only a negligible amount of evidence.

I think that's fair, but I'm still admittedly annoyed at this usage of language. I don't think it's an isolated demand for rigor because I have personally criticized many other similar uses of "no evidence" in the past.

I think future AIs will be much more aligned than humans, because we will have dramatically more control over them than over humans.

That's plausible to me, but I... (read more)

4
Nora Belrose
1mo
The goal realism section was an argument in the alternative. If you just agree with us that the indifference principle is invalid, then the counting argument fails, and it doesn't matter what you think about goal realism. If you think that some form of indifference reasoning still works— in a way that saves the counting argument for scheming— the most plausible view on which that's true is goal realism combined with Huemer's restricted indifference principle. We attack goal realism to try to close off that line of reasoning.

(I might write a longer response later, but I thought it would be worth writing a quick response now.)

I have a few points of agreement and a few points of disagreement:

Agreements:

  • The strict counting argument seems very weak as an argument for scheming, essentially for the reason you identified: it relies on a uniform prior over AI goals, which seems like a really bad model of the situation.
  • The hazy counting argument—while stronger than the strict counting argument—still seems like weak evidence for scheming. One way of seeing this is, as you pointed out, t
... (read more)
0
Nora Belrose
1mo
This seems like an isolated demand for rigor to me. I think it's fine to say something is "no evidence" when, speaking pedantically, it's only a negligible amount of evidence. I mean, we do in fact discuss the simplicity argument, although we don't go in as much depth. Without a concrete proposal about what that might look like, I don't feel the need to address this possibility. I think future AIs will be much more aligned than humans, because we will have dramatically more control over them than over humans. We did not intend to deny that some AIs will be well-described as having goals.

Superhuman agents ruthlessly optimize for a reward at the expense of anything else we might care about. The more capable the agent and the more ruthless the optimizer, the more extreme the results.

To the extent this is an empirical claim about superhuman agents we are likely to build and not merely a definition, it needs to be argued for, not merely assumed. "Ruthless" optimization could indeed be bad for us, but current AIs don't seem well-described as ruthless optimizers.

Instead, LLMs appear corrigible more-or-less by default, and there don't appear t... (read more)

Some people seem to think the risk from AI comes from AIs gaining dangerous capabilities, like situational awareness. I don't really agree. I view the main risk as simply arising from the fact that AIs will be increasingly integrated into our world, diminishing human control.

Under my view, the most important thing is whether AIs will be capable of automating economically valuable tasks, since this will prompt people to adopt AIs widely to automate labor. If AIs have situational awareness, but aren't economically important, that's not as concerning.

The risk... (read more)

Barnett argues that future technology will be primarily used to satisfy economic consumption (aka selfish desires). That seems even plausible to me, however, I’m not that concerned about this causing huge amounts of future suffering (at least compared to other s-risks). It seems to me that most humans place non-trivial value on the welfare of (neutral) others such as animals. Right now, this preference (for most people) isn’t strong enough to outweigh the selfish benefits of eating meat. However, I’m relatively hopeful that future technology would mak

... (read more)
4
David_Althaus
1mo
Yes, agree. (For this and other reasons, I'm supportive of projects like, e.g., NYU MEP.) I also agree that there are no strong reasons to think that technological progress improves people's morality. As you write, my main reason for worrying more about agential s-risks is that the greater the technological power of agents, the more their intrinsic preferences matter in how the universe will look like. To put it differently, actors whose terminal goals put some positive value on suffering (e.g., due to sadism, retributivism or other weird fanatical beliefs) would deliberately aim to arrange matter in such a way that it contains more suffering—this seems extremely worrisome if they have access to advanced technology.  Altruists would also have a much harder time to trade with such actors, whereas purely selfish actors (who don't put positive value on suffering) could plausibly engage in mutually beneficial trades (e.g., they use (slightly) less efficient AI training/alignment methods which contain much less suffering and altruists give them some of their resources in return).  Yeah, despite what I have written above, I probably worry more about incidental s-risks than the average s-risk reducer.   

In some circles that I frequent, I've gotten the impression that a decent fraction of existing rhetoric around AI has gotten pretty emotionally charged. And I'm worried about the presence of what I perceive as demagoguery regarding the merits of AI capabilities and AI safety. Out of a desire to avoid calling out specific people or statements, I'll just discuss a hypothetical example for now.

Suppose an EA says, "I'm against OpenAI's strategy for straightforward reasons: OpenAI is selfishly gambling everyone's life in a dark gamble to make themselves immorta... (read more)

I think OpenAI doesn't actually advocate a "full-speed ahead approach" in a strong sense. A hypothetical version of OpenAI that advocated a full speed ahead approach would immediately gut its safety and preparedness teams, advocate subsidies for AI, and argue against any and all regulations that might impede their mission.

Now, of course, there might be political reasons why OpenAI doesn't come out and do this. They care about their image, and I'm not claiming we should take all their statements at face value. But another plausible theory is simply that Ope... (read more)

I think "if you believe the probability that a technology will make humanity go extinct with a probability of 1% or more, be very very cautious" would be endorsed by a large majority of the general population & intellectual 'elite'.

I'm not sure we disagree. A lot seems to depend on what is meant by "very very cautious". If it means shutting down AI as a field, I'm pretty skeptical. If it means regulating AI, then I agree, but I also think Sam Altman advocates regulation too.

I agree the general population would probably endorse the statement "if a techn... (read more)

There's an IMO fairly simple and plausible explanation for why Sam Altman would want to accelerate AI that doesn't require positing massive cognitive biases or dark motives. The explanation is simply: according to his moral views, accelerating AI is a good thing to do.

[ETA: also, presumably, Sam Altman thinks that some level of safety work is good. He just prefers a lower level of safety work/deceleration than a typical EA might recommend.]

It wouldn't be unusual for him to have such a moral view. If one's moral view puts substantial weight on the lives and... (read more)

5
David Mathers
1mo
I think that whilst utilitarian but not longtermist views might well justify full-speed ahead, normal people are quite risk averse, and are not likely to react well to someone saying "let's take a 7% chance of extinction if it means we reach immortality slightly quicker and it benefits current people, rather than being a bit slower so that some people die and miss out". That's just a guess though. (Maybe Altman's probability is actually way lower, mine would be, but I don't think a probability more than an order of magnitude lower than that fits with the sort of stuff about X-risk he's said in the past.) 

Arguably, it is effective altruists who are the unusual ones here. The standard EA theory employed to justify extreme levels of caution around AI is strong longtermism.

This suggests people's expected x-risk levels are really small ('extreme levels of caution'), which isn't what people believe.

I think "if you believe the probability that a technology will make humanity go extinct with a probability of 1% or more, be very very cautious" would be endorsed by a large majority of the general population & intellectual 'elite'. It's not at all a fringe moral position.

1
Nick K.
1mo
You don't need to be an extreme longtermist to be sceptical about AI, it suffices to care about the next generation and not want extreme levels of change. I think looking too much into differing morals is the wrong lens here. The most obvious explanation for how Altman and people more concerned about AI safety (not specifically EAs) differ seems to be in their estimates about how likely AI risk is vs other risks. That being said, the point that it's disingenuous to ascribe cognitive bias to Altman for having whatever opinion he has, is a fair one - and one shouldn't go too far with it in view of general discourse norms. That said, given Altman's exceptional capability for unilateral action due to his position, it's reasonable to be at least concerned about it.
4
NickLaing
1mo
I agree that's possible, but I'm not sure I've seen his rhetoric put that view forward in a clear way.

Me being alive is a relatively small part of my values.

I agree some people (such as yourself) might be extremely altruistic, and therefore might not care much about their own life relative to other values they hold, but this position is fairly uncommon. Most people care a lot about their own lives (and especially the lives of their family and friends) relative to other things they care about. We can empirically test this hypothesis by looking at how people choose to spend their time and money; and the results are generally that people spend their money on ... (read more)

One intuitive argument for why capitalism should be expected to advance AI faster than competing economic systems is because capitalist institutions incentivize capital accumulation, and AI progress is mainly driven by the accumulation of computer capital. 

This is a straightforward argument: traditionally it is widely considered that a core element of capitalist institutions is the ability to own physical capital, and receive income from this ownership. AI progress and AI-driven growth requires physical computer capital, both for training and for infe... (read more)

4
Erich_Grunewald
1mo
That makes sense. I agree that capitalism likely advances AI faster than other economical systems. I just don’t think the difference is large enough for economic system to be a very useful frame of analysis (or point of intervention) when it comes to existential risk, let alone the primary frame.

I have the feeling we're talking past each other a bit. I suspect talking about this poll was kind of a distraction. I personally have the sense of trying to convey a central point, and instead of getting the point across, I feel the conversation keeps slipping into talking about how to interpret minor things I said, which I don't see as very relevant.

I will probably take a break from replying for now, for these reasons, although I'd be happy to catch up some time and maybe have a call to discuss these questions in more depth. I definitely see you as trying a lot harder than most other EAs in trying to make progress on these questions collaboratively with me.

4
JWS
2mo
I'd be very happy to have some discussion on these topics with you Matthew. For what it's worth, I really have found much of your work insightful, thought-provoking, and valuable. I think I just have some strong, core disagreements on multiple empirical/epistemological/moral levels with your latest series of posts. That doesn't mean I don't want you to share your views, or that they're not worth discussion, and I apologise if I came off as too hostile. An open invitation to have some kind of deeper discussion stands.[1] 1. ^ I'd like to try out the new dialogue feature on the Forum, but that's a weak preference
3
Ryan Greenblatt
2mo
Agreed, sorry about that.

This response still seems underspecified to me. Is the default unaligned alternative paperclip maximization in your view? I understand that Eliezer Yudkowsky has given arguments for this position, but it seems like you diverge significantly from Eliezer's general worldview, so I'd still prefer to hear this take spelled out in more detail from your own point of view.

1
Ryan Greenblatt
2mo
Your poll says: And then you say: So, I think more human control is better than more literal paperclip maximization, the option given in your poll. My overall position isn't that the AIs will certainly be paperclippers, I'm just arguing in isolation about why I think the choice given in the poll is defensible.

Like you claim there aren't any defensible reasons to think that what humans will do is better than literally maximizing paper clips?

I'm not exactly sure what you mean by this. There were three options, and human paperclippers were only one of these options. I was mainly discussing the choice between (1) and (2) in the comment, not between (1) and (3).

Here's my best guess at what you're saying: it sounds like you're repeating that you expect humans to be unusually altruistic or thoughtful compared to an unaligned alternative. But the point of my previou... (read more)

5
Ryan Greenblatt
2mo
+1, but I don't generally think it's worth counting on "the EA community" to do something like this. I've been vaguely trying to pitch Joe on doing something like this (though there are probably better uses of his time) and his recent blogs posts are touching similar topics.
1
Ryan Greenblatt
2mo
You didn't make this clear, so was just responding generically. Separately, I think I feel a pretty similar intution for case (2), people literally only caring about their families seems pretty clearly worse.
1
Ryan Greenblatt
2mo
There, I'm just saying that human control is better than literal paperclip maximization.

When I say that people are partial to humanity, I'm including an irrational bias towards thinking that humans, or evolved beings, are unusually thoughtful or ethical compared to the alternatives (I believe this is in fact an irrational bias, since the arguments I've seen for thinking that unaligned AIs will be less thoughtful or ethical than aliens seem very weak to me).

In other cases, when people irrationally hold a certain group X to a higher standard than a group Y, it is routinely described as "being partial to group Y over group X". I think this is ju... (read more)

1
Ryan Greenblatt
2mo
Also, to be clear, I agree that the question of "how much worse/better is it for AIs to get vast amounts of resources without human society intending to grant those resources to the AIs from a longtermist perspective" is underinvestigated, but I think there are pretty good reasons to systematically expect human control to be a decent amount better.
3
Ryan Greenblatt
2mo
In that case, my main disagreement is thinking that your twitter poll is evidence for your claims. More specifically: Like you claim there aren't any defensible reasons to think that what humans will do is better than literally maximizing paper clips? This seems totally wild to me.

This seems to underrate the arguments for Malthusian competition in the long run.

I'm mostly talking about what I expect to happen in the short-run in this thread. But I appreciate these arguments (and agree with most of them).

Plausibly my main disagreement with the concerns you raised is that I think coordination is maybe not very hard. Coordination seems to have gotten stronger over time, in the long-run. AI could also potentially make coordination much easier. As Bostrom has pointed out, historical trends point towards the creation of a Singleton.

I'm ... (read more)

The confusing thing about that is, what if EA activities are a key reason why good countermeasures end up being taken against AI?

I find that quite unlikely. I think EA activities contribute on the margin, but it seems very likely to me that people would eventually have taken measures against AI risk in the absence of any EA movement.

In general, while I agree we should not take this argument so far, so that EA ideas do not become "victims of their own success", I also think neglectedness is a standard barometer EAs have used to judge the merits of their int... (read more)

I think the fact that people are partial to humanity explains a large fraction of the disagreement people have with me. But, fair enough, I exaggerated a bit. My true belief is a more moderate version of that claim.

When discussing why EAs in particular disagree with me, to overgeneralize by a fair bit, I've noticed that EAs are happy to concede that AIs could be moral patients, but are generally reluctant to admit AIs as moral agents, in the way they'd be happy to accept humans as independent moral agents (e.g. newborns) into our society. I'd call this "be... (read more)

I think the fact that people are partial to humanity explains a large fraction of the disagreement people have with me.

Maybe, it's hard for me to know. But I predict most the pushback you're getting from relatively thoughtful longtermists isn't due to this.

I've noticed that EAs are happy to concede that AIs could be moral patients, but are generally reluctant to admit AIs as moral agents, in the way they'd be happy to accept humans as independent moral agents (e.g. newborns) into our society.

I  agree with this.

I'd call this "being partial to humanity"

... (read more)

I don't think humanity is bad. I just think people are selfish, and generally driven by motives that look very different from impartial total utilitarianism. AIs (even potentially "random" ones) seem about as good in expectation, from an impartial standpoint. In my opinion, this view becomes even stronger if you recognize that AIs will be selected on the basis of how helpful, kind, and useful they are to users. (Perhaps notice how different this selection criteria is from the evolutionary criteria used to evolve humans.)

I understand that most people are pa... (read more)

6
Ryan Greenblatt
2mo
This is not why people disagree IMO.

And there is distinction I haven't seen you acknowledged: while high "quality" doesn't require humans to be around, I ultimately judge quality by my values.

Is there any particular reason why you are partial towards humans generically controlling the future, relative to this particular current generation of humans? To me, it seems like being partial to one's own values, one's community, and especially one's own life, generally leads to an even stronger argument for accelerationism, since the best way to advance your own values is generally to actually "be t... (read more)

1
Pivocajs
1mo
I agree with this. I (strongly) disagree with this. Me being alive is a relatively small part of my values. And since I am not the director of the world, me personally being around to influence things is unlikely to have a decisive impact on things I value. In more detail: Sure, all else being equal, me being there when AI happens is mildly helpful. But the outcome of building AI seems to be a function of, among other things, (i) values of the people building it + (ii) how much reflection they can do on those values + (iii) the environment dynamics these people are subject to (e.g., the current race dynamics between AI companies). And over time, I expect the potential decrease in (i) to be far outweighed by gains in (ii) and (iii). * The first issue is about (i), that it is not actually me building the AGI, either now or in the future. But I am willing to grant that (all else being equal) current generation is more likely to have values closer to my values. * However, I expect that the factors are (ii) and (iii) are just as influential. Regarding (ii), it seems we keep making progress at philosophy, ethics, etc, and to me, this currently far outweighs the value drift in (i). * Regarding (iii), my impression is that the current situation is so bad that it can't get much worse, and we might as well wait. This of course depends on how likely you think we are likely to get a bad outcome if we either (a) get superintelligence without additional progress on alignment or (b) get widespread human-level AI with no progress on alignment, institution design, etc.

If instead you believed the latter, that would set a significantly higher bar for unaligned AI, right?

That's right, if I thought human values would improve greatly in the face of enormous wealth and advanced technology, I'd definitely be open to seeing humans as special and extra valuable from a total utilitarian perspective. Note that many routes through which values could improve in the future could apply to unaligned AIs too. So, for example, I'd need to believe that humans would be more likely to reflect, and be more likely to do the right type of r... (read more)

I'm guessing preference utilitarians would typically say that only the preferences of conscious entities matter.

Perhaps. I don't know what most preference utilitarians believe.

I doubt any of them would care about satisfying an electron's "preference" to be near protons rather than ionized.

Are you familiar with Brian Tomasik? (He's written about suffering of fundamental particles, and also defended preference utilitarianism.)

I think Bostrom's argument merely compares a pure x-risk (such as a huge asteroid hurtling towards Earth) relative to technological acceleration, and then concludes that reducing the probability of a pure x-risk is more important because the x-risk threatens the eventual colonization of the universe. I agree with this argument in the case of a pure x-risk, but as I noted in my original comment, I don't think that AI risk is a pure x-risk.

If, by contrast, all we're doing by doing AI safety research is influencing something like "the values of the agents in ... (read more)

I agree it's important to talk about and analyze the (relatively small) component of human values that are altruistic. I mostly just think this component is already over-emphasized.

Here's one guess at what I think you might be missing about my argument: 90% selfish values + 10% altruistic values isn't the same thing as, e.g., 90% valueless stuff + 10% utopia. The 90% selfish component can have negative effects on welfare from a total utilitarian perspective, that aren't necessarily outweighed by the 10%. 

90% selfish values is the type of thing that pr... (read more)

9
Ryan Greenblatt
2mo
Yep, this can be true, but I'm skeptical this will matter much in practice. I typically think things which aren't directly optimizing for value or disvalue won't have intended effects which are very important and that in the future unintended effects (externalities) won't be that much of total value/disvalue. When we see the selfish consumption of current very rich people, it doesn't seem like the intentional effects are that morally good/bad relative to the best/worst uses of resources. (E.g. owning a large boat and having people think you're high status aren't that morally important relative to altruistic spending of similar amounts of money.) So for current very rich people the main issue would be that the economic process for producing the goods has bad externalities. And, I expect that as technology advances, externalities reduce in moral importance relative to intended effects. Partially this is based on crazy transhumanist takes, but I feel like there is some broader perspective in which you'd expect this. E.g. for factory farming, the ultimately cheapest way to make meat in the limit of technological maturity would very likely not involve any animal suffering. Separately, I think externalities will probably look pretty similar for selfish resource usage for unaligned AIs and humans because most serious economic activities will be pretty similar.
1
Ryan Greenblatt
2mo
I'd like to explicitly note that this I don't think that this is true in expectation for a reasonable notion of "selfish". Though I maybe think something which is sort of in this direction if we use a relatively narrow notion of altruism.

The idea that billionaires have 90% selfish values seems consistent with a claim of having "primarily selfish" values in my opinion. Can you clarify what you're objecting to here?

3
Ryan Greenblatt
2mo
The literal words of "primarily selfish" don't seem that bad, but I would maybe prefer majority selfish? And your top level comment seems like it's not talking about/emphasizing the main reason to like human control which is that maybe 10-20% of resources are spent well. It just seemed odd to me to not mention that "primarily selfish" still involves a pretty big fraction of altruism.

I agree your original argument was slightly different than the form I stated. I was speaking too loosely, and conflated what I thought Pablo might be thinking with what you stated originally. 

I think the important claim from my comment is "As far as I can tell, I haven't seen any argument in this thread that analyzed and compared the long-term effects in any detail, except perhaps in Ryan Greenblatt original comment, in which he linked to some other comments about a similar topic in a different thread (but I still don't see what the exact argument is)."

1
Ryan Greenblatt
2mo
Explicitly confirming that this seems right to me.

I was just claiming that the "indirect" effects dominate (by indirect, I just mean effects other than shifting the future closer in time).

I understand that. I wanted to know why you thought that. I'm asking for clarity. I don't currently understand your reasons. See this recent comment of mine for more info.

1
Ryan Greenblatt
2mo
(I don't think I'm going to engage further here, sorry.)

I was trying to hint at prima facie plausible ways in which the present generation can increase the value of the long-term future by more than one part in billions, rather than “assume” that this is the case, though of course I never gave anything resembling a rigorous argument.

As I understand, the argument originally given was that there was a tiny effect of pushing for AI acceleration, which seems outweighed by unnamed and gigantic "indirect" effects in the long-run from alternative strategies of improving the long-run future. I responded by trying to ge... (read more)

2
Pablo
2mo
Thanks for the clarification. Yes, I agree that we should consider the long-term effects of each intervention when comparing them. I focused on the short-term effects of hastening AI progress because it is those effects that are normally cited as the relevant justification in EA/utilitarian discussions of that intervention. For instance, those are the effects that Bostrom considers in ‘Astronomical waste’. Conceivably, there is a separate argument that appeals to the beneficial long-term effects of AI capability acceleration. I haven’t considered this argument because I haven’t seen many people make it, so I assume that accelerationist types tend to believe that the short-term effects dominate.
1
Ryan Greenblatt
2mo
To be clear, this wasn't the structure of my original argument (though it might be Pablo's).  My argument was more like "you seem to be implying that action X is good because of its direct effect (literal first order acceleration), but actually the direct effect is small when considered in a particular perspective (longtermism), so for the that perspective we need to consideer indirect effects and the analysis for that looks pretty different". Note that I wasn't trying really trying argue much about the sign of the indirect effect, though people have indeed discussed this in some detail in various contexts.

My stance is that we (more-or-less) know humans are conscious and have moral values that, while they have failed to prevent large amounts of harm, seem to have the potential to be good.

I claim there's a weird asymmetry here where you're happy to put trust into humans because they have the "potential" to do good, but you're not willing to say the same for AIs, even though they seem to have the same type of "potential".

Whatever your expectations about AIs, we already know that humans are not blank slates that may or may not be altruistic in the future: we ac... (read more)

7
Ben Millwood
2mo
I haven't read your entire post about this, but I understand you believe that if we created aligned AI, it would get essentially "current" human values, rather than e.g. some improved / more enlightened iteration of human values. If instead you believed the latter, that would set a significantly higher bar for unaligned AI, right?

It seems like you're just substantially more pessimistic than I am about humans. I think factory farming will be ended, and though it seems like humans have caused more suffering than happiness so far, I think their default trajectory will be to eventually stop doing that, and to ultimately do enough good to outweigh their ignoble past. I don't think this is certain by any means, but I think it's a reasonable extrapolation. (I maybe don't expect you to find it a reasonable extrapolation.)

Meanwhile I expect the typical unaligned AI may seize power for some ... (read more)

I basically buy that the values we get will be similar to just giving existing humans massive amounts of wealth, but I'm less sold that this will result in outcomes which are well described as "primarily selfish".

Current humans definitely seem primarily selfish (although I think they also care about their family and friends too; I'm including that). Can you explain why you think giving humans a lot of wealth would turn them into something that isn't primarily selfish? What's the empirical evidence for that idea?

2
JWS
2mo
How are we defining selfish here? It seem like a pretty strong position to take on the topic of psychological egoism? Especially including family/friends in terms of selfish? In your original post, you say: But I don't know, it seems that as countries and individuals get wealthier, we seem to on the whole be getting better? Maybe factory farming acts against this, but the idea that factory farming is immoral and should be abolished exists and I think is only going to grow. I don't think the humans are just slaves to our base wants/desires, and think that is a remarkably impoverished view of both individual human pyschology and social morality. As such, I don't really agree with much of this post. An AGI, when built, will be able to generate new ideas and hypotheses about the world, including moral ones. A strong-but-narrow AI could be worse (e.g. optimal-factory-farm-PT), but then the right response here isn't really technical alignment, it's AI governance and moral persuasion in general.
3
Ryan Greenblatt
2mo
The behavior of billionares, which maybe indicates more like 10% of income spent on altruism. ETA: This is still literally majority selfish, but it's also plausible that 10% altruism is pretty great and looks pretty different than "current median person behavior with marginal money".  (See my other comment about the percent of cosmic resources.)

For what it's worth, I think my reply to Pablo here responds to your comment fairly adequately too.

I'm claiming that it is not actually clear that we can take actions that don't merely wash out over the long-term. In this case, you cannot simply assume that we can meaningfully and predictably affect how valuable the long-term future will be in, for example, billions of years. I agree that, yes, if you assume we can meaningfully affect the very long-run, then all actions that merely have short-term effects will have "tiny" impacts by comparison. But the assumption that we can meaningfully and predictably affect the long-run is precisely the thing that ne... (read more)

1
Ryan Greenblatt
2mo
I don't disagree with this. I was just claiming that the "indirect" effects dominate (by indirect, I just mean effects other than shifting the future closer in time). There is still the question of indirect/direct effects.
4
Pablo
2mo
I was trying to hint at prima facie plausible ways in which the present generation can increase the value of the long-term future by more than one part in billions, rather than “assume” that this is the case, though of course I never gave anything resembling a rigorous argument. I do agree that the “washing out” hypothesis is a reasonable default and that one needs a positive reason for expecting our present actions to persist into the long-term. One seemingly plausible mechanism is influencing how a transformative technology unfolds: it seems that the first generation that creates AGI has significantly more influence on how much artificial sentience there is in the universe a trillion years from now than, say, the millionth generation. Do you disagree with this claim? I’m not sure I understand the point you make in the second paragraph. What would be the predictable long-term effects of hastening the arrival of AGI in the short-term?

It seems to me that a big crux about the value of AI alignment work is what target you think AIs will ultimately be aligned to in the future in the optimistic scenario where we solve all the "core" AI risk problems to the extent they can be feasibly solved, e.g. technical AI safety problems, coordination problems, the problem of having "good" AI developers in charge etc.

There are a few targets that I've seen people predict AIs will be aligned to if we solve these problems: (1) "human values", (2) benevolent moral values, (3) the values of AI developers, (4... (read more)

2
aogara
2mo
This seems to underrate the arguments for Malthusian competition in the long run.  If we develop the technical capability to align AI systems with any conceivable goal, we'll start by aligning them with our own preferences. Some people are saints, and they'll make omnibenevolent AIs. Other people might have more sinister plans for their AIs. The world will remain full of human values, with all the good and bad that entails.  But current human values are do not maximize our reproductive fitness. Maybe one human will start a cult devoted to sending self-replicating AI probes to the stars at almost light speed. That person's values will influence far-reaching corners of the universe that later humans will struggle to reach. Another human might use their AI to persuade others to join together and fight a war of conquest against a smaller, weaker group of enemies. If they win, their prize will be hardware, software, energy, and more power that they can use to continue to spread their values.  Even if most humans are not interested in maximizing the number and power of their descendants, those who are will have the most numerous and most powerful descendants. This selection pressure exists even if the humans involved are ignorant of it; even if they actively try to avoid it.  I think it's worth splitting the alignment problem into two quite distinct problems:  1. The technical problem of intent alignment. Solving this does not solve coordination problems. There will still be private information and coordination problems after intent alignment is solved,  therefore we'll still face coordination problems, fitter strategies will proliferate, and the world will be governed by values that maximize fitness.  2. "Civilizational alignment"? Much harder problem to solve. The traditional answer is a Leviathan, or Singleton as the cool kids have been saying. It solves coordination problems, allowing society to coherently pursue a long-run objective such as flourishing rather
6
MichaelStJules
2mo
EDIT: I guess I'd think of human values as what people would actually just sincerely and directly endorse without further influencing them first (although maybe just asking them makes them take a position if they didn't have one before, e.g. if they've never thought much about the ethics of eating meat). I think you're overstating the differences between revealed and endorsed preferences, including moral/human values, here. Probably only a small share of the population thinks eating meat is wrong or bad, and most probably think it's okay. Even if people generally would find it wrong or bad after reflecting long enough (I'm not sure they actually would), that doesn't reflect their actual values now. Actual human values do not generally find eating meat wrong. To be clear, you can still complain that humans' actual/endorsed values are also far from ideal and maybe not worth aligning with, e.g. because people don't care enough about nonhuman animals or helping others. Do people care more about animals and helping others than an unaligned AI would, in expectation, though? Honestly, I'm not entirely sure. Humans may care about animal welfare somewhat, but they also specifically want to exploit animals in large part because of their values, specifically food-related taste, culture, traditions and habit. Maybe people will also want to specifically exploit artificial moral patients for their own entertainment, curiosity or scientific research on them, not just because the artificial moral patients are generically useful, e.g. for acquiring resources and power and enacting preferences (which an unaligned AI could be prone to).
5
Ryan Greenblatt
2mo
I basically buy that the values we get will be similar to just giving existing humans massive amounts of wealth, but I'm less sold that this will result in outcomes which are well described as "primarily selfish". I feel like your comment is equivocating between "the situation is similar to making existing humans massively wealth" and "of course this will result in primarily selfish usage similar to how the median person behaves with marginal money now".
3
Ryan Greenblatt
2mo
What percent of cosmic resources do you expect to be spent thoughtfully and altruistically? 0%? 10%? I would guess the thoughtful and altruistic subset of resources dominate in most scenarios where humans retain control. Then, my main argument for why human control would be good is that the fraction isn't that small (more like 20% in expectation than 0%) and that unaligned AI takeover seems probably worse than this. Also, as an aside, I agree that little good public argumentation has been made about the relative value of unaligned AI control vs human control. I'm sympathetic to various discussion from Paul Christiano and Joe Carlsmith, but the public scope and detail is pretty limited thus far.

Under purely longtermist views, accelerating AI by 1 year increases available cosmic resources by 1 part in 10 billion. This is tiny.

Tiny compared to what? Are you assuming we can take some other action whose consequences don't wash out over the long-term, e.g. because of a value lock-in? In general, these assumptions just seem quite weak and underspecified to me.

What exactly is the alternative action that has vastly greater value in expectation, and why does it have greater value? If what you mean is that we can try to reduce the risk of extinction ins... (read more)

3
Ryan Greenblatt
2mo
Ensuring human control throughout the singularity rather than having AIs get control very obviously has relatively massive effects. Of course, we can debate the sign here, I'm just making a claim about the magnitude. I'm not talking about extinction of all smart beings on earth (AIs and humans), which seems like a small fraction of existential risk. (Separately, the badness of such extinction seems maybe somewhat overrated because pretty likely intelligent life will just re-evolve in the next 300 million years. Intelligent life doesn't seem that contingent. Also aliens.)
3
Pablo
2mo
I think it remains the case that the value of accelerating AI progress is tiny relative to other apparently available interventions, such as ensuring that AIs are sentient or improving their expected well-being conditional on their being sentient. The case for focusing on how a transformative technology unfolds, rather than on when it unfolds,[1] seems robust to a relatively wide range of technologies and assumptions. Still, this seems worth further investigation. 1. ^ Indeed, it seems that when the transformation unfolds is primarily important because of how it unfolds, insofar as the quality of a transformation is partly determined by its timing.

It would be surprising to me if making the transfer of power more voluntary/careful led to worse outcomes (or only led to slightly better outcomes such that the downsides of slowing down a bit made things worse).

Two questions here:

  1. Why would accelerating AI make the transition less voluntary? (In my own mind, I'd be inclined to reverse this sentiment a bit: delaying AI by regulation generally involves forcibly stopping people from adopting AI. Force might be justified if it brings about a greater good, but that's not the argument here.)
  2. I can understand bein
... (read more)
4
elifland
2mo
1. So in the multi-agent slowly-replacing case, I'd argue that individual decisions don't necessarily represent a voluntary decision on behalf of society (I'm imagining something like this scenario). In the misaligned power-seeking case, it seems obvious to me that this is involuntary. I agree that it technically could be a collective voluntary decision to hand over power more quickly, though (and in that case I'd be somewhat less against it). 2. I think emre's comment lays out the intuitive case for being careful / taking your time, as does Ryan's. I think the empirics are a bit messy once you take into account benefits of preventing other risks but I'd guess they come out in favor of delaying by at least a few years.

It's very likely that whatever change that comes from AI development will be irreversible.

I think all actions are in a sense irreversible, but large changes tend to be less reversible than small changes. In this sense, the argument you gave seems reducible to "we should generally delay large changes to the world, to preserve option value". Is that a reasonable summary?

In this case I think it's just not obvious that delaying large changes is good. Would it have been good to delay the industrial revolution to preserve option value? I think this heuristic, if used in the past, would have generally demanded that we "pause" all sorts of social, material, and moral progress, which seems wrong.

6
Michael_PJ
2mo
I don't think we would have been able to use the additional information we would have gained from delaying the industrial revolution but I think if we could have the answer might be "yes". It's easy to see in hindsight that it went well overall, but that doesn't mean that the correct ex ante attitude shouldn't have been caution!

I'm curious why there hasn't been more work exploring a pro-AI or pro-AI-acceleration position from an effective altruist perspective. Some points:

  1. Unlike existential risk from other sources (e.g. an asteroid) AI x-risk is unique because humans would be replaced by other beings, rather than completely dying out. This means you can't simply apply a naive argument that AI threatens total extinction of value to make the case that AI safety is astronomically important, in the sense that you can for other x-risks. You generally need additional assumptions.
  2. Total
... (read more)
2
Vasco Grilo
2mo
Great points, Matthew! I have wondered about this too. Relatedly, readers may want to check the sequence otherness and control in the age of AGI from Joe Carlsmith, in particular, Does AI risk “other” the AIs?. One potential argument against accelerating AI is that it will increase the chance of catastrophes which will then lead to overregulating AI (e.g. in the same way that nuclear power arguably was overregulated).
5
JWS
2mo
So I think it's likely you have some very different beliefs from most people/EAs/myself, particularly: 1. Thinking that humans/humanity is bad, and AI is likely to be better 2. Thinking that humanity isn't driven by ideational/moral concerns[1] 3. That AI is very likely to be conscious, moral (as in, making better moral judgements than humans), and that the current/default trend in the industry is very likely to make them conscious moral agents in a way humans aren't I don't know if the total utilitarian/accelerationist position in the OP is yours or not. I think Daniel is right that most EAs don't have this position. I think maybe Peter Singer gets closest to this in his interview with Tyler on the 'would you side with the Aliens or not question' here. But the answer to your descriptive question is simply that most EAs don't have the combination of moral and empirical views about the world to make the argument you present valid and sound, so that's why there isn't much talk in EA about naïve accelerationism. Going off the vibe I get from this view though, I think it's a good heuristic that if your moral view sounds like a movie villain's monologue it might be worth reflecting, and a lot of this post reminded me of the Earth-Trisolaris Organisation from Cixin Liu's Three Body Problem. If someone's honest moral view is "Eliminate human tyranny! The world belongs to Trisolaris AIs!" then I don't know what else there is to do except quote Zvi's phrase "please speak directly into this microphone". Another big issue I have with this post is that some of the counter-arguments just seem a bit like 'nu-uh', see:  These (and other examples) are considerations for sure, but they need to be argued for. I don't think they can just be stated and then say "therefore, ACCELERATE!". I agree that AI Safety research needs to be more robust and the philosophical assumptions and views made more explicit, but one could already think of some counters to the questions that you rai
8
Arepo
2mo
I generally agree that we should be more concerned about this. In particular, I find people who will happily approve Shut Up and Multiply sentiment but reject this consideration (such as Eliezer) suspect in their reasoning. A more extreme version of this is that, given the massively greater efficiency with which a digital consciousness could convert matter and energy to utilons (IIRC naively about 3 orders of magnitude according to Bostrom, before any increase from greater coordination), on strict expected value reasoning you have to be extremely confident that this won't happen - or at least have a much stronger rebuttal than 'AI won't necessarily be conscious'. Separately, I think there might be a case for accelerationism even if you think it increases the risk of AI takeover and that AI takeover is bad, on the grounds that in many scenarios advancing faster might still increase the probability of human descendants getting through the time of perils before some other threat destroys us (every year we remain in our current state is another year in which we run the risk of, for example, a global nuclear war or civilisation-ending pandemic).
3
Pivocajs
2mo
My personal reason for not digging into this is that my naive model of how good the AI future is: quality_of_future * amount_of_the_stuff. And there is distinction I haven't seen you acknowledged: while high "quality" doesn't require humans to be around, I ultimately judge quality by my values. (Thing being conscious is an example. But this also includes things like not copy-pasting the same thing all over, not wiping out aliens, and presumably many other things I am not aware of. IIRC Yudkowsky talks about cosmopolitanism being a human value.) Because of this, my impression is that if we hand over the future to a random AI, the "quality" will be very low. And so we can currently have a much larger impact by focusing on increasing the quality. Which we can do by delaying "handing over the future to AI" and picking a good AI to hand over to. IE, alignment. (Still, I agree it would be nice if there was a better analysis of this, which exposed the assumptions.)
  1. My understanding is that relatively few EAs are actual hardcore classic hedonist utilitarians. I think this is ~sufficient to explain why more haven't become accelerationists.
  2. Have you cornered a classic hedonist utilitarian EA and asked them? Have you cornered three? What did they say?
4
Robi Rahman
2mo
I'm guessing preference utilitarians would typically say that only the preferences of conscious entities matter. I doubt any of them would care about satisfying an electron's "preference" to be near protons rather than ionized.
8
Ben Millwood
2mo
A lot of these points seem like arguments that it's possible that unaligned AI takeover will go well, e.g. there's no reason not to think that AIs are conscious, or will have interesting moral values, or etc. My stance is that we (more-or-less) know humans are conscious and have moral values that, while they have failed to prevent large amounts of harm, seem to have the potential to be good. AIs may be conscious and may have welfare-promoting values, but we don't know that yet. We should try to better understand whether AIs are worthy successors before transitioning power to them. Probably a core point of disagreement here is whether, presented with a "random" intelligent actor, we should expect it to promote welfare or prevent suffering "by default". My understanding is that some accelerationists believe that we should. I believe that we shouldn't. Moreover I believe that it's enough to be substantially uncertain about whether this is or isn't the default to want to take a slower and more careful approach.

AI x-risk is unique because humans would be replaced by other beings, rather than completely dying out. This means you can't simply apply a naive argument that AI threatens total extinction of value

Paul Christiano wrote a piece a few years ago about ensuring that misaligned ASI is a “good successor” (in the moral value sense),[1] as a plan B to alignment (Medium versionLW version). I agree it’s odd that there hasn’t been more discussion since.[2]

Here's a non-exhaustive list of guesses for why I think EAs haven't historically been sympathetic [.

... (read more)
7
Ryan Greenblatt
2mo
I don't think this is a crux. Even if you prefer unaligned AI values over likely human values (weighted by power), you'd probably prefer doing research on further improving AI values over speeding things up.

Under purely longtermist views, accelerating AI by 1 year increases available cosmic resources by 1 part in 10 billion. This is tiny. So the first order effects of acceleration are tiny from a longtermist perspective.

Thus, a purely longtermist perspective doesn't care about the direct effects of delay/acceleration and the question would come down to indirect effects.

I can see indirect effects going either way, but delay seems better on current margins (this might depend on how much optimism you have on current AI safety progress, governance/policy progress... (read more)

6
Isaac Dunn
2mo
I think misaligned AI values should be expected to be worse than human values, because it's not clear that misaligned AI systems would care about eg their own welfare. Inasmuch as we expect misaligned AI systems to be conscious (or whatever we need to care about them) and also to be good at looking after their own interests, I agree that it's not clear from a total utilitarian perspective that the outcome would be bad. But the "values" of a misaligned AI system could be pretty arbitrary, so I don't think we should expect that.
4
Nathan_Barnard
2mo
Strongly there should be more explicit defences of this argument.  One way of doing this in a co-operative way might working on co-operative AI stuff, since it seems to increase the likelihood that misaligned AI goes well, or at least less badly. 
8
elifland
2mo
(edit: my point is basically the same as emre's) I think there is very likely at some point going to be some sort of transition to a world where AIs are effectively in control. It seems worth it to slow down on the margin to try to shape this transition as best we can, especially slowing it down as we get closer to AGI and ASI. It would be surprising to me if making the transfer of power more voluntary/careful led to worse outcomes (or only led to slightly better outcomes such that the downsides of slowing down a bit made things worse). Delaying the arrival of AGI by a few years as we get close to it seems good regardless of parameters like the value of involuntary-AI-disempowerment futures. But delaying the arrival by 100s of years seems more likely bad due to the tradeoff with other risks.

I think a more important reason is the additional value of the information and the option value. It's very likely that the change resulting from AI development will be irreversible. Since we're still able to learn about AI as we study it, taking additional time to think and plan before training the most powerful AI systems seems to reduce the likelihood of being locked into suboptimal outcomes. Increasing the likelihood of achieving "utopia" rather than landing into "mediocrity" by 2 percent seems far more important than speeding up utopia by 10 years.

In response to human labor being automated, a lot of people support a UBI funded by a tax on capital. I don't think this policy is necessarily unreasonable, but if later the UBI gets extended to AIs, this would be pretty bad for humans, whose only real assets will be capital.

As a result, the unintended consequence of such a policy may be to set a precedent for a massive wealth transfer from humans to AIs. This could be good if you are utilitarian and think the marginal utility of wealth is higher for AIs than humans. But selfishly, it's a big cost.

What reason is there to think that AI will shift the offense-defense balance absurdly towards offense? I admit such a thing is possible, but it doesn't seem like AI is really the issue here. Can you elaborate?

2
Ryan Greenblatt
2mo
I think main abstract argument for why this is plausible is that AI will change many things very quickly and in a high variance way. And some human processes will lag behind heavily. This could plausibly (though not obviously) lead to offense dominance.
2
Chris Leong
2mo
I'm not going to fully answer this question, b/c I have other work I should be doing, but I'll toss in one argument. If different domains (cyber, bio, manipulation, ect.) have different offense-defense balances a sufficiently smart attacker will pick the domain with the worst balance. This recurses down further for at least some of these domains where they aren't just a single thing, but a broad collection of vaguely related things.

Your argument in objection 1 doesn't the position people who are worried about an absurd offense-defense imbalance.

I'm having trouble parsing this sentence. Can you clarify what you meant?

Additionally: It may be that no agent can take over the world, but that an agent can destroy the world.

What incentive is there to destroy the world, as opposed to take it over? If you destroy the world, aren't you sacrificing yourself at the same time?

4
Chris Leong
2mo
Oh, I can see why it is ambiguous. I meant whether it is easier to attack or defend, which is separate from the "power" attackers have and defenders have. "What incentive is there to destroy the world, as opposed to take it over? If you destroy the world, aren't you sacrificing yourself at the same time?" Some would be willing to do that if they can't take it over.
Load more