LG

Lukas_Gloor

6438 karmaJoined
0

Sequences
1

Moral Anti-Realism

Comments
528

The point I wanted to make in the short form was directed at a particular brand of skeptic. 

When I said,

Something has gone wrong if people think pausing can only make sense if the risks of AI ruin are >50%.

I didn't mean to imply that anyone who opposes pausing would consider >50% ruin levels their crux.

Likewise, I didn't mean to imply that "let's grant 5% risk levels" is something that every skeptic would go along with (but good that your comment is making this explicit!). 

For what it's worth, if I had to give a range for how much I think people I, at the moment, epistemically respect to the highest extent possible, can disagree on this question today (June 2024), I would probably not include credences <<5% in that range (I'd maybe put it a more like 15-90%?). (This is of course subject to change if I encounter surprisingly good arguments for something outside the range.) But that's a separate(!) discussion, separate from the conditional statement that I wanted to argue for in my short form. (Obviously, other people will draw the line elsewhere.)

On the 80k article, I think it aged less well than what one maybe could've written at the time, but it was written at a time when AI risk concerns still seemed fringe. So, just because it in my view didn't age amazingly doesn't mean that it was unreasonable at the time. At the time, I'd have probably called it "lower than what I would give, but seems within the range of what I consider reasonable."

Yeah, I agree. I wrote about timing considerations here; I agree this is an important part of the discussion.

[AI pause can be the best course of action even if the baseline risk of AI ruin were "only" 5%]

Some critics of pausing frontier AI research argue that AI pause would be an extreme measure (curtailing tech progress, which has historically led to so much good) that is only justifiable if we have very high confidence that the current path leads to ruin.

On the one hand, I feel like the baseline risk is indeed very high (I put it at 70% all-things-considered).

At the same time, I'm frustrated that there's so much discussion of "is the baseline risk really that high?" compared to "is not pausing really the optimal way for us to go into this major civilizational transition?"

I feel like arguing about ruin levels can be distraction from what should be a similarly important crux. Something has gone wrong if people think pausing can only make sense if the risks of AI ruin are >50%.

The key question is: Will pausing reduce the risks?

Even if the baseline risk were "only" 5%, assuming we have a robust argument that pausing for (say) five years will reduce it to (say) 4%, that would clearly be good! (It would be very unfortunate for the people who will die preventable deaths in the next five years, but pausing would still be better on net, on these assumptions.)

So, what assumptions would have to be true that continuing ahead is better than pausing?

(Also, if someone is worried that there are negative side effects from pausing, such as that it'll be politically/societally hard to re-start things later on after alignment research made breakthroughs or if some Western-values unaligned country is getting closer to building TAI themselves, that's a discussion worth having! However, then we have to look at "the best implementation of pausing we can realistically get if we advocate for the smartest thing," and not "a version of pausing that makes no attempts whatsoever to reduce bad side effects.")

That's good.

2.The author distinguishes between "utilitarianism as a personal goal" versus utilitarianism as the single true morality everyone must adopt.

And I argue (or link to arguments in previous posts) that the latter interpretation isn't defensible. Utilitarianism as the true morality would have to be based on an objective axiology, but there's likely no such thing (only subjective axiologies).

Maybe also worth highlighting is that the post contains an argument about how we can put person-affecting views on more solid theoretical grounding. (This goes more into the weeds, but it's a topic that comes up a lot in EA discourse.) Here's a summary of that argument: 

  • The common arguments against person-affecting views seem to be based on the assumption, "we want an overarching framework that tells us what's best for both existing/sure-to-exist and possible people at the same time." 
  • However, since (so I argue) there's no objective axiology, it's worth asking whether this is maybe too steep of a requirement?
  • Person-affecting views seem well-grounded if we view them as a deliberate choice between two separate perspectives, where the non-person affecting answer is "adopt a subjective axiology that tells us what's best for newly created people," and the person-affecting answer is "leave our axiology under-defined."
  • Leaving one's subjective axiology under-defined means that many actions we can take that affect new people will be equally "permissible." 
  • Still, this doesn't mean "anything goes," since we'll still have some guidance from minimal morality: On the context of creating new people/beings, minimal morality implies that we should (unless it's unreasonably demanding) not commit actions that are objectionable according to all plausible subjective axiologies.
  • Concretely, this means that it's permissible to do a range of things even if they are neither what's best on anti-natalist grounds, nor what's best on totalist grounds, as long as we don't do something that's bad on both these grounds. 

However, I suspect we should pick A instead. With Z available, A+ seems too unfair to the contingent people and too partial to the necessary/present people. Once the contingent people exist, Z would have been better than A+. And if Z is still an option at that point, we’d switch to it. So, anticipating this reasoning, whether or not we can later make the extra people better off later, I suspect we should rule out A+ first, and then select A over Z.

I can imagine myself as one of the original necessary people in A. If we picked A+, I'd judge that to be too selfish of us and too unkind to the extra people relative to the much fairer Z. All of us together, with the extra people, would collectively judge Z to have been better. From my impartial perspective, I would then regret the choice of A+. On the other hand, if we (the original necessary people) collectively decide to stick with A to avoid Z and the unkindness of A+ relative to Z, it's no one else's business. We only hurt ourselves relative to A+. The extra people won't be around to have any claims.

Well said! I think that's what a well-constructed account of person-affecting views should say, and I think we can indeed say that without running into contradictions or other counterintuitive conclusions elsewhere (but it's worth working this out in detail). The rationale I would give is something like the following. "If we decide to create new people (which is never in itself important, but we might still want to do it for various reasons), it's important that we take into consideration the interests of these new people. This rules out some ways of creating new people [like A+ when Z is available] where they're worse off than they could be, at no compensatory moral gain." 

Some people might think it's strange that the acceptability of A+ depends on whether Z is also available. However, I think that's actually okay, and we can see that it is okay if we think about common sense practical examples: I'd say it's acceptable today, even for people who aren't particulary rich/generally best-positioned to give their children the best lives, for people to have children. (Assuming they're doing their best to have the happiest children that they can, and if we can expect the children to be happy on average). However, imagine a utopia where everyone is "rich" (or better yet: where money no longer matters because there's an abundance of everything) and well-positioned to give their children the best lives. In that utopia, it would be wrong to create children with unnecessarily restricted access to the utopia's resources, who will then only be as well off as the average child in our reality in 2024. 

So, most people already believe that the ethics around having children are sensitive to the parents' means and their available options. (In the example with populations A, A+, and Z, we can think of the people in "A" as the "parent population." The parent population in A has the means to have lots of children at welfare level 3, but in A+, they decide to only have these children at welfare level 1 instead for comparatively trivial self-oriented gain. That doesn't seem okay, but it would be okay if A+ was them doing the best they could.)

This is still in brainstorming stage; I think there's probably a convincing line of argument for "AI alignment difficulty is high at least on priors" that includes the following points:

  • Many humans don't seem particularly aligned to "human values" (not just thinking of dark triad traits, but also things like self-deception, cowardice, etc.)
  • There's a loose analogy where AI is "more technological progress," and "technological progress" so far hasn't always been aligned to human flourishing (it has solved or improved a lot of long-term problems of civilization, like infant mortality, but has also created some new ones, like political polarization, obesity, unhappiness from constant bombardement with images of people who are richer and more successful than you, etc.). So, based on this analogy, why think things will somehow fall into place with AI training so that the new forces that be will for once become aligned?
  • AI will accelerate everything, and if you accelerate something that isn't set up in a secure way, it goes off the rails ("small issues will be magnified").

My guess is that, even then, there'll be a lot of people for whom it remains counterintuitive. (People may no longer use the strong word "repugnant" to describe it, but I think many will still find it counterintuitive.)

Which would support my point that many people find the repugnant conclusion counterintuitive not (just) because of aggregation concerns, but also because they have the intuition that adding new people doesn't make things better.

I just meant that my impression was that person-affecting views seem fairly orthogonal to the Repugnant Conclusion specifically. I imagine that many person-affecting believers would agree with this. Or, I assume that it's very possible to do any combination of [strongly care about the repugnant conclusion] | [not care about it], and [have person-affecting views] and [not have them].

The (very briefly explained) example I mentioned is meant as something like,
Say there's a trolly problem. You could either accept scenario (A): 100 people with happy lives are saved, or (B) 10000 people with sort of decent lives are saved.

My guess was that this would still be an issue in many person-affecting views (I might well be wrong here though, feel free to correct me!). To me, this question is functionally equivalent to the Repugnant Conclusion. 

I'm pretty confident you're wrong about this. (Edit: I mean, you're right if you call it "repugnant conclusion" whenever we talk about choosing between a small very happy population and a sufficiently larger less happy one; however, my point is that it's no coincidence that people most often object to favoring the larger population over the smaller one in contexts of population ethics, i.e., when the populations are not already both in existence.) 
I've talked to a lot of suffering-focused EAs. Of the people who feel strongly about rejecting the repugnant conclusion in population ethics, at best only half feel that aggregation is altogether questionable. More importantly, even in those that feel that aggregation is altogether questionable, I'm pretty sure that's a separate intuition for them (and it's only triggered when we compare something as mild as dust specks to extremes like torture). Meaning, they might feel weird about "torture vs dustspecks," but they'll be perfectly okay with "there comes a point where letting a trolley run over a small paradise is better than letting it run over a sufficiently larger population of less happy (but still overall happy) people on the other track." By contrast, the impetus of their reaction to the original repugnant conclusion comes from the following. When they hear a description of "small-ish population with very high happiness," their intuition goes "hmm, that sounds pretty optimal," so they're not interested in adding costs just to add more happiness moments (or net happy lives) to the total.

To pass the Ideological Turing test for most people who don't want to accept the repugnant conclusion, you IMO have to engage with the intuition that it isn't morally important to create new happy people. (This is also what person-affecting views try to build on.)

I haven't done explicit surveys of this, but I'm still really confident that I'm right about this being what non-totalists in population ethics base their views on, and I find it strange that pretty much* every time totalists discuss the repugnant conclusion, they don't seem to see this.

(For instance, I've pointed this out here on the EA forum at least once to Gregory Lewis and Richard Yetter-Chappell (so you're in good company, but what is going on?))

*For an exception,this post by Joe Carlsmisth doesn't mention the repugnant conclusion directly, but it engages with what I consider to be more crux-y arguments and viewpoints in relation to it. 

Somewhat relatedly, what about using AI to improve not your own (or your project's) epistemics, but improve public discourse? Something like "improve news" or "improve where people get their info on controversial topics."

Edit: To give more context, I was picturing something like training LLMs to pass ideological turing tests and then create a summary of the strongest arguments for and against, as well as takedowns of common arguments by each side that are clearly bad. And maybe combine that with commenting on current events as they unfold (to gain traction), handling the tough balance of having to compete in the attention landscape while still adhering to high epistemic standards. The goal then being something like "trusted source of balanced reporting," which you can later direct to issues that matter the most (after gaining traction earlier by discussing all sorts of things).

Off the cuff answers that may change as I reflect more:

  • Maybe around 25% of people in leadership positions in the EA ecosystem qualify? Somewhat lower for positions at orgs that are unusually "ambitious;" somewhat higher for positions that are more like "iterate on a proven system" or "have a slow-paced research org that doesn't involve itself too much in politics."
  • For the ambitious leaders, I unfortunately have no examples where I feel particularly confident, but can think of a few examples where I'm like "from a distance, it looks like they might be good leaders." I would count Holden in that category, even though I'd say the last couple of years seem suboptimal in terms of track record (and also want to flag that this is just a "from a distance" impression, so don't put much weight on it).
  • Why we're bad at identifying: This probably isn't the only reason, but the task is just hard. If you look at people who have ambitious visions and are willing to try hard to make them happen, they tend to be above-average on dark triad traits. You probably want someone who is very much not high on psychopathic traits, but still low enough on neuroticism that they won't be anxious all the time. Similarly, you want someone who isn't too high on narcissism, but they still need to have that ambitious vision and belief in being exceptional. You want someone who is humble and has inner warmth so they will uplift others along the way, so high on honesty-humility factor, but that correlates with agreeableness and neuroticism – which is a potential problem because you probably can't be too agreeable in the startup world or when running an ambitious org generally, and you can't be particularly neurotic.
    • (Edit) Another reason is, I think people often aren't "put into leadership positions" by others/some committee; instead, they put themselves there. Like, usually there isn't some committee with a great startup idea looking for a leader; instead, the leader comes with the vision and accumulates followers based on their conviction. And most people who aren't in leadership positions simply aren't vigilant or invested enough to care a lot about who becomes a leader. 

I think incentives matter, but I feel like if they're all that matters, then we're doomed anyway because "Who will step up as a leader to set good incentives?" In other words, the position "incentives are all that matters" seems self-defeating, because to change things, you can't just sit on the sidelines and criticize "the incentives" or "the system." It also seems too cynical: just because, e.g., lots of money is at stake, that doesn't mean people who were previously morally motivated and cautious about their motivations and trying to do the right thing, will suddenly go off the rails.

To be clear, I think there's probably a limit for everyone and no person is forever safe from corruption, but my point is that it matters where on the spectrum someone falls. Of the people that are low on corruptibility, even though most of them don't like power or would flail around helplessly and hopelessly if they had it, there are probably people who have the right mix of traits to create, maintain and grow pockets of sanity (well-run, well-functioning organizations, ecosystems, etc.). 

Load more