LG

Lukas_Gloor

6182 karmaJoined Jan 2015

Sequences
1

Moral Anti-Realism

Comments
497

Thank you for engaging with my post!! :)

Also I'm not sure how I would form object-level moral convictions even if I wanted to. No matter what I decide today, why wouldn't I change my mind if I later hear a persuasive argument against it? The only thing I can think of is to hard-code something to prevent my mind being changed about a specific idea, or to prevent me from hearing or thinking arguments against a specific idea, but that seems like a dangerous hack that could mess up my entire belief system.

I don't think of "convinctions" as anywhere near as strong as hard-coding something. "Convictions," to me,  is little more than "whatever makes someone think that they're very confident they won't change their mind." Occasionally, someone will change their minds about stuff even after they said it's highly unlikely. (If this happens too often, one has a problem with calibration, and that would be bad by the person's own lights, for obvious reasons. It seems okay/fine/to-be-expected for this to happen infrequently.)

I say "litte more than [...]" rather than "is exactly [...]" because convictions are things that matter in the context of one's life goals. As such, there's a sense of importance attached to them, which will make people more concerned than usual about changing their views for reasons they wouldn't endorse (while still staying open for low-likelihood ways of changing their minds through a process they endorse!). (Compare this to: "I find it very unlikely that I'd ever come to like the taste of beetroot." If I did change my mind on this later because I joined a community where liking beetroot is seen as very cool, and I get peer-pressured into trying it a lot and trying to form positive associations with it when I eat it, and somehow this ends up working and I actually come to like it, I wouldn't consider this to be as much of a tragedy as if a similar thing happened with my moral convictions.) 

Also I'm not sure how I would form object-level moral convictions even if I wanted to.

Some people can't help it. I think this has a lot to do with reasoning styles. Since you're one of the people on LW/EA forum who place the most value on figuring out things related to moral uncertainty (and metaphilosophy), it seems likely that you're more towards the far end of the spectrum of reasoning styles around this. (It also seems to me that you have a point, that these issues are indeed important/underappreciated – after all, I wrote a book-length sequence on something that directly bears on these questions, but coming from somewhere more towards the other end of the spectrum of reasoning styles.)

  1. I'm very confused/uncertain about many philosophical topics that seem highly relevant to morality/axiology, such as the nature of consciousness and whether there is such a thing as "measure" or "reality fluid" (and if so what is it based on). How can it be right or safe to form moral convictions under such confusion/uncertainty?

Those two are some good examples of things that I imagine most or maybe even all people* are still confused about. (I remember also naming consciousness/"Which computations do I care about?" in a discussion we had long ago on the same topic, as an example of something where I'd want more reflection.)

*(I don't say "all people" outright because I find it arrogant when people who don't themselves understand a topic declare that no one can understand it – for all I know, maybe Brian Tomasik's grasp on consciousness is solid enough that he could form convictions about certain aspects of it, if forming convictions there were something that his mind felt drawn to.)

So, especially since issues related to consciousness and reality don't seem too urgent for us to decide on, it seems like the most sensible option here, for people like you and me at least, is to defer.

Do you have any candidates for where you deem it safe enough to form object-level moral convictions?

Yeah; I think there are many other issues in morality/in "What are my goals?" that are independent of the two areas of confusion you brought up. We can discuss whether forming convictions early in those independent areas (and in particular, in areas where narrowing down our uncertainty would already be valuable** in the near term) is a defensible thing to do. (Obviously, it will depend on the person: it requires having a solid grasp of the options and the "option space" to conclude that you're unlikely to encounter view-shaking arguments or "better conceptualizations of what the debate is even about" in the future.)

**If someone buys into ECL, making up one's mind on one's values becomes less relevant because the best action according to ECL is to work on your comparative advantage among interventions that are valuable from the perspective of ECL-inclined, highly-goal-driven people around you. (One underlying assumption here is that, since we don't have much info about corners of the multiverse that look too different from ours, it makes sense to focus on cooperation partners that live in worlds relevantly similar to ours, i.e., worlds that contain the same value systems we see here among present-day humans.) Still, "buying into ECL" already involves having come to actionably-high confidence on some tricky decision theory questions. I don't think there's a categorical difference between "values" and "decision theory," so having confidence in ECL-allowing decision theories already involves having engaged in some degree of "forming convictions."

The most prominent areas I can think of where I think it makes sense for some people to form convictions early:

  • ECL pre-requirements (per the points in the paragraph above).
  • Should I reason about morality in ways that match my sequence takeaways here, or should I instead reason more like some moral realists would think that we should reason?
  • Should I pursue self-oriented values or devote my life to altruism (or do something in between)?
  • What do I think about population ethics and specifically the question of "How likely is it that I would endorse a broadly 'downside-focused' morality after long reflection?" 

These questions all have implications for how we should act in the near future. Furthermore, they're the sort of questions where I think it's possible to get a good enough grasp on the options and option space to form convictions early.

Altruism vs self-orientedness seems like the most straightforward one. You gotta choose something eventually (including the option of going with a mix), and you may as well choose now because the question is ~as urgent as it gets, and it's not like the features that make this question hard to decide on have much do with complicated philosophical arguments or future-technology-requiring new insights. (This isn't to say that philosophical arguments have no bearing on the question – e.g., famine affluence and morality, or Parfit on personal identity, contain arguments that some people might find unexpectedly compelling, so there's something that's lost if someone were to make up their mind without encountering those arguments. Or maybe some unusually privileged people would find themselves surprised if they read specific accounts of how hard life can be for non-privileged people, or if they became personally acquainted with some of these hardships. But all these things seem like things that a person can investigate right here and now, without the need to wait for future superintelligent AI philosophy advisors. [Also, some of these seem like they may not just be "new considerations," but actually "transformative experiences" that change you into a different person. E.g., encountering someone who is facing hardship and you help them and you feel very fulfilled can become the seed you form your altruistic identity around.])

Next, for moral realism vs anti-realism (which is maybe more about “forming convictions on metaphilosophy” than about direct values, but just like with "decision theory vs values," I think "metaphilosophy vs values" is also a fluid/fuzzy distinction), I see it similarly. The issue has some urgent implications for EAs to decide on (though I don't think of it as the most important question), and there IMO are some good reasons to expect that future insights won't make it significantly easier/won't change the landscape in which we have to find our decision. Namely, the argument here is that this is a question that already permeates all the ways in which one would go about doing further reflection. You need to have some kind of reasoning framework to get started with thinking about values, so you can't avoid choosing. There's no "by default safe option." As I argued in my sequence, thinking that there's a committing wager for non-naturalist moral realism only works if you've formed the conviction I labelled "metaethical fanaticism" (see here), while the wager for moral naturalism (see the discussion in this post we're here commenting on) isn't strong enough to do all the work on its own.

Some people will likely object at this point that moral realism vs anti-realism is not independent from questions of consciousness. Some moral realists place a lot of weight on consciousness realism and the claim that consciousness gives us direct access to moral value. (This view tends to be associated with moral realist hedonist axiology, or, in David Pearce's case, with moral realist negative utilitarianism.) I addressed this claim here and found it unconvincing.

Lastly, population ethics might be the most controversial example, but I think it's fairly easy to see that there won't be a new consideration that will sway all sophisticated reasoners towards the same endpoint. 

[Edit: BTW, when I speak of "forming convinctions on population ethics," I don't necessarily mean some super specific philosophical theory backed up with an academic paper or long blogpost. I mean more things like having strong confidence in broad features of a class of views. The more specific is also possible/defensible, but I wouldn't want you to think of "I'm now a negative utilitarian" or "I'm now a hedonistic classical utilitarian" as the most central example of forming some convictions early.]

Firstly, there are the famous impossibility theorems. Personally, I am concerned that the frameworks in which people derive impossibility theorems often bake in non-obvious assumptions, so that they exclude options where we would come to think about population ethics from within a different ontology (meaning, with a different conceptual repertoire and different conceptualizations of "What question are we trying to answer here, what makes for a good solution, etc.?"). However, even within my moral anti-realism-informed framing of the option space in population ethics (see here), one eventually runs into the standard forking paths and dilemmas, and I've observed that people have vastly different strong intuitions on fundamental aspects of population ethics, such as on the question of "is non-existence a problem?," and that means that people will in practice end up taking different personal stances on population ethics, and I don't see where they'd be going wrong. Next to impossibility theorems that show us that a solution is unlikely to come about, I think we can also gesture at this from the other side, seeing why it is unlikely to come about. I think that population ethics has this reputation of being vexed/difficulty because it is "stretching the domain of our most intuitive moral principles to a point where things become under-defined." 

  • Fundamentally, I conceptualize ethics as being about others' interests. (Dismantling Hedonism-inspired Moral Realism explains why I don’t see ethics as being about experiences. Against Irreducible Normativity explains why I don’t see use in conceptualizing ethics as "irreducible," as being about things we can’t express in non-normative terminology.) So, something like preference utilitarianism feels like a pretty good answer to "how should a maximally wise and powerful god/shepherd AI take care of a fixed population of humans?." However, once we move away from having a fixed population of existing humans, the interests of not-yet-existing minds are now underdefined, in two(!) ways even:
    (1) It's underdefined how many new others there will be.
    (2) It's underdefined who the others will be. E.g., some conceivable new happy minds will be very grateful for their existence, but others will be like "I'm happy and that's nice, but if I hadn't been borne, that would be okay too."
    The underlying intuitions behind preference utilitarianism (the reasons why preference utilitarianism seems compelling in fixed-population contexts, namely, that it gives everyone what they want and care about) no longer help us decide in tricky population-ethics dilemmas. That suggests inherent under-definedness.

And yet, population ethics is urgently relevant to many aspects of effective altruism. So, people are drawn to thinking deeply about it. And some people will form convinctions in the course of thinking about it. That's what happens, empirically. So, to disagree, you'd have to explain what it is that the people to whom this happens are doing wrong. You might object with, "Why form confident views about anything that already you know (or suspect) that it won't be backed by a consensus of ideal reasoners?" 

My dialogue in one of the last sections of the post we're here commenting on is relevant to that (quoting it here in full for ease of having everything in one location):

Critic: Why would moral anti-realists bother to form well-specified moral views? If they know that their motivation to act morally points in an arbitrary direction, shouldn’t they remain indifferent about the more contested aspects of morality? It seems that it’s part of the meaning of “morality” that this sort of arbitrariness shouldn’t happen.

Me: Empirically, many anti-realists do bother to form well-specified moral views. We see many examples among effective altruists who self-identify as moral anti-realists. That seems to be what people’s motivation often does in these circumstances.

Critic: Point taken, but I’m saying maybe they shouldn’t? At the very least, I don’t understand why they do it.

Me: You said that it’s “part of the meaning of morality” that arbitrariness “shouldn’t happen.” That captures the way moral non-naturalists think of morality. But in the moral naturalism picture, it seems perfectly coherent to consider that morality might be under-defined (or “indefinable”). If there are several defensible ways to systematize a target concept like “altruism/doing good impartially,” you can be indifferent between all those ways or favor one of them. Both options seem possible.

Critic: I understand being indifferent in the light of indefinability. If the true morality is under-defined, so be it. That part seems clear. What I don’t understand is favoring one of the options. Can you explain to me the thinking of someone who self-identifies as a moral anti-realist yet has moral convictions in domains where they think that other philosophically sophisticated reasoners won’t come to share them?

Me: I suspect that your beliefs about morality are too primed by moral realist ways of thinking. If you internalized moral anti-realism more, your intuitions about how morality needs to function could change.

Consider the concept of “athletic fitness.” Suppose many people grew up with a deep-seated need to study it to become ideally athletically fit. At some point in their studies, they discover that there are multiple options to cash out athletic fitness, e.g., the difference between marathon running vs. 100m-sprints. They may feel drawn to one of those options, or they may be indifferent.

Likewise, imagine that you became interested in moral philosophy after reading some moral arguments, such as Singer’s drowning child argument in Famine, Affluence and Morality. You developed the motivation to act morally as it became clear to you that, e.g., spending money on poverty reduction ranks “morally better” (in a sense that you care about) than spending money on a luxury watch. You continue to study morality. You become interested in contested subdomains of morality, like theories of well-being or population ethics. You experience some inner pressure to form opinions in those areas because when you think about various options and their implications, your mind goes, “Wow, these considerations matter.” As you learn more about metaethics and the option space for how to reason about morality, you begin to think that moral anti-realism is most likely true. In other words, you come to believe that there are likely different systematizations of “altruism/doing good impartially” that individual philosophically sophisticated reasoners will deem defensible. At this point, there are two options for how you might feel: either you’ll be undecided between theories, or you find that a specific moral view deeply appeals to you.

In the story I just described, your motivation to act morally comes from things that are very “emotionally and epistemically close” to you, such as the features of Peter Singer’s drowning child argument. Your moral motivation doesn’t come from conceptual analysis about “morality” as an irreducibly normative concept. (Some people do think that way, but this isn’t the story here!) It also doesn’t come from wanting other philosophical reasoners to necessarily share your motivation. Because we’re discussing a naturalist picture of morality, morality tangibly connects to your motivations. You want to act morally not “because it’s moral,” but because it relates to concrete things like helping people, etc. Once you find yourself with a moral conviction about something tangible, you don’t care whether others would form it as well.

I mean, you would care if you thought others not sharing your particular conviction was evidence that you’re making a mistake. If moral realism was true, it would be evidence of that. However, if anti-realism is indeed correct, then it wouldn’t have to weaken your conviction.

Critic: Why do some people form convictions and not others?

Me: It no longer feels like a choice when you see the option space clearly. You either find yourself having strong opinions on what to value (or how to morally reason), or you don’t.

The point I'm trying to make here is that people will have strong path-defining intuitions about population ethics for similar reasons to why they were strongly moved by the drowning child argument. When they contemplate why they get up in the morning, they might either find themselves motivated to make happy people, or they don't. Just like some people find the drowning child argument compelling as a reason to re-orient a lot of their lives, while others don't. It's the same type of "where motivation to form life goals comes from." See also my post here, in particular the subsection on "planning mode" that describes how I believe that people decide on adopting an identity around some specific life goal. (And the primary point there is that it's not all that different from how people make less-high-stakes decisions, such as what job to take or whether to go skiing on a weekend vs stay cozily at home.)

One underlying assumption in my thinking here is that when people say they have a confident view on population ethics because [insert some complicated philosophical argument], it's often that the true reason they have that view*** is some fundamental intuition about some pretty straightforward thought experiment, and the theory surrounding it is more like "extra furnishing of that intuition." 

***Instead of saying "that view," I should rather say "a view that has implications that place it in this broad family of views (e.g., 'downside-focused' vs not)." For people to come up with highly specific and bullet-biting views like "negative utilitarianism" or "classical hedonistic utilitarianism," they do have to engage in a lot of abstract theoretical reasoning. However, why is someone drawn to theories that say it's important to create happy people? I feel like you can often track this down to some location in the chain of arguments where there's a pretty straightforward thought experiment and the person goes "this is where I stand my ground, I won't accept that." And people stand their ground at very different points, and sometimes you have a dilemma where someone is like "I always found the left path intuitive" and the other person is like "the left path is absolutely horrible, and I believe that more confidently than I'd believe the merits of more abstract arguments." 

This comment I just made on Will Aldred's Long Reflection Reading List seems relevant for this topic. 

Overall, I'd say there's for sure going to be some degree of moral convergence, but it's often overstated, and whether the degree of convergence is strong enough to warrant going for the AI strategies you discuss in your subsequent posts (e.g., here) would IMO depend on a tricky weighting of risks and benefits (including the degree to which alternatives seem promising).

Does moral realism imply the convergent morality thesis? Not strictly, although it’s suggestive. And even if you believe both, presumably there’s some causal mechanism behind convergent morality. Personally, though, I find many intuitions that used to make me sympathetic to realism now make me sympathetic to the convergent morality thesis.

I agree with this endnote. 

For my anti-realism sequence, I've actually made the stylistic choice of defining (one version of) moral realism as implying moral convergence (at least under ideal reasoning circumstances). That's notably different from how philosophers typically define it. I went for my idiosnycratic definition because, when I tried to find out what are the action-guiding versions of moral realism (here), many ways in which philosophers have defined "moral realism" in the literature don't actually seem relevant for what we should do as effective altruists. I could only come up with two (very different!) types of moral realism that would have clear implications for effective altruism. 

(1) Non-naturalist moral realism based on the (elusive?) concept of irreducible normativity.

(2) Naturalist moral realism where the true morality is what people who are interested in "doing the most moral/altruistic thing" would converge on under ideal reflection conditions.

(See this endnote where I further justify my choice of (2) against some possible objections.)

I think (1) just doesn't work as a concept, and (2) is almost certainly false at least in its strongest form. But yeah, there's going to be degrees of convergence, and moral reflection (even at the individual level without convergence) is relevant also from within a moral anti-realist reasoning framework. 

Many of those posts in the list seem really relevant to me for the cluster of things you're pointing at!

On some of the philosophical background assumptions, I would consider adding my ambitiously-titled post The Moral Uncertainty Rabbit Hole, Fully Excavated. (It's the last post in my metaethics/anti-realism sequence.)

Since the post is long and it says that it doesn't work maximally well as a standalone piece (without two other posts from earlier in my sequence), it didn't get much engagement when I published it, so I feel like I should do some advertizing for it here.

As the title indicates, I'm trying to answer questions in that post that many EAs don't ask themselves because they think about moral uncertainty or moral reflection in an IMO somewhat lazy way.

The post starts with a conundrum for the concept of moral uncertainty: 

In an earlier post, I argued that moral uncertainty and confident moral realism don’t go together. Accordingly, if we’re morally uncertain, we must either endorse moral anti-realism or at least put significant credence on it.

This insight has implications because we're now conflating a few different things under the "moral uncertainty" label: 

  • Metaethical uncertainty (i.e., our remaining probability on moral realism) and the strength of possible wagers for acting as though moral realism is true even if our probability in it is low.
  • Uncertainty over the values we'd choose after long reflection (our "idealized values", which most people would be motivated to act upon even if moral realism is false).
  • Related to how we'd get to idealized values, the possibility of having under-defined values, i.e., the possibility that, because moral realism is false, even idealized moral reflection may lead to different endpoints based on very small changes to the procedure, or that a person's reflection doesn't "terminate" because their subjective feeling of uncertainty never goes away inside the envisioned reflection procedure.

My post is all about further elaborating on these distinctions and spelling out their implications for effective altruists.

I start out by introducing the notion of a moral reflection procedure to explain what moral reflection in an idealized setting could look like:

To specify the meaning of “perfectly wise and informed,” we can envision a suitable procedure for moral reflection that a person would hypothetically undergo. Such a reflection procedure comprises a reflection environment and a reflection strategy. The reflection environment describes the options at one’s disposal; the reflection strategy describes how a person would use those options.

Here’s one example of a reflection environment:

  • My favorite thinking environment: Imagine a comfortable environment tailored for creative intellectual pursuits (e.g., a Google campus or a cozy mansion on a scenic lake in the forest). At your disposal, you find a well-intentioned, superintelligent AI advisor fluent in various schools of philosophy and programmed to advise in a value-neutral fashion. (Insofar as that’s possible – since one cannot do philosophy without a specific methodology, the advisor must already endorse certain metaphilosophical commitments.) Besides answering questions, they can help set up experiments in virtual reality, such as ones with emulations of your brain or with modeled copies of your younger self. For instance, you can design experiments for learning what you'd value if you first encountered the EA community in San Francisco rather than in Oxford or started reading Derek Parfit or Peter Singer after the blog Lesswrong, instead of the other way around.[2] You can simulate conversations with select people (e.g., famous historical figures or contemporary philosophers). You can study how other people’s reflection concludes and how their moral views depend on their life circumstances. In the virtual-reality environment, you can augment your copy’s cognition or alter its perceptions to have it experience new types of emotions. You can test yourself for biases by simulating life as someone born with another gender(-orientation), ethnicity, or into a family with a different socioeconomic status. At the end of an experiment, your (near-)copies can produce write-ups of their insights, giving you inputs for your final moral deliberations. You can hand over authority about choosing your values to one of the simulated (near-)copies (if you trust the experimental setup and consider it too difficult to convey particular insights or experiences via text). Eventually, the person with the designated authority has to provide to your AI assistant a precise specification of values (the format – e.g., whether it’s a utility function or something else – is up to you to decide on). Those values then serve as your idealized values after moral reflection.

(Two other, more rigorously specified reflection procedures are indirect normativity and HCH.[3] Indirect normativity outputs a utility function whereas HCH attempts to formalize “idealized judgment,” which we could then consult for all kinds of tasks or situations.)[4]

“My favorite thinking environment” leaves you in charge as much as possible while providing flexible assistance. Any other structure is for you to specify: you decide the reflection strategy.[5] This includes what questions to ask the AI assistant, what experiments to do (if any), and when to conclude the reflection.

For reflection strategies (how to behave inside a reflection procedure), I discuss a continuum from "conservative" to "open-minded" reflection strategies.

Someone with a conservative reflection strategy is steadfast in their moral reasoning framework. ((What I mean by “moral-reasoning framework” is similar to what Wei Dai calls “metaphilosophy” – it implies having confidence in a particular metaphilosophical stance and using that stance to form convictions about one’s reasoning methodology or object-level moral views.)) They guard their opinions, which turns these into convictions (“convictions” being opinions that one safeguards against goal drift). At its extreme, someone with a maximally conservative reflection strategy has made up their mind and no longer benefits from any moral reflection. People can have moderately conservative reflection strategies where they have formed convictions on some issues but not others.

By contrast, people with open-minded moral reflection strategies are uncertain about either their moral reasoning framework or (at least) their object-level moral views. As the defining feature, they take a passive (“open-minded”) reflection approach focused on learning as much as possible without privileging specific views[7] and without (yet) entering a mode where they form convictions.

That said, “forming convictions” is not an entirely voluntary process – sometimes, we can’t help but feel confident about something after learning the details of a particular debate. As I’ll elaborate below, it is partly for this reason that I think no reflection strategy is inherently superior.

Comparing these two reflection strategies is a core theme of the post, and one takeaway I get to is that none of the two ends of the spectrum is superior to the other. Instead, I see moral reflection as a bit of an art, and we just have to find our personal point on the spectrum.

Relatedly, there's also the question of "What's the benefit of reflection now" vs. "how much do we want to just leave things to future selves or hypothetical future selves in a reflection procedure." (The point being that it is is not by-default obvious that moral reflection has to be postponed!)

Reflection procedures are thinking-and-acting sequences we'd undergo if we had unlimited time and resources. While we cannot properly run a moral reflection procedure right now in everyday life, we can still narrow down our uncertainty over the hypothetical reflection outcome. Spending time on that endeavor is worth it if the value of information – gaining clarity on one’s values – outweighs the opportunity cost from acting under one’s current (less certain) state of knowledge.

Gaining clarity on our values is easier for those who would employ a more conservative reflection strategy in their moral reflection procedure. After all, that means their strategy involves guarding some pre-existing convictions, which gives them advance knowledge of the direction of their moral reflection.[9]

By contrast, people who would employ more open-minded reflection strategies may not easily be able to move past specific layers of indecision. Because they may be uncertain how to approach moral reasoning in the first place, they can be “stuck” in their uncertainty. (Their hope is to get unstuck once they are inside the reflection procedure, once it becomes clearer how to proceed.)

[...]

If moral realism were true, the timing of that transition (“the reflection strategy becoming increasingly conservative as the person forms more convictions”) is obvious. It would happen once the person knows enough to see the correct answers, once they see the correct way of narrowing down their reflection or (eventually) the correct values to adopt at the very end of it.

In the moral realist picture, expressions like “safeguarding opinions” or “forming convictions” (which I use interchangeably) seem out of place. Obviously, the idea is to “form convictions” about only the correct principles!

However, as I’ve argued in previous posts, moral realism is likely false.

This is then followed by a discussion on whether "idealized values" are chosen or discovered.

Under moral anti-realism, there are two empirical possibilities[10] for “When is someone ready to form convictions?.” In the first possibility, things work similarly to naturalist moral realism but on a personal/subjectivist basis. We can describe this option as “My idealized values are here for me to discover.” By this, I mean that, at any given moment, there’s a fact of the matter to “What I’d conclude with open-minded moral reflection.” (Specifically, a unique fact – it cannot be that I would conclude vastly different things in different runs of the reflection procedure or that I would find myself indifferent about a whole range of options.)

The second option is that my idealized values aren’t “here for me to discover.” In this view, open-minded reflection is too passive – therefore, we have to create our values actively. Arguments for this view include that (too) open-minded reflection doesn’t reliably terminate; instead, one must bring normative convictions to the table. “Forming convictions,” according to this second option, is about making a particular moral view/outlook a part of one’s identity as a morality-inspired actor. Finding one’s values, then, is not just about intellectual insights.

I will argue that the truth is somewhere in between.

Why do I think this? There's more in my post, but here are some of the interesting bits, which seem especially relevant to the topic of "long reflection":

There are two reasons why I think open-minded reflection isn’t automatically best:

  1. We have to make judgment calls about how to structure our reflection strategy. Making those judgment calls already gets us in the business of forming convictions. So, if we are qualified to do that (in “pre-reflection mode,” setting up our reflection procedure), why can’t we also form other convictions similarly early?
  2. Reflection procedures come with an overwhelming array of options, and they can be risky (in the sense of having pitfalls – see later in this section). Arguably, we are closer (in the sense of our intuitions being more accustomed and therefore, arguably, more reliable) to many of the fundamental issues in moral philosophy than to matters like “carefully setting up a sequence of virtual reality thought experiments to aid an open-minded process of moral reflection.” Therefore, it seems reasonable/defensible to think of oneself as better positioned to form convictions about object-level morality (in places where we deem it safe enough).

Reflection strategies require judgment calls

In this section, I’ll elaborate on how specifying reflection strategies requires many judgment calls. The following are some dimensions alongside which judgment calls are required (many of these categories are interrelated/overlapping):

  • Social distortions: Spending years alone in the reflection environment could induce loneliness and boredom, which may have undesired effects on the reflection outcome. You could add other people to the reflection environment, but who you add is likely to influence your reflection (e.g., because of social signaling or via the added sympathy you may experience for the values of loved ones).
  • Transformative changes: Faced with questions like whether to augment your reasoning or capacity to experience things, there’s always the question “Would I still trust the judgment of this newly created version of myself?”
  • Distortions from (lack of) competition: As Wei Dai points out in this Lesswrong comment: “Current human deliberation and discourse are strongly tied up with a kind of resource gathering and competition.” By competition, he means things like “the need to signal intelligence, loyalty, wealth, or other ‘positive’ attributes.” Within some reflection procedures (and possibly depending on your reflection strategy), you may not have much of an incentive to compete. On the one hand, a lack of competition or status considerations could lead to “purer” or more careful reflection. On the other hand, perhaps competition functions as a safeguard, preventing people from adopting values where they cannot summon sufficient motivation under everyday circumstances. Without competition, people’s values could become decoupled from what ordinarily motivates them and more susceptible to idiosyncratic influences, perhaps becoming more extreme.
  • Lack of morally urgent causes: In the blogpost On Caring, Nate Soares writes: “It's not enough to think you should change the world — you also need the sort of desperation that comes from realizing that you would dedicate your entire life to solving the world's 100th biggest problem if you could, but you can't, because there are 99 bigger problems you have to address first.”
    In that passage, Soares points out that desperation can strongly motivate why some people develop an identity around effective altruism. Interestingly enough, in some reflection environments (including “My favorite thinking environment”), the outside world is on pause. As a result, the phenomenology of “desperation” that Soares described would be out of place. If you suffered from poverty, illnesses, or abuse, these hardships are no longer an issue. Also, there are no other people to lift out of poverty and no factory farms to shut down. You’re no longer in a race against time to prevent bad things from happening, seeking friends and allies while trying to defend your cause against corrosion from influence seekers. This constitutes a massive change in your “situation in the world.” Without morally urgent causes, you arguably become less likely to go all-out by adopting an identity around solving a class of problems you’d deem urgent in the real world but which don’t appear pressing inside the reflection procedure. Reflection inside the reflection procedure may feel more like writing that novel you’ve always wanted to write – it has less the feel of a “mission” and more of “doing justice to your long-term dream.”[11]
  • Ordering effects: The order in which you learn new considerations can influence your reflection outcome. (See page 7 in this paper. Consider a model of internal deliberation where your attachment to moral principles strengthens whenever you reach reflective equilibrium given everything you already know/endorse.)
  • Persuasion and framing effects: Even with an AI assistant designed to give you “value-neutral” advice, there will be free parameters in the AI’s reasoning that affect its guidance and how it words things. Framing effects may also play a role when interacting with other humans (e.g., epistemic peers, expert philosophers, friends, and loved ones).

Pitfalls of reflection procedures

There are also pitfalls to avoid when picking a reflection strategy. The failure modes I list below are avoidable in theory,[12] but they could be difficult to avoid in practice:

  • Going off the rails: Moral reflection environments could be unintentionally alienating (enormous option space; time spent reflecting could be unusually long). Failure modes related to the strangeness of the moral reflection environment include existential breakdown and impulsively deciding to lock in specific values to be done with it.
  • Issues with motivation and compliance: When you set up experiments in virtual reality, the people in them (including copies of you) may not always want to play along.
  • Value attacks: Attackers could simulate people’s reflection environments in the hope of influencing their reflection outcomes.
  • Addiction traps: Superstimuli in the reflection environment could cause you to lose track of your goals. For instance, imagine you started asking your AI assistant for an experiment in virtual reality to learn about pleasure-pain tradeoffs or different types of pleasures. Then, next thing you know, you’ve spent centuries in pleasure simulations and have forgotten many of your lofty ideals.
  • Unfairly persuasive arguments: Some arguments may appeal to people because they exploit design features of our minds rather than because they tell us  “What humans truly want.” Reflection procedures with argument search (e.g., asking the AI assistant for arguments that are persuasive to lots of people) could run into these unfairly compelling arguments. For illustration, imagine a story like “Atlas Shrugged” but highly persuasive to most people. We can also think of “arguments” as sequences of experiences: Inspired by the Narnia story, perhaps there exists a sensation of eating a piece of candy so delicious that many people become willing to sell out all their other values for eating more of it. Internally, this may feel like becoming convinced of some candy-focused morality, but looking at it from the outside, we’ll feel like there’s something problematic about how the moral update came about.)
  • Subtle pressures exerted by AI assistants: AI assistants trained to be “maximally helpful in a value-neutral fashion” may not be fully neutral, after all. (Complete) value-neutrality may be an illusory notion, and if the AI assistants mistakenly think they know our values better than we do, their advice could lead us astray. (See Wei Dai’s comments in this thread for more discussion and analysis.)

Conclusion: “One has to actively create oneself”

“Moral reflection” sounds straightforward – naively, one might think that the right path of reflection will somehow reveal itself. However, as we think of the complexities of setting up a suitable reflection environment and how we’d proceed inside it, what it would be like and how many judgment calls we’d have to make, we see that things can get tricky.

Joe Carlsmith summarized it as follows in an excellent post (what Carlsmith calls “idealizing subjectivism” corresponds to what I call “deferring to moral reflection”):

>My current overall take is that especially absent certain strong empirical assumptions, >idealizing subjectivism is ill-suited to the role some hope it can play: namely, providing >a privileged and authoritative (even if subjective) standard of value. Rather, the >version of the view I favor mostly reduces to the following (mundane) observations:

  • If you already value X, it’s possible to make instrumental mistakes relative to X.
  • You can choose to treat the outputs of various processes, and the attitudes of various hypothetical beings, as authoritative to different degrees.

>This isn’t necessarily a problem. To me, though, it speaks against treating your >“idealized values” the way a robust meta-ethical realist treats the “true values.” That is, >you cannot forever aim to approximate the self you “would become”; you must actively >create yourself, often in the here and now. Just as the world can’t tell you what to >value, neither can your various hypothetical selves — unless you choose to let them. Ultimately, it’s on you.

In my ((Lukas's)) words, the difficulty with deferring to moral reflection too much is that the benefits of reflection procedures (having more information and more time to think; having access to augmented selves, etc.) don’t change what it feels like, fundamentally, to contemplate what to value. For all we know, many people would continue to feel apprehensive about doing their moral reasoning “the wrong way” since they’d have to make judgment calls left and right. Plausibly, no “correct answers” would suddenly appear to us. To avoid leaving our views under-defined, we have to – at some point – form convictions by committing to certain principles or ways of reasoning. As Carslmith describes it, one has to – at some point – “actively create oneself.” (The alternative is to accept the possibility that one’s reflection outcome may be under-defined.)

It is possible to delay the moment of “actively creating oneself” to a time within the reflection procedure. (This would correspond to an open-minded reflection strategy; there are strong arguments to keep one’s reflection strategy at least moderately open-minded.) However, note that, in doing so, one “actively creates oneself” as someone who trusts the reflection procedure more than one’s object-level moral intuitions or reasoning principles. This may be true for some people, but it isn’t true for everyone. Alternatively, it could be true for someone in some domains but not others.[13]

I further discuss the notion of "having under-defined values." This happens if someone defers to moral reflection with the expectation that it'll terminate with a specific answer, but they're pre-disposed to following reflection strategies that are open-ended enough so that the reflection will, in practice, have under-defined outcomes.

Having under-defined values isn't necessarily a problem – I discuss the pros and cons of it in the post.

Towards the end of the post, there's a section where I discuss the IMO most sophisticated wager for "acting as though moral realism is true" (the wager for naturalist moral realism, rather than the one for non-naturalist/irreducible-normativity-based moral realism which I discussed earlier in my sequence). In that discussion, I conclude that this naturalist moral realism wager actually often doesn't overpower what we'd do anyway under anti-realism. (The reasoning here is that naturalist moral realism feels somewhat watered down compared to non-naturalist moral realism, so that it's actually "built on the same currency" as how we'd anyway structure our reasoning under moral anti-realism. Consequently, whether naturalist moral realism is true isn't too different from the question of whether idealized values are chosen or discovered – it's just that now we're also asking about the degree of moral convergence between different people's reflection.)

Anyway, that section is hard to summarize, so I recommend just reading it in full in the post (it has pictures and a fun "mountain analogy.")

Lastly, I end the post with some condensed takeaways in the form of advice for someone's moral reflection:

Selected takeaways: good vs. bad reasons for deferring to (more) moral reflection

To list a few takeaways from this post, I made a list of good and bad reasons for deferring (more) to moral reflection. (Note, again, that deferring to moral reflection comes on a spectrum.)

In this context, it’s important to note that deferring to moral reflection would be wise if moral realism is true or if idealized values are ((on the far end of the spectrum of)) “here for us to discover.” In this sequence, I argued that neither of those is true – but some (many?) readers may disagree.

Assuming that I’m right about the flavor of moral anti-realism I’ve advocated for in this sequence, below are my “good and bad reasons for deferring to moral reflection.”

(Note that this is not an exhaustive list, and it’s pretty subjective. Moral reflection feels more like an art than a science.)

Bad reasons for deferring strongly to moral reflection:

  • You haven’t contemplated the possibility that the feeling of “everything feels a bit arbitrary; I hope I’m not somehow doing moral reasoning the wrong way” may never go away unless you get into a habit of forming your own views. Therefore, you never practiced the steps that could lead to you forming convictions. Because you haven’t practiced those steps, you assume you’re far from understanding the option space well enough, which only reinforces your belief that it’s too early for you to form convictions.
  • You observe that other people’s fundamental intuitions about morality differ from yours. You consider that an argument for trusting your reasoning and your intuitions less than you otherwise would. As a result, you lack enough trust in your reasoning to form convictions early.
  • You have an unreflected belief that things don’t matter if moral anti-realism is true. You want to defer strongly to moral reflection because there’s a possibility that moral realism is true. However, you haven’t thought about the argument that naturalist moral realism and moral anti-realism use the same currency, i.e., that the moral views you’d adopt if moral anti-realism were true might matter just as much to you.

Good reasons for deferring strongly to moral reflection:

  • You don’t endorse any of the bad reasons, and you still feel drawn to deferring to moral reflection. For instance, you feel genuinely unsure how to reason about moral views or what to think about a specific debate (despite having tried to form opinions).
  • You think your present way of visualizing the moral option space is unlikely to be a sound basis for forming convictions. You suspect that it is likely to be highly incomplete or even misguided compared to how you’d frame your options after learning more science and philosophy inside an ideal reflection environment.

Bad reasons for forming some convictions early:

  • You think moral anti-realism means there’s no for-you-relevant sense in which you can be wrong about your values.
  • You think of yourself as a rational agent, and you believe rational agents must have well-specified “utility functions.” Hence, ending up with under-defined values (which is a possible side-effect of deferring strongly to moral reflection) seems irrational/unacceptable to you.

Good reasons for forming some convictions early:

  • You can’t help it, and you think you have a solid grasp of the moral option space (e.g., you’re likely to pass Ideological Turing tests of some prominent reasoners who conceptualize it differently).
  • You distrust your ability to guard yourself against unwanted opinion drift inside moral reflection procedures ((if you were to follow a more open-minded reflection strategy)), and the views you already hold feel too important to expose to that risk.
  1. and 2. seem very similar to me. I think it's something like that.

The way I envision him (obviously I don't know and might be wrong):

  • Genuinely cares about safety and doing good.
  • Also really likes the thought of having power and doing earth-shaking stuff with powerful AI.
  • Looks at AI risk arguments with a lens of motivated cognition influenced by the bullet point above.
  • Mostly thinks things will go well, but this is primarily from an instinctive feel of a high-energy CEO, who are predominantly personality-selected for optimistic attitudes. If he were to really sit down and try to introspect on his views on the question (and stare into the abyss), as a very smart person, he might find that he thinks things might well go poorly, but then thoughts come up like "ehh, if I can't make AI go well, others probably can't either, and it's worth the risk especially because things could be really cool for a while or so before it all ends."
  • If he ever has thoughts like "Am I one of the bad guys here?," he'll shrug them off with "nah" rather than having the occasional existential crises and self-doubts around that sort of thing.
  • He maybe has no stable circle of people to whom he defers on knowledge questions; that is, no one outside himself he trusts as much as himself. He might say he updates to person x or y and considers them smarter than himself/better forecasters, but in reality, he "respects" whoever is good news for him as long as they are good news for him. If he learns that smart people around him are suddenly confident that what he's doing is bad, he'll feel system-1 annoyed at them, which prompts him to find reasons to now disagree with them and no longer consider them included in his circle of epistemic deference. (Maybe this trait isn't black and white; there's at least some chance that he'd change course if 100% of people he at one point in time respects spoke up against his plan all at once.)
  • Maybe doesn't have a lot of mental machinery built around treating it as a sacred mission to have true beliefs, so he might say things about avoiding hardware overhang as an argument for OpenAI's strategy and then later do something that seemingly contradicts his previous stance, because he was using arguments that felt like they'd fit but without really thinking hard about them and building a detailed model for forecasting that he operates from for every such decision.

Related to your point 1 : 

I think one concrete complexity-increasing ingredient that many (but not all) people would want in a utopia is for one's interactions with other minds to be authentic – that is, they want the right kind of "contact with reality."

So, something that would already seem significantly suboptimal (to some people at least) is lots of private experience machines where everyone is living a varied and happy life, but everyone's life in the experience machines follows pretty much the same template and other characters in one's simulation aren't genuine, in the sense that they don't exist independently of one's interaction with them (meaning that your simulation is solipsistic and other characters in your simulation may be computed to be the most exciting response to you, but their memories from "off-screen time" are fake). So, while this scenario would already be a step upwards from "rats on heroin"/"brains in a vat with their pleasure hotspots wire-headed," it's still probably not the type of utopia many of us would find ideal. Instead, as social creatures who value meaning, we'd want worlds (whether simulated/virtual or not doesn't seem to matter) where the interactions we have with other minds are genuine. That these other minds wouldn't just be characters programmed to react to us, but real minds with real memories and "real" (as far as this is a coherent concept) choices. Utopian world setups that allow for this sort of "contact with reality" presumably cannot be packed too tightly with sentient minds.

By contrast, things seem different for dystopias, which can be packed tightly. For dystopias, it matters less whether they are repetitive, whether they're lacking in options/freedom, or whether they have solipsistic aspects to them. (If anything, those features can make a particular dystopia more horrifying.)

To summarize, here's an excerpt from my post on alignment researchers arguably having a comparative advantage in reducing s-risks:

Asymmetries between utopia and dystopia. It seems that we can “pack” more bad things into dystopia than we can “pack” good things into utopia. Many people presumably value freedom, autonomy, some kind of “contact with reality.” The opposites of these values are easier to implement and easier to stack together: dystopia can be repetitive, solipsistic, lacking in options/freedom, etc. For these reasons, it feels like there’s at least some type of asymmetry between good things and bad things – even if someone were to otherwise see them as completely symmetric.

Here are (finally) some thoughts:

  • Owen clearly doesn't fit the pattern of grandiose narcissism or sociopathy. I could say more about this but I doubt it's anyone's crux, and I prefer to not spend too much time on this.
  • Next to grandiose narcissism or sociopathy, there are other patterns how people can systematically cause harm to others. I'm mostly thinking of "harm through negligence" rather than with intent (but this isn't to say that grandiose narcissists cause all their harm fully-consciously). Anyway, many of these other patterns IMO involve having a bad theory of mind at least in certain domains. And we've seen that Owen has had this. However, I think it only becomes really vexed/hard to correct if someone (1) lacks a strong desire to improve their understanding of others so as to (i.e., with the prosocial goal being the primary motivation) avoid harming them/to make them more comfortable, or (2) if they are hopelessly bad at improving their understanding of others for reasons other than lacking such a desire in the first place.
  • On (1), I'm confident that Owen has a strong desire to improve his understanding of others so as to avoid harming them and make people positively comfortable. I don't remember the specifics, but I remember thinking that he's considerate in a way that many people aren't, which suggests that other people's feelings and comfort is often on his mind (edit) at least in some contexts. (E.g., in conversations about research, he'd often ask if what he's been saying so far is still helpful, or if we should move on to other questions. But more importantly, I think I also noticed signs of scrupulosity related to how he picks words carefully to make sure he doesn't convey the wrong thing, which I think is linked with not wanting to be bad socially or not wanting to come to the wrong conclusions about research relevant to the other party's path to impact.) Sure, people who are scared of social rejection can also be hypervigilant like that, and sometimes it's more people-pleasing than sincere concern, but I also felt like I picked up on "he's sincerely trying to be helpful" when talking to him. Though this is more of an intuition-based judgment than anything where I can say "this specific thing I've observed is the reason." (Edit: Actually, one concrete thing that comes to mind is that he was among the very few people who said things about s-risks that made my research priorities seem less important, so he was honest about this in a way that exposes him to me thinking less of him if I were the sort of person who would take stuff like this personally.) Lastly, I think the apology really speaks to all this as well. (That said, I guess someone with cynical priors could point out that the apology might have been written in a very different way if it wasn't for showing it to friends – I doubt that it would be fundamentally different, and in fact I don't even know if he showed it to other people before posting [though I think that would be the wise thing to do]. In any case, I agree it's important to look at whether what people say in their apology is consistent with what we know about them from other situations; anyway, my answer to that is "yes, feels very consistent.")
  • Regarding (2), I have no doubts that Owen can greatly improve his understanding of others. He seems among the most "interested in introspection and analyzing social stuff" men that I've met, and he's very intelligent, so it's not like he lacks the cognitive abilities or interest to improve.
  • This leaves us with "are there other character obstacles that we should expect that would stop him from improving sufficiently?." The other negative patterns that I can think of in this domain are:
    (a.) Extreme entitlement;
    (b.) being very bad at taking feedback to heart because one "flinches away" from bad parts of one's psychology, to the point that all one's introspection is premature and always an exercise in ego protection;
    (c.) issues with externalizing shame/bad feelings, such as (e.g.) an underlying drive to emotionally control others or "drag them down to one’s level" when one is feeling bad;
    (d.) domain-specific tendency to form strong delusional beliefs, like believing that people who aren't attracted to you are attracted to you, and keeping these even in the light of clear counterevidence.
    Of the above, I think (c.) doesn't fit the pattern of observed harm in this case here (this would be more relevant in people who make false accusations), so let's focus on (a.), (c.) and (d.)
  • Regarding (a.) ("extreme entitlement"), I haven’t observed anything that would make me worry that Owen is very entitled. Admittedly, I only know him from a few professional (or semi-professional – e.g., talking in a small group of people after a retreat) context occasions, and I'm not a woman he's attracted to, so I may not have seen all sides. Still, for further consistency of what I already thought about him before all of this based on my intuitions from meeting him, I note that having extreme entitlement would be in tension with not causing bigger issues after being rejected (and it seems like he faced a lot of rejection in these cases), nor is it consistent with writing an apology where the fault isn't placed on other people.
  • Regarding (b.) ("a pattern of habitually flinching away from self-critical thoughts"), my best guess is that Owen’s interest in introspection is too deep for a person with this issue. (Edit: Also, he partly felt driven to mention his attraction because there was shame attached to it, and even though this made him maybe not think clearly about everything, it's not like, if I understand correctly, that he flinched away from mentioning or at least noticing the shame altogether.) The people I know who exhibited this "flinching away" pattern to a severe degree seemed uncomfortable with serious introspection in the first place. They sometimes were very quick to apologize, but they seemed to distort what they were accused of to a point where they were only apologizing for things that are easy to apologize to. By contrast, I feel like Owen's apology admits a bunch of things about himself that aren't easy to say, so it feels like genuine self-work went into it.
  • Regarding (d.) ("motivated cognition dialed up into proper delusions"), I get that people are concerned about this when they read the accounts of what happened. I am too, a little bit. However, I think that it's not super uncommon to not pick up on people feeling uncomfortable when they try to hint at this discreetly. I also think the instances where Owen kept talking to someone about topics they expressed they wanted him to stop talking about are at most only one or two instances? I feel like if he were severely deluded in the sense of flat-out not cognitively accepting when his advances are rejected, he'd have kept talking to people a lot more than what he actually did? So, I think the levels of motivated cognition here were most likely "bad, but not too far out of the ordinary" (i.e., nowhere near "stalker levels"), so I'm not too worried about this for the future. We should also keep in mind how this was a big event in Owen's life he's unlikely to forget, and how much updating he's probably been doing (and he also said he's been discussing this stuff with a therapist).
  •  Lastly, something that goes into my assessment of (d.) that I could imagine other people missing is that I feel I have a good model of what went wrong. In that model, it's not necessary for (d.) (or anything else that would spell trouble) to be pathological about Owen's psychology. Instead, I think what went poorly is primarily explained by (i) bad theory of mind and (ii) character-related scrupulosity that led to persistently bringing up things that were no longer appropriate (or were never appropriate in the first place, such as with some of the comments or locations where things were brought up).
  • [Now describing my model.] When people read Owen's account of why he did what he did, one reaction they might have is the following. "So, okay, he felt attracted to someone, he worried that this attraction would make him a bad person, he wanted to get feedback to reassure himself that he isn't bad for this. So far, so good. But then, out of everyone he could've picked for that conversation, why on earth would he pick the person he's attracted to to discuss this with? Why not discuss it with anyone else? Doesn't that mean there must be some more sinister underlying motive at play? How could 'wanting to be reassured that he isn't bad' be anything but an excuse to make repeated advances?" My reaction to that interpretation is "people who think this are probably missing something about what it's like to have character-related scrupulosity."
  •  I'm only speculating here, but it feels like the sort of thing I'm likely right about at least directionally. Namely, my guess is that Owen, over the five or so years that this is about, confessed his attraction also to the women he was attracted to (instead of only seeking out other people to discuss his feelings with) and sought absolution from them partly because he cared particularly about what these women think about him, which was related to him being attracted to them in the first place (meaning he admired them character-wise).
  • [Edit: paragraph added roughly 8h after posting the comment] Owen reached out to me (first time we communicated in several years) after I wrote this comment and said he felt seen by this paragraph (the one right above), but that I was missing a further factor. He felt – at the time – like the women in question had moral authority on this topic. Quoting: "If they didn't mind my attraction, then by their judgement my attraction wasn't bad. If they did mind it, it was bad (and I should take therefore take further internal steps to suppress the feeling)." Don't take from this that Owen necessarily agrees with all the other descriptions in my long comment, but I wanted to highlight this bit in particular because it sheds more light on why he talked about his attraction to the women directly. I didn't think of this possibility independently, but it seems credible/consistent to me. (I got permission from Owen to share.)
  • In some men, it's common to feel like the ultimate judge of your character is a woman with desirable character qualities herself. If you have a strong desire to be trusted and accepted for who you are, not for who others think you are but for who you actually are, it makes sense that you overshare weird details about yourself that you feel the least comfortable about. And someone with obsessive tendencies in that area may make the mistake of doing this too often or in contexts where it isn't appropriate. This is a common motivation (I know because I have the same feelings around some of this), and it IMO explains a lot of what happened, so it's not like we need to postulate other outlier-y things (extreme entitlement, extreme propensity to form delusional beliefs, etc.) to explain why Owen did what he did. (Other than poor theory of mind and common levels of motivated cognition, I mean.)
  • Lastly, and ironically, I think part of what caused things to go wrong here is actually protective for avoiding bad outcomes in the future. It makes it so much easier to evaluate someone's character if the person proactively gives you lots of information about it and if one of their primary drives seems to be to help you with that. It serves to rule out a lot of ways someone's character could be, but isn't. (Like, I’m pretty sure if Owen had had any sinister motives besides just having a crush on the women, he'd have confessed those other motives as well to the women in question, and we'd have a bigger scandal.) I think this is a positive thing and one reason I'm drawn to defending Owen here as someone who doesn't know him super well (without saying that he didn't do anything wrong) is because I can easily imagine how other people would react in similar situations, and I want to flag that I like it when people make it easier for others to evaluate them.
  • A caveat here is that, as we've seen, a desire to be trusted and act so as to earn trust by proactively doing lots of introspection and sharing negative information by no means implies that someone is free from massive blind spots or self-deception. There's even a hypothesis that people who practice radical honesty and transparency thereby dial up their self-deception – see Holly Elmore’s post on privacy, which seems relevant here, though (a) "making oneself transparent on matters relevant to trust" isn't exactly the same as "declining to have any sort of privacy in any context," and so, (b), I'm not sure I'd say that self-deception is necessarily dialed up in all instances of trying to do the former. Still, it seems plausible that it could come with that sort of risk.) Anyway, despite the concern that self-deception and blind spots remain very much possible/to be expected, I think that people who try to make themselves transparent to have their character more easily evaluated are in fact easier to evaluate for good character than people who don't do that.

I'm not sure why your comment was downvoted. I think it's a perfectly reasonable request since, as you say correctly in other comments, people who don't know enough to form their own opinion can't just trust that other forum commenters with direct opinions are well-calibrated/have decent people judgment about this.

I started writing down some points, but it's not easy and I don't want to do it in a half-baked fashion and then have readers go "oh, those data points and interpretations all sound pretty spurious, if that's all you have, it seems weird that you'd even voice an opinion." It's often hard to put in words convincingly why you believe something about someone.

I might still get around to finishing the comment at some point in the next few days, but don't count on it.

I agree that the women affected are what this is primarily about. But there's also an issue with not wanting to ascribe to anyone how we think they likely feel, without knowing much about them. Like, maybe at least some of the women who had negative experiences have nuanced feelings that aren't best described as "I feel bad/invalidated whenever I see someone say positive things about Owen, even if they take care to not thereby downplay that the things he did weren't acceptable." Maybe some feel things like, "this stuff was messed up and really needed to be dealt with, and it sucks that it took so long/seemed like initially it wasn't going to be dealt with, but it seems like things are developing in a good direction now." Or maybe not! Maybe they're still super upset and wish that Owen never re-enter the community again. That would be their right and seems understandable, too. In any case, the way I see it, we don't know at this point (at least I don't), and while I agree that it's important to create encouraging incentives so people will be likely to report future instances of misconduct, I don't think this requires a policy of "avoid at all costs saying things that might make someone who was affected uncomfortable." (In fact, there's also a risk of making people less likely to report uncomfortable experiences if they worry that there'll be a community overreaction. That's not the first thing I'd worry about, to be clear; I'm just pointing out that this could happen/be someone's reason to be hesitant about speaking up about something.) Personally, the message I find most important is something like: I want us to take seriously that it's unacceptable for people to predictably be at risk to have bad experiences like that in the EA community, and the community/Community Health takes this seriously and takes appropriate action.

I expressed support of another person's comment that contained many positive points about Owen. I hope that no one feels like this means I'm "on Owen's side" rather than on the side of the people who brought up these complaints. Owen seems to largely agree about the facts of what happened, and he seems genuinely committed to making sure similar stuff doesn't happen again, plus he accepted the consequences (stepping down, two-year ban). These features of the situation IMO make it possible to not have to view this as "either you support the victims, or you can say a redeeming thing or two about Owen."

This option of "not thinking of the situation as one where it's about picking sides" isn't always available. If a person accused of causing harm goes DARVO and accuses the alleged victims to be malefactors in return who make up false stories, then one is forced to either side with the alleged victims, or with the accused. Similarly, sometimes someone does something that's immediately strong evidence that they are operating without even a desire to respect others (Milena Cenzler's comment originally contained a hyperlink to the case of Brock Turner, who sexually assaulted an unconscious woman, as an example of how comments by supporters can sometimes be re-traumatizing to the victims). In those cases, it also seems to me like one can't say much that's redeeming about the person who caused harm without this being disrespectful towards the victims, both because of the severity of what they faced and because there's not much redeeming you can say about someone who even lacks a basic desire/intent to respect others. But that sort of case has very different features from "confessed feeling attracted out of scrupulosity and misguided desire to get moral absolution from the people one is attracted to for not having to feel bad about the attraction." It's a very different thing. 
(Edit) Lastly, I guess sometimes someone can be a skilled manipulator and seem remorseful and accept consequences but downplay the extent of the harm and downplay their "bad character." If Owen were like that and people who said positive things about him just fell victim to his charm, that could also be invalidating for the people who were harmed. I don't think that's the case, but this would be another situation we want to try to avoid, so I feel like the people who say positive things about Owen have a responsibility to consider the possibility "am I being manipulated?." 

I agree with those points and they seem important.

I didn't write this further above, but thinking about it now, I think there was also another dimension that fed into me thinking of this case as "atypical." (But maybe this isn't the best wording and these things are more typical than we think, but what I'm trying to gesture at is "the sort of thing that has high chances of getting fixed.") In any case, when I think of cases of "harm through neglect," where someone isn't ill-intentioned but still has a pattern of making others uncomfortable, some cases that come to mind are with people who are kind of hopeless and their personality and psychology seems tragic and like they are unlikely to improve without excessive amount of supervision/handholding and fix all the stuff that is at risk of causing harm in different ways. Importantly, Owen very much doesn't seem to me like that either.* So, according to my interpretation and guesses, there is indeed less potential for future harm than in many other conceivable cases where someone e.g., received a two-year ban for making people uncomfortable. 

It's good that you made this point because I agree we shouldn't place too much importance on the "intentional harm vs unintentional harm" distinction. Instead, I think what matters if people overall have a prosocially-oriented and corrigible cognition. 

*That said, I acknowledge that Owen needed more than just the initial pointer from the first time he was approached by Community Health, so I'm not saying this is the most obvious call in the world. It would be a longer topic to elaborate on why I feel confidently optimistic, but people can read for themselves the document he wrote on all of this and see how much (if at all) it makes them feel reassured about steps Owen has taken since then and how much it seems he now has insight into what went wrong and why he did what he did, etc. 

I view power differentials, workplace dating, etc., as something that's risky/delicate, but it can be fine if done carefully. Even if something goes poorly in one instance, it doesn't necessarily mean that a person did something immoral.

However, when there's a pattern of several people complaining, that's indicative of some kind of problem.

It means likely that either a person was particularly likely to make people really uncomfortable with their advances when they made them, or that the person made a ton of advances in professional contexts (and a small portion of them left people unusually uncomfortable). I think both of these would be bad, for different reasons. 

(Why is bad to make tons of careful advances? I feel like it's bad because it reflects not taking seriously the view that one's prior should be against it being a good idea, especially if your professional context is about having impact rather than a means for getting romance or sex.)

Load more