LG

Lukas_Gloor

6171 karmaJoined Jan 2015

Sequences
1

Moral Anti-Realism

Comments
494

  1. and 2. seem very similar to me. I think it's something like that.

The way I envision him (obviously I don't know and might be wrong):

  • Genuinely cares about safety and doing good.
  • Also really likes the thought of having power and doing earth-shaking stuff with powerful AI.
  • Looks at AI risk arguments with a lens of motivated cognition influenced by the bullet point above.
  • Mostly thinks things will go well, but this is primarily from an instinctive feel of a high-energy CEO, who are predominantly personality-selected for optimistic attitudes. If he were to really sit down and try to introspect on his views on the question (and stare into the abyss), as a very smart person, he might find that he thinks things might well go poorly, but then thoughts come up like "ehh, if I can't make AI go well, others probably can't either, and it's worth the risk especially because things could be really cool for a while or so before it all ends."
  • If he ever has thoughts like "Am I one of the bad guys here?," he'll shrug them off with "nah" rather than having the occasional existential crises and self-doubts around that sort of thing.
  • He maybe has no stable circle of people to whom he defers on knowledge questions; that is, no one outside himself he trusts as much as himself. He might say he updates to person x or y and considers them smarter than himself/better forecasters, but in reality, he "respects" whoever is good news for him as long as they are good news for him. If he learns that smart people around him are suddenly confident that what he's doing is bad, he'll feel system-1 annoyed at them, which prompts him to find reasons to now disagree with them and no longer consider them included in his circle of epistemic deference. (Maybe this trait isn't black and white; there's at least some chance that he'd change course if 100% of people he at one point in time respects spoke up against his plan all at once.)
  • Maybe doesn't have a lot of mental machinery built around treating it as a sacred mission to have true beliefs, so he might say things about avoiding hardware overhang as an argument for OpenAI's strategy and then later do something that seemingly contradicts his previous stance, because he was using arguments that felt like they'd fit but without really thinking hard about them and building a detailed model for forecasting that he operates from for every such decision.

Related to your point 1 : 

I think one concrete complexity-increasing ingredient that many (but not all) people would want in a utopia is for one's interactions with other minds to be authentic – that is, they want the right kind of "contact with reality."

So, something that would already seem significantly suboptimal (to some people at least) is lots of private experience machines where everyone is living a varied and happy life, but everyone's life in the experience machines follows pretty much the same template and other characters in one's simulation aren't genuine, in the sense that they don't exist independently of one's interaction with them (meaning that your simulation is solipsistic and other characters in your simulation may be computed to be the most exciting response to you, but their memories from "off-screen time" are fake). So, while this scenario would already be a step upwards from "rats on heroin"/"brains in a vat with their pleasure hotspots wire-headed," it's still probably not the type of utopia many of us would find ideal. Instead, as social creatures who value meaning, we'd want worlds (whether simulated/virtual or not doesn't seem to matter) where the interactions we have with other minds are genuine. That these other minds wouldn't just be characters programmed to react to us, but real minds with real memories and "real" (as far as this is a coherent concept) choices. Utopian world setups that allow for this sort of "contact with reality" presumably cannot be packed too tightly with sentient minds.

By contrast, things seem different for dystopias, which can be packed tightly. For dystopias, it matters less whether they are repetitive, whether they're lacking in options/freedom, or whether they have solipsistic aspects to them. (If anything, those features can make a particular dystopia more horrifying.)

To summarize, here's an excerpt from my post on alignment researchers arguably having a comparative advantage in reducing s-risks:

Asymmetries between utopia and dystopia. It seems that we can “pack” more bad things into dystopia than we can “pack” good things into utopia. Many people presumably value freedom, autonomy, some kind of “contact with reality.” The opposites of these values are easier to implement and easier to stack together: dystopia can be repetitive, solipsistic, lacking in options/freedom, etc. For these reasons, it feels like there’s at least some type of asymmetry between good things and bad things – even if someone were to otherwise see them as completely symmetric.

Here are (finally) some thoughts:

  • Owen clearly doesn't fit the pattern of grandiose narcissism or sociopathy. I could say more about this but I doubt it's anyone's crux, and I prefer to not spend too much time on this.
  • Next to grandiose narcissism or sociopathy, there are other patterns how people can systematically cause harm to others. I'm mostly thinking of "harm through negligence" rather than with intent (but this isn't to say that grandiose narcissists cause all their harm fully-consciously). Anyway, many of these other patterns IMO involve having a bad theory of mind at least in certain domains. And we've seen that Owen has had this. However, I think it only becomes really vexed/hard to correct if someone (1) lacks a strong desire to improve their understanding of others so as to (i.e., with the prosocial goal being the primary motivation) avoid harming them/to make them more comfortable, or (2) if they are hopelessly bad at improving their understanding of others for reasons other than lacking such a desire in the first place.
  • On (1), I'm confident that Owen has a strong desire to improve his understanding of others so as to avoid harming them and make people positively comfortable. I don't remember the specifics, but I remember thinking that he's considerate in a way that many people aren't, which suggests that other people's feelings and comfort is often on his mind (edit) at least in some contexts. (E.g., in conversations about research, he'd often ask if what he's been saying so far is still helpful, or if we should move on to other questions. But more importantly, I think I also noticed signs of scrupulosity related to how he picks words carefully to make sure he doesn't convey the wrong thing, which I think is linked with not wanting to be bad socially or not wanting to come to the wrong conclusions about research relevant to the other party's path to impact.) Sure, people who are scared of social rejection can also be hypervigilant like that, and sometimes it's more people-pleasing than sincere concern, but I also felt like I picked up on "he's sincerely trying to be helpful" when talking to him. Though this is more of an intuition-based judgment than anything where I can say "this specific thing I've observed is the reason." (Edit: Actually, one concrete thing that comes to mind is that he was among the very few people who said things about s-risks that made my research priorities seem less important, so he was honest about this in a way that exposes him to me thinking less of him if I were the sort of person who would take stuff like this personally.) Lastly, I think the apology really speaks to all this as well. (That said, I guess someone with cynical priors could point out that the apology might have been written in a very different way if it wasn't for showing it to friends – I doubt that it would be fundamentally different, and in fact I don't even know if he showed it to other people before posting [though I think that would be the wise thing to do]. In any case, I agree it's important to look at whether what people say in their apology is consistent with what we know about them from other situations; anyway, my answer to that is "yes, feels very consistent.")
  • Regarding (2), I have no doubts that Owen can greatly improve his understanding of others. He seems among the most "interested in introspection and analyzing social stuff" men that I've met, and he's very intelligent, so it's not like he lacks the cognitive abilities or interest to improve.
  • This leaves us with "are there other character obstacles that we should expect that would stop him from improving sufficiently?." The other negative patterns that I can think of in this domain are:
    (a.) Extreme entitlement;
    (b.) being very bad at taking feedback to heart because one "flinches away" from bad parts of one's psychology, to the point that all one's introspection is premature and always an exercise in ego protection;
    (c.) issues with externalizing shame/bad feelings, such as (e.g.) an underlying drive to emotionally control others or "drag them down to one’s level" when one is feeling bad;
    (d.) domain-specific tendency to form strong delusional beliefs, like believing that people who aren't attracted to you are attracted to you, and keeping these even in the light of clear counterevidence.
    Of the above, I think (c.) doesn't fit the pattern of observed harm in this case here (this would be more relevant in people who make false accusations), so let's focus on (a.), (c.) and (d.)
  • Regarding (a.) ("extreme entitlement"), I haven’t observed anything that would make me worry that Owen is very entitled. Admittedly, I only know him from a few professional (or semi-professional – e.g., talking in a small group of people after a retreat) context occasions, and I'm not a woman he's attracted to, so I may not have seen all sides. Still, for further consistency of what I already thought about him before all of this based on my intuitions from meeting him, I note that having extreme entitlement would be in tension with not causing bigger issues after being rejected (and it seems like he faced a lot of rejection in these cases), nor is it consistent with writing an apology where the fault isn't placed on other people.
  • Regarding (b.) ("a pattern of habitually flinching away from self-critical thoughts"), my best guess is that Owen’s interest in introspection is too deep for a person with this issue. (Edit: Also, he partly felt driven to mention his attraction because there was shame attached to it, and even though this made him maybe not think clearly about everything, it's not like, if I understand correctly, that he flinched away from mentioning or at least noticing the shame altogether.) The people I know who exhibited this "flinching away" pattern to a severe degree seemed uncomfortable with serious introspection in the first place. They sometimes were very quick to apologize, but they seemed to distort what they were accused of to a point where they were only apologizing for things that are easy to apologize to. By contrast, I feel like Owen's apology admits a bunch of things about himself that aren't easy to say, so it feels like genuine self-work went into it.
  • Regarding (d.) ("motivated cognition dialed up into proper delusions"), I get that people are concerned about this when they read the accounts of what happened. I am too, a little bit. However, I think that it's not super uncommon to not pick up on people feeling uncomfortable when they try to hint at this discreetly. I also think the instances where Owen kept talking to someone about topics they expressed they wanted him to stop talking about are at most only one or two instances? I feel like if he were severely deluded in the sense of flat-out not cognitively accepting when his advances are rejected, he'd have kept talking to people a lot more than what he actually did? So, I think the levels of motivated cognition here were most likely "bad, but not too far out of the ordinary" (i.e., nowhere near "stalker levels"), so I'm not too worried about this for the future. We should also keep in mind how this was a big event in Owen's life he's unlikely to forget, and how much updating he's probably been doing (and he also said he's been discussing this stuff with a therapist).
  •  Lastly, something that goes into my assessment of (d.) that I could imagine other people missing is that I feel I have a good model of what went wrong. In that model, it's not necessary for (d.) (or anything else that would spell trouble) to be pathological about Owen's psychology. Instead, I think what went poorly is primarily explained by (i) bad theory of mind and (ii) character-related scrupulosity that led to persistently bringing up things that were no longer appropriate (or were never appropriate in the first place, such as with some of the comments or locations where things were brought up).
  • [Now describing my model.] When people read Owen's account of why he did what he did, one reaction they might have is the following. "So, okay, he felt attracted to someone, he worried that this attraction would make him a bad person, he wanted to get feedback to reassure himself that he isn't bad for this. So far, so good. But then, out of everyone he could've picked for that conversation, why on earth would he pick the person he's attracted to to discuss this with? Why not discuss it with anyone else? Doesn't that mean there must be some more sinister underlying motive at play? How could 'wanting to be reassured that he isn't bad' be anything but an excuse to make repeated advances?" My reaction to that interpretation is "people who think this are probably missing something about what it's like to have character-related scrupulosity."
  •  I'm only speculating here, but it feels like the sort of thing I'm likely right about at least directionally. Namely, my guess is that Owen, over the five or so years that this is about, confessed his attraction also to the women he was attracted to (instead of only seeking out other people to discuss his feelings with) and sought absolution from them partly because he cared particularly about what these women think about him, which was related to him being attracted to them in the first place (meaning he admired them character-wise).
  • [Edit: paragraph added roughly 8h after posting the comment] Owen reached out to me (first time we communicated in several years) after I wrote this comment and said he felt seen by this paragraph (the one right above), but that I was missing a further factor. He felt – at the time – like the women in question had moral authority on this topic. Quoting: "If they didn't mind my attraction, then by their judgement my attraction wasn't bad. If they did mind it, it was bad (and I should take therefore take further internal steps to suppress the feeling)." Don't take from this that Owen necessarily agrees with all the other descriptions in my long comment, but I wanted to highlight this bit in particular because it sheds more light on why he talked about his attraction to the women directly. I didn't think of this possibility independently, but it seems credible/consistent to me. (I got permission from Owen to share.)
  • In some men, it's common to feel like the ultimate judge of your character is a woman with desirable character qualities herself. If you have a strong desire to be trusted and accepted for who you are, not for who others think you are but for who you actually are, it makes sense that you overshare weird details about yourself that you feel the least comfortable about. And someone with obsessive tendencies in that area may make the mistake of doing this too often or in contexts where it isn't appropriate. This is a common motivation (I know because I have the same feelings around some of this), and it IMO explains a lot of what happened, so it's not like we need to postulate other outlier-y things (extreme entitlement, extreme propensity to form delusional beliefs, etc.) to explain why Owen did what he did. (Other than poor theory of mind and common levels of motivated cognition, I mean.)
  • Lastly, and ironically, I think part of what caused things to go wrong here is actually protective for avoiding bad outcomes in the future. It makes it so much easier to evaluate someone's character if the person proactively gives you lots of information about it and if one of their primary drives seems to be to help you with that. It serves to rule out a lot of ways someone's character could be, but isn't. (Like, I’m pretty sure if Owen had had any sinister motives besides just having a crush on the women, he'd have confessed those other motives as well to the women in question, and we'd have a bigger scandal.) I think this is a positive thing and one reason I'm drawn to defending Owen here as someone who doesn't know him super well (without saying that he didn't do anything wrong) is because I can easily imagine how other people would react in similar situations, and I want to flag that I like it when people make it easier for others to evaluate them.
  • A caveat here is that, as we've seen, a desire to be trusted and act so as to earn trust by proactively doing lots of introspection and sharing negative information by no means implies that someone is free from massive blind spots or self-deception. There's even a hypothesis that people who practice radical honesty and transparency thereby dial up their self-deception – see Holly Elmore’s post on privacy, which seems relevant here, though (a) "making oneself transparent on matters relevant to trust" isn't exactly the same as "declining to have any sort of privacy in any context," and so, (b), I'm not sure I'd say that self-deception is necessarily dialed up in all instances of trying to do the former. Still, it seems plausible that it could come with that sort of risk.) Anyway, despite the concern that self-deception and blind spots remain very much possible/to be expected, I think that people who try to make themselves transparent to have their character more easily evaluated are in fact easier to evaluate for good character than people who don't do that.

I'm not sure why your comment was downvoted. I think it's a perfectly reasonable request since, as you say correctly in other comments, people who don't know enough to form their own opinion can't just trust that other forum commenters with direct opinions are well-calibrated/have decent people judgment about this.

I started writing down some points, but it's not easy and I don't want to do it in a half-baked fashion and then have readers go "oh, those data points and interpretations all sound pretty spurious, if that's all you have, it seems weird that you'd even voice an opinion." It's often hard to put in words convincingly why you believe something about someone.

I might still get around to finishing the comment at some point in the next few days, but don't count on it.

I agree that the women affected are what this is primarily about. But there's also an issue with not wanting to ascribe to anyone how we think they likely feel, without knowing much about them. Like, maybe at least some of the women who had negative experiences have nuanced feelings that aren't best described as "I feel bad/invalidated whenever I see someone say positive things about Owen, even if they take care to not thereby downplay that the things he did weren't acceptable." Maybe some feel things like, "this stuff was messed up and really needed to be dealt with, and it sucks that it took so long/seemed like initially it wasn't going to be dealt with, but it seems like things are developing in a good direction now." Or maybe not! Maybe they're still super upset and wish that Owen never re-enter the community again. That would be their right and seems understandable, too. In any case, the way I see it, we don't know at this point (at least I don't), and while I agree that it's important to create encouraging incentives so people will be likely to report future instances of misconduct, I don't think this requires a policy of "avoid at all costs saying things that might make someone who was affected uncomfortable." (In fact, there's also a risk of making people less likely to report uncomfortable experiences if they worry that there'll be a community overreaction. That's not the first thing I'd worry about, to be clear; I'm just pointing out that this could happen/be someone's reason to be hesitant about speaking up about something.) Personally, the message I find most important is something like: I want us to take seriously that it's unacceptable for people to predictably be at risk to have bad experiences like that in the EA community, and the community/Community Health takes this seriously and takes appropriate action.

I expressed support of another person's comment that contained many positive points about Owen. I hope that no one feels like this means I'm "on Owen's side" rather than on the side of the people who brought up these complaints. Owen seems to largely agree about the facts of what happened, and he seems genuinely committed to making sure similar stuff doesn't happen again, plus he accepted the consequences (stepping down, two-year ban). These features of the situation IMO make it possible to not have to view this as "either you support the victims, or you can say a redeeming thing or two about Owen."

This option of "not thinking of the situation as one where it's about picking sides" isn't always available. If a person accused of causing harm goes DARVO and accuses the alleged victims to be malefactors in return who make up false stories, then one is forced to either side with the alleged victims, or with the accused. Similarly, sometimes someone does something that's immediately strong evidence that they are operating without even a desire to respect others (Milena Cenzler's comment originally contained a hyperlink to the case of Brock Turner, who sexually assaulted an unconscious woman, as an example of how comments by supporters can sometimes be re-traumatizing to the victims). In those cases, it also seems to me like one can't say much that's redeeming about the person who caused harm without this being disrespectful towards the victims, both because of the severity of what they faced and because there's not much redeeming you can say about someone who even lacks a basic desire/intent to respect others. But that sort of case has very different features from "confessed feeling attracted out of scrupulosity and misguided desire to get moral absolution from the people one is attracted to for not having to feel bad about the attraction." It's a very different thing. 
(Edit) Lastly, I guess sometimes someone can be a skilled manipulator and seem remorseful and accept consequences but downplay the extent of the harm and downplay their "bad character." If Owen were like that and people who said positive things about him just fell victim to his charm, that could also be invalidating for the people who were harmed. I don't think that's the case, but this would be another situation we want to try to avoid, so I feel like the people who say positive things about Owen have a responsibility to consider the possibility "am I being manipulated?." 

I agree with those points and they seem important.

I didn't write this further above, but thinking about it now, I think there was also another dimension that fed into me thinking of this case as "atypical." (But maybe this isn't the best wording and these things are more typical than we think, but what I'm trying to gesture at is "the sort of thing that has high chances of getting fixed.") In any case, when I think of cases of "harm through neglect," where someone isn't ill-intentioned but still has a pattern of making others uncomfortable, some cases that come to mind are with people who are kind of hopeless and their personality and psychology seems tragic and like they are unlikely to improve without excessive amount of supervision/handholding and fix all the stuff that is at risk of causing harm in different ways. Importantly, Owen very much doesn't seem to me like that either.* So, according to my interpretation and guesses, there is indeed less potential for future harm than in many other conceivable cases where someone e.g., received a two-year ban for making people uncomfortable. 

It's good that you made this point because I agree we shouldn't place too much importance on the "intentional harm vs unintentional harm" distinction. Instead, I think what matters if people overall have a prosocially-oriented and corrigible cognition. 

*That said, I acknowledge that Owen needed more than just the initial pointer from the first time he was approached by Community Health, so I'm not saying this is the most obvious call in the world. It would be a longer topic to elaborate on why I feel confidently optimistic, but people can read for themselves the document he wrote on all of this and see how much (if at all) it makes them feel reassured about steps Owen has taken since then and how much it seems he now has insight into what went wrong and why he did what he did, etc. 

I view power differentials, workplace dating, etc., as something that's risky/delicate, but it can be fine if done carefully. Even if something goes poorly in one instance, it doesn't necessarily mean that a person did something immoral.

However, when there's a pattern of several people complaining, that's indicative of some kind of problem.

It means likely that either a person was particularly likely to make people really uncomfortable with their advances when they made them, or that the person made a ton of advances in professional contexts (and a small portion of them left people unusually uncomfortable). I think both of these would be bad, for different reasons. 

(Why is bad to make tons of careful advances? I feel like it's bad because it reflects not taking seriously the view that one's prior should be against it being a good idea, especially if your professional context is about having impact rather than a means for getting romance or sex.)

Not sure if everyone does it this way, but I find agree/disagree votes more important for what you're saying than merely upvotes. In cases like this, I would use agree/disagree votes if I know a lot about either Owen directly, or about Jonas's judgment in situations like this.* Even though it's technically anonymous, I think of agree/disagree votes in situations like this as "staking a small part of my own reputation on the claims in the comment." I'd use upvotes more liberally and upvote things that sound potentially important or insightful even if I'm still unsure about them.

*I guess a third case is if I think a comment uses weird reasoning that makes me think the person who wrote it has bad people judgment, I could also see myself disagree-voting it from a distance/without any more direct knowledge.

For reasons I went into here, I think it often sets things up for vexed discussion dynamics when we're criticizing how others are reacting or aren't reacting, and whether they are emphasizing the right points with the appropriate degree of strength. (I do this myself occasionally, and there isn't anything wrong with doing it, per se. I'm just pointing out why we're doomed to have an unpleasant discussion experience.)

I would even add that assuming that the community will conflate Owen and Epstein's case is patronizing and far-fetched;

I feel like you're being uncharitable here; I was commenting on "not leaving much space for things to be worse" rather than making a specific claim about conflation.

Edited to add:

I do not see anyone in the comment claiming that he is more than what is described in the post, and in general, I do not see anything pointing at overaction from anybody.

Just want to flag that I agree with this. It still doesn't seem unreasonable to me to proactively make the sort of comment that Jonas made (in fact, I liked the comment a lot). But I also see why you find it odd to do this "unpromptedly." 

2nd edit 24h later: Someone has now made a comment pointing out how maybe Jonas just got "charmed." The comment stating this hypothesis got lots of upvotes, which is okay/good for balance, but it shows why Jonas's comment was very valuable in the first place. When you ban someone from community spaces for two years, it really is in a lot of people's mind that the person might be antisocial and manipulative. That's the sort of associations this conjures up. It's important to point out how this case is atypical and that there are many reasons to assume that risk of future harm is very low – as others have pointed out with concrete examples/arguments. (Which I also believe to be the case firsthand, so not just based on updating to Jonas, lyra, Emma, etc.)

[T]here is a certain irony to see these two people coming to defend Owen while the community health head, Julia, admits to a certain level of bias when handling this affair since he was her friend.

Jonas's comment includes statements like "This obviously doesn’t make his past behavior any less bad and doesn’t excuse any of it" and "I think a temporary ban is important, both as an incentive against bad behavior and as a precaution so the harms don’t continue. That said, two years are a long time, [...]"

So, I don't think this would be repeating the mistakes that the community health team acknowledged. I also think Jonas made important points.

Generally, I think as long as someone acknowledges what's at stake for both sides (underreaction vs overreaction), it should be okay to try to add important nuance to a conversation.

Regarding lyra's comment, among other things, she says that she thinks Owen would be positively welcoming for lots of communities. In light of several people speaking up about their negative experiences, I can see why that's weird to hear. Like, even if it's true going forward (I think it might be!), I'd personally would have liked it better if lyra's comment had contained more of a mention of how she agrees the patterns described in the update were indeed concerning and very much warranted action. (I don't know who lyra is – it's plausible that she'd agree with that and simply didn't bother to write it because she thought readers can already see all the negative descriptions.)

Can you put yourself two seconds in the shoes of these women who received unwanted and pressing attention from Owen,

Again, I can see why you're writing this in response to lyra's comment. At the same time, I want to point out that "being concerned about overreactions" isn't just one-sidedly bad for the goal of improving community safety and welcomingness. If the community doesn't have a sense of how there's still a big difference in "potential for future harm" between someone like Owen and someone like Jeffrey Epstein, then how is that conducive for welcomingness and safety? This isn't yet a fleshed out argument, but I have the intuition that if we react with all-out outrage at even comparatively more fixable/improvable incidents of harm, we'll do a worse job at having the energy and vigilance to maximally pursue the worst cases of it, which often also involve a lot of deception and "enemy action" that distorts the discourse around what's happening and around "who's the bad party?." In those situations, you really want people who can pay attention to lots of subtleties of character and don't fall into outrage traps around whose side is currently winning. 

Load more