All of Scott Alexander's Comments + Replies

I thought we already agreed the demon case showed that FDT wins in real life, since FDT agents will consistently end up with more utility than other agents.

Eliezer's argument is that you can become the kind of entity that is programmed to do X, by choosing to do X. This is in some ways a claim about demons (they are good enough to predict even the choices you made with "your free will"). But it sounds like we're in fact positing that demons are that good - I don't know how to explain how they have 999,999/million success rate otherwise - so I think he is r... (read more)

1
Omnizoid
8mo
We all agree that you should get utility.  You are pointing out that FDT agents get more utility.  But once they are already in the situation where they've been created by the demon, FDT agents get less utility.  If you are the type of agent to follow FDT, you will get more utility, just as if you are the type of agent to follow CDT while being in a scenario that tortures FDTists, you'll get more utility.  The question of decision theory is, given the situation you are in, what gets you more utility--what is the rational thing to do.  Eliezer's turns you into the type of agent who often gets more utility, but that does not make it the right decision theory.  The fact that you want to be the type of agent who does X doesn't make doing X rational if doing X is bad for you and not doing X is rewarded artificially.   Again, there is no dispute about whether on average one boxers or two boxers get more utility or which kind of AI you should build. 

I think rather than say that Eliezer is wrong about decision theory, you should say that Eliezer's goal is to come up with a decision theory that helps him get utility, and your goal is something else, and you have both come up with very nice decision theories for achieving your goal.

(what is your goal?)

My opinion on your response to the demon question is "The demon would never create you in the first place, so who cares what you think?" That is, I think your formulation of the problem includes a paradox - we assume the demon is always right, but also, tha... (read more)

1
Omnizoid
8mo
The demon case shows that there are cases where FDT loses, as is true of all decision theories.  IF the question is which decision theory will programming into an AI generate most utility, then that's an empirical question that depends on facts about the world.  If it's once you're in a situation which  will get the most utility, well, that's causal decision theory.   Decision theories are intended as theories of what is rational for you to do.  So it describes what choices are wise and which choices are foolish.  I think Eliezer is confused about what a decision theory is, but that is a reason to trust his judgment less.   In the demon case, we can assume it's only almost infallible, so every million times it makes a mistake.  The demon case is a better example, because I have some credence in EVT, and EVT entails you should one box.  I am waaaaaaaaaaaay more confident FDT is crazy than I am that you should two box. 

I guess any omniscient demon reading this to assess my ability to precommit will have learned I can't even precommit effectively to not having long back-and-forth discussions, let alone cutting my legs off. But I'm still interested in where you're coming from here since I don't think I've heard your exact position before.

Have you read https://www.lesswrong.com/posts/6ddcsdA2c2XpNpE5x/newcomb-s-problem-and-regret-of-rationality ? Do you agree that this is our crux?

Would you endorse the statement "Eliezer, using his decision theory, will usually end out with... (read more)

1
Omnizoid
8mo
I would agree with the statement "if Eliezer followed his decision theory, and the world was such that one frequently encountered lots of Newcombe's problems and similar, you'd end up with more utility."  I think my position is relatively like MacAskill's in the linked post where he says that FDT is better as a theory of the agent you should want to be than what's rational.   But I think that rationality won't always benefit you.  I think you'd agree with that.  If there's a demon who tortures everyone who believes FDT, then believing FDT, which you'd regard as rational, would make you worse off.  If there's another demon who will secretly torture you if you one box, then one boxing is bad for you!  It's possible to make up contrived scenarios that punish being rational--and Newcombe's problem is a good example of that. Notably, if we're in the twin scenario or the scenario that tortures FDTists, CDT will dramatically beat FDT.   I think the example that's most worth focusing on is the demon legs cut off case.  I think it's not crazy at all to one box, and have maybe 35% credence that one boxing is right.  I have maybe 95% credence that you shouldn't cut off your legs in the demon case, and 80% confidence that the position that you can is crazy, in the sense that if you spent years thinking about it while being relatively unbiased you'd almost certainly give it up. 

Sorry if I misunderstood your point. I agree this is the strongest objection against FDT. I think there is some sense in which I can become the kind of agent who cuts off their legs (ie by choosing to cut off my legs), but I admit this is poorly specified.

I think there's a stronger case for, right now, having heard about FDT for the first time, deciding I will follow FDT in the future. Various gods and demons can observe this and condition on my decision, so when the actual future comes around, they will treat me as an FDT-following agent rather than a non... (read more)

2
Omnizoid
8mo
I know you said you didn't want to repeatedly go back and forth, but . . .  Yes, I agree that if you have some psychological mechanism by which you can guarantee that you'll follow through on future promises--like programming an AI--then that's worth it.  It's better to be the kind of agent who follows FDT (in many cases).  But the way I'd think about this is that this is an example of rational irrationality, where it's rational to try to get yourself to do something irrational in the future because you get rewarded for it.  But remember, decision theories are theories about what's rational, not theories about what kind of agent you should be.   I think we agree with both of the following claims:  1. If you have some way to commit in advance to follow FDT in cases like the demon case or the bomb case, you should do so.   2. Once you are in those cases, you have most reason to defect.   3. Given that you can predict that you'll have most reason to defect, you can sort of psychologically make a deal with your future self where you say "NO REALLY, DON'T DEFECT, I'M SERIOUS."   My claim though, is that decision theory is about 2, rather than 1 or 3.  No one disputes that the kinds of agents who two box do worse than the kinds of agents who one box--the question is about what you should do once you're in that situation.  If an AI is going to encounter Newcombe's problem a lot, everyone agrees you should program it to one box. 

Were there bright people who said they had checked his work, understood it, agreed with him, and were trying to build on it? Or just people who weren't yet sure he was wrong?

'Were there bright people who said they had checked his work, understood it, agreed with him, and were trying to build on it?'

Yes, I think. Though my impression (Guy can make a better guess of this than me, since he has maths background) is that they were an extreme minority in the field, and all socially connected to Mochizuki:  https://www.wired.com/story/math-titans-clash-over-epic-proof-of-the-abc-conjecture/

'Between 12 and 18 mathematicians who have studied the proof in depth believe it is correct, wrote Ivan Fesenko of the University of Nottingh... (read more)

I don't want to get into a long back-and-forth here, but for the record I still think you're misunderstanding what I flippantly described as "other Everett branches" and missing the entire motivation behind Counterfactual Mugging. It is definitely not supposed to directly make sense in the exact situation you're in. I think this is part of why a variant of it is called "updateless", because it makes a principled refusal to update on which world you find yourself in in order to (more flippant not-quite-right description) program the type of AIs that would w... (read more)

1
Omnizoid
8mo
Oh sorry, yeah I misunderstood what point you were making.  I agree that you want to be the type of agent who cuts off their legs--you become better off in expectation.  But the mere fact that the type of agent who does A rather than B gets more utility on average does not mean that you should necessarily do A rather than B.  If you know you are in a situation where doing A is guaranteed to get you less utility than B, you should do B.  The question of which agent you should want to be is not the same as which agent is acting rationally.  I agree with MacAskill's suggestion that FDT is the result of conflating what type of agent to be with what actions are rational.  FDT is close to the right answer for the second and a crazy answer for the first imo.   Happy to debate someone about FDT.  I'll make a post on LessWrong about it.   One other point, I know that this will sound like a cop-out, but I think that the FDT stuff is the weakest example in the post.  I am maybe 95% confident that FDT is wrong, while 99.9% confident that Eliezer's response to zombies fails and 99.9% confident that he's overconfident about animal consciousness.

I won't comment on the overall advisability of this piece, but I think you're confused about the decision theory (I'm about ten years behind state of the art here, and only barely understood it ten years ago, so I might be wrong).

The blackmail situation seems analogous to the Counterfactual Mugging, which was created to highlight how Eliezer's decision theories sometimes (my flippant summary) suggest you make locally bad decisions in order to benefit versions of you in different Everett branches. Schwartz objecting "But look how locally bad this decision i... (read more)

2
TAG
8mo
The remark.about Everett branches rather gives the game away. Decision theories rest on assumptions about the nature of the universe and of the decider, so trying to formulate a DT that will work.perfectly in any universe is hopeles.

'The blackmail situation seems analogous to the Counterfactual Mugging, which was created to highlight how Eliezer's decision theories sometimes (my flippant summary) suggest you make locally bad decisions in order to benefit versions of you in different Everett branches. Schwartz objecting "But look how locally bad this decision is!" isn't telling Eliezer anything he doesn't already know, and isn't engaging with the reasoning'

I just control-F searched the paper Schwarz reviewed, for "Everett", "quantum", "many-worlds" and "branch" and found zero hits. Can... (read more)

>that makes extremely bright people with math PhDs make simple dumb mistakes that any rando can notice

Bright math PhDs that have already been selected for largely buying into Eliezer's philosophy/worldview, which changes how you should view this evidence. Personally I don't think FDT is wrong as much as just talking past the other theories and being confused about that, and that's a much more subtle mistake that very smart math PhDs could very understandably make

This is starting to sound less like "Eliezer is a uniquely bad reasoner" and more like "there's something in the water supply here that makes extremely bright people with math PhDs make simple dumb mistakes that any rando can notice."

Independently of all the wild decision theory stuff, I don't think this is true at all. It's more akin to how for a few good years, people thought Mochizuki might have proven the ABC conjecture. It's not that he was right - just that he wrapped everything in so much new theory and terminology, that it took years for people to understand what he meant well enough to debunk him. He was still entirely wrong.

2
Omnizoid
8mo
If your action affects what happens in other Everett branches, such that there are actual, concretely existing people whose well-being is affected by your action to blackmail, then that is not relevantly like the case given by Schwarz.  That case seems relevantly like the twin case, where I think there might be a way for a causal decision theorist to accomodate the intuition, but I am not sure.   We can reconstruct the case without torture vs dust specks reasoning, because that's plausibly a cofounder.  Suppose a demon is likely to create people who will cut off their legs once they exist.  Suppose being created by the demon is very good.  Once you're created, do you have any reason to cut off your legs, assuming it doesn't benefit anyone else?  No!  In the twin case, suppose that there are beings named Bob.  Each being named Bob is almost identical to the last one--there choices are 99.9% correlated--and can endure great cost to create another Bob when he dies.  It seems instrumentally irrational not to bare great costs.   I think it's plausible that most people are just not very good at generating true beliefs about philosophy, just as they're not good at generating true beliefs about physics.  Philosophy is really fricking hard!  So the phenomenon "lots of smart people with a math background rather than a philosophy background hold implausible views about philosophy," isn't news.  However, if someone claims to be the expert on physics, philosophy, decision theory, and AI, and then they turn out to be very confused about philosophy, then that is a mark against their reasoning abilities.   It's true that there is a separate interesting question about how so many smart people go so wrong about philosophy (note, I'd dispute the characterization that these are errors basic enough that a rando can figure them out--I think it wouldn't have been obvious to me what the errors were if it weren't for MacAskill and Schwarz who are very much non-randos).  But the thing I

Thanks for writing this.

I understand why you can't go public with applicant-related information, but is there a reason grantmakers shouldn't have a private Slack channel where they can ask things like "Please PM me if any of you have any thoughts on John Smith, I'm evaluating a grant request for him now"?

6
Austin
8mo
I think this is worth doing for large grants (eg >$50k); for smaller grants, coordination can get to be costly in terms of grantmaker time. Each additional step of the review process adds to the time until the applicant gets their response and their money. Background checks with grantmakers are relatively easier with an application system that works in rounds (eg SFF is twice a year, Lightspeed and ACX also do open/closed rounds) -- you can batch them up, "here's 40 potential grantees, let us know if you have red flags on any". But if you have a continuous system like LTFF or Manifund, then every coordination request between two funders adds an interruption point/context switch. I think out of ~20 grants we've made on Manifund, we checked in with LTFF/Lightspeed on 2 of them, mostly not wanting to bother them too much. Background checks also take longer the more people you're checking with; you can ask in parallel but you'll be bottlenecked by the time of the slowest respondent. Reliability can get especially hard (what if a grantmaker is sick/on vacation)? You can also try setting a fixed timeline "we're going to approve this in 48h", I guess, and try to find a tradeoff between "enough time for checks to come back" and "not delaying process overmuch"

Yeah we're working on something like this! There are a few logistical and legal details, but I think we can at least make something like this work between legible-to-us grantmakers (from my lights, LTFF, EAIF, OP longtermism, Lightspeed, Manifund, and maybe a few of the European groups like Longview and Effective Giving). Obviously there are still limitations (eg we can't systematically coordinate with academic groups, government bodies, and individual rich donors), but I think an expectation that longtermist nonprofit grantmakers talk to each other by default would be an improvement over the status quo.

(Note that weaker versions of this already happens, just not very systematically)

Okay, so GWWC, LW, and GiveWell, what are we going to do to reverse the trend?

Seriously, should we be thinking of this as "these sites are actually getting less effective at recruiting EAs" or as "there are so many more recruitment pipelines now that it makes sense that each one would drop in relative importance" or as "any site will naturally do better in its early years as it picks the low-hanging fruit in converting its target population, then do worse later"?

GWWC's membership has steadily grown in the recent years, so it's not that GWWC isn't getting more people to give significantly and effectively! I think this highlights broader questions about what the focus of the current effective altruism community is, and what it should be.

GWWC team members have advocated for a "big tent" effective altruism where everyone who wants to do good effectively should feel that they can be a part of the community - but anecdotally we hear sometimes that people who are primarily interested in giving don't feel like the broader... (read more)

8
David_Moss
9mo
I think this is one case where it's useful to also look at the absolute numbers from each source across years rather than the percentages. (Usually the absolute totals risk being very misleading/confusing, because the total numbers of respondents from different surveys varies a lot, e.g. 2020 was a nadir, with just over 2000 respondents, whereas in 2022 we recruited over 3500). The raw numbers show similar numbers from GiveWell across years, whereas GWWC's raw totals do seem lower than their peak in earlier years. So it doesn't seem to simply be the case that they've been outpaced by faster growth from 80K or others.
7
Jason
9mo
Or, for GiveWell and to a lesser extent GWWC, "changes in EA's focus over the years have made it less likely that the population interested in GiveWell would be engaged enough in the EA meta to see and complete the survey?"

I think the elephant in the room might be OpenPhil spending at least $211m on "Effective Altruism Community Growth (Longtermism)", including 80k spending $2.6m in marketing in 2022.[1] 

As those efforts get results I expect the % of EA growth from those sources to increase.

I also expect EA™ spaces where these surveys are advertised to over-represent "longermism"/"x-risk reduction" (in part because of donor preferences, and in part because they are more useful for some EAs), so that would impact the % of people coming to these spaces from things like Gi... (read more)

The first point seems to be saying that we should factor in the chance that a program works into cost-effectiveness analysis. Isn't this already a part of all such analyses? If it isn't, I'm very surprised by that and think it would be a much more important topic for an essay than anything about PEPFAR in particular.

The second point, that people should consider whether a project is politically feasible, is well taken. It sounds like the lesson here is "if you find yourself in a situation where you have to recommend either project A or B, and both are good,... (read more)

Good points. Agree that "always go for a big push instead of incrementalism" is waaayyyy too simple and sweeping a lesson to draw from PEPFAR. Also, three cheers for not lying. I think the World Bank was right not to suppress its data on the low cost-effectiveness of ARV drugs circa the mid-2000s. But in retrospect, I think people drew bad policy conclusions from that data.

My piece above is largely a plea for a little bit of intellectual humility and introspection on the part of the cost-effectiveness crowd (of which I'm often an active participant). If we... (read more)

3
PatrickL
1y
Yeah good find, I also think that passes the bar. Although I do think people have generally overestimated GPT's essay-writing ability compared to humans, and think I might be falling for that here.  I'm not planning to change the doc because Bing's AI wasn't released by Feb 23, but if you think it should be included (which would be reasonable given OpenAI pretty obviously made this before Feb 23), it would mean: * Experts expected 9 milestones to be met vs actually 11 milestones * The calibration curve looks four percentage points worse at the 10% mark * Bulls' Brier score: 0.29 * Experts' Brier score: 0.24 * Bears' Brier score: 0.29 I've added it to this tracker of milestones (feel free to request edit access).

I find this interesting, but also somewhat hard to identify any meaningful patterns. For example, one could expect red points to be clustered at the top for Manifold, indicating that more forecasts equal better performance. But we don't see that here. The comparison may be somewhat limited anyway: In the eyes of the Metaculus community prediction, all forecasts are created equal. On Manifold, however, users can invest different amounts of money. A single user can therefore in principle have an outsized influence on the overall market price if they are will

... (read more)
2
nikos
1y
I slightly tend towards yes, but that's mere intuition. As someone on Twitter put it, "Metaculus has a more hardcore user base, because it's less fun" - I find it plausible that the Metaculus user base and the Manifold user base differs. But higher trading volume I think would have helped.  For this particular analysis I'm not sure correcting for the number of forecasters would really be possible in a sound way. It would be great to get the MetaculusBot more active again to collect more data. 

Thank you. I misremebered the transcription question. I now agree with all of your resolutions, with the most remaining uncertainty on translation.

Thank you for doing this! I was working on a similar project and mostly came up with the same headline finding as you: the experts seemed well-calibrated. I did  decide a few of the milestones a little differently, and would like to hear why you chose the way you did so I can decide whether or not to change mine.

  • Zach Stein-Perlman from AI Impacts said that he thought "efficiently sort very large lists" and "write good Python code" were false, because the questions said it had to be done in a certain way by a certain type of neural net, and that wasn't
... (read more)
3
Meefburger
1y
I think it's reasonable to go either way on Starcraft. It's true that the version of Alphastar from  three years ago were not beating the best humans more than half the time, and they did not take screen pixels as inputs.  But those models were substantially inhibited in their actions per minute, because computers that can beat humans by being fast are boring. Given that the version of Alphastar that beat MaNa was already throttled (albeit not in the right way to play like a human), I don't see why an AI with no APM restrictions couldn't beat the best humans. And I don't see any particular reason you couldn't train an image classifier to get from screen pixels to Alphastar's inputs. So I think this mostly comes down to whether you think it was implied in the prediction that a realistic APM limit was implied, and what your bar is for "feasible".
2
PatrickL
1y
I've only given the data a quick look and found it hard to analyse - but yeah, many of the forecasts look bad. But some of the medians (I think- from eyeballing data) seem not terrible - the 'create top forty song' shifted from 10 year median to ~5 year median. The 'answer open-ended questions' shifted from 10 year median to ~3 years.  But like you say, for many of the milestones I resolved as being met before this survey went out, they still have medians  >0 years from now so - if I'm right in my judgements - the experts seem pretty poorly clued up on recent developments across the field.

This is great - thanks for this comment! I've gone through each to explain my reasoning. Your comments/sources changed my opinion on Starcraft and Explain - I've updated the post and scores to reflect this, and think the conclusion is now the same but slightly weaker, because the experts' Brier score is 0.2 points worse, but the comparative Brier scores are also worse to a similar amount. There's also my reasoning for other milestones in the appendix (and I've copy-pasted some of them below).

Zach Stein-Perlman from AI Impacts said that he thought "efficien

... (read more)
3
Lorenzo Buonanno
1y
I think the question says: As a data point, it seems to me that OpenAI's Whisper large model is probably above typical human transcription quality for standard accents in non-noisy environments. E.g. it transcribes correctly "Hyderabad" from here  (while YouTube transcribes it as "hyper bus and").[1] For "noisy environments with a variety of accents", it was surprisingly hard to find a sample. From this, it generates this, which does seem worse than a typical human, so I would also resolve this as "false" if OpenAI's Whisper is the state of the art, but I wouldn't say that it doesn't seem close.   As another data point, for English <-> Italian it's usually better than me. But it really struggles with things like idioms. 1. ^ Here's the full transcription of that talk. (It does transcribe "Jacy" as "JC", but I still think the typical human would have made more mistakes, or at the very least it does seem close).

Thanks for asking. One reason we decided to start with forecasting was because we think it has comparatively low risks compared to other fields like AI or biotech. 

If this goes well and we move on to a more generic round, we'll include our thoughts on this, which will probably include a commitment not to oracular-fund projects that seem like they were risky when proposed, and maybe to ban some extremely risky projects from the market entirely. I realize we didn't explicitly say that here, which is because this is a simplified test round and we think t... (read more)

In 2018, I collected data about several types of sexual harassment on the SSC survey, which I will report here to help inform the discussion. I'm going to simplify by assuming that only cis women are victims and only cis men are perpetrators, even though that's bad and wrong.

Women who identified as EA were less likely report lifetime sexual harassed at work than other women, 18% vs. 20%. They were also less likely to report being sexually harassed outside of work, 57% vs. 61%. 

Men who identified as EA were less likely to admit to sexually harassing pe... (read more)

Conditional on being a woman in California, being EA did make someone more likely to experience sexual harassment, consistently, as measured in many different ways. But Californian EAs were also younger, much more bisexual, and much more polyamorous than Californian non-EAs; adjusting for sexuality and polyamory didn't remove the gap, but age was harder to adjust for and I didn't try.  EAs who said they were working at charitable jobs that they explicitly calculated were effective had lower harassment rates than the average person, but those working a

... (read more)

Minor object-level objection: you say we should predict that crypto exchanges like FTX to fail, but I tried to calculate the risk of this in the second part of my post, and the average FTX-sized exchange fails only very rarely. 

I don't think this is our main point of disagreement though. My main point of disagreement is about how actionable this is and what real effects it can have.

I think that the main way EA is "affiliated with" crypto  is that it has accepted successful crypto investors' money. Of people who have donated the most to EA, I thin... (read more)

3
Jason
1y
I see more possible action points from the original post, mostly having to do with brand risk management and risk containment. I think it's hard to deny that the crypto sector has much higher-than-average risk of causing reputational damage, and that EA just got a concussion from crypto. So, it is at particular risk of something akin to second concussion syndrome (SCS) if there's another crypto concussion before this one heals. So my additional possble action points based on the original post: 1. There are advantages of splitting one's social movement into different brands. Corporations know this -- Marriott owns Ritz Carlton and a bunch of other hotel brands, but it doesn't splash the Ritz Carlton brand over all its stuff. If you're right that crypto is a large portion of available funding, maybe people should be steered toward the "Crypto for Human Flourishing" movement with its own meta organizations and public figures, which could use EA organizations as vendors (e.g., commissioning research from RP, GiveWell, etc.) There needs to be a brand that non-crypto billionaires have a good opinion of and are proud to be associated with -- and having the big-money crypto folks at the party probably isn't conducive to that (at least in the medium run). 2.  Likewise, maybe EA meta orgs should avoid taking crypto money altogether. Instead, solid donors like OP could funge the crypto donor's non-donation to meta causes and create the correct funding balance. For various reasons, I think EA meta taking bad crypto $$$ has a much higher reputational risk than (e.g.) AMF or GiveDirectly doing so. To outsiders, meta can come across as smug, self-indulgent, arrogant, and morally superior -- I am not endorsing those views, but it's just a lot harder for other people to stay mad at an organization that just delivers bednets or cash transfers to people in Africa. 3. I have generally been skeptical of donor vetting/investigation, but one could argue that the increased risk of the
0
titotal
1y
Might I introduce you to the post announcing an EA funds spinoff website called "effectivecrypto.org", where the authors openly state: The website has since been deleted, but if you look at the archived version, you can see a big splash banner of SBF himself endorsing the project. In the "learn more" page, you then see another big post tying crypto to EA: They then uncritically repeat speculation that bitcoin will hit $200k (whoops). At no point in the website, or in the announcement post, did anybody mention that crypto was in the middle of a massive speculative bubble due to crash any minute.  (edit:)Theres also a direct quote from the Q&A section: So I think that settles that point.  EA and SBF were undeniably affiliated. The ties between CEA leadership and SBF have been documented extensively, and used to beat the movement over the head with in pretty much every major newspaper of note. I mean, just look at the FTX foundation website, with the effective altruism logo prominently featured as "grantee and partners".  I mean, I guess my main point of note is: Don't do it again? I'm sure there will be a period of caution for the next while, but if another crypto bubble forms, I expect there will be a temptation to exploit the speculative mania by affiliating with another crypto firm, with predictable results.  The other main point is: Stop pretending crypto isn't in a speculative bubble. It is, and will remain so until an actual use-case exists that generates revenue commensurate with the ridiculous market cap. (or if the field collapses enough to be sustainable by the few niche applications that do exist).  If EA sacrifices it's reputation to chase after unregulated crypto money, theres a very good chance it will be left with nothing, if, as I suspect, crypto eventually fizzles out completely, and ends up just a niche hobby thing. Whereas a reputation for responsibility and caution will actually pay off long term. And even if they do find the mysterious cry

Thanks for your thoughtful response.

I'm trying to figure out how much of a response to give, and how to balance saying what I believe vs. avoiding any chance to make people feel unwelcome, or inflicting an unpleasant politicized debate on people who don't want to read it. This comment is a bad compromise between all these things and I apologize for it, but:

I think the Kathy situation is typical of how effective altruists respond to these issues and what their failure modes are. I think "everyone knows" (in Zvi's sense of the term, where it's such strong co... (read more)

Thanks, I realize this is a tricky thing to talk about publicly (certainly trickier for you, as someone whose name people actually know, than for me, who can say whatever I want!). I'm coming in with a stronger prior from "the outside world", where I've seen multiple friends ignored/disbelieved/attacked for telling their stories of sexual violence, so maybe I need to better calibrate for intra-EA-community response. I agree/hope that our goals shouldn't be at odds, and that's what I was trying to say that maybe did not come across: I didn't want people to ... (read more)

Predictably, I disagree with this in the strongest possible terms.

If someone says false and horrible things to destroy other people's reputation, the story is "someone said false and horrible things to destroy other people's reputation". Not "in some other situation this could have been true". It might be true! But discussion around the false rumors isn't the time to talk about that.

Suppose the shoe was on the other foot, and some man (Bob), made some kind of false and horrible rumor about a woman (Alice). Maybe he says that she only got a good position in... (read more)

ethai
1y29
20
8

Thank you, this is clarifying for me and I hope for others.

Responses to me, including yours, have helped me update my thinking on how the EA community handles gendered violence. I wasn't aware of these cases and am glad, and hope that other women seeing this might also feel more supported within EA knowing this. I realize there are obvious reasons why these things aren't very public, but I hope that somehow we can make it clearer to women that Kathy's case, and the community's response, was an outlier.

I would still push back against the gender-reversal fal... (read more)

EDIT: After some time to cool down, I've removed that sentence from the comment, and somewhat edited this comment which was originally defending it. 

I do think the sentence was true. By that I mean that (this is just a guess, not something I know from specifically asking them) the main reason other people were unwilling to post the information they had, was because they were worried that someone would write a public essay saying "X doesn't believe sexual assault victims" or "EA has a culture of doubting sexual assault victims". And they all hoped some... (read more)

Hi Scott,

Thank you for both of your comments. I appreciate you explaining why you wrote a post about Kathy and I think it's useful context for people to understand as they are thinking about these issues. My intention was not to call anybody out, rather, to point to a pattern of behavior that I observed and describe how it made me (and could make others) feel.  

[anonymous]1y27
5
1

Thanks for removing the sentence.

I'm sorry you've gotten flak. I don't think you deserve it. I think you did the right thing, and the silence of other people "in the know" doesn't reflect particularly well on them. (Not in the sense that we should call them out, but in the sense that they should maybe think about whether they knowingly let a likely-innocent person suffer unjust reputation harm.)

I think there's a culture of fear around these kinds of issues that it's useful to bring to the foreground if we want to model them correctly.

Agreed. I thin... (read more)

I read about Kathy Forth, a woman who was heavily involved in the Effective Altruism and Rationalist communities. She committed suicide in 2018, attributing large portions of her suffering to her experiences of sexual harassment and sexual assault in these communities. She accuses several people of harassment, at least one of whom is an incredibly prominent figure in the EA community. It is unclear to me what, if any, actions were taken in response to (some) of her claims and her suicide. What is clear is the pages and pages of tumblr posts and Reddit thre

... (read more)

For the record, I knew Kathy for several years, initially through a local Less Wrong community, and considered her a friend for some time. I endorse Scott's assessment, but I'll emphasise that I think she believed the accusations she made.

Relevant to this post: Many people tried to help Kathy, from 3 groups that I'm aware of. People gave a lot of time and energy. Speaking for myself and what I observed in our local community, I believe we prioritised helping her over protecting our community and over our own wellbeing.

In the end things went poorly on all t... (read more)

Ruby
1y45
7
0

I came to the comments here to also comment quickly on Kathy Forth's unfortunate death and her allegations. I knew her personally (she subletted in my apartment in Australia for 7 months in 2014, but more meaningfully in terms of knowing her, we also we overlapped at Melbourne meetups many times, and knew many mutual people). Like Scott, I believe she was not making true accusations (though I think she genuinely thought they were true). 

I would have said more, but will follow Scott's lead in not sharing more details. Feel free to DM me.

(some of) Kathy's accusations were false

just to draw some attention to the "(some of)", Kathy claimed in her suicide note that her actions had led to one person being banned from EA events. My understanding is that she made a mixture of accusations that were corroborated and ones that weren't, including the ones you refer to. I think this is interesting because it means both:

  • Kathy was not just a liar who made everything up to cause trouble. I would guess she really was hurt, and directed responsibility for that hurt to a mixture of the right and wrong
... (read more)
7
Elika
1y
Responding to the attention on Kathy's specific case (I'm aware I'm adding more to it) - I think we're detracting from the key argument that the EA community as a whole is neglecting to validate and support community members who experience bad things in the community  In this  post, it's  women and sexual assault primarily. But there are other posts (1, 2) exempifying ways the EA community itself can and should prioritise internal community health.  To argue the truth of one specific example might be detracting from recognising that this might be a systematic problem.
-6
Guy Raveh
1y
4
ethai
1y
edit: after discussion below & other comments on this post, I feel less strongly about the claim "EA community is bad at addressing harm", but stand by / am clarifying my general point, which is that the veracity of Kathy's claims doesn't detract from any of the other valid points that Maya makes and I don't think people should discount the rest of these points. A suggestion to people who are approaching this from a "was Kathy lying?" lens: I think it's also important to understand this post in the context of the broader movement around sexual assault and violence. The reason this kind of thing stings to a woman in the community is because it says "this is how this community will react if you speak up about harm; this is not a welcoming place for you if you are a survivor." It's not about whether Kathy, in particular, was falsely accusing others. The way I read Maya's critique here is "there were major accusations of major harm done, and we collectively brushed it off instead of engaging with how this person felt harmed;" which is distinct from "she was right and the perpetrator should be punished". This is a call for the EA community to be more transparent and fair in how it deals with accusations of wrongdoing, not a callout post of anybody.  Perhaps I would feel differently if I knew of examples of the EA community publicly holding men accountable for harm to women, but as it stands AFAIK we have a lot of examples like those Maya pointed out and not much transparent accountability for them. :/ Would be very happy to be corrected about that. (Maya, I know it's probably really hard to see that the first reply on your post is an example of exactly the problem you're describing, so I just want to add in case you see this that I relate to a lot of what you've shared and you have an open offer to DM me if you need someone to hold space for your anger!)

Regardless of the accuracy of this comment, it makes me sad that the top comment on this post is adversarial/argumentative and showing little emotional understanding/empathy (particularly the line "getting called out in posts like this one"). I think it unfortunately demonstrates well the point the author made about EA having an emotions problem:

On the forum in particular and in EA discourse in general, there is a tendency to give less weight/be more critical of posts that are more emotion-heavy and less rational. This tendency makes sense based on EA prin

... (read more)
[anonymous]1y131
68
13

I'm glad you made your post about how Kathy's accusations were false.  I believe that was the right thing to do -- certainly given the information you had available.

But I wish you had left this sentence out, or written it more carefully:

But they wouldn't do that, I'm guessing because they were all terrified of getting called out in posts like this one.

It was obvious to me reading this post that the author made a really serious effort to stay constructive. (Thanks for that, Maya!)  It seems to me that we should recognize that, and you're erasi... (read more)

While this is important (clarifying of misinformation), I want to mention that I don't think this takes away from the main message of the post. I think it's important to remember that even with a culture of rationality, there are times when we won't have enough information to say what happened (unlike in Scotts case), and for that reason Mayas post is very relevant and I am glad it was shared.

It also doesn't seem appropriate to mention this post as "calling out". While it's legitimate to fear reputations being damaged with unsubstantiated claims, this post doesn't strike me as doing such.

Arepo
1y80
20
1

I want to strong agree with this post, but a forum glitch is preventing me from doing so, so mentally add +x agreement karma to the tally. [Edit: fixed and upvoted now]

I have also heard from at least one very credible source that at least one of Kathy's accusations had been professionally investigated and found without any merit.

Maybe also worth adding that the way she wrote the post would in a healthy person be intentionally misleading, and was at least incredibly careless for the strength of accusation. Eg there was some line to the effect of 'CFAR are i... (read more)

I think I wrote that piece in 2010 (based on timestamp on version I have saved, though I'm not 100% sure that's the earliest draft). I would have been 25-26 then. I agree that's the first EA-relevant thing I wrote.

2
Zach Stein-Perlman
2y
See https://web.archive.org/web/20131230140344/http://squid314.livejournal.com/243765.html?(Also I think the webpages you link to are from no later than 2008, and clustered up to November 2008.)

For what it's worth, I still don't feel like I understand CEA's model of how having extra people present hurts the most prestigious attendees.

If you are (say) a plant-based meat expert, you are already surrounded by many AI researchers, epidemiologists, developmental economists, biosecurity analyists, community-builders, PR people, journalists, anti-factory-farm-activists, et cetera. You are probably going to have to plan your conversations pretty deliberately to stick to people within your same field, or who you are likely to have interesting things to sa... (read more)

I also don't get this. I can;t help thinking about the Inner Ring essay by C.S. Lewis. I hope that's not what's happening.

Thanks for your response. I agree that the goal should be trying to hold the conference in a way that's best for the world and for EA's goals. If I were to frame my argument more formally, it would be something like - suppose that you reject 1000 people per year (I have no idea if this is close to the right number). 5% get either angry or discouraged and drop out of EA. Another 5% leave EA on their own for unrelated reasons, but would have stayed if they had gone to the conference because of some good experience they had there. So my totally made up Fermi ... (read more)

Hi Scott — it’s hard to talk about these things publicly, but yes a big concern of opening up the conference is that attendees’ time won’t end up spent on the most valuable conversations they could be having. I also worry that a two-tiered app system would cause more tension and hurt feelings than it would prevent. A lot of conversations aren’t scheduled through the app but happen serendipitously throughout the event. (Of the things you mentioned, I’m not particularly worried about attendees disrupting talks.)

We’ve thought a fair bit about the “how costly ... (read more)

I'm having trouble figuring out how to respond to this. I understand that it's kind of an academic exercise to see how cause prioritization  might work out if you got very very rough numbers and took utilitarianism very seriously without allowing any emotional considerations to creep in. But I feel like that potentially makes it irrelevant to any possible question.

If we're talking about how normal people should prioritize...well, the only near-term cause close to x-risk here is animal welfare. If you tell a normal person "You can either work to preven... (read more)

3
Benjamin_Todd
2y
I think once you take account of diminishing returns and the non-robustness of the x-risk estimates, there's a good chance you'd end up estimating the cost per present life saved of GiveWell is cheaper than donating to xrisk. So the claim 'neartermists should donate to xrisk' seems likely wrong. I agree with Carl the US govt should spend more on x-risk, even just to protect their own citizens. I think the typical person is not a neartermist, so might well end up thinking x-risk is more  cost-effective than GiveWell if they thought it through. Though it would depend a lot on what considerations you include or not. From a pure messaging pov, I agree we should default to opening with "there might be an xrisk soon" rather than "there might be trillions of future generations", since it's the most important message and is more likely to be well-received. I see that as the strategy of the Precipice, or of pieces directly pitching AI xrisk. But I think it's also important to promote longtermism independently, and/or mention it as an additional reason to prioritise about xrisk a few steps after opening with it.
8
NunoSempere
2y
The question is, I think, "how should FTX/SBF spend its billions?"
8
elifland
2y
I see what you mean and I think I didn't do a good job of specifying this in the post; my impression is one question your post and the other posts I'm responding to are trying to answer is "How should we pitch x-risks to people who we want to {contribute to them via work, donations, policy, etc.}?" So my post was (primarily) attempting to contribute to answering that question. In your post, my understanding of part of your argument was: thoughtful short-termism usually leads to the same conclusion as longtermism so when pitching x-risks we can just focus on the bad short-term effects without getting into debates about whether future people matter and how much, etc. My argument is that it's very unclear if this claim is true[1], so making this pitch feels intellectually dishonest to some extent. It feels important to have people who we want to do direct work on x-risks working on it for coherent reasons so intellectual honesty feels very important when pitching there; I'm less sure about donaters and even less sure about policymakers but in general trying to be as intellectually honest as possible while maintaining similar first-order effectiveness feels good to me. It feels less intellectually dishonest if we're clear that a substantial portion of the reason we care about x-risks so much is that extinction is extra bad, as you mentioned here but wasn't in the original post: ---------------------------------------- A few reactions to the other parts of your comment: I agree, but it feels like the target audience matters here; in particular, as I mentioned above I think the type of person I'd want to successfully pitch to directly work on x-risk should care about the philosophical arguments to a substantial extent. Agree, I'm not arguing to change the behavior/prioritization of leaders/big funders of the EA movement (who I think are fairly bought into longtermism with some worldview diversification but are constrained by good funding opportunities). I agree wit

Yes, I'm sorry, I talked to Claire about it and updated, sorry for the mixed messages and any stress this caused.

7
Devin Kalish
2y
For what it's worth, I also think approving it being posted was the right decision.

Thanks for the link, which I had previously missed and which does contain some important considerations.

I've been assuming that the people who set up the first impact market will have the opportunity to affect the "culture" around certificates, especially since many people will be learning what they are for the first time after the market starts to exist, but I agree that eventually it will depend on what buyers and sellers naturally converge to.

One way that preference could be satisfied is to give each share a number. Funders will value the first shares m

... (read more)
2
RyanCarey
2y
Agree on affecting culture. Sorry, I could have been clearer. Suppose we want to choose (B) on (2). And I ask for retro funding for a painting I drew. Let's consider three cases: a) it had $500 of impact, and I was 0% likely to make the painting anyway. b) it had $1k of impact, 50% likely anyway b) $2k of impact, 75% likely anyway. The "obvious" solution is that in each case, I can sell some certs, say 100 for $5 ea. Alternatively, we could say that in: case a) each cert 1-100 is worth $5 b) certs 1-50 are worth $10 ea, 51-100 $0 c) certs 1-25 are worth $20 ea, 26-100 worth $0. The second solution allows the world to know how useful the painting/project was, though at the cost of some complexity. The bottom-line I think is that the contract should be clear about which thing it purports to be doing.

Thanks, I had read that but failed to internalize how much it was saying this same thing. Sorry to Neel for accidentally plagiarizing him.

I didn't mean to imply that you were plagiarising Neel. I more wanted to point out that that many reasonable people (see also Carl Shulman's podcast) are pointing out that the existential risk argument can go through without the longtermism argument. 

I posted the graphic below on twitter back in Nov. These three communities & sets of ideas overlap a lot and I think reinforce one another, but they are intellectually & practically separable, and there are people in each section doing great work. Just because someone is in one section doesn't mea... (read more)

No worries, I'm excited to see more people saying this! (Though I did have some eerie deja vu when reading your post initially...)

I'd be curious if you have any easy-to-articulate feedback re why my post didn't feel like it was saying the same thing, or how to edit it to be better? 

(EDIT: I guess the easiest object-level fix is to edit in a link at the top to your's, and say that I consider you to be making substantially the same point...)

Thank you for doing this. I know many people debating this question, with some taking actions based on their conclusions, and this looks like a really good analysis.

3
kokotajlod
2y
FWIW I'd wildly guess this analysis underestimates by 1-2 orders of magnitude, see this comment thread. ETA: People (e.g. Misha) have convinced me that 2 is way too high, but I still think 1 is reasonable. ETA: This forecaster has a much more in-depth analysis that says it's 1 OOM.

I see the site lists "our bloggers", including Aria Babu, Sam Enright, Stian Weslake, etc. Are these people who are on your team (and not competing for the prize), or are these people who have already entered the competition?

The first two issues are the whole point of laundering your opinions through bloggers. 

I don't mean the bloggers should post the documents publicly, or even a play-by-play of the documents ("First Will MacAskill said, then Peter Singer said...") . I mean the bloggers should read the documents, understand the arguments, and post the key points/conclusions, perhaps with a "thanks to some anonymous people who helped me develop these ideas".

I agree the last issue is important, but this could be solved by good channels of communication and explanation about what should/shouldn't be posted.

EA is producing a ton of thoughtful writing, but the majority takes place in internal discussions and private documents. For some discussions, this would be the only sensible way to have them. But having other discussions in public should help to raise the salience of EA in the broader discourse and bring more people in. It could also help spark new ideas. 

 

Any thoughts about making some of this discussion available to bloggers so they can popularize it? Asking bloggers unconnected to the EA network to reinvent or equal the level of discourse that the top people have among themselves sounds much harder than figuring out a way to get the originals to the public.

6
ben.smith
2y
Perhaps an enterprising blogger could start an interview-format blog, where they interview EA authors of those "internal discussions and private documents" and ask them to elucidate their ideas in a way suitable for a general audience. I think that would make for a pretty neat and high-value blog!
[anonymous]2y22
0
0

I agree that this would be very valuable. I work at an EA org and even I miss out on a lot of discussions that happen between top people,  on googledocs or over lunch. Things must be much worse for people not at EA orgs. 

It would be useful if some of the top people could share why they prefer not to make these discussions public. I would guess that one reason is that people don't want arguments which they haven't backed up in  formal ways to be classed as "the official view of EA leaders". Creating a forum for posts with shakier epistemic status seems valuable

The coronavirus Fast Grants were great, but their competitive advantage seems to have been that they were they first (and fastest) people to move in a crisis.

The overall Emergent Ventures idea is interesting and worth exploring (I say, while running a copy of it), but has it had proven cost-effective impact yet? I haven't been following the people involved but I don't remember MR formally following up.

2
AppliedDivinityStudies
2y
There's a list of winners here, but I'm not sure how you would judge counterfactual impact. With a lot of these, it's difficult to demonstrate that the grantee would have been unable to do their work without the grant. At the very least, I think Alexey was fairly poor when he received the grant and would have had to get a day job otherwise.

Thank you for writing this. I've seen a lot of people get confused around this, and it's genuinely pretty confusing, and it's good to have a really good summary all in one place by someone who knows what's going on.

Thanks for asking!

On some of your graphs, eg https://ourworldindata.org/grapher/gdp-per-capita-maddison-2020, you have a box you can tick to get "relative change". On other graphs, eg https://ourworldindata.org/grapher/children-per-woman-un?tab=chart&time=1950..2015&country=OWID_WRL~HUN, you don't have that box. You can force the chart to do this by adding "?stackMode=relative" to the URL, but that is annoying and hard to remember. Please add the box to all graphs.

If you generate a graph like https://ourworldindata.org/grapher/children-per-woman-un... (read more)

Thanks for doing this. It's really interesting to see someone try to quantify the effects of activism. A few questions:

1. Can you further explain your estimate of a 0.5% - 10% higher chance of a bill passing because of climate activism?

2. Does that number claim that the Sunrise Movement in particular increased the chance that much, or that all activism (compared to some world with no active pro-climate grassroots movement) increased it that much? If the latter, is this being divided by the Sunrise Movement's budget, or to something else? Is the claim that ... (read more)

3
Dan Stein
3y
Hi Scott, thanks for your questions! Good questions, let me try some responses.   1. This is clearly the most difficult parameter to measure. We thought .5-10% represented a reasonable yet conservative range of potential values. I'd say "conventional wisdom" (ie what quite a number of the people we've spoken with have argued, but certainly not everyone agrees) is that you can draw a pretty straight line between the recent work of policy-focused climate activism groups like Sunrise and subsequent placement of climate as a high priority for the Biden administration. All of that being said, we agree that it's relatively arbitrary. As we move forward to look at specific organizations, we'll try to do a bit more work to get more reasonable values for these parameters. But it's not clear how much better (if anything) we'll be able to do.  2. We were trying to make this CEA a bit more general but it was clearly calibrated on Sunrise. I think you should interpret the model as representing a single, effective activism organization with a specific budget. So if we were trying to use the model to hone in on Sunrise's impact, we'd say that for Sunrise's budget of ~25 million over 5 years, we think this caused X% increase of a broad, progressive bill being passed. You could also extrapolate this to the movement in general, but then you'd use a larger budget and a larger % change.  3. We aren't making that claim, but if you wanted to extrapolate this model to a future marginal increase in spending, then yes you'd need this assumption. I'd agree this is dodgy, but also I'm not sure the best way to extrapolate a total effect to marginal effect. Probably would just have to do some kind of discounting to account for diminishing marginal effects. This is something we should definitely think about, so thanks for bringing it up. 

Whether or not you go this route, you might also want to talk to Lama, a rationalist-adjacent Saudi student who was in the last Emergent Ventures cohort. She might be able to give you some advice and connect you to any existing Saudi community. You can find her at https://lamaalrajih.com

4
Bojack
4y
Thanks, I've sent her an email. (UPDATE: She does not know any EA's in Saudi.)

Thanks for the link to the Enthea paper, I'll check it out.

Vox is looking for EA journalists. This is an opportunity to publicize EA and help shape its public perception. Their ad hints that they want people who are already in the movement, so take a look if you have any writing or journalism related skills.

1
Ro-bot-tens
6y
I think this is so huge. I was going to post it but saw you got to it first.

I support the change. I mean, I would, as someone who's taken advantage of the ambiguity in the current pledge to donate to x-risk-related causes, but I think even independent of that I support the change.

The GWWC pledge is a good institution. It provides a unified community norm of "at least ten percent" and helps keep people honest. It's a piece of "social technology" that makes effective altruism easier.

As such, if GWWC restricted it to the developing world, I would expect and encourage the animal rights movement and the x-risk movem... (read more)

Does anyone have any thoughts on whether the Ebola outbreak is a unique effective giving opportunity compared to better-studied issues like malaria and schistosomiasis? I tried to do a Fermi estimate here but I don't trust it further than I can throw it.

2
Owen Cotton-Barratt
10y
I think that the Fermi estimate is a good start, but I am more suspicious that it might be a substantial underestimate of the cost-effectiveness. The strongest case for the Ebola outbreak to be an outstanding giving opportunity seems to be that the outbreak might grow a long way, and intervening now could be an easy way to help containment and give outside agencies enough time to get a proper plan into action. Perhaps there's a story where we're on route to discover a cure or vaccine, but we'll hit 1 million deaths (say) before this happens, and it will still be on its exponential growth curve at that stage. Then it might be pretty cheap to slow the whole thing down by 1% today, but that could translate to 10,000 fewer deaths at the point where the cure or vaccine comes in. I don't really think that story has high likelihood, particularly as perturbations can lessen its impact -- if Ebola spreads slower today, perhaps that will also slow down the efforts of people who would eventually deal with it; or if it's no longer in exponential growth phase when we get a solution then the effect will be much smaller. But this looks like a scenario where the tail benefit could dominate, so I'd want to take the possibility seriously if looking at this. If this is correct then early interventions could be quite a bit better (in expectation) than later ones. My best guess right now is that it's still not good enough to target, though.