All of AGB's Comments + Replies

EA Debate Championship & Lecture Series

I even explicitly said I am less familiar with BP as a debate format.

The fact that you are unfamiliar with the format, and yet are making a number of claims about it, is pretty much exactly my issue. Lack of familiarity is an anti-excuse for overconfidence.

The OP is about an event conducted in BP. Any future events will presumably also be conducted in BP. Information about other formats is only relevant to the extent that they provide information about BP. 

I can understand not realising how large the differences between formats are initially, and so a... (read more)

Finally, even after a re-read and showing your comment to two other people seeking alternative interpretations, I think you did say the thing you claim not to have said. Perhaps you meant to say something else, in which case I'd suggest editing to say whatever you meant to say. I would suggest an edit myself, but in this case I don't know what it was you meant to say.

I've edited the relevant section. The edit was simply "This is also pretty common in other debate formats (though I don't know how common in BP in particular)".

By contrast, criticisms I think

... (read more)
EA Debate Championship & Lecture Series

You did give some responses elsewhere, so a few thoughts on your responses:

But this is really far from the only way policy debate is broken. Indeed, a large fraction of policy debates end up not debating the topic at all, but end up being full of people debating the institution of debating in various ways, and making various arguments for why they should be declared the winner for instrumental reasons. This is also pretty common in other debate formats.

(Emphasis added). This seems like a classic case for 'what do you think you know, and how do you think yo... (read more)

2Habryka5moTo be clear, I think very little of my personal experience played a role in my position on this. Or at least very unlikely in the way you seem to suggest. A good chunk of my thoughts on this were formed talking to Buck Shlegeris and Evan Hubinger at some point and also a number of other discussions about debating with a bunch of EAs and rationalists. I was actually pretty in favor of debate ~4-5 years ago when I remember first discussing this with people, but changed my mind after a bunch of people gave their perspectives and experiences and I thought more through the broader problem of how to fix it. I also want to clarify the following I didn't say that. I said "This is also pretty common in other debate formats". I even explicitly said I am less familiar with BP as a debate format. It seems pretty plausible to me that BP has less of the problem of meta-debate. But I do think evidence of problems like meta-debate in other formats is evidence of BP also having problems, even if I am specifically less familiar with BP.
Getting a feel for changes of karma and controversy in the EA Forum over time

Thanks for this, pretty interesting analysis.

Every time I come across an old post in the EA forum I wonder if the karma score is low because people did not get any value from it or if people really liked it and it only got a lower score because fewer people were around to upvote it at that time.

The other thing going on here is that the karma system got an overhaul when forum 2.0 launched in late 2018, giving some users 2x voting power and also introducing strong upvotes. Before that, one vote was one karma. I don't remember exactly when the new system came... (read more)

EA Debate Championship & Lecture Series

I think these concerns are all pretty reasonable, but also strongly discordant with my personal experience, so I figured it would help third parties if I explained the key insights/skills I think I learned or were strongly reinforced by my debating experience. 

Three notable caveats on that experience:

  • I spent more time judging debates than I did speaking in them, which is moderately unusual. It's plausible to me that judging was much more useful.
  • It was 8-12 years ago, and my independent impression is that the top levels of the sport have degenerated so
... (read more)

I don't have time to respond in super much depth because of a bunch of competing commitments but I want to say that all of these are good points and I appreciate you making them.

How much does performance differ between people?

So taking a step back for a second, I think the primary point of collaborative written or spoken communication is to take the picture or conceptual map in my head and put it in your head, as accurately as possible. Use of any terms should, in my view, be assessed against whether those terms are likely to create the right picture in a reader's or listener's head. I appreciate this is a somewhat extreme position.

If everytime you use the term heavy-tailed (and it's used a lot - a quick CTRL + F tells me it's in the OP 25 times) I have to guess from context wh... (read more)

4Max_Daniel5moI'm not sure how extreme your general take on communication is, and I think at least I have a fairly similar view. I agree that the kind of practical experiences you mention can be a good reason to be more careful with the use of some mathematical concepts but not others. I think I've seen fewer instances of people making fallacious inferences based on something being log-normal, but if I had I think I might have arrived at similar aspirations as you regarding how to frame things. (An invalid type of argument I have seen frequently is actually the "things multiply, so we get a log-normal" part. But as you have pointed out in your top-level comment, if we multiply a small number of thin-tailed and low-variance factors we'll get something that's not exactly a 'paradigmatic example' of a log-normal distribution even though we could reasonably approximate it with one. On the other hand, if the conditions of the 'multiplicative CLT' aren't fulfilled we can easily get something with heavier tails than a log-normal. See also fn26 in our doc: )
How much does performance differ between people?

Briefly on this, I think my issue becomes clearer if you look at the full section.

If we agree that log-normal is more likely than normal, and log-normal distributions are heavy-tailed, then saying 'By contrast, [performance in these jobs] is thin-tailed' is just incorrect? Assuming you meant the mathematical senses of heavy-tailed and thin-tailed here, which I guess I'm not sure if you did.

This uncertainty and resulting inability to assess whether this section is true or false obviously loops back to why I would prefer not to use the term 'heavy-tailed' at... (read more)

2Max_Daniel5moI think the main takeaway here is that you find that section confusing, and that's not something one can "argue away", and does point to room for improvement in my writing. :) With that being said, note that we in fact don't say anywhere that anything 'is thin-tailed'. We just say that some paper 'reports' a thin-tailed distribution, which seems uncontroversially true. (OTOH I can totally see that the "by contrast" is confusing on some readings. And I also agree that it basically doesn't matter what we say literally - if people read what we say as claiming that something is thin-tailed, then that's a problem.) FWIW, from my perspective the key observations (which I apparently failed to convey in a clear way at least for you) here are: * The top 1% share of ex-post "performance" [though see elsewhere that maybe that's not the ideal term] data reported in the literature varies a lot, at least between 3% and 80%. So usually you'll want to know roughly where on the spectrum you are for the job/task/situation relevant to you rather than just whether or not some binary property holds. * The range of top 1% shares is almost as large for data for which the sources used a mathematically 'heavy-tailed' type of distribution as model. In particular, there are some cases where we some source reports a mathematically 'heavy-tailed' distribution but where the top 1% share is barely larger than for other data based on a mathematically 'thin-tailed' distribution. * (As discussed elsewhere, it's of course mathematically possible to have a mathematically 'thin-tailed' distribution with a larger top 1% share than a mathematically 'heavy-tailed' distribution. But the above observation is about what we in fact find in the literature rather than about what's mathematically possible. I think the key point here is not so much that we haven't found a 'thin-tailed' distribution with larger top 1% share than some 'heavy-taile
How much does performance differ between people?

Hi Max and Ben, a few related thoughts below. Many of these are mentioned in various places in the doc, so seem to have been understood, but nonetheless have implications for your summary and qualitative commentary, which I sometimes think misses the mark. 

  • Many distributions are heavy-tailed mathematically, but not in the common use of that term, which I think is closer to 'how concentrated is the thing into the top 0.1%/1%/etc.', and thus 'how important is it I find top performers' or 'how important is it to attract the top performers'. For exam
... (read more)
2Max_Daniel5moAs an aside, for a good and philosophically rigorous criticism of cavalier assumptions of normality or (arguably) pseudo-explanations that involve the central limit theorem, I'd recommend Lyon (2014), Why are Normal Distributions Normal? [] Basically I think that whenever we are in the business of understanding how things actually work/"why" we're seeing the data distributions we're seeing, often-invoked explanations like the CLT or "multiplicative" CLT are kind of the tip of the iceberg that provides the "actual" explanation (rather then being literally correct by themselves), this iceberg having to do with the principle of maximum entropy / the tendency for entropy to increase / 'universality' and the fact that certain types of distributions are 'attractors' for a wide range of generating processes. I'm too much of an 'abstract algebra person' to have a clear sense of what's going on, but I think it's fairly clear that the folk story of "a lot of things 'are' normally distributed because of 'the' central limit theorem" is at best an 'approximation' and at worst misleading. (One 'mathematical' way to see this is that it's fishy that there are so many different versions of the CLT rather than one clear 'canonical' or 'maximally general' one. I guess stuff like this also is why I tend to find common introductions to statistics horribly unaesthetic and have had a hard time engaging with them.)
2Max_Daniel5moI kind of agree with this (and this is why I deliberately said that "they report a Gaussian distribution" rather than e.g. "performance is normally distributed"). In particular, yes, they just assumed a normal distribution and then ran with this in all cases in which it didn't lead to obvious problems/bad fits no matter the parameters. They did not compare Gaussian with other models. I still think it's accurate and useful to say that they were using (and didn't reject) a normal distribution as model for low- and medium-complexity jobs as this does tell you something about how the data looks like. (Since there is a lot of possible data where no normal distribution is a reasonable fit.) I also agree that probably a log-normal model is "closer to the truth" than a normal one. But on the other hand I think it's pretty clear that actually neither a normal nor a log-normal model is fully correct. Indeed, what would it mean that "jobs actually follow a certain type of distribution"? If we're just talking about fitting a distribution to data, we will never get a perfect fit, and all we can do is providing goodness-of-fit statistics for different models (which usually won't conclusively identify any single one). This kind of brute/naive empiricism just won't and can't get us to "how things actually work". On the other hand, if we try to build a model of the causal generating mechanism of job performance it seems clear that the 'truth' will be much more complex and messy - we will only have finitely many contributing things (and a log-normal distribution is something we'd get at best "in the limit"), the contributing factors won't all be independent etc. etc. Indeed, "probability distribution" to me basically seems like the wrong type to talk about when we're in the business of understanding "how things actually work" - what we want then is really a richer and more complex model (in the sense that we could have several different models that would yield the same approximate
4Max_Daniel5moYeah, I think we agree on the maths, and I'm quite sympathetic to your recommendations regarding framing based on this. In fact, emphasizing "top x% share" as metric and avoiding any suggestion that it's practically useful to treat "heavy-tailed" as a binary property were my key goals for the last round of revisions I made to the summary - but it seems like I didn't fully succeed. FWIW, I maybe wouldn't go quite as far as you suggest in some places. I think the issue of "mathematically 'heavy-tailed' distributions may not be heavy-tailed in practice in the everyday sense" is an instance of a broader issue that crops up whenever one uses mathematical concepts that are defined in asymptotic terms in applied contexts. To give just one example, consider that we often talk of "linear growth", "exponential growth", etc. I think this is quite useful, and that it would overall be bad to 'taboo' these terms and always replace them with some 'model-agnostic' metric that can be calculated for finitely many data points. But there we have the analog issue that depending on the parameters an e.g. exponential function can for practical purposes look very much like a linear function over the relevant finite range of data. Another example would be computational complexity, e.g. when we talk about algorithms being "polynomial" or "exponential" regarding how many steps they require as function of the size of their inputs. Yet another example would be attractors in dynamical systems. In these and many other cases we encounter the same phenomenon that we often talk in terms of mathematical concepts that by definition only tell us that some property holds "eventually", i.e. in the limit of arbitrarily long amounts of time, arbitrarily much data, or similar. Of course, being aware of this really is important. In practice it often is crucial to have an intuition or more precise quantitative bounds on e.g. whether we have enough data points to be able to use some computational method
Forget replaceability? (for ~community projects)

I want to push back against a possible interpretation of this moderately strongly.

If the charity you are considering starting has a 40% chance of being 2x better than what is currently being done on the margin, and a 60% chance of doing nothing, I very likely want you to start it, naive 0.8x EV be damned. I could imagine wanting you to start it at much lower numbers than 0.8x, depending on the upside case. The key is to be able to monitor whether you are in the latter case, and stop if you are. Then you absorb a lot more money in the 40% case, and the actu... (read more)

Evidence Action are another great example of "stop if you are in the  downside case" done really well.

3MichaelStJules6moI agree with giving more weight to upside when you can monitor results effectively and shut down if things don't go well, but you can actually model all of this explicitly. Maybe the model will be too imprecise to be very useful in many cases, but sensitivity analysis can help. You can estimate the effects in the case where things go well and you scale up, and in the case where you shut down, including the effects of diverting donations from effective charities in each case, and weight the conditional expectations by the probabilities of scaling up and shutting down. If I recall correctly, this is basically what Charity Entrepreneurship has done, with shutdown within the first 1 or 2 years in the models I looked at. Shutting down minimizes costs and diverting of funding. You wouldn't start a charity with a negative expected impact after including all of these effects, including the effects of diverting funding from other charities.
RyanCarey's Shortform

I have a few thoughts here, but my most important one is that your (2), as phrased, is an argument in favour of outreach, not against it. If you update towards a much better way of doing good, and any significant fraction of the people you 'recruit' update with you, you presumably did much more good via recruitment than via direct work. 

Put another way, recruitment defers to question of how to do good into the future, and is therefore particularly valuable if we think our ideas are going to change/improve particularly fast. By contrast, recruitment (o... (read more)

4RyanCarey6moGood point - this has changed my model of this particular issue a lot (it's actually not something I've spent much time thinking about). I guess we should (by default) imagine that if at time T you recruit a person, that they'll do an activity that you would have valued, based on your beliefs at time T. Some of us thought that recruitment was even better, in that the recruited people will update their views over time. But in practice, they only update their views a little bit. So the uncertainty-bonus for recruitment is small. In particular, if you recruit people to a movement based on messaging in cause A, you should expect relatively few people to switch to cause B based on their group membership, and there may be a lot of within-movement tensions between those that do/don't. There are also uncertainty-penalties for recruitment. While recruiting, you crystallise your own ideas. You give up time that you might've used for thinking, and for reducing your uncertainties. On balance, recruitment now seems like a pretty bad way to deal with uncertainty.
2[comment deleted]6mo
Some quick notes on "effective altruism"

+1. A short version of my thoughts here is that I’d be interested in changing the EA name if we can find a better alternative, because it does have some downsides, but this particular alternative seems worse from a strict persuasion perspective.

Most of the pushback I feel when talking to otherwise-promising people about EA is not really as much about content as it is about framing: it’s people feeling EA is too cold, too uncaring, too Spock-like, too thoughtless about the impact it might have on those causes deemed ineffective, too naive to realise the imp... (read more)

Progress Open Thread: March 2021

I think similar adjustments should be made if you are extrapolating to crimes with very different prevalence. For example, the US murder rate is 4-5x that of the UK, but I wouldn’t expect the US to have that many more bike thefts.

Proxy seems fine if you’re focused on which country/city/etc. has higher overall crime, rather than estimating magnitude.

(FWIW, attempt at Googling the above suggest ~300k bike thefts per year in UK versus 2m in US, US population 5x bigger so that’s only 1.33x the UK rate. A quick check on bicycle sales in the two countries does n... (read more)

7Gregory_Lewis6moA less important motivation/mechanism is probabilities/ratios (instead of odds) are bounded above by one. For rare events 'doubling the probability' versus 'doubling the odds' get basically the same answer, but not so for more common events. Loosely, flipping a coin three times 'trebles' my risk of observing it landing tails, but the probability isn't 1.5. (cf []). E.g. If you used the 80% definition instead of 20%, then the '4x' risk factor implied by 60% additional chance (with 20% base rate) would give instead an additional 240% chance. [(Of interest, 20% to 38% absolute likelihood would correspond to an odds ratio of ~2.5, in the ballpark of 3-4x risk factors discussed before. So maybe extrapolating extreme event ratios to less-extreme event ratios can do okay if you keep them in odds form. The underlying story might have something to do with logistic distributions closely resemble normal distributions (save at the tails), so thinking about shifting a normal distribution across the x axis so (non-linearly) more or less of it lies over a threshold loosely resembles adding increments to log-odds (equivalent to multiplying odds by a constant multiple) giving (non-linear) changes when traversing a logistic CDF. But it still breaks down when extrapolating very large ORs from very rare events. Perhaps the underlying story here may have something to do with higher kurtosis : '>2SD events' are only (I think) ~5X more likely than >3SD events for logistic distributions, versus ~20X in normal distribution land. So large shifts in likelihood of rare(r) events would imply large logistic-land shifts (which dramatically change the whole distribution, e.g. an OR of 10 makes evens --> >90%) much more modest in normal-land (e.g. moving up an SD gives OR>10 for previously 3SD events, but ~2 for previously 'above average' ones)]
Progress Open Thread: March 2021

(Arguably nitpicking, in the sense that I suspect this would not change the bottom line, posted because the use of stats here raised my eyebrows)

For some calibration, risk of drug abuse, which is a reasonable baseline for other types of violent behavior as well, is about 2-3x in adopted children. This is not conditioning on it being a teenager adoption, which I expect would likely increase the ratio to more something like 3-4x, given the additional negative selection effects. 

Sibling abuse rates are something like 20% (or 80% depending on your definit

... (read more)
7Habryka6moYeah, I think this is a totally fair critique and I updated some after reading it! I wrote the above after a long Slack conversation with Aaron at like 2AM, just trying to capture the rough shape of the argument without spending too much time on it. I do think actually chasing this argument all the way through is interesting and possibly worth it. I think it's pretty plausible it could make a 2-3x difference in the final outcome (and possibly a lot more!), and I hadn't actually thought through it all the way. And while I had some gut sense it was important to differentiate between median and tail outcomes here, I hadn't properly thought through the exact relationship between the two and am appreciative of you doing some more of the thinking. I currently prefer your estimate of "moving it from 20% to 38%" as something like my best guess.
8Wei_Dai6moThanks for this explanation. That part of Habryka's comment also struck me as very suspicious when I read it, but it wasn't immediately obvious what's wrong with it exactly.
Responses and Testimonies on EA Growth

I agree with a lot of this, and I appreciated both the message and the effort put into this comment. Well-substantiated criticism is very valuable.

I do want to note that GWWC being scaled back was flagged elsewhere, most explicitly in Ben Todd's comment (currently 2nd highest upvoted on that thread). But for example, Scott's linked reddit comment also alludes to this, via talking about the decreased interest in seeking financial contributions. 

But it's true that in neither case would I expect the typical reader to come away with the impression that a ... (read more)

9AnonymousEAForumAccount6moThanks AGB! I do think it was a mistake to deprioritize GWWC, though I agree this is open to interpretation. But I want to clarify that my main point is that the EA community seems to have strong and worrisome cultural biases toward self-congratulation and away from critical introspection.
What do you make of the doomsday argument?
Answer by AGBMar 19, 20213

With low confidence, I think I agree with this framing.

If correct, then I think the point is that seeing us at an 'early point in history' updates us against a big future, but the fact we exist at all updates in favour of a big future, and these cancel out.

You wake up in a mysterious box, and hear the booming voice of God:

“I just flipped a coin. If it came up heads, I made ten boxes, labeled 1 through 10 — each of which has a human in it.

If it came up tails, I made ten billion boxes, labeled 1 through 10 billion — also with one human in each box.

To get int

... (read more)
1Ramiro6moIn the "voice of God" example, we're guaranteed to minimize error by applying this reasoning; i.e., if God asks this question to every possible human created, and they all answer this way, most of them will be right. Now, I'm really unsure about the following, but imagine each new human predicts Doomsday through DA reasoning; in that case, I'm not sure it minimizes error the same way. We often assume human population will increase exponentially and then suddenly go extinct; but then it seems like most people will end up mistaken in their predictions. Maybe we're using the wrong priors?
Politics is far too meta

Weirdly, I found this post a bit 'too meta', in the sense that there are a lot of assertions and not a lot of effort to provide evidence or otherwise convince me that these claims are actually true. Some claims I agree with anyway (e.g. I think you can reasonably declare political feasibility 'out-of-scope' in early-stage brainstorming), some I don't. Here's the bit that my gut most strongly disagrees with:

A good test is to ask, when right things are done on the margin, what happens? When we move in the direction of good policies or correct statements, how

... (read more)
What Makes Outreach to Progressives Hard

This has been a philosophical commitment since the early days of EA, yet information on how we (or the charities we prioritize) actually confirm with recipients that our programs are having the predicted positive impact on them receives, AFAICT, little attention in EA.

[Within footnote] As an example, after ten minutes of searching I could not find information on GiveWell's overall view on this subject on their website.


FWIW, the most closely related Givewell article I'm aware of is How not to be a "white in shining armor". Relevant excerpts (emphasis ... (read more)

Worth noting that you might get increased meaningfulness in exchange for the lost happiness

FWIW, I think this accidentally sent this subthread off on a tangent because of the phrasing of 'in exchange for the lost happiness'. 

My read of the stats, similar to this Vox article and to what Robin actually said, is that people with children (by choice) are neither more nor less happy on average than childless people (by choice), so any substantial boost to meaning should be seen as a freebie, rather than something you had to give up happiness for.

I think th... (read more)

What Makes Outreach to Progressives Hard

Whether this is a ‘good’ answer would depend on your audience, but I think one true answer from a typical EA would be ‘I care about those things too, but I think that the global poor/nonhuman animals/future generations are even more excluded from decision-making (and therefore ignored) than POC/women/LGBT groups are, so that’s where I focus my limited time and money’.

I don’t actually think the cause area challenge is quite what is going on here; I can easily imagine advancing those things being considered cause areas if they had a stronger case.

What Makes Outreach to Progressives Hard

But also, I think a lot of people that end up at HLS don't think in those sort of Marxist/socialist class terms, but rather just have a sort of strong Rawslian egalitarianism commitment.

I also think many people at HLS are hilariously unaware of their class privilege.

FWIW, I strongly agree with both of these statements for Oxbridge in the UK as well. 

The latter I think is a combination of a common dynamic where most people think they are closer to the middle of the income spectrum than they are, plus a natural human tendency to focus on the areas where you are being treated poorly or unfairly over the areas where you are being treated well. 

To this I would add:

Beware of the selection effect where I’d expect people with kids are less likely to come to meetups, less likely to post on this forum, etc. than EAs with overall-similar levels of involvement, so it can look like there are fewer than is actually the case, if you aren’t counting carefully.

For EA clusters in very-high-housing-cost areas specifically (Milan mentioned the Bay), I wouldn’t be surprised if the broader similar demographic is also avoiding children, since housing is usually the largest direct financial cost of having children,... (read more)

2Milan_Griffes6mofwiw I'm using "Bay Area Rationality" to point to a particular subculture (that which grew out of Overcoming Bias and LessWrong and is now centered around but not entirely contained by the Bay Area), and to disambiguate from the broader notion of "Rationality," which I understand to encompass many social movements, subcultures, and time periods.
How to make people appreciate asynchronous written communication more?
Answer by AGBMar 10, 202120

Writing is just a lot more time-consuming to cover equivalent ground in my experience. I occasionally make the mistake of getting into multi-hour text conversations with people, and almost invariably look back and think we could have covered the same ground in a phone call lasting <25% as long.

5Khorton6moI agree with this, especially if you're trying to make a decision or solve a problem together. It's very difficult and time-consuming to negotiate solutions via text.
1konrad6moYeah, agreed that your conclusion applies to the majority of interactions from a 1-off perspective. But I observe a decent amount of cases where it would be good to have literal documentation of statements, take-aways etc. because otherwise, you'll have to have many more phone calls. I'm especially thinking of co-working and other mutually agreed upon mid- to long-term coordination scenarios. In order to do collective world-modelling better, i.e. to find cruxes, prioritize, research, introspect, etc., it seems good to have more bandwidth AND more memory. But people routinely choose bandwidth over memory, without having considered the trade-off. I suspect that this is an unconscious choice and often leads to quite suboptimal outcomes for groups, as they become reliant on human superconnectors and those people's memory - as a local community-builder, this is what I am. And I can't trust my memory, so I outsource most of it in ways that are legible mostly to me - as it would be too costly for me to make it such that it's legible also for others. It is these superconnectors who have a disproportionate effect on the common knowledge and overall culture of a group. If the culture is being developed purposefully, you'd want really good documentation of it to remind, improve and onboard. Instead, most groups seem to have to rely on leadership and oral communication to coordinate. In part this might be because the pay-off of good documentation and building a culture that uses it is so long-term, that few are currently willing to pay for it? I am essentially wondering about the causal relationship here: are we (a) not paying for more resource-intensive coordination systems because we consciously aren't convinced of the value/possibility of it or are we (b) not convinced of the value/possibility of more resource-intensive coordination systems because we haven't actually tried all that much yet? I suspect that we're in the scenario of "not actually having tried enough" b
Why Hasn't Effective Altruism Grown Since 2015?

Scattered thoughts on this, pointing in various directions. 

TL;DR: Measuring and interpreting movement growth is complicated.

Things I'm relatively confident about:

  1. You need to be careful about whether the thing you are looking at is a proxy for 'size of EA' or a proxy for a derivative, i.e. 'how fast is EA growing'. I think Google Trends searches for 'Effective Altruism' are mostly the latter; it's something people might do on the way into the movement, but not something I would ever do.
  2. After correcting for (1), my rough impression is that EA grew supe
... (read more)
Complex cluelessness as credal fragility

How to manage deep uncertainty over the long-run ramifications of ones decisions is a challenge across EA-land - particularly acute for longtermists, but also elsewhere: most would care about risks about how in the medium term a charitable intervention could prove counter-productive

This makes some sense to me, although if that's all we're talking about I'd prefer to use plain English since the concept is fairly common. I think this is not all other people are talking about though; see my discussion with MichaelStJules. 

FWIW, I don't think 'risks' is q... (read more)

2Gregory_Lewis6mo(Apologies in advance I'm rehashing unhelpfully) The usual cluelessness scenarios are more about that there may be powerful lever for impacting the future, and your intended intervention may be pulling it in the wrong direction (rather than a 'confirmed discovery'). Say your expectation for the EV of GiveDirectly on conflict has a distribution with a mean of zero but an SD of 10x the magnitude of the benefits you had previously estimated. If it were (e.g.) +10, there's a natural response of 'shouldn't we try something which targets this on purpose?'; if it were 0, we wouldn't attend to it further; if it meant you were -10, you wouldn't give to (now net EV = "-9") GiveDirectly. The right response where all three scenarios are credible (plus all the intermediates) but you're unsure which one you're in isn't intuitively obvious (at least to me). Even if (like me) you're sympathetic to pretty doctrinaire standard EV accounts (i.e. you quantify this uncertainty + all the others and just 'run the numbers' and take the best EV) this approach seems to ignore this wide variance, which seems to be worthy of further attention. The OP tries to reconcile this with the standard approach by saying this indeed often should be attended to, but under the guise of value of information rather than something 'extra' to orthodoxy. Even though we should still go with our best guess if we to decide (so expectation neutral but high variance terms 'cancel out'), we might have the option to postpone our decision and improve our guesswork. Whether to take that option should be governed by how resilient our uncertainty is. If your central estimate of GiveDirectly and conflict would move on average by 2 units if you spent an hour thinking about it, that seems an hour well spent; if you thought you could spend a decade on it and remain where you are, going with the current best guess looks better. This can be put in plain(er) English (although familiar-to-EA jargon like 'EV' may remain). Yet
1tae6moNice, thanks for sharing!
Complex cluelessness as credal fragility

Not sure if you were referring to that particular post or the whole sequence. If I follow it correctly, I think that particular post is trying to answer the question 'how can we plausibly impact the long-term future, assuming it's important to do so'. I think it's a pretty good treatment of that question!

But I wouldn't mentally file that under cluelessness as I understand the term, because that would also be an issue under ordinary uncertainty. To the extent you explain how cluelessness is different to garden-variety uncertainty and why we can't deal with ... (read more)

2Milan_Griffes7moIs there a tl;dr of the distinction you're drawing between normal uncertainty and clueless uncertainty?
Complex cluelessness as credal fragility

I think if you reject incomparability, you're essentially assuming away complex cluelessness and deep uncertainty.

That's really useful, thanks, at the very least I now feel like I'm much closer to identifying where the different positions are coming from. I still think I reject incomparability; the example you gave didn't strike me as compelling, though I can imagine it compelling others. 

So, while I might just pick an option if forced to choose between A, B and indifferent, it doesn't reveal a ranking, since you've eliminated the option I'd want to g

... (read more)
Complex cluelessness as credal fragility

Thanks again. I think my issue is that I’m unconvinced that incomparability applies when faced with ranking decisions. In a forced choice between A and B, I’d generally say you have three options: choose A, choose B, or be indifferent.

Incomparability in this context seems to imply that one could be indifferent between A and B, prefer C to A, yet be indifferent between C and B. That just sounds wrong to me, and is part of what I was getting at when I mentioned transitivity, curious if you have a concrete example where this feels intuitive?

For the second hal... (read more)

2MichaelStJules7moI think if you reject incomparability, you're essentially assuming away complex cluelessness and deep uncertainty. The point in this case is that there are considerations going in each direction, and I don't know how to weigh them against one another (in particular, no evidential symmetry). So, while I might just pick an option if forced to choose between A, B and indifferent, it doesn't reveal a ranking, since you've eliminated the option I'd want to give, "I really don't know". You could force me to choose among wrong answers to other questions, too. B = business as usual / "doing nothing" C= working on a cause you have complex cluelessness about, i.e. you're not wiling to say it's better or worse than or equivalent to B (e.g. for me, climate change is an example) A=C but also torturing a dog that was about to be put down anyway (or maybe generally just being mean to others) I'm willing to accept that C>A, although I could see arguments made for complex cluelessness about that comparison (e.g. through the indirect effects of torturing a dog on your work, that you already have complex cluelessness about). Torturing a dog, however, could be easily dominated by the extra effects of climate change in A or C compared to B, so it doesn't break the complex cluelessness that we already had comparing B and C. Some other potential examples here [] , although these depend on how the numbers work out.
Deference for Bayesians

I mostly agree with this. Of course, to notice that you have to know (2)/(3) are part of the ‘expert belief set’, or at least it really helps, which you easily might not have done if you relied on Twitter/Facebook/headlines for your sense of ‘expert views’.

And indeed, I had conversations where pointing those things out to people updated them a fair amount towards thinking that masks were worth wearing.

In other words, even if you go and read the expert view directly and decide it doesn’t make sense, I expect you to end up in a better epistemic position than... (read more)

6Linch7moUpon reflection, I want to emphasize that I strongly agree with your general point that in the world we live in, on the margin people probably ought to listen directly to what experts say. Unfortunately, I think this is in the general category of other advice like "do the homework" (eg, read original sources, don't be sloppy with the statistics, read original papers, don't just read the abstract or press release, read the original 2-sentence quote [] before taking somebody else's 1-sentence summary at face value, etc), and time/attention/laziness constraints may make taking this advice to heart prohibitively costly (or be perceived this way). I certainly think it's unfortunate that the default information aggregation systems we have (headlines, social media, etc) are not quite up to the task of accurately representing experts. I think this is an important and (in the abstract) nontrivial point, and I'm a bit sad that our best solution here appears to be blaming user error. I strongly agree, though I usually feel much more strongly about this for evidence than for arguments! :P
Deference for Bayesians

A quibble on the masks point because it annoys me every time it's brought up. As you say, it's pretty easy to work out that masks stop an infected person from projecting nearly as many droplets into the air when they sneeze, cough, or speak, study or no study. But virtually every public health recommendation that was rounded off as 'masks don't work' did in fact recommend that infected people should wear masks. For example, the WHO advice that the Unherd article links to says:

Among the general public, persons with respiratory symptoms or those caring for C

... (read more)

I realize that this is kind of a tangent to your tangent, but I don't think the general conjunction of  (Western) expert views  in 2020 was particularly defensible. Roughly speaking, the views  (that I still sometimes hear it parroted by Twitter folks) were something like

  1. For most respiratory epidemics, (surgical) masks are effective at protecting wearers in medical settings.
  2. They are also effective as a form of source control in medical settings.
  3. They should be effective as a form of source control in community transmission.
  4. However, there is i
... (read more)
Complex cluelessness as credal fragility

I certainly wouldn't walk on by, but that's mainly due to a mix of factoring in moral uncertainty (deontologists would think me the devil) and not wanting the guilt of having walked on by.

This makes some sense, but to take a different example, I've followed a lot of the COVID debates in EA and EA-adjacent circles, and literally not once have I seen cluelessness brought up as a reason to be concerned that maybe saving lives via faster lockdowns or more testing or more vaccines or whatever is not actually a good thing to do. Yet it seems obvious that some le... (read more)

1jackmalde7moTo be fair I would say that taking the cluelessness critique seriously is still quite fringe even within EA (my poll on Facebook provided some indication of this). With an EA hat on I want us to sort out COVID because I think COVID is restricting our ability to do certain things that may be robustly good. With a non-EA hat on I want us to sort out COVID because lockdown is utterly boring (although it actually got me into EA and this forum a bit more which is good) and I don't want my friends and family (or myself!) to be at risk of dying from it. Most people have decided to obey lockdowns and be careful in how they interact with others, in order to save lives. In terms of EAs not doing more (e.g. donating money) I think this comes down to the regular argument of COVID not being that neglected and that there are probably better ways to do good. In terms of saving lives, I think deontologists require you to save a drowning child in front of you, but I'm not actually sure how far that obligation extends temporally/spatially. This is interesting and slightly difficult to think about. I think that when I encounter decisions in non-EA-life that I am complexly clueless about, that I let my personal gut feeling takes over. This doesn't feel acceptable in EA situations because, well, EA is all about not letting personal gut feelings take over. So I guess this is my tentative answer to Greaves' question.
In diversity lies epistemic strength

I'm claiming the latter, yes. I do agree it's hard to prove, but I place high subjective credence (~88%) on it. Put simply, if I can directly observe factors that would tend to lower the representation of WEIRD ethnic minorities, I don't necessarily need to have an estmate of the percentages of WEIRD people who are ethnic minorities, or even of the percentage of people in EA who are from ethnic minorities. I only need to think that the factors are meaningful enough to lead to meaningful differences in representation, and not being offset by comparably-mean... (read more)

Complex cluelessness as credal fragility

Medium-term indirect impacts are certainly worth monitoring, but they have a tendency to be much smaller in magnitude than primary impacts being measured, in which case they don’t pose much of an issue; to be best of my current knowledge carbon emissions from saving lives are a good example of this.

Of course, one could absolutely think that a dollar spent on climate mitigation is more valuable than a dollar spent saving the lives of the global poor. But that’s very different to the cluelessness line of attack; put harshly it’s the difference between choosi... (read more)

5jackmalde7moPerhaps, although I wouldn't say it's a priori obvious, so I would have to read more to be convinced. I didn't raise animal welfare concerns either which I also think are relevant in the case of saving lives. In other words I'm not sure you need to raise future effects for cluelessness worries to have bite, although I admit I'm less sure about this. I certainly wouldn't walk on by, but that's mainly due to a mix of factoring in moral uncertainty (deontologists would think me the devil) and not wanting the guilt of having walked on by. Also I'm certainly not 100% sure about the cluelessness critique, so there's that too. The cluelessness critique seems sufficient to me to want to search for other ways than AMF to do the most good, but not to literally walk past a drowning child.
Complex cluelessness as credal fragility

Thanks for the response, but I don't think this saves it. In the below I'm going to treat your ranges as being about the far future impacts of particular actions, but you could substitute for 'all the impacts of particular actions' if you prefer.

In order for there to be useful things to say, you need to be able to compare the ranges. And if you can rank the ranges ("I would prefer 2 to 1" "I am indifferent between 3 and 4", etc.), and that ranking obeys basic rules like transitivity, that seems equivalent to collapsing the all the ranges to single numbers.... (read more)

2MichaelStJules7moYa, it's a weak ordering, so you can't necessarily collapse them to single numbers, because of incomparability. [1, 1000] and [100, 105] are incomparable. If you tried to make them equivalent, you could run into problems, say with [5, 50], which is also incomparable with [1, 1000] but dominated by [100, 105]. [5, 50] < [100, 105] [1, 1000] incomparable to the other two If your set of options was just these 3, then, sure, you could say [100, 105] and [1, 1000] are equivalent since neither is dominated, but if you introduce another option which dominates one but not the other, that equivalence would be broken. I think there are two ways of interpreting "make the far future better": 1. compared to doing nothing/business as usual, and 2. compared to a specific other option. 1 implies 2, but 2 does not imply 1. It might be the case that none of the options look robustly better than doing nothing, but still some options are better than others. For example, writing their expected values as the difference with doing nothing, we could have: 1. [-2, 1] 2. [-1, 2] 3. 0 (do nothing) and suppose specifically that our distibutions are such that 2 always dominates 1, because of some correspondence between pairs of distributions. For example, although I can think up scenarios where the opposite might be true, it seems going out of your way to torture an animal to death (for no particular benefit) is dominated at least by killing them without torturing them. Basically, 1 looks like 2 but with extra suffering and the harms to your character. In this scenario, we can't reliably make the world better, compared to doing nothing, but we still have that option 2 is better than option 1.
Complex cluelessness as credal fragility

Just to be clear I also think that we can tractably influence the far future in expectation (e.g. by taking steps to reduce x-risk). I'm not really sure how that resolves things.

If you think you can tractably impact the far future in expectation, AMF can impact the far future in expectation. At which point it's reasonable to think that those far future impacts could be predictably negative on further investigation, since we weren't really selecting for them to be positive. I do think trying to resolve the question of whether they are negative is probably a... (read more)

1jackmalde7moOn a slightly different note, I can understand why one might not think we can tractably impact the far future, but what about the medium-term future? For example it seems that mitigating climate change is a pretty surefire way to improve the medium-term future (in expectation). Would you agree with that? If you accept that then you might also accept that we are clueless about giving to AMF based on it's possible medium-term climate change impacts (e.g. maybe giving to AMF will increase populations in the near to medium term, and this will increase carbon emissions). What do you think about this line of reasoning?
Complex cluelessness as credal fragility

I'm not sure how to parse this 'expectation that is neither positive nor negative or zero but still somehow impacts decisions' concept, so maybe that's where my confusion lies. If I try to work with it, my first thought is that not giving money to AMF would seem to have an undefined expectation for the exact same reason that giving money to AMF would have an undefined expectation; if we wish to avoid actions with undefined expectations (but why?), we're out of luck and this collapses back to being decision-irrelevant.

I have read the paper. I'm surprised yo... (read more)

2MichaelStJules7moI would put it as entertaining multiple probability distributions for the same decision, with different expected values. Even if you have ranges of (so not singly defined) expected values, there can still be useful things you can say. Suppose you have 4 different acts with EVs in the following ranges: 1. [-100, 100] (say this is AMF in our example) 2. [5, 50] 3. [1, 1000] 4. [100, 105] I would prefer each of 2, 3 and 4 to 1, since they're all robustly positive, while 1 is not. 4 is also definitely better in expectation than 1 and 2 (according to the probability distributions we're considering), since its EV falls completely to the right of each's, so this means neither 1 nor 2 is permissible. Without some other decision criteria or information, 3 and 4 would both be permissible, and it's not clear which is better.
1jackmalde7moIt's not so much that we should avoid doing it full stop, it's more that if we're looking to do the most good then we should probably avoid doing it because we don't actually know if it does good. If you don't have your EA hat on then you can justify doing it for other reasons. I've only properly read it once and it was a while back. I just remember it having quite an effect on me. Maybe I read it a few times to fully grasp it, can't quite remember. I'd be quite surprised if it immediately clicked for me to be honest. I clearly don't remember it that well because I forgot that Greaves had that discussion about the psychology of cluelessness which is interesting. Just to be clear I also think that we can tractably influence the far future in expectation (e.g. by taking steps to reduce x-risk). I'm not really sure how that resolves things. I'm surprised to hear you say you're unsure you disagree with Greaves. Here's another quote from her (from here [] ). I'd imagine you disagree with this?
Complex cluelessness as credal fragility

So yes we are in fact predictably influencing the far future by giving to AMF, in that we know we will be affecting the number of people who will live in the future. However, I wouldn't say we are influencing the far future in a 'tractable way' because we're not actually making the future better (or worse) in expectation


If we aren't making the future better or worse in expectation, it's not impacting my decision whether or not to donate to AMF. We can then safely ignore complex cluelessness for the same reason we would ignore simple cluelessness.

Clue... (read more)

8jackmalde7moSaying that the long-run effects of giving to AMF are not positive or negative in expectation is not the same as saying that the long-run effects are zero in expectation. The point of complex cluelessness is that we don't really have a well-formed expectation at all because there are so many forseeable complex factors at play. In simple cluelessness there is symmetry across acts so we can say the long-run effects are zero in expectation, but in complex cluelessness we can't say this. If you can't say the long-run effects are zero in expectation, then you can't ignore the long-run effects. I think all of this is best explained in Greaves' original paper [].
Complex cluelessness as credal fragility

I think there's a difference between the muddy concept of 'cause areas' and actual specific charities/interventions here. At the level of cause areas, there could be overlap, because I agree that if you think the Most Important Thing is to expand the moral circle, then there are things in the animal-substitute space that might be interesting, but I'd be surprised and suspicious (not infinitely suspicious, just moderately so) if the actual bottom-line charity-you-donate-to was the exact same thing as what you got to when trying to minimise the suffering of ... (read more)

Complex cluelessness as credal fragility

If we have good reason to expect important far future effects to occur when donating to AMF, important enough to change the sign if properly included in the ex ante analysis, that is equivalent to (actually somewhat stronger than) saying we can tractably influence the far future, since by stipulation AMF itself now meaningfully and predictably influences the far future. I currently don't think you can believe the first and not the second, though I'm open to someone showing me where I'm wrong.

8jackmalde7moThere's an important and subtle nuance here. Note that complex cluelessness only arises when we know something about how the future will be impacted, but don't know enough about these foreseeable impacts to know if they are net good or bad when taken in aggregation. If we knew literally nothing about how the future would be impacted by an intervention this would be a case of simple cluelessness, not complex cluelessness, and Greaves argues we can ignore simple cluelessness []. What Greaves argues is that we don't in fact know literally nothing about the long-run impacts of giving to GiveWell charities. For example, Greaves says we can be pretty sure there will be long-term population effects of giving to AMF, and that these effects will be very important in the moral calculus (so we know something). But she also says, amongst other things, that we can't be sure even if the long-run effect on population will be positive or negative (so we clearly don't know very much). So yes we are in fact predictably influencing the far future by giving to AMF, in that we know we will be affecting the number of people who will live in the future. However, I wouldn't say we are influencing the far future in a 'tractable way' because we're not actually making the future better (or worse) in expectation, because we are utterly clueless. Making the far future better or worse in expectation is the sort of thing longtermists want to do and they claim there are some ways to do so.
In diversity lies epistemic strength

FWIW, I don’t think your argument goes through for ethnic diversity either; EA is much whiter than its WEIRD base. I agree aiming to match the ethnic diversity of the world would be a mistake.

(Disclaimer: Not white)

3Aditya Vaze7moAre you saying ethnic minorities in the West are less likely to be WEIRD and hence underrepresented in EA, or that ethnic minorities who are WEIRD are underrepresented in EA? The former wouldn't surprise me at all, given the significant disparities in income and educational opportunity between ethnic minorities in the West. The latter would surprise me, but I'm not sure how you would go about proving it, since it would require you to already have an estimate of demographics of true WEIRDs, and I'm not sure how you'd go about collecting that. Unless the assumption that any educated person from a Western Developed country is WEIRD, which I would disagree with.
80,000 Hours one-on-one team plans, plus projects we’d like to see

Spitballing here, but have you considered putting some thoughts to this effect on your website? Currently, the relevant part of the 80k website reads as follows.

Why wasn’t I accepted?

Unfortunately, due to overwhelming demand, we can’t advise everyone who applies. However, we’re confident that everyone who is reading this has what it takes to lead a fulfilling, high impact career. Our key ideas series contains lots of our best advice on this topic – we hope you’ll find it useful.

If you’re thinking of re-applying, you can improve your chances by:

  1. Reading our
... (read more)
7Michelle_Hutchinson1moThanks for this feedback. I had a go at rewriting that our 'why wasn't I accepted' FAQ. It now reads: WHY WASN’T I ACCEPTED? We sincerely regret that we can’t advise everyone who applies. We read every application individually and are thankful that you took the time to apply. It’s really touching reading about people who have come across 80,000 Hours and are excited about using their careers to help others. We aim to talk to the people we think we can help most. Our not speaking with you does not mean we think you won’t have a highly impactful career. Whether we can be helpful to you sometimes depends on contingent factors like whether one of our advisers happens to know of a role or introduction right now that might be a good fit for you. We also have far less information about you than you do, so we aren’t even necessarily making the right calls about who we can help most. You’re very welcome to reapply, particularly if your situation changes. If you’re thinking of doing so, it might be worth reading our key ideas series [] and trying out our career planning process [], which we developed to help people think through their career decisions. You can also get involved in our community [] to get help from other people trying to do good with their careers.
Complex cluelessness as credal fragility

Many of the considerations regarding the influence we can have on the deep future seem extremely hard, but not totally intractable, to investigate. Offering naive guestimates for these, whilst lavishing effort to investigate easier but less consequential issues, is a grave mistake. The EA community has likely erred in this direction.


Yet others, those of complex cluelessness, do not score zero on ‘tractability’. My credence in “economic growth in poorer countries is good for the longterm future” is fragile: if I spent an hour (or a week, or a decade) mul

... (read more)
6Gregory_Lewis6moBelatedly: I read the stakes here differently to you. I don't think folks thinking about cluelessness see it as substantially an exercise in developing a defeater to 'everything which isn't longtermism'. At least, that isn't my interest, and I think the literature has focused on AMF etc. more as salient example to explore the concepts, rather than an important subject to apply them to. The AMF discussions around cluelessness in the OP are intended as toy example - if you like, deliberating purely on "is it good or bad to give to AMF versus this particular alternative?" instead of "Out of all options, should it be AMF?" Parallel to you, although I do think (per OP) AMF donations are net good, I also think (per the contours of your reply) it should be excluded as a promising candidate for the best thing to donate to: if what really matters is how the deep future goes, and the axes of these accessible at present are things like x-risk, interventions which are only tangentially related to these are so unlikely to be best they can be ruled-out ~immediately. So if that isn't a main motivation, what is? Perhaps something like this: 1) How to manage deep uncertainty over the long-run ramifications of ones decisions is a challenge across EA-land - particularly acute for longtermists, but also elsewhere: most would care about risks about how in the medium term a charitable intervention could prove counter-productive. In most cases, these mechanisms for something to 'backfire' are fairly trivial, but how seriously credible ones should be investigated is up for grabs. Although "just be indifferent if it is hard to figure out" is a bad technique which finds little favour, I see a variety of mistakes in and around here. E.g.: a) People not tracking when the ground of appeal for an intervention has changed. Although I don't see this with AMF, I do see this in and around animal advocacy. One crucial consideration around here is WAS, particularly an 'inverse logic of the larde
2Milan_Griffes7moDo you feel that the frame I offered here [] has no decision-relevance?
8jackmalde7moGreat comment. I agree with most of what you've said. Particularly that trying to uncover if donating to AMF is going to be a great way to improve the long-run future seems a fools errand. This is where my quibble comes in: I don't think this is true. Cluelessness arguments intend to demonstrate that we can't be confident that we are actually doing good when we donate to say GiveWell charities, by noting that there are important indirect/long-run effects that we have good reason to expect will occur, and have good reason to suspect are sufficiently important such that they could change the sign of GiveWell's final number if properly included in their analysis. It seems to me that this should have an impact on any person donating to GiveWell charities unless for some reason these people just don't care about indirect/long-run effects of their actions (e.g. they have a very high rate of pure time preference). In reality though you'll be hard-pressed to find EAs who don't think indirect/long-run effects matter, so I would expect cluelessness arguments to have bite for many EAs. I don't think you need to demonstrate to people that we can tractably influence the far future for them to be impacted by cluelessness arguments. It is certainly possible to think cluelessness arguments are problematic for justifying giving to GiveWell charities, and also to think we can't tractably affect the far future. At that point you may be left in a tough place but, well, tough! You might at that point be forgiven for giving up on EA entirely as Greaves notes [] . (By the way I recall us having a similar convo on Facebook about this, but this is certainly a better place to have it!)
9MichaelStJules7moFor what it's worth, I think it's plausible that some interventions chosen for their short-term effects may be promising candidates for longtermist interventions. If you thought that s-risks were important and that larger moral circles mitigate s-risks, then plant-based and cultured animal product substitutes might be promising, since these seem most likely to shift attitudes towards animals the most and fastest, and this would (hopefully) help make the case for wild animals and artificial sentience mext. Maybe direct advocacy for protections for artificial sentience would best, though, but I wouldn't be surprised if you'd still at least want to target animals somewhat, since this seems more incremental and the step to artificial sentience is greater. That being said, depending on how urgent s-risks are, how exactly we should approach animal product subsititutes may be different for the short term and long term. Longtermist-focused animal adocacy might be different in other ways, too; see this post [] . Furthermore, if growth is fast enough in the future (exponentially? EDIT: It might hit a cubic limit due to physical constraints), and the future growth rate can't reliably be increased, then growth today may have a huge effect on wealth in the long term. The differenceXat−Yatgoes to+/−∞ast→∞, ifX≠Y. If our sphere of influence grows exponentially fast, and our moral circle expands gradually, then you can make a similar argument supporting expanding our moral circle more quickly now.
Retention in EA - Part III: Retention Comparisons

So my first reaction to the Youth Ministry Adherence data was the basically the opposite of yours, in that I looked at it and thought 'seems like they are doing a (slightly) better job of retention'. Reviewing where we disagree, I think there's a tricky thing here about distinguishing between 'dropout' rates and 'decreased engagement' rates. Ben Todd's estimates which you quote are explicitly trying to estimate the former, but when you compare to:

those listed as “engaged disciples” who continue to self-report as “high involvement”

...I think you might end u... (read more)

4Ben_West7moInteresting, thanks! Something which probably isn’t obvious without reading the methods (pages 125-127) is that study participants were recruited through church mailing lists and Facebook groups. So the interpretation of that statistic is “ of the people who answer surveys from their church, 92% report at least moderate engagement”. “Moderate engagement” is defined as an average of a bunch of questions, but roughly it means someone who attends church at least once per month. I think that definition of “moderate engagement” is a bit higher than “willing to answer surveys from my church” (as evidenced by the people who answered the survey but did not report moderate engagement), but it’s not a ton higher, so I’m hesitant to read too much into the percentage who report moderate engagement. I felt like “high engagement” was enough above “willing to answer a survey” that some value could be gotten from the statistic, but even there I’m hesitant to conclude too much, and wouldn’t blame someone who discounted the entire result because of the research method (or interpreted the result in a pretty different way from me). If we want to compare it to Ben’s EA estimates: I guess one analog would be to look at people who attended that weekend away but also answered the EA survey five years later. I’m not sure if such a data set exists.
Retention in EA - Part III: Retention Comparisons

Cool series, thanks for sharing on the forum. One nitpick:

ACE estimates that the average vegetarian stays vegetarian for 3.9-7.2 years, implying a five-year dropout rate of 14-26%.

I'm not sure how your rate is being calculated from ACE's figures here, but at first pass it seems wrong? Since 5 years is within but slightly towards the lower end of the range given for how long the average vegetarian stays vegetarian, I'd assume we'd end up with something more like a ~45% five-year dropout rate. By contrast, a 14-26% five-year dropout rate would suggest that &... (read more)

2Ben_West7moThanks Alex! You are correct. I accidentally put the annual dropout rates there instead of five-year dropout rates. The implied five-year rate is 53%-77%, approximately in line with Ben’s estimates for GWWC members. I’ve updated the text accordingly.
Introduction to Longtermism

Thank you for this! This is not the kind of post that I expect to generate much discussion, since it's relatively uncontroversial in this venue, but is the kind of thing I expect to point people to in future. 

I want to particularly draw attention to a pair of related quotes partway through your piece:

I've tried explaining the case for longtermism in a way that is relatively free of jargon. I've argued for a fairly minimal version — that we may be able to influence the long-run future, and that aiming to achieve this is extremely good and important. Th

... (read more)
Money Can't (Easily) Buy Talent

I was surprised to discover that this doesn't seem to have already been written up in detail on the forum, so thanks for doing so. The same concept has been written up in a couple of other (old) places, one of which I see you linked to and I assume inspired the title:

Givewell: We can't (simply) buy capacity

80000 Hours: Focus more on talent gaps, not funding gaps

The 80k article also has a disclaimer and a follow-up post that felt relevant here; it's worth being careful about a word as broad as 'talent':

Update April 2019: We think that our use of the term ‘t

... (read more)
richard_ngo's Shortform

But for the purposes of my questions above, that's not the relevant factor; the relevant factor is: does someone know, and have they made those arguments [that specific intervention X will wildly outperform] publicly, in a way that we could learn from if we were more open to less quantitative analysis?

I agree with this. I think the best way to settle this question is to link to actual examples of someone making such arguments. Personally, my observation from engaging with non-EA advocates of political advocacy is that they don't actually make a case; when ... (read more)

richard_ngo's Shortform

I think we’re still talking past each other here.

You seem to be implicitly focusing on the question ‘how certain are we these will turn out to be best’. I’m focusing on the question ‘Denise and I are likely to make a donation to near-term human-centric causes in the next few months; is there something I should be donating to above Givewell charities’.

Listing unaccounted-for second order effects is relevant for the first, but not decision-relevant until the effects are predictable-in-direction and large; it needs to actually impact my EV meaningfully. Curre... (read more)

2richard_ngo8moHmm, I agree that we're talking past each other. I don't intend to focus on ex post evaluations over ex ante evaluations. What I intend to focus on is the question: "when an EA make the claim that GiveWell charities are the charities with the strongest case for impact in near-term human-centric terms, how justified are they?" Or, relatedly, "How likely is it that somebody who is motivated to find the best near-term human-centric charities possible, but takes a very different approach than EA does (in particular by focusing much more on hard-to-measure political effects) will do better than EA?" In my previous comment, I used a lot of phrases which you took to indicate the high uncertainty of political interventions. My main point was that it's plausible that a bunch of them exist which will wildly outperform GiveWell charities. I agree I don't know which one, and you don't know which one, and GiveWell doesn't know which one. But for the purposes of my questions above, that's not the relevant factor; the relevant factor is: does someone know, and have they made those arguments publicly, in a way that we could learn from if we were more open to less quantitative analysis? (Alternatively, could someone know if they tried? But let's go with the former for now.) In other words, consider two possible worlds. In one world GiveWell charities are in fact the most cost-effective, and all the people doing political advocacy are less cost-effective than GiveWell ex ante (given publicly available information). In the other world there's a bunch of people doing political advocacy work which EA hasn't supported even though they have strong, well-justified arguments that their work is very impactful (more impactful than GiveWell's top charities), because that impact is hard to quantitatively estimate. What evidence do we have that we're not in the second world? In both worlds GiveWell would be saying roughly the same thing (because they have a high bar for rigour). Would OpenPhi
richard_ngo's Shortform

Thanks for the write-up. A few quick additional thoughts on my end:

  • You note that OpenPhil still expect their hits-based portfolio to moderately outperform Givewell in expectation. This is my understanding also, but one slight difference of interpretation is that it leaves me very baseline skeptical that most 'systemic change' charities people suggest would also outperform, given the amount of time Open Phil has put into this question relative to the average donor. 
  • I think it's possible-to-likely I'm mirroring your 'overestimating how representative my
... (read more)
2richard_ngo8moI have now read OpenPhil's sample of the back-of-the-envelope calculations [] on which their conclusion that it's hard to beat GiveWell was based. They were much rougher than I expected. Most of them are literally just an estimate of the direct benefits and costs, with no accounting for second-order benefits or harms, movement-building effects, political effects, etc. For example, the harm of a year of jail time is calculated as 0.5 QALYs plus the financial cost to the government - nothing about long-term effects of spending time in jail, or effects on subsequent crime rates, or community effects. I'm not saying that OpenPhil should have included these effects, they are clear that these are only intended as very rough estimates, but it means that I now don't think it's justified to treat this blog post as strong evidence in favour of GiveWell. Here's just a basic (low-confidence) case for the cost-efficacy of political advocacy: governmental policies can have enormous effects, even when they attract little mainstream attention (e.g. PEPFAR []). But actually campaigning for a specific policy is often only the last step in the long chain of getting the cause into the Overton Window, building a movement, nurturing relationships with politicians, identifying tractable targets, and so on, all of which are very hard to measure, and which wouldn't show up at all in these calculations by OpenPhil. Given this, what evidence is there that funding these steps wouldn't outperform GiveWell for many policies? (See also Scott Alexander 's rough calculations on the effects of FDA regulations [], which I'm not very confident in, but which have always stuck in my head as an argument that how dull-sounding policies might have wildly large impacts.) Your o
My mistakes on the path to impact

(Disclaimer: I am OP’s husband)

As it happens, there are a couple of examples in this post where poor or distorted versions of 80k advice arguably caused harm relative to no advice; over-focus on working at EA orgs due to ‘talent constraint’ claims probably set Denise’s entire career back by ~2 years for no gain, and a simplistic understanding of replaceability was significantly responsible for her giving up on political work.

Apart from the direct cost, such events leave a sour taste in people’s mouths and so can cause them to dissociate from the community;... (read more)

3Habryka8moYeah, totally agree that we can find individual instances where the advice is bad. Just seems pretty unlikely for that average to be worse, even just by the lights of the person who is given advice (and ignoring altruistic effects, which presumably are more heavy-tailed).
The Folly of "EAs Should"

The ‘any decent shot’ is doing a lot of work in that first sentence, given how hard the field is to get into. And even then you only say ‘probably stop’.

There’s a motte/bailey thing going on here, where the motte is something like ‘AI safety researchers probably do a lot more good than doctors’ and the bailey is ‘all doctors who come into contact with EA should be told to stop what they are doing and switch to becoming (e.g.) AI safety researchers, because that’s how bad being a doctor is’.

I don’t think we are making the world a better place by doing the s... (read more)

6Habryka8moThe "probably" there is just for the case of becoming an AI safety researcher. The argument for why being a doctor seems rarely the right choice does of course not just route through AI Alignment being important. It routes through a large number of alternative careers that seem more promising, many of which are analyzed and listed on 80k's website. That is what my second paragraph was trying to say. I think if you take into account all of those alternatives, the "probably" turns into a "very likely" and conditioning on "any decent shot" no longer seems necessary to me.
Load More