All of David Rhys Bernard's Comments + Replies

Section 2.2.2 of their report is titled "Choosing a fixed or random effects model". They discuss the points you make and clearly say that they use a random effects model. In section 2.2.3 they discuss the standard measures of heterogeneity they use. Section 2.2.4 discusses the specific 4-level random effects model they use and how they did model selection.

I reviewed a small section of the report prior to publication but none of these sections, and it only took me 5 minutes now to check what they did. I'd like the EA Forum to have a higher bar (as Gregory's parent comment exemplifies) before throwing around easily checkable suspicions about what (very basic) mistakes might have been made.

Innovations for Poverty Action just released their Best Bets: Emerging Opportunities for Impact at Scale report. It covers what they think are best evidence-backed opportunities in global health and development. The opportunities are:

  1. Small-quantity lipid-based nutrient supplements to reduce stunting
  2. Mobile phone reminders for routine childhood immunization
  3. Social signaling for routine childhood immunization
  4. Cognitive behavioral therapy to reduce crime
  5. Teacher coaching to improve student learning
  6. Psychosocial stimulation and responsive care to promote ear
... (read more)

Thanks Vasco, I'm glad you enjoyed it! I corrected the typo and your points about inverse-variance weighting and lognormal distributions are well-taken.

I agree that doing more work to specify what our priors should be in this sort of situation is valuable although I'm unsure if it rises to the level of a crucial consideration. Our ability to predict long-run effects has been an important crux for me hence the work I've been doing on it, but in general, it seems to be more of an important consideration for people who lean neartermist than those who lean longtermist.

Hi Michael, thanks for this.

On 1: Thorstad argues that if you want to hold both claims (1) Existential Risk Pessimism - per-century existential risk is very high, and (2) Astronomical Value Thesis - efforts to mitigate existential risk have astronomically high expected value, then TOP is the most plausible way to jointly hold both claims. He does look at two arguments for TOP - space settlement and an existential risk Kuznets curve - but says these aren’t strong enough to ground TOP and we instead need a version of TOP that appeals to AI. It’s fair to thin... (read more)

Yep, I agree you can generate the time of perils conclusion if AI risk is the only x-risk we face. I was attempting to empirically describe a view that seem to be popular in the x-risk space here, that other x-risks beside AI are also cause for concern, but you're right that we don't necessarily need this full premise.

I was somewhat surprised by the lack of distinction between the cases where we go extinct and the universe is barren (value 0) and big negative futures filled with suffering. The difference between these cases seem large to me and seems like they will substantially affect the value of x-risk and s-risk mitigation. This is even more the case if you don't subscribe to symmetric welfare ranges and think our capacity to suffer is vastly greater than our capacity to feel pleasure, which would make the worst possible futures way worse than the best possible futu... (read more)

Thanks for highlighting this Michael and spelling out the different possibilities. In particular, it seems like if aliens are present and would expand into the same space we would have expanded into had we not gone extinct, then for the totalist, to the extent that aliens have similar values to us, the value of x-risk mitigation is reduced. If we are replaceable by aliens, then it seems like not much is lost if we do go extinct, since the aliens would still produce the large valuable future that we would have otherwise produced.

I have to admit though, it i... (read more)

Hi Geoffrey, thanks for these comments, they are really helpful as we move to submitting this to journals. Some miscellaneous responses:

  1. I'd definitely be interested in seeing a project where the surrogate index approach is applied to even longer-run settings, especially in econ history as you suggest. You could see this article as testing whether the surrogate index approach works in the medium-run, so thinking about how well it works in the longer-run is a very natural extension. I spent some time thinking about how to do this during my PhD and datasets y
... (read more)

The JPAL and IPA Dataverses have data from 200+ RCTs from development economics and the 3ie portal has 500+ studies with datasets available (and you can further filter by study type if you want to limit to RCTs). I can't point you to particular studies that having missing or mismeasured covariates, but from personal experience, a lot of them have lots of missing data.

3
Emma Skarstein
1y
Thank you for this, these are both new to me! I will definitely take a look through them.

Can you explain more why the bootstrapping approach doesn't give a causal effect (or something pretty close to one) here? The aggregate approach is clearly confounded since questions with more answers are likely easier. But once you condition on the question and directly control the number of forecasters via bootstrapping different sample sizes, it doesn't seem like there are any potential unobserved confounders remaining (other than the time issue Nikos mentioned). I don't see what a natural experiment or RCT would provide above the bootstrapping approach.

3
Kenan Schaefkofer
1y
Predictors on Metaculus seeing the prior prediction history seems like a plausible confounder to me, but I'm not sure it would change the result.

Side note: a Cohen's d of .31 is not small. My opinion is that the rules of thumb used to interpret effect sizes in psychology are messed up, because so much p-hacking in the past produced way overinflated effect sizes. Regardless, 0.3 is typically seen as a moderate effect size. A 0.3 standard deviation increase in IQ would be 4.5 points which would lead to economically meaningful differences in income.

Within 3 days of departing the UK to return to the US, take another COVID test. This is required by the US CDC according to this link, and both PCR and Rapid Antigen tests are acceptable. I am planning to walk into an NHS location near the EA conference venue (like this) and get a free test. You don’t have to be a UK citizen to get free tests from the NHS (link).

My understanding is that you should not be using the free NHS test for travel and should instead book a private test, which is possible across London and at airports on the day of your flight. ... (read more)

Hi Edo!

Our funder was interested in How Asia Works, presumably from positive reviews it's received from people like Bill Gates and Noah Smith, and asked us to check the land section in more detail. We had a comparative advantage here given my background in development economics.

I wouldn't be particularly interested in more land redistribution research, given that there don't seem to be any clear funding opportunities in this space. If someone could find decent opportunities then that would make it a bit more interesting. But given the ambiguous results on... (read more)

If a whole book is too much, you could also try their article, Economic Lives of the Poor - https://www.aeaweb.org/articles?id=10.1257/jep.21.1.141 - but this is explicitly focused on people living below the extreme poverty line, who are an order of magnitude poorer than the global median.

For example, David and Jason's report on charter cities was completed in 100 hours, a reasonable fraction of which was extra legwork for external writeup/following up with affected parties, after the original report was delivered to Open Phil. My impression is that the bulk of the work was done on a fairly short calendar time cycle too, in ways that may be hard for external parties to replicate. But naively the report would still be useful to Open Phil and cost-effective to fund if it took 200 hours to complete and 3x the calendar time.

Just to clarify,... (read more)

4
Linch
3y
Thanks for the clarification!

I know you were explicit about these being your views and not Founders Pledge's, but is there anyone better placed to think through those implications than Founders Pledge? And similarly, it seems like Founders Pledge would be one of the most natural organisations to advocate against limits on patient philanthropy, given the work on the long-term investment fund.

I'm not convinced that our CEA is particularly useful for more generalised interventions. All we really do is assume that the intervention causes some growth increase (a distribution rather than a point estimate) and then model expected income with the intervention, with the intervention 10 years later and with no intervention. The amount the intervention increases growth is the key parameter and is very uncertain so further research on this will have the highest VoI, but this will be different for each intervention. We treat how the intervention increase... (read more)

Thanks Mark, both for your time and feedback while we were writing the report and your comments now.

On 1, I agree that charter cities sit somewhere between neartermist and longtermist so thinking about them as mid/mediumtermist makes sense. I imagine Rethink Priorities’ future work in this space will be a mixture of traditionally neartermist and mediumtermist topics. However, most of the current arguments for charter cities, especially Mason (2019), have an explicitly neartermist flavour, given the direct comparisons to GiveWell charities and a focus on th... (read more)

Thanks Jeremy!

That was just a typo. Previously we were unsure whether they would be an ally or an opponent and then Pure Earth told us they considered them to be an ally. I wasn't careful enough when editing that section so I've deleted "or an opponent" now.

8
Josh Jacobson
3y
I largely like this video, but I also think it’s good to be aware of some shortcomings of this: https://forum.effectivealtruism.org/posts/KShgaczKHwvxBbQXK/i-want-to-do-good-an-ea-puppet-mini-musical (can’t get permalink to work from mobile, but intended here to link to my comment on that post).

The tag seems focused on how much weight should be assigned to different moral patients. But some people and posts use the phrase moral weight to refer to relative importance of different outcomes, e.g. how much should we care about consumption vs saving a life? Examples include:

Should we include both under this wiki-tag and broaden the definition? Or should we make a new... (read more)

2
Pablo
3y
Thanks for tagging all these posts! We already have a moral patienthood entry, though unfortunately it was "wiki-only", so you couldn't use it as a tag. I have now removed this restriction and re-tagged all the articles. For the two articles above, I used the moral uncertainty tag instead, which seems more appropriate. Feel free to review my changes, and if you are satisfied with them, I would suggest deleting this tag.

This paper was a chapter in the book Randomized Control Trials in the Field of Development: A Critical Perspective, a collection of articles  on RCTs. Assuming the author of this chapter, Timothy Ogden doesn't identify as a randomista, the only other author who maybe does is Jonathan Morduch, so it's a pretty one-sided book (which isn't necessarily a problem, just something to be aware of).

There was a launch event for the book with talks from Sir Angus Deaton, Agnès Labrousse, Jonathan Morduch, Lant Pritchett and moderated by William Easterly, which y... (read more)

8
Stephen Clare
3y
Ogden works with Innovations for Poverty Action (and, incidentally, is on GiveWell's board). I'm not sure he'd identify as a randomista but seems very likely he's favourable to RCTs.

Thanks for the post, but I don't think you can conclude from your analysis that your criteria weren't helpful and the result is not necessarily that surprising. 

If you look at professional NBA basketball players, there's not much of a correlation between how tall a basketball player is and how much they get paid or some other measure of how good they are. Does this mean NBA teams are making a mistake by choosing tall basketball players? Of course not!

The mistake your analysis is making is called 'selecting on the dependent variable' or 'collider bias'... (read more)

Broadly, I agree with your points. You're right that we don't care about the relationship in the subpopulation, but rather about the relationship in the broader population. However, there are a couple of things I think are important to note here:

  1. As mentioned in my response on range restrictions, in some cases we did not reject many people at all. In those cases, our subpopulation was almost the entire population. This is not the case for the NBA or GRE examples.
  2. Lastly, possibly more importantly: we only know of maybe 3 cases of people being rejected from t
... (read more)
4
David_Moss
3y
The other big issue with this approach is that this would likely be confounded by the treatment effect of being selected for and undertaking the fellowship. i.e. we would hope that going through the fellowship actually makes people more engaged, which would lead to the people with higher scores (who get accepted to the fellowship) also having higher engagement scores. But perhaps what you had in mind was combining the simple approach with a more complex approach, like randomly selecting people for the fellowship across the range of predictor scores and evaluating the effects of the fellowship as well as the effect of the initial scores?

Thanks for this Luisa, I found it very interesting and appreciated the level of detail in the different cases. One thought and related questions that came up when reading the toy calculations at the end of each case:

For a fixed number of survivors, there is a trade-off between groups of different sizes. The larger the groups, the more likely each group is to survive, but the fewer groups need to be wiped out in order for humanity to go extinct. 

  • What might this trade-off look like and is there some optimal group size to minimise the risk of extinction?
... (read more)

I’m happy to see an increase in the number of temporary visiting researcher positions at various EA orgs. I found my time visiting GPI during their Early  Career Conference Programme very valuable (hint: applications for 2021 are now open, apply!) and would encourage other orgs to run similar sorts of programmes to this and FHI’s (summer) research scholars programme. I'm very excited to see how our internship program develops as I really enjoy mentoring.

I think I was competitive for the RP job because of my T-shaped skills, broad knowledge in lots of ... (read more)

5
MichaelA
3y
I already strongly agreed with your first paragraph in a separate answer, so I'll just jump in here to strongly agree with the second one too!  I can confirm that I've been gobbling up EA content rather obsessively for the last 2 years. If anyone's interested in what this involved and how many hours I spent on it, I describe that here.

1. Thinking vs. reading. 

Another benefit of thinking before reading is that it can help you develop your research skills. Noticing some phenomena and then developing a model to explain it is a super valuable exercise. If it turns out you reproduce something that someone else has already done and published, then great, you’ve gotten experience solving some problem and you’ve shown that you can think through it at least as well as some expert in the field. If it turns out that you have produced something novel then it’s time to see how it compares to ex... (read more)

2
EdoArad
3y
It was interesting to read, thanks for the answers :) A small remark, which may be of use as you said you used Anki and now using Roam - The Roam Toolkit add-on allows you to use spaced-repetition in Roam. 
3
Dawn Drescher
3y
Thank you! Using the thinking vs. reading balance as a feedback mechanism is an interesting take, and in my experience it’s also most fruitful in philosophy, though I can’t compare with those branches of economics. Survival mindset: I suppose it serves its purpose when you’re in a very low-trust environment, but it’s probably not necessary most of the time for most aspiring EA researchers. Thanks for linking that list of textbooks! It’s also been helpful for me in the past. :-D Planning the next day the evening before also seems like a good thing to try for me. Thanks! I wonder whether you all have such fairly high typing speeds simply because you all type a lot or whether 80+ WPM is a speed threshold that is necessary to achieve before one ceases to perceive typing speed as a limiting factor. (Mine is around 60 WPM.) I hope you can get your work hours down to a manageable level!

Thanks for the paper suggestions! Most of my own research is on internal validity in the LaLonde style so I definitely think it is important too. I'll add a section on replicability to the syllabus.

The first 5 paragraphs are repeated twice. Could someone fix this?

Hey Kaj, I just thought I'd let you know that you're not alone in Scandinavia! A few of us are starting an EA group in Uppsala, Sweden and Trondheim, Norway launched a couple of weeks ago. I know it's late notice, but we're having a Google Hangout this evening, 9pm your time so if you could join, that'd be great!

1
Kaj_Sotala
10y
Fantastic! (Replied more privately.)