Effective Altruism Forum
EA Forum

All of Michael_Wiebe's Comments + Replies

Should you "trust literatures, not papers"?
I replicated the literature on meritocratic promotion in China, and found that the evidence is not robust.

https://twitter.com/michael_wiebe/status/1750572525439062384

Michael_Wiebe's Quick takes

Michael_Wiebe3mo4

Do vaccinated children have higher income as adults?
I replicate a paper on the 1963 measles vaccine, and find that it is unable to answer the question.

https://twitter.com/michael_wiebe/status/1750197740603367689

Michael_Wiebe's Quick takes

Michael_Wiebe3mo2

I've written up my replication of Cook (2014) on racial violence and patenting by Black inventors.

Bottom line: I believe the conclusions, but I don't trust the results.

https://twitter.com/michael_wiebe/status/1749831822262018476

Michael_Wiebe's Quick takes

Michael_Wiebe3mo2

Thanks, will edit.

Michael_Wiebe's Quick takes

Michael_Wiebe3mo30

New replication: I find that the results in Moretti (AER 2021) are caused by coding errors.
The paper studies agglomeration effects for innovation, but the results supporting a causal interpretation don't hold up.

https://twitter.com/michael_wiebe/status/1749462957132759489

Karthik Tadepalli

3mo

You might want to add what the subject of Moretti (2021) is, and what the result is, just so people know if they're interested in learning more.

Can we drive development at scale? An interim update on economic growth work

Michael_Wiebe5mo2

Angus Deaton writes that in academia and policy circles, “Past development practice is seen as a succession of fads, with one supposed magic bullet replacing another—from planning to infrastructure to human capital to structural adjustment to health and social capital to the environment and back to infrastructure—a process that seems not to be guided by progressive learning.”

This framing is weird. Obviously these factors have a positive causal effect on growth. But why would you expect a silver bullet? Conditions change over time, so the constraints on growth will change as well.

What social science research do you want to see reanalyzed?

Michael_Wiebe5mo4

I'd expect this article to be pretty solid, but errors in top journals do happen.

Seth Ariel Green

5mo

Yep, I recall this case from Bryan Caplan as well: https://betonit.substack.com/p/a-correction-on-housing-regulation I happen to think Johannes is unusually careful about this stuff; per the original UCT evaluation: so I assume a similar level of care in Egger et al., on which he is coauthor

How many people are neartermist and have high P(doom)?

Michael_Wiebe9mo9

person-affecting views
supporting a non-zero pure discount rate

I think non-longtermists don't hold these premises; rather, they object to longtermism on tractability grounds.

Michael_Wiebe's Quick takes

Michael_Wiebe9mo9

What AI safety work should altruists do? For example, AI companies are self-interestedly working on RLHF, so there's no need for altruists to work on it. (And even stronger, working on RLHF is actively harmful because it advances capabilities.)

Michael_Wiebe's Quick takes

Michael_Wiebe9mo13

Tweet-thread promoting Rotblat Day on Aug. 31, to commemorate the spirit of questioning whether a dangerous project should be continued.

Some Reflections on EA Strategy Fortnight

Michael_Wiebe9mo4

On Rotblat day, people post what signs they would look for to determine if their work was being counterproductive.

How about May 7, the day of the German surrender?

Bill Gates' 400 million dollar bet - First Tuberculosis Vaccine in 100 years?

Michael_Wiebe10mo4

I found the opening paragraph a bit confusing. Suggested edits:

Last week a young man named Onekalit turned up
The dry cough wore him down for over a month, but last week he finally managed to cough a bit of sputum
- Why 'finally'? This makes it sound like the dry cough was preventing the collection of sputum.
The incredible Gates-Foundation-funded GeneXpert test [add hyphens]

NickLaing

10mo

Thanks so much appreciate it, have made the edits! Yes though, the dry cough was the thing preventing the collection of sputum.

DeepMind: Model evaluation for extreme risks

Michael_Wiebe10mo0

Side note: what's up with "model evals"? Seems like a jargony term that excludes outsiders.

How economists got Africa’s AIDS epidemic wrong

Michael_Wiebe10mo10

An evaluation by the National Academy of Sciences estimates PEPFAR has saved millions of lives (PEPFAR itself claims 25 million).

The dominant conceptual apparatus economists use to evaluate social policies—comparative cost-effectiveness analysis, which focuses on a specific goal like saving lives, and ranks policies by lives saved per dollar—suggested America’s foreign aid budget could’ve been better spent on condoms and awareness campaigns, or even malaria and diarrheal diseases.

As already mentioned by others, these two claims are consistent.

Hits-based development: funding developing-country economists

Michael_Wiebe11mo2

The Think Tank Initiative was dedicated to strengthening the capacity of independent policy research institutions in the developing world. Launched in 2008 and managed by IDRC, TTI was a partnership between five donors. The program ended in 2019.

Seeking important GH or IDEV working papers to evaluate

Michael_Wiebe11mo3

Great to see this moving forward!

Seeking (Paid) Case Studies on Standards

Michael_Wiebe1y2

Potential topic: state governments enforcing housing plans on municipalities.

First clean water, now clean air

Michael_Wiebe1y4

Has anyone thought about the effects of air pollution on animal welfare?

Michael_Wiebe's Quick takes

Michael_Wiebe1y8

Animal welfare

Has anyone looked at the effect of air pollution on animal welfare (farmed or wild)?

Pat Myron's Quick takes

Michael_Wiebe1y2

Those papers don't look very convincing. Check out:
https://vincentbagilet.github.io/inference_pollution/

This paper looks to be the best:
https://www.aeaweb.org/articles?id=10.1257/aer.20180279

More Is Probably More - Forecasting Accuracy and Number of Forecasters on Metaculus

Michael_Wiebe1y2

This is a relevant question if you're thinking about how hard you should try to drive engagement on a forecasting question.

What is the 'policy relevance' of answering the title question? Ie. if the answer is "yes, forecaster count strongly increases accuracy", how would you go about increasing the number of forecasters?

nikos

For Metaculus there are lots of ways to drive engagement: prioritise making the platform easier to use, increase cash prizes, community building and outreach etc. But as mentioned in the article the problem in practice is that the bootstrap answer is probably misleading, as increasing the number of forecasters likely changes forecaster composition. However, one specific example where the analysis might be actually applicable is when you're thinking about how many Pro Forecasters you hire for a job.

More Is Probably More - Forecasting Accuracy and Number of Forecasters on Metaculus

Michael_Wiebe1y2

Yes, this is the main difference compared to forecasters being randomly assigned to a question.

More Is Probably More - Forecasting Accuracy and Number of Forecasters on Metaculus

Michael_Wiebe1y7

I don't think you can learn much from observational data like this about the causal effect of the number of forecasters on performance. Do you have any natural experiments that you could exploit? (ie. some 'random' factor affecting the number of forecasters, that's not correlated with forecaster skill.) Or can you run a randomized experiment?

David Rhys Bernard

Can you explain more why the bootstrapping approach doesn't give a causal effect (or something pretty close to one) here? The aggregate approach is clearly confounded since questions with more answers are likely easier. But once you condition on the question and directly control the number of forecasters via bootstrapping different sample sizes, it doesn't seem like there are any potential unobserved confounders remaining (other than the time issue Nikos mentioned). I don't see what a natural experiment or RCT would provide above the bootstrapping approach.

More Is Probably More - Forecasting Accuracy and Number of Forecasters on Metaculus

Michael_Wiebe1y4

It sounds like you're doing subsampling. Bootstrapping is random sampling with replacement.

If, for example, we kept increasing the size of the sample we draw, then eventually the variance would be guaranteed to go to zero (when the sample size equals the total number of forecasters and there is only one possible sample we can draw).

With bootstrapping, there are $N^{N}$ possible draws when the bootstrap sample size is equal to the actual sample size $N$ . (And you could choose a bootstrap sample size $K > N$ .)

nikos

Ah snap! I forgot to remove that paragraph... I did subsampling initially, then switched to bootstrapipng. Resulsts remained virtually unchanged. Thanks for pointing that out, will update the text.

The Capability Approach to Human Welfare

Michael_Wiebe1y4

Imagine two cities. In one, it is safe for women to walk around at night and in the second it is not. I think the former city is better even if women don’t want to walk around at night, because I think that option is valuable to people even if they do not take it. Preference-satisfaction approaches miss this.

Don't people also have preferences for having more options?

ryancbriggs

I believe that’s generally outside the model. It’s like asking if people have preferences about the ranking of their preferences.

Livelihood interventions: overview, evaluation, and cost-effectiveness

Michael_Wiebe1y8

I'm surprised the Nigerian business plan competition was not included. (Chris Blattman writeup from 2015 here: "Is this the most effective development program in history?".)

Are you really in a race? The Cautionary Tales of Szilárd and Ellsberg

Michael_Wiebe1y2

I say "They were arguably right, ex ante, to advocate for and participate in a project to deter the Nazi use of nuclear weapons." Actions in 1939-42 or around 1957-1959 are defensible.

Given this, is it accurate to call Einstein's letter a 'tragedy'? The tragic part was continuing the nuclear program after the German program was shut down.

Are you really in a race? The Cautionary Tales of Szilárd and Ellsberg

Michael_Wiebe1y4

I suppose sprints start out as jogs.

Are you really in a race? The Cautionary Tales of Szilárd and Ellsberg

Michael_Wiebe1y2

2 August 1939: Einstein-Szilárd letter to Roosevelt advocates for setting up a Manhattan Project. [...]
June 1942: Hitler decides against an atomic program for practical reasons.

Is it accurate to say that the US and Germans were in a nuclear weapons race until 1942? So perhaps the takeaway is "if you're in a race, make sure to keep checking that the race is still on".

HaydnBelfield

I think the crucial thing is funding levels. It was only by October 1941 (after substantial nudging from the British) that Roosevelt approved serious funding. As a reminder, I'm particularly interested in ‘sprint’ projects with substantial funding: for example those in which the peak year funding reached 0.4% of GDP (Stine, 2009, see also Grace, 2015). So to some extent they were in a race 1939-1942, but I would suggest it wasn't particularly intense, it wasn't a sprint race.

Michael_Wiebe1y2

How much would I personally have to reduce X-risk to make this the optimal decision? Well, that’s simple. We just calculate:
25 billion * X = 20,000 lives saved
X = 20,000 / 25 billion
X = 0.0000008
That’s 0.00008% in x-risk reduction for a single individual.

I'm not sure I follow this exercise. Here's how I'm thinking about it:

Option A: spend your career on malaria.

Cost: one career
Payoff: save 20k lives with probability 1.

Option B: spend your career on x-risk.

Cost: one career
Payoff: save 25B lives with probability p (=P(prevent extinction)), save 0

... (read more)

Phosphorous

I think that the expected payoff and the reduction in P(extinction) are just equivalent. Like, a 1% chance of saving 25b is the same as reducing P(extinction) from 7% to 6%, that's what a "1% chance of saving" means, because: p(extinction) = 1 - p(extinction reduction from me) - p(extinction reduction from all other causes) So, if I had a 100% chance of saving 25b lives, then that'd be a 100% reduction in extinction risk. Of course, what we care about is the counterfactual, so if there's already only a 50% chance of extinction, then you could say colloquially that I brought P(extinction) from 0.5 to 0, and there I had a "100% chance of saving 25b lives" but that's not quite right, because I should only get credit for reducing it from 0.5 to 0, so it would be better in that scenario to say that I had a 50% chance of saving 25b, and that's just as high as that can get.

Michael_Wiebe1y2

How much would I personally have to reduce X-risk to make this the optimal decision?

Shouldn't this exercise start with the current P(extinction), and then calculate how much you need to reduce that probability? I think your approach is comparing two outcomes: save 25B lives with probability p, or save 20,000 lives with probability 1. Then the first option has higher expected value if p>20000/25B. But this isn't answering your question of personally reducing x-risk.

Also, I think you should calculate marginal expected value, ie., the value of additional resources conditional on the resources already allocated, to account for diminishing marginal returns.

Phosphorous

Hey thank you for this comment. We actually started by thinking about P(extinction) but came to believe that it wasn't relevant, because in terms of expected value, reducing P(extinction) from 95% to 94% is equivalent to reducing it from 3% to 2%, or from any other amount to any other amount (keeping the difference the same). All that matters is the change in P(extinction). Also, in terms of marginal expected value, that would be the next step in this process. I'm not saying with this post "Go work on X-Risk because it's marginal EV is likely to be X" I'm rather saying, "You should go work on X-Risk if it's marginal EV is above X." But to be honest, I have no idea how to figure the first question out. I'd really like to, but I don't know of anyone who has even attempted to give an estimate on how much a particular intervention might reduce x-risk (please, forum, tell me where I can find this.)

Air Pollution: Founders Pledge Cause Report

Michael_Wiebe2y5

Adding to the causal evidence, there's a 2019 paper that uses wind direction as an instrumental variable for PM2.5. They find that IV > OLS, implying that observational studies are biased downwards:

Comparing the OLS estimates to the IV estimates in Tables 2 and 3 provides strong evidence that observational studies of the relationship between air pollution and health outcomes suffer from significant bias: virtually all our OLS estimates are smaller than the corresponding IV estimates. If the only source of bias were classical measurement error, whi

Michael_Wiebe2y9

Related, John von Neumann on x-risk:

Finally and, I believe, most importantly, prohibition of technology (invention and development, which are hardly separable from underlying scientific inquiry), is contrary to the whole ethos of the industrial age. It is irreconcilable with a major mode of intellectuality as our age understands it. It is hard to imagine such a restraint successfully imposed in our civilization. Only if those disasters that we fear had already occurred, only if humanity were already completely disillusioned about technological civilization

... (read more)

Maxwell Tabarrok

Yes, this paper is great and it was an inspiration for my piece. I found his answer here pretty unsatisfying though so hopefully I was able to expand on it well.

Should you still use the ITN framework? [Red Teaming Contest]

Michael_Wiebe2y2

I didn't suggest otherwise.

Should you still use the ITN framework? [Red Teaming Contest]

Michael_Wiebe2y4

It sounds like you're arguing that we should estimate 'good done/additional resources' directly (via Fermi estimates), instead of indirectly using the ITN framework. But shouldn't these give the same answer?

Karthik Tadepalli

I don't think OP is opposed to multiplying them together.

Should you still use the ITN framework? [Red Teaming Contest]

Michael_Wiebe2y4

And even when you can multiply the three quantities together, I feel like speaking in terms of importance, neglectedness and tractability might make you feel that there is no total ordering of intervention (“some have higher importance, some have higher tractability, whether you prefer one or the other is a matter a personal taste”)

I don't follow this. If you multiply I*T*N and get 'good done/additional resources', how is that not an ordering?

frib

That's an ordering! It's mostly analyses like the ones of 80k Hours, which do not multiply the three together, which might let you think there is no ordering. Is there a way I can make that more precise?

It's OK not to go into AI (for students)

Michael_Wiebe2y13

There seems to be a "intentions don't matter, results do" lesson that's relevant here. Intending to solve AI alignment is secondary, and doesn't mean that you're making progress on the problem.

And we don't want people saying "I'm working on AI" just for the social status, if that's not their comparative advantage and they're not actually being productive.

ruthgrace2y10

Yes that's exactly it! Even if a lot of people think that AI is the most important problem to work on, I would expect only a small minority to have a comparative advantage. I worry that students are setting themselves up for burnout and failure by feeling obligated to work on what's been billed as some as the most pressing/impactful cause area, and I worry that it's getting in the way of people exploring with different roles and figuring out and building out their actual comparative advantage

Person-affecting intuitions can often be money pumped

Michael_Wiebe2y3

Hm, then I find necessitarianism quite strange. In practice, how do we identify people who exist regardless of our choices?

EJT

I think in ordinary cases, necessitarianism ends up looking a lot like presentism. If someone presently exists, then they exist regardless of my choices. If someone doesn't yet exist, their existence likely depends on my choices (there's probably something I could do to prevent their existence). Necessitarianism and presentism do differ in some contrived cases, though. For example, suppose I'm the last living creature on Earth, and I'm about to die. I can either leave the Earth pristine or wreck the environment. Some alien will soon be born far away and then travel to Earth. This alien's life on Earth will be much better if I leave the Earth pristine. Presentism implies that it doesn't matter whether I wreck the Earth, because the alien doesn't exist yet. Necessitarianism implies that it would be bad to wreck the Earth, because the alien will exist regardless of what I do.

An epistemic critique of longtermism

Michael_Wiebe2y2

The longtermist claim is that because humans could in theory live for hundreds of millions or billions of years, and we have potential to get the risk of extinction very almost to 0, the biggest effects of our actions are almost all in how they affect the far future. Therefore, if we can find a way to predictably improve the far future this is likely to be, certainly from a utilitarian perspective, the best thing we can do.

I don't find this framing very useful. The importance-tractability-crowdedness framework gives us a sophisticated method for evaluating... (read more)

An epistemic critique of longtermism

Michael_Wiebe2y2

Because of this heavy tailed distribution of interventions

Is it actually heavy-tailed? It looks like an ordered bar chart, not a histogram, so it's hard to tell what the tails are like.

Lorenzo Buonanno

What do you mean? It looks like a histogram to me Oh nevermind, I see what you mean, indeed the y axis seems to indicate the intervention, not the number of interventions. Still, wouldn't a histogram be very similar?

Announcing the Center for Space Governance

Michael_Wiebe2y12

Zach and Kelly Weinersmith are writing a book on space settlement. Might be worth reaching out to them.

Fanatical EAs should support very weird projects

Michael_Wiebe2y5

What do you think of the Bayesian solution, where you shrink your EV estimate towards a prior (thereby avoiding the fanatical outcomes)?

Derek Shiller

Thanks for sharing this. My (quick) reading is that the idea is to treat expected value calculations not as gospel, but as if they are experiments with estimated error intervals. These experiments should then inform, but not totally supplant, our prior. That seems sensible for givewell’s use cases, but I don’t follow the application to pascal’s mugging cases or better supported fanatical projects. The issue is that they don’t have expected value calculations that make sense to regard as experiments. Perhaps the proposal is that we should have a gut estimate and a gut confidence based on not thinking through the issues much, and another estimate based on making some guesses and plugging in the numbers, and we should reconcile those. I think this would be wrong. If anything, we should take our Bayesian prior to be our estimate after thinking through all the issues, (but perhaps before plugging in all of the exact numbers). If you’ve thought through all the issues above, I think it is appropriate to allow an extremely high expected value for fanatical projects even before trying to make a precise calculation. Or at least it is reasonable for your prior to be radically uncertain.

Thomas Kwa

There are ways to deal with Pascal's Mugger with leverage penalties, which IIRC deal with some problems but are not totally satisfying in extremes.

When Giving People Money Doesn't Help

Michael_Wiebe2y4

The three groups have completely converged by the end of the 180 day period

I find this surprising. Why don't the treated individuals stay on a permanently higher trajectory? Do they have a social reference point, and since they're ahead of their peers, they stop trying as hard?

Person-affecting intuitions can often be money pumped

Michael_Wiebe2y3

Is the difference between actualism and necessitarianism that actualism cares about both (1) people who exist as a result of our choices, and (2) people who exist regardless of our choices; whereas necessitarianism cares only about (2)?

EJT

Yup!

A Critical Review of Open Philanthropy’s Bet On Criminal Justice Reform

Michael_Wiebe2y2

I wonder if we can back out what assumptions the 'peace pact' approach is making about these exchange rates. They are making allocations across cause areas, so they are implicitly using an exchange rate.

A Critical Review of Open Philanthropy’s Bet On Criminal Justice Reform

Michael_Wiebe2y4

I get the weak impression that worldview diversification (partially) started as an approximation to expected value, and ended up being more of a peace pact between different cause areas. This peace pact disincentivizes comparisons between giving in different cause areas, which then leads to getting their marginal values out of sync.

Do you think there's an optimal 'exchange rate' between causes (eg. present vs future lives, animal vs human lives), and that we should just do our best to approximate it?

NunoSempere

Yes. To elaborate on this, I think that agents should converge on such an exchange as they become more wise and understand the world better. Separately, I think that there are exchange rates that are inconsistent with each other, and I would already consider it a win to have a setup where the exchange rates aren't inconsistent.

Global Health & Development - Beyond the Streetlight

Michael_Wiebe2y2

Have you seen this?

Kurzgesagt - The Last Human (Longtermist video)

Michael_Wiebe2y9

If we don't kill ourselves in the next few centuries or millennia, almost all humans that will ever exist will live in the future.

The idea is that, after a few millenia, we'll have spread out enough to reduce extinction risks to ~0?

Lorenzo Buonanno2y12

Even without considering that, if we stay at ~140 million births per year, in 800 years 50% of all humans will have been born in our future.
And in ~7 millennia 90% of all humans will have been born in our future.

Sharmake

Basically, yes. Assuming civilization survives the Singularity, existential risks are effectively zero thanks to the fact that it's almost impossible to destroy an interstellar civilization.

Results of a survey of international development professors on EA

Michael_Wiebe2y3

Nice work! Sounds like movement building is very important.

Longtermist slogans that need to be retired

Michael_Wiebe2y2

Do you disagree with FTX funding lead elimination instead of marginal x-risk interventions?

Zach Stein-Perlman

Not actively. I buy that doing a few projects with sharper focus and tighter feedback loops can be good for community health & epistemics. I would disagree if it took a significant fraction of funding away from interventions with a more clear path to doing an astronomical amount of good. (I almost added that it doesn't really feel like lead elimination is competing with more longtermist interventions for FTX funding, but there probably is a tradeoff in reality.)