I am a founder and editor of Works in Progress at Stripe, a researcher on health at Our World in Data, and an editor at Stripe Press.
My research interests are in health, epidemiology and the social sciences.
You can contact me at firstname.lastname@example.org or find me on Twitter at https://twitter.com/salonium
You gave lots of good examples of low-cost high-impact interventions like water chlorination, vaccines and lead removal. I agree that there are far more examples like that, particularly in health & medicine, which we already know about but where the scale of the benefits is underestimated. Water chlorination is a particularly good example because it's one where the large benefits were expected by experts but were surprising to others.
And thank you for linking to my article on RCTs, the arguments you made above were actually a big part of the reason that I wrote that!
I think my phrasing might have been unclear earlier – I'm Nick's colleague at Works in Progress, but not on the blog prize and don't have any involvement there.1. I think that blogs fill a different purpose to many other formats you mention, but are also more feasible than writing long-form: for people who have other commitments, for writing short commentaries, for responding to topical events or stories, for publishing independent parts of a series in a way that makes each part more shareable. I'm sure you can think of many examples of each of these. I think it's not important for them to be searchable in the same way as it is for encyclopedia entries, although that's a bonus. YouTube videos are an alternative but they also have a different demographic audience and people who prefer to consume information in a different format, so they're not interchangeable in that way, in my opinion.
2. Those aren't quite the same as I suggested, which would be more like a prize for new blogs that are maintained in the longterm, or an additional prize for longterm maintenance.
3. (I'm not involved in this so don't have any comments)
4. I think those are a different sort of problem. Prizes for open ended endeavours – such as answering an unsolved problem – don't have a certainty of being resolved. Prizes for meeting some criteria, which involve essentially improving on existing methods, are more effective than those, as Anton Howes has written about here. But neither are the same as having a prize with the certainty that someone will win the prize out of the candidates that apply. A closer analogue of this prize is probably a competition.
This is an interesting critique! I think it misses a lot, though, so would like to push back on it. Full disclosure/conflict of interest – I'm one of Nick's colleagues – we're both editors at Works in Progress.1) Blogs are short content. – I think a lot of your critiques here (that they become outdated quickly, don't provide much long-term value, fall prey to replication crises) actually apply to all forms of published, static content – newspapers, tweet threads, newsletters, books, etc. We've all heard of books and news content that have aged badly – it's unclear why you've listed this as a drawback of blogs. If anything, blogs are much easier to keep updated than the others (as you mention), especially compared to printed content – you can easily update an old blogpost and signpost when you've done that. You can't edit a tweet thread, a printed book, or a news article easily.
I don't think people need to read through entire archives of blogs for them to be useful. People often share links to particular old blogposts as references on a particular topic, and that's fine. In the same way, we don't say that newspapers are bad because people don't read through newspaper archives.2) Did anybody think about the incentives? – as you mention later on, the purpose of the prize is to incentivise new blogs to be created and maintained, so it wouldn't make sense to reward ones that already exist. I don't think it's ironic because the topic is on longtermism – and it's likely that at least some of these blogs will be maintained in the longterm, which may be another way to look at it. Many projects (and blogs) are likely to be unfinished, but that's a feature of new projects in general. One way to improve the prize might be to reward blogs that stay maintained years from now, but I don't think you make that point.
3) Examples – skipping over this as it looks like you liked many of them.
4) There's not that many longtermist blogs around. To me, it seems like the EA and longtermist movements are growing rapidly in size and could be a lot bigger than they currently are, so what may seem like a sufficient number of blogs now wouldn't be later. If you recognise all the blogs on that list, that seems like an indication that there aren't very many of them. Ambitiously, people might want to reach a level where there were so many EA blogs that they weren't able to keep track of all of them. EA Forum is great, but if EA was huge, people might not use this as the central forum to cross-post all their blogposts. This seems more like a central node or bridge to other content than The place to put everything EA related.
5) Blogs are for discourse. I think you may not be aware that Nick's already an editor at Works in Progress (the long form magazine you mentioned). Blogs and longer form discourse magazines don't seem like an either/or to me, and not to Nick who's involved in both. I think they actually feed into each other: We sometimes recruit authors because we've read their blogs and want to develop their content further. And we'd be very happy if bloggers riffed off ideas that we've published on another platform like their own blog. In short, why not both?
Hey Stephen, thanks very much!
I completely agree with you on the differences between clinical RCTs and development/public policy RCTs.
Part of the reason for that is that it was originally meant to be a longer piece, with some policy RCT examples, how clustering works, etc. but it was already fairly long, and those were harder to explain concisely. And secondly simply because I have a background in health/medicine, which meant it was easy to draw examples from the field.
Hopefully I signposted this a little by saying that the procedures I mention are those found in medicine / clinical RCTs, but from your comment I think it was probably not enough. I'll think about this and clarify or add some caveats to the article that make it clearer. Thank you!
Oh, I remember reading this paper now! It's great, thanks for sharing.
And thank you very much :) I will be here more often for sure.
Thank you very much!
Is there a paper by him you would recommend reading on the topic? I've seen this one, which I agree with in parts – with good theory and evidence from other research on which policies work, there's less need for RCTs, but I think there's a role for both to answer different questions.
Great post! I thought this was a very clear and useful summary of the literature, and all the links and references are very helpful.
You mention the difficulties in comparing happiness between countries towards the end, do you have a view on how big of a problem these issues are for measuring happiness across the lifespan? Or views on the age-happiness curve more generally?
Also, in case you hadn't seen it already, I found this post by Pew Research Centre a very useful summary of various problems in questionnaire design (some of which you mention, e.g. acquiscence bias), and how they try to get around them.
If you do find it, I'd be interested to read that.
I would guess that it's difficult for people to intuitively understand precisely why randomization is so useful, although other aspects of RCTs are probably easier to grasp – particularly, the experimental part of giving treatment A to one group and treatment B to another group and following up their outcomes. But overall I think I would agree with you; people need less understanding of confounders and selection bias to read an RCT than they'd need to read an observational study.
Hey Marius, thank you!
I wish I could answer this better, but I don't know enough to have a good answer to how to scale policy RCTs, especially since they're quite different from clinical RCTs (they often can't administer the treatment in a standardised way, there's usually no way to blind participants to what they're receiving, they usually don't track/measure participants as regularly, etc.) Though those are also factors that make them messier in larger projects.
I've read this blog post by Michael Clemens, which I found was a useful summary of two books on the topic: https://cgdev.org/blog/scaling-programs-effectively-two-new-books-potential-pitfalls-and-tools-avoid-them
But I think there are often situations where they can be leveraged for large-scale interventions. A good recent example is this experiment on street lighting and its effect in reducing crime. There are some features of the policy make it easier to study at scale. Crime data exists at the right scale (you don't need to track individual participants to find out about crime rates), streetlighting is easy to standardise, you can measure the effects at the level of neighbourhood clusters rather than at the level of individuals. So maybe that's a good way of thinking about how to scale up RCTs - to find treatments and outcomes that are easier to implement and measure at a large scale.