All of Sjlver's Comments + Replies

Thanks for the long reply!

These are good arguments. Some were new to me, many I was already aware of. For me, the overall effect of the arguments, benchmarks, and my own experience is to make me think that a lot of scenarios are plausible. There is a wide uncertainty range. It might well be that AGI takes a long time to happen, but I also see many trends that indicate it could arrive surprisingly quickly.

For you, the overall conclusion from all the arguments is to completely rule out near-term AGI. That still seems quite wildly overconfident, even if there is a decent case being made for long timelines.

1
Yarrow Bouchard 🔸
Important correction to my comment above: the AI Impacts survey was actually conducted in October 2023, which is 7 months after the release of GPT-4 in March 2023. So, it does actually reflect AI researchers' views on AGI timelines after given time to absorb the impact of ChatGPT and GPT-4. The XPT superforecasting survey I mentioned was, however, indeed conducted in 2022 just before the launch of ChatGPT in November 2022. So, that's still a pre-ChatGPT forecast. I just published a post here about these forecasts. I also wrote a post about 2 weeks ago that adapted my comments above, although unfortunately it didn't lead to much discussion. I would love to stimulate more debate about this topic.  It would be great, even, if the EA Forum did some kind of debate week or essay competition around whether near-term AGI is likely. Maybe I will suggest that.
1
Yarrow Bouchard 🔸
I don't really have a gripe with people who want to put relatively small probabilities on near-term AGI, like the superforecasters who guessed there's a 1% chance of AGI by 2030. Who knows anything about anything? Maybe Jill Stein has a 1% chance of winning in 2028! But 50% by 2032 is definitely way too high and I actually don't think there's a rational basis for thinking that.

I agree that whether or not we get AGI is a crux for this topic. Though it makes sense to update our cause priorities even if AI is merely transformational.

Your comment seems overconfident, however ("essentially no chance of AGI"). This seems to not take into account that many (most?) intellectual tasks see progress. For example ARC-AGI-2 had scores below 10% at the beginning of the year, and within just few months the best solution on https://arcprize.org/leaderboard scores 29%. Even publicly available models without custom scaffolding score >10% now.

O... (read more)

3
Yarrow Bouchard 🔸
Forgive me for the very long reply. I’m sure that you and others on the EA Forum have heard the case for near-term AGI countless times, often at great depth, but the opposing case is rarely articulated in EA circles, so I wanted to do it justice that a tweet-length reply could not do.  Why does the information we have now indicate AGI within 7 years and not, say, 17 years or 70 years or 170 years? If progress in science and technology continues indefinitely, then eventually we will gain the knowledge required to build AGI. But when is eventually? And why would it be so incredibly soon? To say that some form of progress is being made is not the same as making an argument for AGI by 2032, as opposed to 2052 or 2132.  I wouldn’t say that LLM benchmarks accurately represent what real intellectual tasks are actually like. First, the benchmarks are designed to be solvable by LLMs because they are primarily intended to measure LLMs against each other and to measure improvements in subsequent versions of the same LLM model line (e.g. GPT-5 vs GPT-4o). There isn’t much incentive to create LLM benchmarks where LLMs stagnate around 0%.[1] Even ARC-AGI 1, 2, and 3, which are an exception in terms of their purpose and design are still intended to be in the sweet spot between too easy to be a real challenge and too hard to see progress on. If a benchmark is easy to solve or impossible to solve, it won’t encourage AI researchers and engineers to try hard to solve it and make improvements to their models in the process. The intention of the ARC-AGI is to give people working on AI a shared point of focus and a target to aim for. The purpose is not to make a philosophical or scientific point about what current AI systems can’t do. The benchmarks are designed to be solved by current AI systems. It always bears repeating, since confusion is possible, that the ARC-AGI benchmarks are not intended to test whether a system is AGI or not, but are rather intended to test whether AI syste

If you're interested in studies that evaluate the impact of LLMs on productivity, I can recommend the blog of Ethan Mollick. For example this post from September 2023: https://www.oneusefulthing.org/p/centaurs-and-cyborgs-on-the-jagged

It found that consultants with AI access outperformed consultants without AI access, on most dimensions that were measured. Ethan has since participated in several other studies on the industry adoption of AI.

Thanks! This sounds like good advice

I have two related thoughts that I would love to hear your opinion on:

  1. There seems to be quite a large opportunity cost. Instead of investing, you could spend the money on effective causes now. Or take a lower-paying job now rather than wait until you've reached some investment goal. Presumably, many effective organizations would benefit from getting money/talent earlier? If you want to maximize your life's impact, would that be a good strategy?

  2. Depending on your AI timelines, money that is locked until retirement is

... (read more)
1
Dave Cortright 🔸
I think about it as 2 mechanisms for income: my effort or my investment's effort. I'm at the point now where my investments do 100% of the work for earning my income, leaving me 100% of my time to spend how I see fit.  You can give more to effective causes sooner, at the cost of investing more effort for longer to keep the money coming in. Ultimately, how you want to give back, at what stage of life, and how much are all unique, personal choices. Absolutely, make adjustments based on any personal values or beliefs. I consider this to be like many art forms. You need to learn and know the foundational basics before you can make good judgments on which parts to change or ignore.

The economic data seems to depend on one's point of view. I'm no economist and I certainly can't prove to you that AI is having an economic impact. Its use grows quickly though: Statistics on AI market size

It's also important, I think, to distinguish between AI capabilities and AI use. The AI-2027 text argues that a selected few AI capabilities matter most, namely those related to software and AI engineering. These will drive the recursive improvements. Changes to other parts of the industry are downstream of that. Both our viewpoints seem to be consistent... (read more)

1
Yarrow Bouchard 🔸
This is confusing two different concepts. Revenue generated by AI companies or by AI products and services is a different concept than AI’s ability to automate human labour or augment the productivity of human workers. By analogy, video games (another category of software) generate a lot of revenue, but automate no human labour and don’t augment the productivity of human workers.  LLMs haven’t automated any human jobs and the only scientific study I’ve seen on the topic found that LLMs slightly reduced worker productivity. (Mentioned in a footnote to the post I linked above.)

Why do you think that? Personally, I've lost several bets. For example, I've bet NO on "Will an AI win a gold medal on the IOI (competitive programming contest) before 2027?" and have already lost that 20 months before the start of 2027.

As a former IOI participant, that achievement feels amazing. As a software engineer, I absolutely find AI tools useful, practical, and economically valuable.

If AI is having an economic impact by automating software engineers' labour or augmenting their productivity, I'd like to see some economic data or firm-level financial data or a scientific study that shows this.

Your anecdotal experience is interesting, for sure, but the other people who write code for a living who I've heard from have said, more or less, AI tools save them the time it would take to copy and paste code from Stack Exchange, and that's about it. 

I think AI's achievements on narrow tests are amazing. I think AlphaStar's success on compet... (read more)

This was a valuable read for me. Thanks!

I share some of your skepticism. At the same time, I think the argument relies on reasons that are quite speculative, such as:

  • it's highly unclear if rates of progress can be extrapolated
  • benchmarks might not generalize to the full complexity of real tasks
  • here's a list of cognitive processes that AI can't do yet; it might continue to fail at them for years to come

I can't shake off the feeling that this type of argument has often aged poorly when it comes to AI. I've certainly been baffled many times by AI solving... (read more)

3
Yarrow Bouchard 🔸
This may be true for games like chess, go, and StarCraft, or for other narrow tests of AI. But for claims that AI will do something useful, practical, and economically valuable — like driving cars or replacing humans on assembly lines — the opposite is true. The predictions about rapid AI progress have been dead wrong and the AI skeptics have been right.

Thanks for the response!

I understand that you are worried about chicken and fish consumption. I have no knowledge about why these charts are the way they are, or why people in the UK consume twice as much chicken as those in Germany. It's also difficult to guess the impact of Veganuary in these trends. Insofar, I find the charts a bit distracting.

What I intended to say with my comment is that Veganuary has clearly visible impacts around me: when I go shopping, when I see ads, when I eat out. This seems to correlate with a general trend of seeing more vegan... (read more)

It's great to try and analyze the cost-effectiveness of Veganuary. I'm thankful for this post and also for the responses by @Toni Vernelli and others.

While I appreciate the effort, I find it hard to agree with Vasco's conclusions. There are many discounts in the analysis that feel pretty arbitrary to me. Toni has answered to this much better than I could. I'd just like to share a few personal impressions. These are of course biased, but might explain why I'm suspicious about the many downward adjustments (and lack of upward adjustments) in Vasco's analysis... (read more)

2
Vasco Grilo🔸
Thanks for the comment, Sjlver! My cost-effectiveness estimate is supposed to be unbiased in the sense of not being too low or high in expectation. To be clear, I think one single email or video can turn someone from omnivoure to vegan. However, I believe that is super far from the expected effect. The supply per capita of poultry meat in Germany has not had a clear downwards trend, although it does seem like it has already peaked. Likewise for the supply per capita of fish and other seafood in Germany. However, this is very weak evidence of the impact of Veganuary. There are many factors which affect meat consumption in Germany besides Veganuary, and that may well be the country which Veganuary targets with the most positive trends. In the UK, the consumption per capita of poultry meat has been increasing, although that on fish and other seafood has recently been decreasing. Nitpick. Dairy accounts for a very small fraction of animal suffering. I think decreases in its consumption only matter to the extent they predict decreases in the consumption of eggs, poultry birds, fish, or other seafood.
2
Vasco Grilo🔸
Thanks for the comment, Sjlver! My cost-effectiveness estimate is supposed to be unbiased in the sense of not being too low or high in expectation. To be clear, I think one single email or video can turn someone from omnivoure to vegan. However, I believe that is super far from the expected effect. The supply per capita of poultry meat in Germany has not had a clear downwards trend, although it does seem like it has already peaked. Likewise for the supply per capita of fish and other seafood in Germany. However, this is very weak evidence of the impact of Veganuary. There are many factors which affect meat consumption in Germany besides Veganuary, and that may well be the country which Veganuary targets with the most positive trends. In the UK, the consumption per capita of poultry meat has been increasing, although that on fish and other seafood has recently been decreasing. Nitpick. Dairy accounts for a very small fraction of animal suffering. I think decreases in its consumption only matter to the extent they predict decreases in the consumption of eggs, poultry birds, fish, or other seafood.

EA charities can also combine education and global health, like https://healthlearn.org/blog/updated-impact-model

HealthLearn builds a mobile app for health workers (nurses, midwives, doctors, community health workers) in Nigeria und Uganda. Health workers use it to learn clinical best practices. This leads to better outcomes for patients.

I'm personally very excited by this. Health workers in developing countries often have few training resources available. There are several clinical practices that can improve patient outcomes while being easy to implement ... (read more)

Personally, I'm not using the forum as much as I could and as much as I used to, because it is a time-sink. I'm the kind of person who can easily get lost on the Internet; clicking a link here and opening another tab there, and... look where those two hours went. Because of this, I'm wary of spending too much time here.

I don't know whether my declining forum use is due to changes in my behavior or changes to the forum. Probably it's a combination. On the forum side, the home page feels a bit more cluttered than it used to be. The forum feels slightly more ... (read more)

1
Pat Myron 🔸
One concrete suggestion: https://forum.effectivealtruism.org/posts/oZff425xLnikfxeGD/pat-myron-s-shortform?commentId=s8Qoydhg3zZ3b3486
4
Sarah Cheng 🔸
That's very fair! I certainly don't want anyone to use the Forum more than they believe is worthwhile. My guess is that a healthy relationship with the Forum looks different for different people, and I don't think that every single reader should engage with the Forum more than they currently do. I recommend that people customize their Forum experience and customize their Frontpage to help with them find a good balance.

OP here :) Thanks for the interesting discussion that the two of you have had!

Lukas_Gloor, I think we agree on most points. Your example of estimating a low probability of medical emergency is great! And I reckon that you are communicating appropriately about it. You're probably telling your doctor something like "we came because we couldn't rule out complication X" and not "we came because X has a probability of 2%" ;-)

You also seem to be well aware of the uncertainty. Your situation does not feel like one where you went to the ER 50 times, were sent home... (read more)

Richard Chappell writes something similar here, better than I could. Thanks Lizka for linking to that post!

Pascalian probabilities are instead (I propose) ones that lack robust epistemic support. They're more or less made up, and could easily be "off" by many, many orders of magnitude. Per Holden Karnofsky's argument in 'Why we can't take explicit expected value estimates literally', Bayesian adjustments would plausibly mandate massively discounting these non-robust initial estimates (roughly in proportion to their claims to massive impact), leading to

... (read more)

I agree that our different reactions come partly from having different intuitions about the boundaries of a thought experiment. Which factors should one include vs exclude when evaluating answers?

For me, I assumed that the question can't be just about expected values. This seemed too trivial. For simple questions like that, it would be clearer to ask the question directly (e.g., "Are you in favor of high-risk interventions with large expected rewards?") than to use a thought experiment. So I concluded that the thought experiment probably goes a bit further... (read more)

This is a great point.

Clearly you are right. That said, the examples that you give are the kind of frequentist probabilities for which one can actually measure rates. This is quite different from the probability given in the survey, which presumably comes from an imperfect Bayesian model with imprecise inputs.

I also don't want to belabor the point... but I'm pretty sure my probability of being stuck by lightning today is far from 0.001%. Given where I live and today's weather, it could be a few orders of magnitude lower. If I use your unadjusted probabilit... (read more)

3
Sjlver
Richard Chappell writes something similar here, better than I could. Thanks Lizka for linking to that post! Maybe I should have titled this post differently, for example "Beware of non-robust probability estimates multiplied by large numbers".

Sorry for having been imprecise in my post -- I wrote the question from memory after having already submitted the survey. I'll change it to "avert".

There is some public information about this here: https://www.givewell.org/charities/amf#Registration

Details vary by country. It's often a process where enumerators go door-to-door and interview the head of household to determine how many people live in a household. There can be some incentives to over-report the number of people, to receive more bednets. However, there is a limit on the number of nets per household (usually 3 or 4), and some of the data is independently verified by a second team of enumerators.

For what it's worth, AMF has population data from distributing bednets to every household. As an organization that cares about being highly effective, AMF tries hard to get the number of nets right. The target is to have approximately one net per 1.8 people (a net covers two people usually, but then there are households with an odd number of people or with pregnant women).

AMF distributed nets in five Nigerian states in the last two years. You can see these distributions here: https://www.againstmalaria.com/Distributions.aspx?MapID=68

AMF reports the populat... (read more)

3
DavidNash
Thanks for this, do you know what process AMF uses to verify the number of people in a house? And if there are any incentives to under/over report.

I've appreciated this response.

The biggest discrepancy seems to be around the number of nurses:

  • Lee writes that 1,709 nurses emigrated from Nigeria to the UK in a year, and that the UK takes ~85% of the total.
  • Nick cites a Guardian article claiming that 15,000 nurses emigrate per year, and says that less than 25% go to the UK

Any insight on these large differences?

2
NickLaing
Thanks yes I would agree the article might hold water if UK Nurses made up over perhaps 60 percent of emigration Even on a sanity check there's no chance only 2000ish nurses leave Nigeria every year. It's way too low to even be plausible. I think there's a major issue here (which is common and somewhat understandable) with giving credence to sources because they are perceived to be "trustworthy' even when their numbers are obviously meaningless. The WHO and OECD data cited should be dismissed out of hand for absurdity, but I think Lee gives it credence because it is seen as an official source. The nurses council in Nigeria has come out publication multiple times saying that at 45,000 nurses have left over the last 3 years and 16,000 last year. Noone has refuted them even though it would probably be in the interest of the Nigerian government and Western countries for PR reason to do so. These numbers may well be exaggerated a little sure (we can't know) , but this is the most direct, proximal data we have and has been cited on news articles for months now. I don't see a good reason to take other secondary data sources seriously that fail a sanity check.
Answer by Sjlver3
0
0

TLDR: Full-stack software engineer (previously at Google and AMF) looking for part-time opportunities.

Skills & background: Expertise in software engineering for backend and frontend development, using a wide range of tech stacks. At AMF, I also worked on many data science tasks: automatic importing and cleaning of data, analyzing geospatial data, database design and optimizations. I have a security mindset and have done PhD research on software testing and hardening. I enjoy working with team members and partner organizations, and have excellent commun... (read more)

For European people on a budget, here's a multivitamin at €0.07 per day: https://www.amazon.de/-/en/Multivitamins-Minerals-Multivitamin-Essential-Vitamins/dp/B08BX439HX They don't deliver to the US, though. And you might want to add in some omega 3 fatty acids (DHA/EPA) for a more complete supplementation

What you write is almost right, but not 100%... we are getting at the heart of the problem here. Thanks for making me re-think this and state it more clearly!

Edited to add: I've now also read the discussion that you've linked to in your comment. It is now clear to me that the team has thought through issues like this... so I wouldn't be angry if you prefer to use your time more wisely than for responding to my ramblings :)

Assume as an example that, without my vote, there is the following situation:

  • candidate A received 933 points from other voters
  • candida
... (read more)

Thanks for setting up this donation election!

Choosing voting methods is difficult, and no voting method is without flaw. Nevertheless, I am somewhat unhappy with the method proposed here, because it is very difficult for users to support multiple candidates. The problem arises because the method tried to do two things: (1) determine which candidates are in the top three, and (2) determine their relative popularity.

The problem: as a voter who likes two candidates A and B, I cannot support A without harming B, and vice versa. My rational behavior is to alloc... (read more)

3
harfe
I think you are misunderstanding the mechanics of the elimination here. If you allocate nonzero points to both charities, then after one of A and B will be eliminated, all points will be reallocated to the remaining charity. So, to maximize the chance that one of them ends up in the top three, it doesn't matter much weather to put 50 points on A and B each, or 99 points on A and 1 point on B (and actually, putting all points on one of A and B will do worse than these).

This is very well written. Thanks! It's the kind of article that sparks (my) curiosity.

I looked for some information on Helvetas' website. Helvetas is a Swiss charity that has been running safe water interventions for about 50 years; they are funded by private donors, but also receive development aid money from the Swiss government.

Helvetas provides some ideas why water interventions might help, besides diarrhea:

  • Disproportionally helps women and girls: Women and girls in poor communities often spend several hours a day fetching water => big opportunit
... (read more)

Thanks! I completely understand... putting these systems in place can be time-consuming, and the regulations differ for each country.

I hope you'll find great US/Canada candidates!


PS, but only tangentially related: I've recently documented the situation of someone working in Germany for an international organization, at https://blog.purpureus.net/posts/how-to-work-in-germany-for-a-foreign-organization/

This sounds interesting, thanks for posting!

I noted that the application is open to candidates in the US or Canada. Is that a strict requirement, or could you make exceptions?

1
JLRiedi
Hello! If folks have U.S. or Canadian identification but live in another country that's not a problem, but otherwise we don't have the administrative/payroll systems in place to hire outside those two countries at this time. As we grow we hope to consider employees outside these countries via an employer of record, but unfortunately we're not able to yet.

Here are some reasons why I think that units of ~100 households are ideal. The post itself has more examples.

  • It's best for detailed planning. There is a type of humanitarian/development work that tries to reach every household in a region. Think vitamin A supplementation, vaccination programs, bednet distributions, cash transfers, ... For these, one typically needs logistics per settlement, such as a contact person/agent/community health worker, some means of transportation, a specific amount of bednets/simcards/..., etc.

    Of course, the higher levels of

... (read more)

Thanks! This seems very relevant. I will try to contact the team.

Yes, I know about What Three Words. Thanks for the suggestion! It's a good opportunity to clarify the different aims of my project and W3W.

W3W is essentially the same as a GPS coordinate, except more memorable and easier to pronounce. A W3W place does not necessarily correspond to anything particular in the real world (like a settlement). Thus, W3W does not provide any added value for planning purposes.

There are some other downsides, such as W3W being proprietary and based on (IMO) bad design choices (e.g., hard to localize).

A better alternative to W3W is... (read more)

2
Sanjay
Can you expand on why the ideal unit is "the settlement, village, community, or neighborhood"?
4
Lorenzo Buonanno🔸
For more on this, and why I think we shouldn't advocate for W3W, see: https://shkspr.mobi/blog/2019/03/why-bother-with-what-three-words/ for theoretical reasons and https://w3w.me.ss/ for some practical examples. As you mention, https://plus.codes is indeed much better, although this is only tangentially related to your project

Prediction markets haven't moved all that much yet: https://manifold.markets/bcongdon/will-a-cell-cultured-meat-product-b

But I share your hopeful attitude :)

I find this an interesting discussion, and hope that it will continue.

My own knowledge of this domain is very limited. I'll just mention some points from World Without End (WWE)... not because I endorse them, but to keep the discussion going:

  • Because of low energy density, wind and solar require 1-2 orders of magnitude more land use, metal, and concrete per kWh than nuclear power. EROEI (Energy returned on energy invested) is worse.
  • If batteries are used, the numbers become even worse; also greenhouse has emissions go up. WWE claims nuclear electricity em
... (read more)

Thanks for the write-up, Michelle! You write about your "hope that other like-minded parents will share their lessons and suggestions", so I decided to contribute a few thoughts.

I'm currently working as a software engineer for the Against Malaria Foundation (50%) and caring for our one-year old (50%). My wife also has a 50%-job.

Work time: Compared to what Michelle and Abby wrote, I have reduced my work time more strongly after becoming a parent. It felt important to me to experience my child growing up and to personally care for it. I can have 30 more pro... (read more)

4
ruthgrace
I'm ecstatic that AMF was able to arrange for you to work part time!! I've also been surprised by what good luck I've had with being able to get very flexible part time internships during my maternity leave and being able to go part time until my baby turned one at my day job. My advice for others on this is that if you've already cultivated a previous relationship with the people you work for or want to work for, it doesn't hurt to ask for a non traditional work arrangement. And then more generally, I think that people who want to have impact and also want to have kids can sometimes find creative solutions to have both.
3
Geoffrey Miller
Sjlver --thanks very much for these comments.  Regarding parental worries about financial security -- I agree that this is heavily dependent on where one lives. In countries with stronger social safety nets, parental leave, affordable housing, and socialized medicine (like Germany and the UK, to some degree), parents need not stress as much. In the US, parents worry a LOT about loss of jobs, which means loss of affordable health insurance; many jobs are less flexible in terms of hours, sick leave, and vacation time; and some cities are absurdly unaffordable for parents who need at least a 3 or 4-bedroom place. Another huge factor is whether public schools are good enough and safe enough for one's kids to actually go there -- or whether one needs to spend the extra for private schools. On the other hand, I agree with your point about kids not costing quite as much at a day-to-day level as one might think. In many cities there are thriving second-hand markets for kids' clothing, toys, equipment, strollers, etc -- we've bought almost nothing new.  It's easy for parents to get caught up in brand-conscious runaway consumerism --but hopefully EAs have the wit and perspective to avoid such nonsense! :)

I really liked this... the post made me think, and will continue to do that for some time. It doesn't seem all that unrealistic to me 🤯

One little nit: you seem to write "century" when you mean "decade".

3
mariushobbhahn
Thanks for pointing out the mistake. I fixed the "century" occurrences. 

Thanks for the thoughts!

I think we are getting closer to the core of your question here: the relationship between cases of malaria (or severe malaria more specifically) and deaths. I think that it would indeed be good to know more about the circumstances under which children die from malaria, and how this is affected by various kinds of medical care.

The question might partially touch upon SMC. Besides preventing malaria cases, it could also have an effect on severity (I'm thinking of Covid vaccines as an analogy). That said, the case for SMC (as I understa... (read more)

2
Seth Ariel Green 🔸
Thanks as always for your careful and helpful read! I was just telling someone yesterday that this exchange is a positive reflection on the EA community and ethos — as a comparison point, it’s been way more constructive and collaborative than any of my experiences with academic peer review. It sounds like I haven’t changed your mind on the core subject and that’s totally understandable. I speculate that this is something of a (professional) culture difference — the academics I discussed this essay with all started nodding along with the general idea the moment I mentioned “uncertainty about external validity” 😃 And thanks for the insight into AMF, y’all do great work.

Looks like I can confirm this. Relevant passages from Cissé et al (2006):

The study was designed to measure Malaria, not deaths:

The primary outcome measure was a comparison of the occurence of clinical malaria between children in the two study groups.

Children with positive malaria tests received treatment:

Malaria morbidity was monitored through home visits every week and by detection of study children who presented at one of three health centres in the study area. At each assessment, axillary temperature was measured, and if it was 37.5C or greater, o

... (read more)
1
Seth Ariel Green 🔸
Thank you for looking into it! Definitely interesting.  To recap: * GiveWell's cost-benefit calculations hinge on the relationship between SMC and mortality.  * The key mediator there is cases of malaria.  * In the provided studies, the estimated relationship between cases of malaria and deaths is likely to be downwardly biased because of co-delivered interventions (ITN, HMM, and, as you've identified, just more attentiveness to malaria in general in treated areas). * As SMC is rolled out, is it rolled out along with more general medical care, or without? With co-interventions, or without? This seems like the key question we don't have a handle on and that GiveWell's materials don't shine much light on. * Let's say it's rolled out along with general medical care. In that case, what's actually doing the work in reducing mortality, SMC or medical care? And which set of costs (SMC, medical care, or the two combined) should factor into the $-per-life-saved calculation? * Let's say it's rolled out without that general medical care. In that case, do we really have a good estimate of the expected effects on mortality of just SMC? because that seems like the number GiveWell is basing its top charity title on, and at first glance, it's really not clear what percentage of the research actually estimates that directly.  * So in sum, either SMC is typically going to be rolled out in places/contexts where its effect on mortality is likely to be much lower than broader data about the relationship between malaria and mortality would suggest, which means that our $-per-life-saved metrics might be seriously off-base; or it will be rolled out in places that are very much unlike the settings in which the studies were run, which is a serious external validity problem.  So all in all, a confusing situation. And given the high stakes,  I suggest that GiveWell taps a team with expertise in both the subject matter and RCTs to design and run an intervention that maps directly o

I appreciate the thoughts! I'm going to think about this more thoroughly... but here's a quick guess about the low death numbers:

These trials involved measuring malaria prevalence in children. Presumably, children with a positive result would then get medication or be referred to a health center. Malaria is a curable disease, so this approach would save lives. Unfortunately, it's also quite likely that the child would not receive appropriate treatment in the absence of a diagnosis, due to lack of knowledge of the parents, distance to health facilities, etc.

Anyway, it's just a quick guess. Might be worth checking if the studies describe what happened to children with positive test results.

1
Sjlver
Looks like I can confirm this. Relevant passages from Cissé et al (2006): The study was designed to measure Malaria, not deaths: Children with positive malaria tests received treatment: I'll still think more about this... but here we have at least a lead towards better understanding of low death numbers in SMC trials.

The Right-Fit Evidence group provides good resources related to this post. They publish guidance on what types of evidence implementers should collect to demonstrate and monitor the impact of their programs.

Notably, different types of evidence are ideal depending on the stage of a program. In the beginning, when there is lots of uncertainty about an intervention, a randomized controlled trial is great. At a later stage, when the program is scaling to many recipients, it is more important to monitor the program and ensure that the implementation is done wel... (read more)

1
Seth Ariel Green 🔸
Thanks, this is very useful and new to me! (I briefly consulted/worked for IPA in 2015-2016.)

That seems fair. I agree that my request for an estimate is a big, maybe even unreasonable, request.

I asked because I am wondering if there really is enough reason to doubt the results of existing SMC trials. If I understand your post correctly, your main worry is not about actual errors in the trials; we don't have concrete reasons to believe they are wrong. Indeed, the trials provide high-quality evidence that SMC reduces malaria cases, including severe cases.

Your worries seem to be that (1) studies are underpowered to quantify reduction in malaria deat... (read more)

8
Seth Ariel Green 🔸
Hi Sjlver,  I've been thinking about this and I think you're right, I  do believe that running this replication trial passes a cost-benefit test, and I should try to explain why. I think there's a 50% chance that a perfectly done SMC replication would find mortality effects that are statistically indistinguishable from a null, for two reasons: 1) the documented empirical effects are strange and don't gel with our underlying theory of malaria; 2) our theory also conflicts with the repeated observation that people living in extreme poverty don't seem to take malaria as seriously as outsiders do, which is prima facie evidence that we're misunderstanding something big.  * My essay's thesis is that SMC's underlying RCT evidence, which is the foundation of GiveWell's cost-benefit analysis, is weaker than it appears at first glance.   * Does the use of meta-analysis somewhat or largely obviate this problem? In my opinion, no, aggregation does not paper over structural issues in the data generation process.  * One of the most striking things my co-authors found when meta-analyzing the contact hypothesis literature was the gap in effect size between studies that had a pre-analysis plan (d = 0.016) and those that didn't (d = 0.451).  This obviously isn't dispositive that there's "no there there" with intergroup contact; but when subsequent high-quality studies on the subject found much more mixed results (e.g. here and here), at the very least, we can say we had a warning sign. * Can we supplement evidence that SMC reduces malaria cases with other putatively causal[1] evidence that intervening to reduce malaria leads to a sizeable reduction in deaths?  * That depends on how seriously we take the argument that most published research findings are false. I myself take this very seriously, and I basically treat all research as provisional until it's been validated through a seriously well-identified study.  * I'm not saying that we don't know that malaria causes dea

Yes, absolutely.

As far as I can tell, that type of RCT indeed is not being done. I don't know much about research on SMC specifically, but Givewell reports the following quote of Christian Lengeler, author of Cochrane Review of insecticide-treated bed nets:

To the best of my knowledge there have been no more RCTs with treated nets. There is a very strong consensus that it would not be ethical to do any more. I don't think any committee in the world would grant permission to do such a trial.

2
Karthik Tadepalli
That's fascinating, the norm is extremely different in economics and I have never heard of this norm. What is the boundary between a necessary replication and something that would be considered unethical?

Thanks for these thoughts!

A question: How large do you expect the effects of such a replication to be? Maybe you could estimate "a study of cost would lead to a change if effect size of with probability " for some instances of . That would help to estimate whether the study would, in expectation, be worth more than one life saved per 5000 dollars.

And an observation: I think it would be very difficult to get ethical approval for such a study. SMC is (according to current knowledge) an amazing intervention. Any controlled trial would require a cont... (read more)

2
MHR🔸
I made an attempt to estimate the cost-effectiveness of replicating research on Deworming in a previous post. There's especially large uncertainty in the Deworming's effect size,  so I doubt you'd get as big an effect for SMC. But I think a similar Bayesian modeling approach could for this! 
1
Seth Ariel Green 🔸
Hi, and thanks for giving this a close read! I considered providing an estimate like the one you suggest, but shied away for two reasons: 1. I am not a subject matter expert and I don’t have a good sense of what the effect size would be — as GiveWell notes, across all seven studies, mortality in both groups is lower than you expect, so there’s some disconnect between theory and empirics here that I/we lack context on; 2. the expected value of a new finding hinges on equilibrium effects that I can’t really get a handle on. Let’s say that GiveWell finds smaller effects than they expect and then shifts a different charity to be #1. Is that intervention’s evidence really solid, or should that intervention also be closely re-examined and then replicated? I do not know; if I had had more time I would have like to do this type of analysis for the other three interventions as well. My hope is that if I help point GiveWell in the right direction, people who are more experienced at cost-benefit analysis can take it from there. My comparative advantage is reading RCTs and meta-analyses. As to the ethical concerns — that depends on whether the control group is likely to have received an anti-malaria treatment in the absence of an intervention, i.e. the point I made in section 2. If everybody is receiving bed nets anyway, let's study that population.
2
Karthik Tadepalli
Isn't that an objection to any RCT of treatments that have been shown to work in some contexts?

Jonas here, AMF software engineer.

Thank you for your research! I would really like to publish more of AMF's PDM data to enable this kind of work. Unfortunately, we have to prioritize how we spend our time in the small AMF team, and this task hasn't made it to the top yet.

If you were interested in doing a more in-depth analysis (and have the time required for this) it might be good to let Rob (our CEO) know. This can help in prioritizing this type of task.

1
brb243
Done, thanks.

(disclaimer: I work for AMF, but this is my personal opinion)

Yes, we have to prioritize. No, life quality seems the wrong metric for prioritization.

A few practical responses to the challenge first: AMF funds bednets at the scale of countries or provinces, that is, a few million nets at a time. This allows for efficient distributions that leverage economies of scale. Prioritization takes a number of factors into account, such as malaria prevalence (which might have an effect on the bednet use rate). Life quality metrics are currently not a factor for priori... (read more)

1
brb243
Thank you. This actually makes a lot of sense. The farming improvements (although could be different in different areas and studies) are astounding. For example, One Acre Fund increases farmers' annual income by about $100 or 50%, for the cost of about $25/farmer in 2021. Bednets have an equivalent nominal impact for about a fifth ($5) of the price. Sidenote: the lower % improvement suggests that AMF serves relatively affluent farmers (with average annual incomes of $633 ($76/12%*100%), which can have twice to five times the real value) (unless the $76 is real value). The agricultural productivity can increase because people are less sick and more productive. Also people could have a greater capacity to seek better farming practice information, livestock could be less ill (if bednets are used to cover livestock), and fishers could have better equipment. Also, children could be able to help with chores rather than occupy parents or older siblings to care for them. Reduced treatment spending can be also substantial. Assuming that malaria treatment costs $4 and a bednet prevents 2 cases of malaria per year, then a family with  5 children (who would be treated if they get malaria) can save $40/year, which can a substantial proportion of their income. In terms of attendance, bednets can have limited effects (about an additional week of school per year?). That is about 10 days/year. If a bednet prevents half of the cases, that is 5 days or a week. The impacts on enrollment can be relatively larger due to the increased farming income and reduced treatment cost if education expenses are substantial. For example, if education costs $100/year, then an additional child can be educated. If education expenses are close to zero, then malaria does not affect enrollment. The quality of education or its relevance to employment is not directly addressed but can be addressed indirectly by enrolling a child in a better (higher paid) school. Reducing mortality can have positive

Related job ad, but not by the forum team; feel free to remove if not appropriate

The Against Malaria Foundation is close to Finnish time zones and completely remote. It currently has employees in the UK, Germany, and South Africa. One employee is working part-time due to parenting. AMF is hiring for several roles.

3
Lorenzo Buonanno🔸
Not sure about how I feel about making these sorts of comments, but potentially even more relevant roles: * Sweden Policy Consultant, Stockholm, Good Food Institute * Finland Policy Consultant, Helsinki, Good Food Institute * Expert, Communicable Diseases Prevention and Control, Stockholm, European Union

Sleeping under a bednet or getting a malaria vaccine are optional activities; people are free to choose to do that or not. (This is not quite accurate for children, where the decision probably lies with their caretakers.)

In post-distribution surveys, AMF consistently finds that most nets are being used as intended. People know that the nets protect against malaria. They also know the sickness, probably had it before, probably know someone who died from it. So it's an informed choice.

Based on this kind of observation, it seems to me that most people want to... (read more)

1
brb243
Let me challenge you here. Suppose that in a community inspired by Tsangano, Malawi, where people used 71% of nets which they freely received, the quality of life is -0.2 with an SD of 0.3 (normally distributed). 60 km away, in a place visually similar to Namisu, Malawi (where people used 95% of nets), the quality of life is 0.3 with an SD of 0.2. Each community has 2,000 people (who need about 1,000 nets). You have only 500 nets. Who are you going to give the nets to? Further challenge: You also have a pre-recorded radio show that improves farmers' agricultural productivity by coaching them to place only 1 grain 75 cm apart and cover with a few cm of soil rather than scattering the grain. This can increase people's productivity by an average of 20%. The airtime for the show in one community costs as much as 500 nets. Are you going to forgo any nets and buy the show? Are you subjectively assigning equivalent moral weights to the lives of the people in the two hypothetical communities?

People should be allowed to destroy the button (aka "x-risk reduction") ;-)

Answer by Sjlver3
0
0

Better analytics for both authors and readers:

  • Readers can highlight sections of an article. The forum might then show a "featured highlight" similarly to how this works in Medium.
  • The forum can also measure how much screen time each paragraph in an article gets, and show this to users (a bit like a heatmap of where readers look at). This could lead to improved writing, and incentivize shorter articles.
  • An article's engagement and read time can become factors used for ranking, as a complement to Karma.

In my previous job, we used the technique described below to prioritize feature requests and estimate their relative value. Feel free to skip this comment if you're not interested in slightly related survey techniques.

  • Show a random sample of five items to a survey participant
  • Participant selects the most important and least important (leaving three items "somewhere in-between")
  • Repeat

Each iteration creates six links between items (A > B, A > C, A > D, B > E, C > E, D > E) plus, transitively, A > E. After enough iterations, a prefer... (read more)

3
Jonas Moss
Thank you for telling about this! In economics, the discrete choice model is used to estimate a scale-free utility function in similar way. It is used in health research for estimating QALYs, among other things, see e.g. this review paper. But discrete choice / the Schulze method should probably not be used by themselves, as they cannot give us information about scale, only ordering. A possibility, which I find promising, is to combine the methods. Say that I have ten items I0…I9 I want you to rate. Then I can ask "Do you prefer Ii to Ij?" for some pairs and "How many times better is Ii than Ij?" for other pairs, hopefully in an optimal way. Then we would lessen the cognitive load of the study participants and make it easier to scale this kind of thing up. (The congitive load of using distributions is the main reason why I'm skeptical about having participants using them in place of point estimates when doing pairwise comparisons.)

Hi, would you be interested in AMF's software engineer positions? We have Python-based data analysis tasks that you might find fun, and I bet you could pick up the rest of the tech stack quickly. I came to AMF from a similar background as you (Python/C++ @ Google) and found that many of the skills translated well into the new environment.

3
John Litborn
Yeah not a perfect fit for my current niche, but I have no problem picking up new techs and even coded some C# in school, so I'll definitely apply!
Load more