Bio

Participation
4

I am a generalist quantitative researcher. I am open to volunteering and paid work. I welcome suggestions for posts. You can give me feedback here (anonymously or not).

How others can help me

I am open to volunteering and paid work (I usually ask for 20 $/h). I welcome suggestions for posts. You can give me feedback here (anonymously or not).

How I can help others

I can help with career advice, prioritisation, and quantitative analyses.

Comments
3135

Topic contributions
42

Thanks for the reply, Vince.

For example, The Humane League's (THL's) future plans are a prioritized list of 24 ways that they would expand with additional funding. Not only is it not feasible for us to construct so many forward-looking CEAs

ACE's Recommended Charity Fund granted 399 k$ (= (85.4 + 314)*10^3), and the 3 highest priority future plans from THL have a cost of 455 k$. So, assuming that the amount granted to THL covers most of the additional funding it received as a result of your recommendation, you would only need to model the cost-effectiveness of 3 programs, or maybe just 2, as the 2 highest priority projects cost 78.7 % (= 358/455) as much as the 3 highest priority projects? In your last cost-effectiveness analysis of THL, you looked into 4 programs.

Funding is also fungible; dollars we direct to THL won't cleanly fund items 1-3 versus 4-6 on the list. Actual deployment depends on what other funders do, internal decisions, and new opportunities that emerge.

Makes sense. At the same time, I still think assessing the cost-effectiveness of marginal projects is valuable. Imagine THL ends up using ACE's funds to support projects 4 to 6 instead of projects 1 to 3, which had the highest priority at the time of your evaluation, and were the only ones covered by your cost-effectiveness analyses. One should expect the cost-effectiveness of projects 4 to 6 to be closer to what ACE estimated for projects 1 to 3 than to the cost-effectiveness of THL as a whole? I think so. Mostly because projects 4 to 6 were much closer to being marginal at the time of your evaluation than random funds from THL. And parly because THL prioritising projects 4 to 6 would be an update towards projects 1 to 3 having a lower cost-effectiveness than you estimated, and projects 4 to 6 having a higher cost-effectiveness than you estimated.

Trying to predict which specific items our marginal dollar pays for is false precision, and we've seen charities greatly change their marginal plans within the two-year span of their recommendation.

I agree it would not make sense to predict changes in what charities will prioritise. I would just assess the cost-effectiveness of the highest priority projects covering the expected increase in funding resulting from your recommendation.

Current cost-effectiveness captures an organization's demonstrated ability to convert dollars into outcomes and that is often what is most likely to carry over to the next dollar.

I suspect the marginal cost-effectiveness of large organisations may be significantly lower than the cost-effectiveness of the whole spending. My intuition having worked for the last 9 months at Anima International is that our marginal cost-effectiveness is 10 % to 50 % of the cost-effectiveness of our whole spending. Relatedly, @abrahamrowe guessed 16 months ago corporate campaigns for chicken welfare not funded by Coefficient Giving (CG), including ones run by organisations funded by CG, were as cost-effective as burning money, despite some of the ones funded by CG being beneficial. Here are some related reflections from Abraham. I was surprised by Abraham's claim at the time, but I have meanwhile come closer to his position, although I think I am still significatly more optimistic than he was at the time.

The method would probably work less well with an imagined reference scenario unless people have experienced something similar.

I agree. In addition, I think judgements about more recent experiences would be more reliable.

I am claiming that people's decision-making values the duration of pain in ways that may be irrational.

Makes sense. The peak-end rule suggests people may prefer experiences which are overall more painful if they end with pain of lower intensity.

The first is contexts where the ongoing work itself hasn’t been built out yet, so there isn’t a stable field at the margin to fund.

I would say thinking at the margin still makes sense for initial work. It is just that the thinking cannot be based on an established track record.

The second is one-off events like passing a regulation, where extra funding may not change the probability of winning at all, or might only at a specific moment, but still maybe worth taking the bet given high CE.

Extra funding which does not change the probability of winning at all has a cost-effectiveness of 0 neglecting effects besides those of passing the regulation?

On your offer to review CEAs, we’d love help on that for the next round and I’ll DM you to follow up.

Nice.

Instead, the key idea is to ask people to rate each experience on an unconstrained scale, with a reference point that is the same for everyone.

This is what I understood from Gemini's summary.

one could ask people to place their palm on a desk, then put a jug filled with three gallons of water on top of it, and then ask, "If the intensity of the pain you are feeling now is 20 [in the reference scenario], then what number best represents the intensity of the suffering you felt when X happened?" for different events X

The method also works, although less accurately, if people just read about the reference scenario (in the same way that they just read about X in your example)? If so, one could ask people to compare the intensity of different pain categories via time trade-offs? For example, if a person is indifferent between 10 h of annoying pain, and 1 h of hurtful pain, annoying pain would be 10 % (= 1/10) as intense for them as hurtful pain. 

Thanks for this, Chetan and Anita.

One-off events and policy lock-ins. The 2023 battery-cage ban failure. The empty regulatory space around broiler stocking density. The black soldier fly larvae industry, currently regulated by no one, and that won't last. These are problems where what matters is the size of the win, probabilities of success, how long it locks in, and whether it would have happened without you, not the marginal cost-effectiveness of one more dollar in the field.

How does the marginal cost-effectiveness (for example, the cost-effectiveness of a grant of a few k$) not matter? It is determined by factors like those you listed.

Cost-effectiveness scored lowest at 5.49.

How did you score the cost-effectiveness? Have you considered doing cost-effectiveness analyses (CEAs)? I would be happy to review some for free.

Hi Nick.

at the very least given it's hard to detect if a human writes it or not, I think as a human readers should have the right to know who they are interacting with. Is it a human? Is it an AI? Is it a mix of both and how did they mix?

For me what matters more is that people who publish posts could be held accountable for their content, regardless of how much of this written by LLMs. So I agree with keeping anonymous bots out. I even wonder whether it would be better to ban all anonymous users (after giving them the opportunity to identify themselves).

Hi Falk. Thanks for sharing that relevant article. Here is a summary from Gemini. To apply the method proposed there, one could assess the intensity of the Welfare Footprint Institute's (WFI's) pain and pleasure categories based on time trade-offs? I think the categories are still subjective, but more objective than a pain score from 0 to 10.

Hi Toby and Francis. Thanks for the update.

I have never used text generated by LLMs in my posts. However, I do not think authors should be required to disclose this. I would just let the visibility of posts be guided by their karma.

The laissez-faire option is flawed because LLM-generated writing is increasingly difficult to detect.

This is not good or bad in itself?

There are posts (I've seen a lot of these) which have the form of a good quality post which is worth reading, but on closer analysis turn out not to contain any ideas, or just to contain a couple of bullet points' worth of ideas, surrounded by a lot of fluff and repetition. This leads to quite a large waste of time for the reader.

The posts described above can be skimmed quickly, and then not voted on, or downvoted, thus not wasting much of readers' time, and not gaining visibility?

Hi Stefan. I liked your post. I remain open to bets against short AI timelines, or what they supposedly imply, up to 10 k$. Do you see any that we could make that is good for both of us under our own views?

Load more