Bio

Participation
4

I am a generalist quantitative researcher. I am open to volunteering and paid work. I welcome suggestions for posts. You can give me feedback here (anonymously or not).

How others can help me

I am open to volunteering and paid work (I usually ask for 20 $/h). I welcome suggestions for posts. You can give me feedback here (anonymously or not).

How I can help others

I can help with career advice, prioritisation, and quantitative analyses.

Comments
2934

Topic contributions
40

Thanks for the post, Michael.

However, any specific function or set of coefficients would (to me) require justification, and it’s unclear that there can be any good justification.

I also worry about the arbitrariness of the weights (coefficients) of the models. In Bob Fischer's book about comparing welfare across species, there seems to be only 1 line about the weights used to aggregate the tentative estimates for the welfare range, the difference between the maximum and minimum hedonistic welfare per unit time. "We assigned 30 percent credence to the neurophysiological model, 10 percent to the equality model, and 60 percent to the simple additive model". People usually give weights that are at least 0.1/"number of models", which is at least 3.33 % (= 0.1/3) for 3 models, when it is quite hard to estimate the weights. However, giving weights which are not much smaller than the uniform weight of 1/"number of models" could easily lead to huge mistakes. As a silly example, if I asked random people with age 7 about whether the gravitational force between 2 objects is proportional to "distance"^-2 (correct answer), "distance"^-20, or "distance"^-200, I imagine I would get a significant fraction picking the exponents of -20 and -200. Assuming 60 % picked -2, 20 % picked -20, and 20 % picked -200, one may naively conclude the mean exponent of -45.2 (= 0.6*(-2) + 0.2*(-20) + 0.2*(-200)) is reasonable. Yet, there is lots of empirical evidence against this which the respondants are not aware of. The right conclusion would be that the respondants have no idea about the right exponent because they would not be able to adequately justify their picks. I think we are in a similar situation with respect to comparing hedonistic welfare across species.

Thanks for the post, Michael.

The more or larger such changes are necessary to get from one brain to another, the less tight the bounds on the comparisons could become, the further they may go both negative and positive overall,[2] and the less reasonable it seems to make such comparisons at all.

I agree comparisons become increasingly uncertain as the difference between the states of the organisms increases. However, I do not think there is a point where comparisons go from possible, but extremely difficult to not possible at all. I would say there is just a progressive widening of the distribution representing the hedonistic welfare per unit time of a given state of an organism as it moves away from typical human states. As an example, I could say my hedonistic welfare right now is 0.5 to 1.5 times that of random human who is awake, whereas that of a random nematode might be 10^-17 to 1 times that of a random human who is awake. I estimate the ratio between the individual number of neurons of nematodes and humans is 2.79*10^-9, whose square is 7.78*10^-18, roughly 10^-17.

Thanks, Seizal. I agree. On the other hand, I personally only care about increasing welfare in expectation. So I would be happy to support interventions which are very unlikely to significantly increase welfare if they increase it cost-effectively in expectation. If I was in an original position where I had an equal chance of reincarnating any of the individuals of a population, my expected welfare after the reincarnation would be proportional to the expected total future welfare of the population. So I believe maximising this corresponds to being as impartial as possible.

  • What makes two actions incomparable, under the imprecise EV model, is that the interval of EV differences crosses zero.

Imagine 2 states of the world which are exactly the same, and have an imprecice expected welfare of -1 to 1. The difference between their imprecise expected welfare is -2 (= -1 - 1) to 2 (1 - (-1)), which crosses 0. So their expected welfare would be incomparable under your framework? I would say their expected welfare would be comparable, and exactly the same.

I did not have a particular structure in mind for how the people in LMICs would regrant the funds. Thanks for giving examples, Mo.

My claim is that people should take time to understand their tools and account for their weaknesses. Accounting for weaknesses should happen not just within the tool, but outside of it when making the final decision.

I think GiveWell is a good example of this. If CEAs made up 100% of their decision making process, their decisions would be heavily influenced by the weaknesses of CEAs as a method. However, GiveWell acknowledges these weaknesses and uses CEAs as a primary deciding factor, while also incorporating other factors as well.

Agreed.

The Digital Consciousness Model (DCM) is a first attempt to assess the evidence for consciousness in AI systems in a systematic, probabilistic way.

Is Digital Consciousness Model a bit of a misnomer considering it can be used to assess the consciousness of biological organisms? Maybe it should simply be called Consciousness Model (CM)?

I believe any pains are quantitatively comparable if the pains of any 2 infinitesimally different states are quantitatively comparable.

I think the weakest part of the strongest version of my argument is that it requires the pains of any 2 infinitesimally different states to be quantitatively comparable with certainty. If they are only quantitatively comparable with, for example, probability 99 %, pains which are 1 k infinitesimal steps apart would only be quantitatively comparable with probability 0.00432 % (= 0.99^(1*10^3)).

The fundamental problem remains. Like I mentioned in my original post, any system for decision making is going to be trading away truth for practicality.

Agreed. At the same time, I struggle to see practical cases where it makes sense to spend significant time on WFMs. I would rather improve cost-effectiveness analyses (CEA). For example, accounting better for priors, modelling more effects, and gathering more evidence to decrease uncertainty in key inputs. GiveWell uses CEAs all the time, but has never included a WFM in their public analyses as far as I know. Gemini did not find any examples either.

You cite GiveWell as an example of an organization that takes EV estimates "close to literally". I assume by this you mean the EV estimates they make with respect to cost-effectiveness.

Yes. Elie Hassenfeld, GiveWell's CEO, mentioned the following on the Clearer Thinking podcast.

GiveWell cost- effectiveness estimates are not the only input into our decisions to fund malaria programs and deworming programs, there are some other factors, but they're certainly 80% plus of the case.

Isabel Arjmand from GiveWell elaborated on the above.

The numerical cost-effectiveness estimate in the spreadsheet is nearly always the most important factor in our recommendations, but not the only factor. That is, we don’t solely rely on our spreadsheet-based analysis of cost-effectiveness when making grants. 

  • We don't have an institutional position on exactly how much of the decision comes down to the spreadsheet analysis (though Elie's take of "80% plus" definitely seems reasonable!) and it varies by grant, but many of the factors we consider outside our models (e.g. qualitative factors about an organization) are in the service of making impact-oriented decisions. See this post for more discussion.
  • For a small number of grants, the case for the grant relies heavily on factors other than expected impact of that grant per se. For example, we sometimes make exit grants in order to be a responsible funder and treat partner organizations considerately even if we think funding could be used more cost-effectively elsewhere.

Hi Evan.

EV reasoning is vulnerable to Pascal's Mugging and the Optimizer's Curse.

One can account for priors (information besides the new evidence) to mitigate these issues, as suggested in Holden Karnofsky's post about not taking expected value estimates literally (when they do not incorporate priors; I think GiveWell does take expected value estimates close to literally when they incorporate the vast majority of available evidence).

Load more