1

43

For low probability of other civilizations, see https://arxiv.org/abs/1806.02404.

Humans don't have obviously formalized goals. But you can formalize human motivation, in which case our final goal is going to be abstract and multifaceted, and it is probably going to be include a very very broad sense of well-being. The model applies just fine.

Because it is *tautologically *true that agents are motivated against changing their final goals, this is just not possible to dispute. The proof is trivial, it comes from the very stipulation of what a goal is in the first place. It is just a framework for describing an agent. Now, with this framework, humans' final goals happen to be complex and difficult to discern, and maybe AI goals will be like that too. But we tend to think that AI goals will not be like that. Omohundro argues some economic reasons in his paper on the "basic AI drives", but also, it just seems clear that you can program an AI with a particular goal function and that will be all there is to it.

Yes, AI may end up with very different interpretations of its given goal but that seems to be one of the core issues in the value alignment problem that Bostrom is worried about, no?

The Pascal's Mugging thing has been discussed a lot around here. There isn't an equivalence between all causes and muggings because the probabilities and outcomes are distinct and still matter. It's not the case that every religion and every cause and every technology has the same tiny probability of the same large consequences, and you cannot satisfy every one of them because they have major opportunity costs. If you apply EV reasoning to cases like this then you just end up with a strong focus on one or a few of the highest impact issues (like AGI) at heavy short term cost. Unusual, but not a reductio ad absurdum.

There is no philosophical or formal system that properly describes human beliefs because human beliefs are messy, fuzzy neurophysiological phenomena. But we may choose to have a rational system for modeling our beliefs more consistently, and if we do then we may as well go with something that doesn't give us obviously wrong implications in dutch book cases, because a belief system that has wrong implications does not fit our picture of 'rational' (whether we encounter those cases or not).

I think the same sheltering happens if you talk about ignoring small probabilities, even if the probability of the x-risk is in fact extremely small.

The probability that $3000 to AMF saves a life is significant. But the probability that it saves the life of any one particular individual is extremely low. We can divide up the possibility space any number of ways. To me it seems like this is a pretty damning problem for the idea of ignoring small probabilities.

We can say that the outcome of the AMF donation has lower variance than the outcome of an x-risk donation, assuming equal EV. So we could talk about preferring low variance, or being averse to having no impact. But I don't know if that will seem as intuitively reasonable when we circle our new framework back to more everyday, tangible thought experiments.

>Zeke estimates the direct financial upside of a successful replication to be about 33B$/year. This is a 66000:1 ratio (33B/500K = 66000).

This is not directly relevant, because the money is being saved by other people and governments, who are not normally using their money very well. EAs' money is much more valuable as it is used much more efficiently than Western people and governments usually do. NB: this is also the reason why EA should generally be considered funders of last resort.

If the study has a 0.5% (??? I have no idea) chance of leading to global approval and effective treatment then it's 35k QALY in expectation per my estimate which means a point estimate of $14/QALY. iirc, that's comparable to global poverty interventions but at a much lower robustness of evidence, some other top EA efforts with a similar degree of robustness will presumably have a much higher EV. Of course the other diseases you can work on may be much worse causes.

Also that $33B comes from a study on the impact of the disease. Just because you replicate well doesn't mean the treatment truly works, and is approved globally, etc. Hence the 0.5% number being very low.

Last thread you said the problem with the funnel is that it makes the decision arbitrarily dependent upon how far you go. But to stop evaluating possibilities violates the regularity assumption. It seems like you are giving an argument against people who follow solution 1 and reject regularity; it's those people whose decisions depend hugely and arbitrarily on where they define the threshold, especially when a hard limit for *p* is selected. Meanwhile, the standard view in the premises here has no cutoff.

> One needs a very extreme probability function in order to make this work; the probabilities have to diminish *very fast* to avoid being outpaced by the utilities.

I'm no sure what you mean by 'very fast'. The implausibility of such a probability function is an intuition that I don't share. I think the appendix 8.1 is really going to be the core argument at stake.

Solution #6 seems like an argument about the probability function, not an argument about the decision rule.

Going from moderate disease to remission seems to be an increase of about 0.25 QALY/year (https://academic.oup.com/ecco-jcc/article-pdf/9/12/1138/984265/jjv167.pdf). If this research accelerates treatment for sufferers by an average of 10 years then that's an impact of 5 million QALY.

Crohn's also costs $33B per year in the US + major European countries (https://www.ncbi.nlm.nih.gov/pubmed/25258034). If we convert that at a typical Western cost-per-statistical-life-saved of $7M, and the average life saved is +25 QALY, that's another 1.2 million. Maybe 2 million worldwide because Crohn's is mostly a Western phenomenon (https://www.thelancet.com/journals/lancet/article/PIIS0140-6736(17)32448-0/fulltext?code=lancet-site). So that's 7 million QALY overall. Which of course we discount by whatever the probability of failure is.

It's very rough but it's a step forward, don't let the perfect be the enemy of the good.

A lot of baggage goes into the selection of a threshold for "

highlyaccurate" or "ensuredsafe" or statements of that sort. The idea is that early safety work helps even though it won't get you a guarantee. I don't see any good reason to believe AI safety to be any more or less tractable than preemptive safety for any other technology, it just happens to have greater stakes. You're right that the track record doesn't look great; however I really haven't seen any strong reason to believe that preemptive safety is generally ineffective - it seems like it just isn't tried much.