All of DanielFilan&#x27;s Comments + Replies

Is x-risk the most cost-effective if we count only the next few generations?

Aum intended to kill thousands of people with sarin gas, and produced enough to do so. But they a) were not able to get the gas to a sufficiently high level of purity, and b) had issues with dispersal. In the 1995 Tokyo subway attack, they ended up killing 13 people, far less than the thousands that they intended.

IIRC b) was largely a matter of the people getting nervous and not deploying it in the intended way, rather than a matter of a lack of metis.

DanielFilan2y3

Thanks!

Is x-risk the most cost-effective if we count only the next few generations?

DanielFilan2y15

No chance there's a ~10 word summary of the executive summary? I'm interested but pretty sleep-deprived/jetlagged and found it hard to interpret the table.

elifland2y20

These were the 3 snippets I was most interested in

Under pure risk-neutrality, whether an existential risk intervention can reduce more than 1.5 basis points per billion dollars spent determines whether the existential risk intervention is an order of magnitude better than the Against Malaria Foundation (AMF).

If you use welfare ranges that are close to Rethink Priorities’ estimates, then only the most implausible existential risk intervention is estimated to be an order of magnitude more cost-effective than cage-free campaigns and the hypothetical shr... (read more)

DanielFilan2y5

(1) Sorry, I was copy-pasting the formula in the spreadsheet, but missed the extra 0.1 factor you added at the end. First of all, I still think the factor of 50 you're shaving off due to diminishing marginal returns seems quite extreme, given the lack of articulable justification for why it should be so big. I guess the extra factor of 10 is because diet cola exists? I'm not sure why you're adding that - as I mentioned, the questions you asked people already mentioned that people could substitute to diet sodas. Quoting from the instructions to participants... (read more)

DanielFilan2y16

Beyond potential a prior scepticism as to whether such a significant number of people not dying or suffering ill health from diabetes really is less valuable than loss of freedom to drink sugar drinks

I guess I want to add something here about why one would have the opposite prior: by and large, a decent model of people is that when they make decisions, they roughly weigh costs and benefits to themselves (or follow a policy that they adopted when weighing costs and benefits). Diabetes is mostly a cost to oneself. At least in the US, people are broadly aw... (read more)

Joel Tan🔸

Consolidating over the number of comments: (1) Per the formula (1-ABS(((1000/(0.83*1.2))-(1000/0.83))/(1000/0.83)))^0.1, it would appear that we have 98.2% remaining freedom, and a 1.8% reduction? (2) I think you're right on the averaging issue - in a number of other areas (e.g. calculating probabilities from a bunch of reference classes), we've tended to use the geomean, but that's for extrapolating from estimates to the true value, as opposed to true individual means to the true population means. The other related issue, however, is whether the sample is representative, and whether we think that highly educated people are more likely to report caring about abstract concerns. Will have to think about this, but thanks for the feedback! (3) As to your larger point, I'm not sure if it's reasonable to interpret individual behaviour when it comes to trading off short term against long term gains as maximizing, given time inconsistent preferences (relative to valuing your welfare equally at all times), people just not thinking too much about daily choices, lack of awareness of the precise degree of risk (even if the risk were 10x or 0.1x, it's not like you would likely shift a meaningful shift in behaviour), and of course motivated reasoning.

Then, using the y=x^0.1 formula, we take (1-d)^0.1 to find that 98% of "freedom of choice" still remains, and correspondingly, there was a 2% reduction.

The number in the sheet is a 0.2% reduction, not a 2% reduction. [EDIT: my bad, it's a 2% reduction, there's just another factor of 10 reduction that I mistakenly lumped into that]

I still disagree with your belief that the accuracy of the iterated questions format was lower than the accuracy of the fraction of income format - both questions had standard deviations that were approximately the same multiple of their means.
I think your original strategy of aggregating across the population using the arithmetic mean made sense, and don't understand what the justification is supposed to be for replacing it with a geometric mean [1]. Concretely, imagine a decision that affects two friends lives, making one 50% worse, and the other 0.005

... (read more)

DanielFilan2y6

also noticed that there are a number of issues – probability of success, and also substitution with diet coke – which should be factored in

Just reread this - surely these don't need to be factored in? Probability of success affects the numerator and the denominator equally, and your poll respondents probably already knew about diet coke.

DanielFilan2y11

(1) Fair enough re sample (altho it obviously limits how much of a conclusion you can draw). Re: the different variances, I basically dispute that the cash value method has a meaningfully lower variance than the hypothetical sequence method, because the relevant error is relevant to the mean. That said, this factor of 3 is the smallest issue I have with your calculation.

(2) Could you show the calculations (perhaps in a simplified format, like I did)? The tax vs ban seems like it should affect the value of freedom similarly to how it affects the disease bur... (read more)

Joel Tan🔸

Hi Daniel - thanks for all the previous feedback. I did a tentative update to the CEA (https://docs.google.com/spreadsheets/d/1kdnvaeP5iAUAF_Fgcf_ZgZDci2cYV02M51N7HMTMOBQ/edit#gid=1248986269), taking into account the points you raised plus some other considerations. I also separated out the analysis into its own tab, hopefully for better clarity. (1) One issue we've had issues with is large variance in estimates - and we typically try to use the geometric (rather than arithmetic mean) to better capture the differences in magnitudes. Our previous average of different people's estimates used a pure arithmetric mean (because of the presence of zeroes - people didn't value drinking sugary drinks at all, I guess). Partly because of our discussion on weighing, I tried a different approach of getting a geomean of the non-zeroes, and then creating a weighted average with the zeroes. The results are pretty sensitive - 0.0002 with this mixed method vs 0.001 with a pure arithmetric mean. I do think the former is probably the better method, insofar as arithmetric means are too sensitive to the high-end number, and have updated accordingly - but I'm not sure if there's a good answer one way or another! (2) For diminishing marginal returns - I basically think of it in terms of a graph (x axis is number of sugary drinks you can buy with your current income, y-axis is "freedom of choice"), and I take the relationship to be y=x^0.1. The proportional reduction in the number of sugar drinks you can buy is estimated like this: (a) calculating how much you can buy without the tax (an arbitrary USD 1000, divided by prevailing prices sourced from Walmart), (b) calculating how much you can buy after the tax (USD 1000, divided by prices subject to 20% tax), and then (c) taking ABS((b-a)/a) to get the proportional reduction (0.17, for a 20% price rise). (d) 1-c then gives how much you can still buy, proportionally (0.83). (d) Then, using the y=x^0.1 formula, we take (1-d)^0.1 to find that

DanielFilan2y7

Sounds like you roughly agree with me - 8.1 / 3.5 = 230%, which is close to 167%. Difference is I use the 5% reduction number for proportion of burden due to sugary drinks, getting 90 mil / 20 = 4.5 mil, 8.1 / 4.5 = 180%, and the rest is error built into these calcs.

MHR🔸

Gotcha, makes sense!

DanielFilan2y12

Wait: your survey numbers are for DALYs lost per year. So if the disease burden is 90 million DALYs per year, banning sugary drinks gives a benefit of 0.0006 DALYs per person-year, compared to 0.001 DALYs lost per person-year, meaning that the loss of freedom reduces the benefit by 167% (or 500% if you believe this comment). So now I'm really curious how your adjustments are bringing that down so much.

MHR🔸

EDIT: reading the full report, the 0.02% reduction in diabetes burden is from eliminating sugary drink consumption in a single country. I've updated my comments below to correct for that I'm also confused here, but I get different numbers than you do. My BOTEC: * For 100% elimination of sugary beverages in all countries, the benefits seem like they'd be 0.0002*193*90,000,000 = 3,474,00 DALYs averted /year * For 100% elimination of sugary drinks, the costs in reduced freedom seem like they'd be 8,100,000,000 * 0.001 = 8,100,000 DALYs/year So this looks net negative in expectation

FWIW I'm also suspicious of the 0.001 DALYs per person number.

AFAICT, the way you get it is by combining two methods: method 1 is to ask people a chain of questions like "as a fraction of death, how bad is life imprisonment", "as a fraction of life imprisonment, how bad is not being able to eat tasty stuff", "as a fraction of not being able to eat tasty stuff, how bad is not being able to have sugary drinks", multiply their answers to get how bad losing sugary drinks is as a fraction of dying, and then multiply by the fraction 64/74 (for remaining life yea... (read more)

Let me try to do a rough calculation myself: if you world-wide banned sugary drinks, each person would lose 0.001 DALYs total over the rest of their lives [EDIT: this is wrong, it's per annum, see this comment for a corrected version of the following calculation].

What's the disease burden caused by DMT2? Your report says roughly 90 million DALYs, I'm going to assume that's per year (you probably say this somewhere or it's probably an obvious convention, but I couldn't find it easily and don't know the conventions in this field). The global average age is ~... (read more)

DanielFilan2y12

FWIW I'm also suspicious of the 0.001 DALYs per person number.

DanielFilan2y17

OK I'm really confused - you calculate ~0.001 DALYs (which i guess is ~9 disability-adjusted life-hours) lost per person to eliminating people's freedom to consume sugary drinks, adjust that down because you're not eliminating it but just restricting it, make a second adjustment which I don't understand but which I'll assume is OK, multiply by the population of the average country to get a total number of DALYs, then:

Fourthly and finally, we divide by overall diabetes mellitus type 2 disease burden, thus creating an estimate -0.001% to be used as a downw

... (read more)

Also in cell 1F of the results spreadsheet, "salty food" should be "sugary drinks".

DanielFilan2y25

Who are the 16 people you surveyed to determine the disvalue of reducing people's freedom to buy sugary drinks?

[EDITED TO ADD: this comment burys the lede a bit, in a great-grandchild I do some napkin math that, if correct, indicates that the survey results mean that the reduced value of freedom entirely negates the health benefits]

Joel Tan🔸2y11

Hi Daniel!

To answer the various issues you raised (in order of when they were posted):

(1) We surveyed EAs, via the forum mainly. Obviously not a particularly representative sample, but bigger surveys are more expensive (even if you do them online via e.g. Lucid) and my sense coming in was that this was unlikely to affect the CEA materially (a lot of people don't have libertarian intuitions, for better or for worse), and so we didn’t commit too much time or money towards this issue. I will say that I suspect EAs – being highly educated and more likely... (read more)

DanielFilan2y17

Fourthly and finally, we divide by overall diabetes mellitus type 2 disease burden, thus creating an estimate -0.001% to be used as a downw

... (read more)

DanielFilan

Also in cell 1F of the results spreadsheet, "salty food" should be "sugary drinks".

DanielFilan2y6

we don't use superintelligent singletons and probably won't, I hope. We instead create context limited model instances of a larger model and tell it only about our task and the model doesn't retain information.

FYI, current cutting-edge large language models are trained on a massive amount of text on the internet (in the case of GPT-4, likely approximately all the text OpenAI could get their hands on). So they certainly have tons of information about stuff other than the task at hand.

Gerald Monroe

This is not what that statement means. What it means is the model has no context of its history since training. It has no context if the task it has been given is "real". It does not know if other copies of itself or other AIs are checking it's outputs for correctness, with serious consequences if it sabotages the output. It doesn't know it's not still in training. It doesn't know if there are a billion instances of it or just 1. We can scrub all this information fairly easily and we already do this as of right now. We can also make trick output where we try to elicit latent deception by giving information that would tell the model its time to betray. We can also work backwards and find what the adversarial inputs are. When will the model change it's answer for this question?

DanielFilan2y5

I asked Alex "no chance you can comment on whether you think assistance games are mostly irrelevant to modern deep learning?"

His response was "i think it's mostly irrelevant, yeah, with moderate confidence". He then told me he'd lost his EA forum credentials and said I should feel free to cross-post his message here.

(For what it's worth, as people may have guessed, I disagree with him - I think you can totally do CIRL-type stuff with modern deep learning, to the extent you can do anything with modern deep learning.)

DanielFilan2y71

The core argument of Nick Bostrom’s bestselling book Superintelligence has also aged quite poorly: In brief, the book mostly assumed we will manually program a set of values into an AGI, and argued that since human values are complex, our value specification will likely be wrong, and will cause a catastrophe when optimized by a superintelligence. But most researchers now recognize that this argument is not applicable to modern ML systems which learn values, along with everything else, from vast amounts of human-generated data.

For what it's worth, the bo... (read more)

Gerald Monroe

As a side note the actual things that break this loop are (1) we don't use superintelligent singletons and probably won't, I hope. We instead create context limited model instances of a larger model and tell it only about our task and the model doesn't retain information. This "break an ASI into a billion instances each which lives only in the moment" is a powerful alignment method (2) it seems to take an absolutely immense amount of compute hardware to host even today's models which are significantly below human intelligence in some expensive to fix ways. (For example how many H100s would you need for useful realtime video perception?) This means a "rogue" Singleton would have nowhere to exist, as it would be too heavy in weights and required bandwidth to run on a botnet. This breaks everything else. It's telling that Bostroms PhD is in philosophy and I don't see any industry experience on his wiki page. He is correct if you ignore real world limitations on AI.

Nora Belrose2y11

Yep I am aware of the value learning section of Chapter 12, which is why I used the "mostly" qualifier. That said he basically imagines something like Stuart Russell's CIRL, rather than anything like LLMs or imitation learning.

If we treat the Orthogonality Thesis as the crux of the book, I also think the book has aged poorly. In fact it should have been obvious when the book was written that the Thesis is basically a motte-and-bailey where you argue for a super weak claim (any combo of intelligence and goals is logically possible), which is itself dubious ... (read more)