659Joined Jun 2022


Answer by FroolowNov 23, 2022110

You might be interested in an Adversarial Collaboration I wrote on this topic a few years ago. My collaborator was a meat-eater who was very strong on finding representative statistics (in fact he wrote the first draft of Section 2.2 to keep me extra-honest)

Yes I will do, although some respondents asked to remain anonymous / not have their data publicly accessible and so I need to make some slight alterations before I share. I'd guess a couple of weeks for this

I agree that the arith-vs-geo question is basically the crux when it comes to whether this essay should move FF's 'fair betting probabilities' - it sounds like everyone is pretty happy with the point about distributions and I'm really pleased about that because it was the main point I was trying to get across. I'm even more pleased that there is background work going on in the analysis of uncertainty space, because that's an area where public statements by AI Risk organisations have sometimes lagged behind the state of the art in other risk management applications. 

With respect to the crux, I hate to say it - because I'd love to be able to make as robust a claim for the prize as possible - but I'm not sure there is a principled reason for using geomean over arithmean for this application (or vice versa). The way I view it, they are both just snapshots of what is 'really' going on, which is the full distribution of possible outcomes given in the graphs / model. By analogy, I would be very suspicious of someone who always argued the arithmean would be a better estimate of central tendency than the median for every dataset / use case! I agree with you the problem of which is best for this particular dataset / use case is subtle, and I think I would characterise it as being a question of whether my manipulations of people's forecasts have retained some essential 'forecast-y' characteristic which means geomean is more appropriate for various features it has, or whether they have been processed into having some sort of 'outcome-y' characteristic in which case arithmean is more appropriate. I take your point below in the coin example and the obvious superiority of arithmeans for that application, but my interpretation is that the FF didn't intend for the 'fair betting odds' position to limit discussion about alternate ways to think about probabilities ("Applicants need not agree with or use our same conception of probability")

However, to be absolutely clear, even if geomean was the right measure of central tendency I wouldn't expect the judges to pay that particular attention - if all I had done was find a novel way of averaging results then my argument would basically be mathematical sophistry, perhaps only one step better than simply redefining 'AI Risk' until I got a result I liked. I think the distribution point is the actually valuable part of the essay, and I'm quite explicit in the essay that neither geomean nor arithmean is a good substitute for the full distribution. While I would obviously be delighted if I could also convince you my weak preference for geomean as a summary statistic was actually robust and considered, I'm actually not especially wedded to the argument for one summary statistic over the other. I did realise after I got my results that the crux for moving probabilities was going to be a very dry debate about different measures of central tendency, but I figured since the Fund was interested in essays on the theme of "a bunch of this AI stuff is basically right, but we should be focusing on entirely different aspects of the problem" (even if they aren't being strictly solicited for the prize) the distribution bit of the essay might find a readership there anyway.

By the way, I know your four-step argument is intended just as a sketch of why you prefer arithmean for this application, but I do want to just flag up that I think it goes wrong on step 4, because acting according to arithmean probability (or geomean, for that matter) throws away information about distributions. As I mention here and elsewhere, I think the distribution issue is far more important than the geo-vs-arith issue, so while I don't really feel strongly if I lose the prize because the judges don't share my intuition that geomean is a slightly better measure of central tendency I would be sad to miss out because the distribution point was misunderstood! I describe in Section 5.2.2 how the distribution implied by my model would quite radically change some funding decisions, probably by more than an argument taking the arithmean to 3% (of course, if you're already working on distribution issues then you've probably already reached those conclusions and so I won't be changing your mind by making them - but in terms of publicly available arguments about AI Risk I'd defend the case that the distribution issue implies more radical redistribution of funds than changing the arithmean to 1.6%). So I think "act according to that mean probability" is wrong for many important decisions you might want to take - analogous to buying a lot of trousers with 1.97 legs in my example in the essay. No additional comment if that is what you meant though and were just using shorthand for that position.

Thanks, this is really interesting - in hindsight I should have included something like this when describing the SDO mechanism, because it illustrates it really nicely. Just to follow up on a comment I made somewhere else, the concept of a 'conjunctive model' is something I've not seen before and implies a sort of ontology of models which I haven't seen in the literature. A reasonable definition of a model is that it is supposed to reflect an underlying reality, and this will sometimes involve multiplying probabilities and sometimes involve adding two different sources of probabilities. 

I'm not an expert in AI Risk so I don't have much of a horse in this race, but I do note that if the one published model of AI Risk is highly 'conjunctive' / describes a reality where many things need to occur in order for AI Catastrophe to occur then the correct response from the 'disjunctive' side is to publish their own model, not argue that conjunctive models are inherently biased - in a sense 'bias' is the wrong term to use here because the case for the disjunctive side is that the conjunctive model accurately describes a reality which is not our own. 

(I'm not suggesting you don't know this, just that your comment assumes a bit of background knowledge from the reader I thought could potentially be misinterpreted!)

I completely agree that the survey demographic will make a big difference to the headline results figure. Since I surveyed people interested in existential risk (Astral Codex Ten, LessWrong, EA Forum) I would expect the results to bias upwards though. (Almost) every participant in my survey agreed the headline risk was greater than the 1.6% figure from this essay, and generally my results line up with the Bensinger survey. 

However, this is structurally similar to the state of Fermi Paradox estimates prior to SDO 'dissolving' this - that is, almost everyone working on the Drake Equation put the probable number of alien civilisations in this universe very high, because they missed the extremely subtle statistical point about uncertainty analysis SDO spotted, and which I have replicated in this essay. In my opinion, Section 4.3 indicates that as long as you have any order-of-magnitude uncertainty you will likely get asymmetric distribution of risk, and so in that sense I disagree that the mechanism depends on who you ask. The mechanism is the key part of the essay, the headline number is just one particular way to view that mechanism.

In practice these numbers wouldn't perfectly match even if there was no correlation because there is some missing survey data that the SDO method ignores (because naturally you can't sample data that doesn't exist). In principle I don't see why we shouldn't use this as a good rule-of-thumb check for unacceptable correlation.

The synth distribution gives a geomean of 1.6%, a simple mean of around 9.6%, as per the essay

The distribution of all survey responses multiplied together (as per Alice p1 x Alice p2 x Alice p3) gives a geomean of approx 2.3% and a simple mean of approx 17.3%.

I'd suggest that this implies the SDO method's weakness to correlated results is potentially depressing the actual result by about 50%, give or take. I don't think that's either obviously small enough not to matter or obviously large enough to invalidate the whole approach, although my instinct is that when talking about order-of-magnitude uncertainty, 50% point error would not be a showstopper.

I'm not sure we actually disagree about the fact on the ground, but I don't fully agree with the specifics of what you're saying (if that makes sense). In a general sense I agree the risk of 'AI is invented and then something bad happens because of that' is substantially higher than 1.6%. In the specific  scenario the Future Fund are interested in for the contest however, I think the scenario is too narrow to say with confidence what would happen on examination of structural uncertainty. I could think of ways in which a more disjunctive structural model could even plausibly diminish the risk of the specific Future Fund catastrophe scenario - for example in models where some of the microdynamics make it easier to misuse AI deliberately. That wouldn't necessarily change the overall risk of some AI Catastrophe befalling us, but it would be a relevant distinction to make with respect to the Future Fund question which asks about a specific kind of Catastrophe.

Also you're right the second and third quotes you give are too strong - it should read something like '...the actual risk of AI Catastrophe of this particular kind...' - you're right that this essay says nothing about AI Catastrophe broadly defined, just the specific kind of catastrophe the Future Fund are interested in. I'll change that, as it is undesirable imprecision.

Both of the links you suggest are strong philosophical arguments for 'disjunctive' risk, but are not actually model schema (although Soares does imply he has such a schema and just hasn't published it yet). The fact that I only use Carlsmith to model risk is a fair reflection of the state of the literature.

(As an aside, this seems really weird to me - there is almost no community pressure to have people explicitly draw out their model schema in powerpoint or on a piece of paper or something. This seems like a fundamental first step in communicating about AI Risk, but only Carlsmith has really done it to an actionable level. Am I missing something here? Are community norms in AI Risk very different to community norms in health economics, which is where I usually do my modelling?) 

I don't know if a rough analogy might help, but imagine you just bought a house . The realtor warns you that some houses in this neighbourhood have faulty wiring,  and your house might randomly set on fire during the 5 years or so you plan to live in it (that is, there is a 10% or whatever chance per year the house sets on fire). There are certain precautions you might take, like investing in a fire blanket and making sure your emergency exits are always clear, but principally buying very good home insurance, at a very high premium.

Imagine then you meet a builder in a bar and he says, "Oh yes, Smith was a terrible electrician and any house Smith built has faulty wiring, giving it a 50% chance of fire each year. If Smith didn't do your wiring then it is no more risky than any other house, maybe 1% per year". You don't actually live in a house with a 10% risk, you live in a house with a 1% or 50% risk. Each of those houses necessitates a different strategy - in a low risk house you can basically take no action, and save money on the premium insurance. In the high risk house you want to basically sell immediately (or replace the wiring completely). One important thing you would want to do straight away is discover if Smith or Jones built your house, which is irrelevant information in the first situation before you met the builder in the bar, where you implicitly have perfect certainty. You might reason inductively - "I saw a fire this year, so it is highly likely I live in a home that Smith built, so I am going to sell at a loss to avoid the fire which will inevitably happen next year" (compared to the first situation where you would just reason you were unlucky)

I totally agree with your final paragraph - to actually do anything with the information there is an asymmetrically distributed ex post AI Risk requires a totally different model. This is not an essay about what to actually do about AI Risk. However hopefully this comment gives perhaps a sketch picture of what might be accomplished when such a model is designed and deployed.

I think you're using a philosophical framework I just don't recognise here - 'conjunctive' and 'disjunctive' are not ordinary vocabulary in the sort of statistical modelling I do. One possible description of statistical modelling is that you are aiming to capture relevant insights about the world in a mathematical format so you can test hypotheses about those insights. In that respect, a model is good or bad based on how well its key features reflect the real world, rather than because it takes some particular position on the conjunctive-vs-disjunctive dispute. For example I am very excited to see the results of the MTAIR project, which will use a model a little bit like the below. This isn't really 'conjunctive' or 'disjunctive' in any meaningful sense - it tries to multiply probabilities when they should be multiplied and add probabilities when they should be added. This is more like the philosophical framework I would expect modelling to be undertaken in.

I'd add that one of the novel findings of this essay is that if there are 'conjunctive' steps between 'disjunctive' steps it is likely the distribution effect I find will still apply (that is, given order-of-magnitude uncertainty). Insofar as you agree that 4-ish steps in AI Risk are legitimately conjunctive as per your comment above, we probably materially agree on the important finding of this essay (that the distribution of risk is asymmetrically weighted towards low-risk worlds) even if we disagree about the exact point estimate around which that distribution skews

Small point of clarification - you're looking at the review table for Carlsmith (2021), which corresponds to Section 4.3.1. The correlation table I produce is for the Full Survey dataset, which corresponds to Section 4.1.1.  Perhaps to highlight the difference, in the Full Survey dataset of 42 people; 5 people give exactly one probability <10%,  2 people give exactly two probabilities <10%, 2 people give exactly three probabilities <10% and 1 mega-outlier gives exactly four probabilities <10%. To me this does seem like there is evidence of 'optimism bias' / correlation relative to what we might expect to see (which would be closer to 1 person giving exactly 2 probabilities <10% I suppose), but not enough to fundamentally alter the conclusion that low-risk worlds are more likely than high-risk worlds based on community consensus (eg see section 4.3.3)

Load More